【问题标题】:list of built-in aggregate and transform primitives内置聚合和转换原语列表
【发布时间】:2019-06-14 21:26:00
【问题描述】:

首先,我喜欢功能工具。它使我的工作变得更加轻松和高效。一个快速的问题:我只是在寻找非自定义 agg 和 trans 原语的完整列表,但似乎找不到。我是否只获取 API 中的方法列表并用小写(和下划线)替换大写字母?

【问题讨论】:

    标签: featuretools


    【解决方案1】:

    如果您运行featuretools.list_primitives(),它会返回一个包含所有原语名称的数据框。 “名称”列中的字符串可以提供给ft.dfs

    >>> import featuretools as ft   
    >>> ft.list_primitives()
                                   name         type                                        description
    0                      percent_true  aggregation           Determines the percent of `True` values.
    1                              last  aggregation               Determines the last value in a list.
    2                          num_true  aggregation                Counts the number of `True` values.
    3                               std  aggregation  Computes the dispersion relative to the mean v...
    4                        num_unique  aggregation  Determines the number of distinct values, igno...
    5                               sum  aggregation     Calculates the total addition, ignoring `NaN`.
    6                              skew  aggregation  Computes the extent to which a distribution di...
    7                              mode  aggregation       Determines the most commonly repeated value.
    8                  time_since_first  aggregation  Calculates the time elapsed since the first da...
    9                               max  aggregation  Calculates the highest value, ignoring `NaN` v...
    10                           median  aggregation  Determines the middlemost number in a list of ...
    11                             mean  aggregation         Computes the average for a list of values.
    12                  time_since_last  aggregation  Calculates the time elapsed since the last dat...
    

    此外,您还可以直接导入和传递原始类。例如,this 这两个调用是等价的。

    >>> from featuretools.primitives import Max, TimeSincePrevious
    >>> ft.dfs(agg_primtives=[Max, TimeSincePrevious], ...)
    >>> ft.dfs(agg_primtives=["max", "time_since_previous"], ...)
    

    如果您需要修改可控参数,导入原始对象会很有帮助。例如,让TimeSincePrevious 以小时为单位返回(默认为秒)

    >>> ft.dfs(agg_primtives=[Max, TimeSincePrevious(unit="hours")], ...)
    

    【讨论】:

      猜你喜欢
      • 2016-01-22
      • 1970-01-01
      • 2021-07-05
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多