【发布时间】:2019-02-27 00:32:03
【问题描述】:
Featuretools 提供处理分类变量的集成功能
variable_types={"product_id": ft.variable_types.Categorical} https://docs.featuretools.com/loading_data/using_entitysets.html
但是这些应该是 strings 或 pandas.Category 类型以实现与 Featuretools 的最佳兼容性?
编辑
另外,是否需要手动指定所有列,如 https://github.com/Featuretools/predict-appointment-noshow/blob/master/Tutorial.ipynb 或者它们会从拟合 pandas 数据类型中自动推断出来
import featuretools.variable_types as vtypes
variable_types = {'gender': vtypes.Categorical,
'patient_id': vtypes.Categorical,
'age': vtypes.Ordinal,
'scholarship': vtypes.Boolean,
'hypertension': vtypes.Boolean,
'diabetes': vtypes.Boolean,
'alcoholism': vtypes.Boolean,
'handicap': vtypes.Boolean,
'no_show': vtypes.Boolean,
'sms_received': vtypes.Boolean}
【问题讨论】:
标签: python pandas feature-extraction feature-engineering featuretools