【发布时间】:2020-05-30 06:40:48
【问题描述】:
我正在尝试通过应用过滤器(来自另一个电子表格的输入)来过滤主数据集(Pandas Dataframe)。
主要数据集:
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
| Cust Id | gender | Age | Indicator | X Indicator | State | foreign_ind | Eu Resident | address1 |
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
| 987685 | M | 65 | Y | N | TX | N | N | XYZ,USA |
| 987686 | F | 54 | Y | N | NJ | N | N | XYZ,USA |
| 987687 | M | 75 | Y | Y | NJ | N | N | XYZ,USA |
| 987688 | M | 45 | N | Y | NY | N | N | XYZ,USA |
| 987689 | F | 45 | Y | Y | NJ | N | N | XYZ,USA |
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
以下是配置列表,我们以电子表格格式从最终用户那里获取输入,并将此条件应用于主数据集。
来自另一个电子表格的条件输入:
+-------------+-----------+--------+------------------------------+---------+-----------+--------+
| column1 | operator1 | value1 | Logical Condition(And or OR) | column2 | operator2 | value2 |
+-------------+-----------+--------+------------------------------+---------+-----------+--------+
| gender | == | F | | | | |
| gender | == | M | | | | |
| Age | >= | 75 | || | Age | >= | 45 |
| Indicator | == | Y | | | | |
| X Idnicator | == | Y | | | | |
| State | == | NJ | | | | |
+-------------+-----------+--------+------------------------------+---------+-----------+--------+
应用过滤器后的预期输出数据帧。
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
| Cust Id | gender | Age | Indicator | X Indicator | State | foreign_ind | Eu Resident | address1 |
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
| 987687 | M | 75 | Y | Y | NJ | N | N | XYZ,USA |
| 987689 | F | 45 | N | Y | DL | N | N | XYZ,USA |
+---------+--------+-----+-----------+-------------+-------+-------------+-------------+----------+
【问题讨论】:
-
输入的电子表格总是一样的吗?
标签: python pandas dataframe filtering