【问题标题】:Create a pandas DataFrame from a Cartesian product of two large lists从两个大列表的笛卡尔积创建一个 pandas DataFrame
【发布时间】:2020-09-12 13:32:04
【问题描述】:

我正在寻找最简单的方法来从另外两个创建数据框,以使其包含其元素的所有组合。 例如我们有这两个数据框:

list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

结果必须是:

   0   1
0  A  x1
1  A  x2
2  A  x3
3  A  x4
4  A  x5
5  A  x6
6  A  x7
7  A  x8
8  B  x1
9  B  x2

我尝试从列表中合并,它适用于小列表,但不适用于大列表。 谢谢

【问题讨论】:

标签: python pandas dataframe cartesian-product


【解决方案1】:
list1 = ["A", "B", "C", "D", "E"]
list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]

df1 = pd.DataFrame(list1)
df2 = pd.DataFrame(list2)

df1['key'] = 0
df2['key'] = 0
print( df1.merge(df2, on='key', how='outer').drop(columns='key') )

打印:

   0_x 0_y
0    A  x1
1    A  x2
2    A  x3
3    A  x4
4    A  x5
5    A  x6
6    A  x7
7    A  x8
8    B  x1
9    B  x2

...

【讨论】:

    【解决方案2】:

    您想将df1 中的每个元素与df2 的所有元素连接起来。

    你可以使用 df.merge:

    In [1820]: df1['tmp'] = 1   ## Create a dummy key in df1
    In [1821]: df2['tmp'] = 1   ## Create a dummy key in df2
    
    ## Merge both frames on `tmp`
    In [1824]: df1.merge(df2, on='tmp').drop('tmp', 1).rename(columns={'0_x': '0', '0_y':'1'}) 
    Out[1824]: 
        0   1
    0   A  x1
    1   A  x2
    2   A  x3
    3   A  x4
    4   A  x5
    5   A  x6
    6   A  x7
    7   A  x8
    8   B  x1
    9   B  x2
    10  B  x3
    11  B  x4
    12  B  x5
    13  B  x6
    14  B  x7
    15  B  x8
    16  C  x1
    17  C  x2
    18  C  x3
    ...
    ...
    

    【讨论】:

      【解决方案3】:

      你可以使用itertools.product:

      import itertools
      import pandas as pd
      
      list1 = ["A", "B", "C", "D", "E"]
      list2 = ["x1", "x2", "x3", "x4", "x5", "x6", "x7", "x8"]
      result = pd.DataFrame(list(itertools.product(list1, list2)))
      

      【讨论】:

      • 感谢您的回答。这就是我所做的。但它不适用于大数据框
      猜你喜欢
      • 2021-10-22
      • 2023-03-10
      • 2020-05-08
      • 2012-01-03
      • 2022-01-06
      • 1970-01-01
      • 1970-01-01
      • 2021-01-25
      • 2015-07-29
      相关资源
      最近更新 更多