【问题标题】:How to pick from multiple lists randomly to fill DFcolumns如何从多个列表中随机选择以填充 DFcolumns
【发布时间】:2020-07-15 19:08:06
【问题描述】:

我想根据下面 3 个列表中的随机值填充 3 列和 20 行的 Pandas DataFrame。我无法弄清楚我做错了什么。有什么建议吗?

import random
import pandas as pd 
import numpy as np


tests= ['TestA', 'TestB', 'TestC', 'TestD']
projects = ['AK', 'AA', 'JH', 'WM']
number = [10, 100, 200, 1000, 2000]

df = pd.DataFrame()

for i in range(1,21):
    df = df.append(
    {'TEST': random.choice(tests),
    'PROJ': random.choice(projects),
    'NUMBER': random.choice(number)})

【问题讨论】:

    标签: python pandas dataframe random


    【解决方案1】:

    你可以使用np.random.choice:

    tests= ['TestA', 'TestB', 'TestC', 'TestD']
    projects = ['AK', 'AA', 'JH', 'WM']
    number = [10, 100, 200, 1000, 2000]
    
    num_rows = 20
    
    # for repeatability, drop in actual code
    np.random.seed(1)
    
    df = pd.DataFrame({
        'TEST': np.random.choice(tests, size=num_rows),
        'PROJ': np.random.choice(projects, size=num_rows),
        'NUMBER': np.random.choice(number, size=num_rows)
    })
    

    输出:

         TEST PROJ  NUMBER
    0   TestB   JH     100
    1   TestD   AA     100
    2   TestA   JH     100
    3   TestA   AK     100
    4   TestD   WM      10
    5   TestB   AK    2000
    6   TestD   JH     100
    7   TestB   AK      10
    8   TestD   AA      10
    9   TestA   JH    1000
    10  TestA   JH     200
    11  TestB   AK     100
    12  TestA   WM      10
    13  TestD   WM    1000
    14  TestB   AA     100
    15  TestA   AA     100
    16  TestC   WM    1000
    17  TestB   JH    2000
    18  TestC   AK      10
    19  TestA   JH     100
    

    【讨论】:

      【解决方案2】:

      添加时忽略索引...

      for i in range(1,21):
          df = df.append(
          {'TEST': random.choice(tests),
          'PROJ': random.choice(projects),
          'NUMBER': random.choice(number)},
          ignore_index=True)
      

      【讨论】:

        【解决方案3】:

        与@quang-hoang 的版本非常相似,只不过它使用的是random.choices

        import random
        import pandas as pd 
        import numpy as np
        
        
        tests= ['TestA', 'TestB', 'TestC', 'TestD']
        projects = ['AK', 'AA', 'JH', 'WM']
        number = [10, 100, 200, 1000, 2000]
        
        df = pd.DataFrame()
        
        ## add a random.seed if you want reproducibility
        
        _t = random.choices(tests,k=20)
        _p = random.choices(projects,k=20)
        _n = random.choices(number,k=20)
        
        pd.DataFrame({'Test':_t,'Project':_p,'Number':_n})
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 2011-01-21
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2023-01-11
          • 2021-11-05
          相关资源
          最近更新 更多