【问题标题】:Understanding Correlation Between Columns Pandas DataFrame了解列之间的相关性 Pandas DataFrame
【发布时间】:2019-03-09 21:57:57
【问题描述】:

我有一个数据集,其中包含两种产品在发布前 10 天的每日销售额。下面的数据框显示了每种产品每天销售的一个和几十个项目。它认为,在一件产品售出之前,没有数十件产品被售出。这两种产品(Period_ID)预计销售数量为几十。

d = {'Period_ID':['A12']*10, 'Prod_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
     'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
df = pd.DataFrame(data=d)

问题

我想构建一个描述性分析,其中我的一个问题是计算在第 1 次、第 2 次、...、第 10 次售出一打产品之前,每种产品平均售出多少单件?

鉴于df.Period_ID.nunique() = 1568

修改与上述累计销售额相反的每日销售额数据集,并使用Pankaj Joshi解决方案进行少量改动,

print(f'Average number of single items before {index + 1} dozen = {df1.A_Singles[:val+1].mean():0.2f}')


d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,0,1,0,1,0,1], 'B_Singles':[0,0,1,0,1,0,1,0,1,0],
 'A_Dozens':[0,0,0,0,0,0,0,1,0,0], 'B_Dozens':[0,0,0,0,0,0,1,0,1,0]}
df1 = pd.DataFrame(data=d)

# For product A
Average number of single items before 1 dozen = 0.38

# For product B
6
Average number of single items before 1 dozen = 0.43
8
Average number of single items before 2 dozen = 0.44, But I want this to be counted from the last Dozens of sales. so rather 0.44, it should be 0.5 

目标是,一旦我获得每个 Period_ID 的信息,然后我将取所有 df.Period_ID.nunique() (= 1568) 的平均值,并尝试优化每个产品的预期“数十”销售数量在 col Prod_A_Doz 和 Prod_B_Doz 下给出

我将不胜感激。

【问题讨论】:

    标签: pandas dataframe statistics correlation


    【解决方案1】:

    下面是我的做法:

    d = {'Period_ID':['A12']*10, 'Prob_A_Doz':[1.2]*10, 'Prod_B_Doz':[2.4]*10, 'A_Singles':[0,0,0,1,1,2,2,3,3,4], 'B_Singles':[0,0,1,1,2,2,3,3,4,4],
     'A_Dozens':[0,0,0,0,0,0,0,1,1,1], 'B_Dozens':[0,0,0,0,0,0,1,1,2,2]}
    df1 = pd.DataFrame(data=d)
    
    
    for per_id in set(df1.Period_ID):
        print(per_id)
    
        df_temp = df1[df1.Period_ID == per_id]
        for index, val in enumerate(df_temp.index[df_temp.A_Dozens>0]):
            print(val)
            print(f'Average number of single items before {index} dozen = {df_temp.A_Singles[:val].mean():0.2f}')
            print(f'Average number of single items before {index} dozen = {df_temp.B_Dozens[:val].mean():0.2f}')
    

    【讨论】:

    • (val+1) 适用于上述数据。现在如何获得所有 1568 df.Period_ID 的平均值?
    • 只需将代码放入 for 循环即可。我修改了上面的代码。
    猜你喜欢
    • 1970-01-01
    • 2018-11-04
    • 2018-07-01
    • 1970-01-01
    • 2020-10-13
    • 1970-01-01
    • 2018-01-11
    • 1970-01-01
    • 2014-02-09
    相关资源
    最近更新 更多