【问题标题】:getting correlation value from certain column从特定列获取相关值
【发布时间】:2017-10-28 04:46:43
【问题描述】:

假设 Ratings.head()

critic  title   rating
0   Jack Matthews   Lady in the Water   3.0
1   Jack Matthews   Snakes on a Plane   4.0
2   Jack Matthews   You Me and Dupree   3.5
3   Jack Matthews   Superman Returns    5.0
4   Jack Matthews   The Night Listener  3.0

我想得到一个像

这样的相关值
title   Just My Luck    Lady in the Water   Snakes on a Plane   Superman Returns    The Night Listener  You Me and Dupree

Just My Luck    1.000000    -0.944911   -0.333333   -0.422890   0.555556    -0.485662
Lady in the Water   -0.944911   1.000000    0.577350    0.404226    NaN 0.333333
Snakes on a Plane   -0.333333   0.577350    1.000000    -0.101929   -0.408248   -0.645497
Superman Returns    -0.422890   0.404226    -0.101929   1.000000    -0.062500   0.657952
The Night Listener  0.555556    NaN -0.408248   -0.062500   1.000000    -0.250000
You Me and Dupree   -0.485662   0.333333    -0.645497   0.657952    -0.250000   1.000000

在 python 中,我尝试使用枢轴,但它从第一个表中删除了 0,1,2,3,4。

如何使用 pandas 获得上述相关表?

【问题讨论】:

  • 获取数据透视码的结果,然后执行:df = df.reset_index()

标签: python pandas numpy dataframe pivot


【解决方案1】:

使用pivot + corr:

df = df.pivot(index='critic', columns='title', values='rating').corr()

替代unstack:

df = df.set_index(['critic','title'])['rating'].unstack().corr()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2013-01-23
    • 1970-01-01
    • 2023-03-16
    • 1970-01-01
    • 2021-12-04
    • 1970-01-01
    • 2019-06-30
    • 2017-11-30
    相关资源
    最近更新 更多