【发布时间】:2022-01-10 13:01:53
【问题描述】:
我试图从 CSV 中获取这四位艺术家各自特征的平均值,但我正在获取这些 NaN 值。
恕我直言,我认为这是因为在 CSV 中有很多零,但后来我尝试从“Extremoduro”中绘制“声学”频率,然后我得到了值的图。
我尝试将平均值放入列中,但它仍然是 NaN 值。另外,我尝试将列格式更改为 int,但也没有任何反应。
这是我的代码:
import numpy as np
result=[]
audio_features=['danceability','energy','key','loudness','mode','speechiness','acousticness','instrumentalness',
'liveness','valence','tempo','duration_ms']
artists=["Metallica", "Extremoduro", "AC/DC", "Hans Zimmer"]
for a in artists:
for v in audio_features:
result.append(np.nanmean(df[v].loc[df['name_artist'] == a]))
输出:
Metallica Extremoduro AC/DC Hans Zimmer
danceability 0.349569 0.846328 5.425641 -7.707323
energy 0.581538 0.098277 0.082615 0.280533
key 0.413364 0.317938 122.677641 333602.843077
loudness 0.409805 0.794935 5.857143 -7.786104
mode 0.571429 0.084675 0.148247 0.169221
speechiness 0.277273 0.483052 139.082468 257855.370130
acousticness NaN NaN NaN NaN
instrumentalness NaN NaN NaN NaN
liveness NaN NaN NaN NaN
valence 0.282199 0.269810 4.561889 -18.791699
tempo 0.593920 0.046895 0.579343 0.742351
duration_ms 0.159604 0.139642 107.293903 245953.425081
那么,如果我这样做:
import numpy as np
result=[]
audio_features=['danceability','energy','key','loudness','mode','speechiness','acousticness','instrumentalness','liveness','valence','tempo','duration_ms']
artists=["Metallica", "Extremoduro", "AC/DC", "Hans Zimmer"]
for a in artists:
for v in audio_features:
result.append(np.nanmean(df[v].loc[df['name_artist'] == a]))
输出:
Metallica Extremoduro AC/DC Hans Zimmer
danceability 0.35 0.85 5.43 -7.71
energy 0.58 0.10 0.32 122.68
key 333602.84 0.41 0.79 5.86
loudness -7.79 0.57 0.08 0.48
mode 139.08 257855.37 NaN NaN
speechiness NaN NaN NaN NaN
valence NaN NaN NaN 0.28
tempo 0.27 4.56 -18.79 0.59
duration_ms 0.05 0.14 107.29 245953.43
另一方面,如果我尝试在我的代码中执行此操作,它实际上会返回一个浮点值:
输入:
a=df['acousticness'].loc[df['name_artist'] == "Metallica"].mean()
输出:
0.08261538461538463
这是我的完整代码:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
dataset=pd.read_csv('df.csv')
result=[]
audio_features=['danceability','energy','key','loudness','mode','speechiness','acousticness','instrumentalness',
'liveness','valence','tempo','duration_ms']
artists=["Metallica", "Extremoduro", "AC/DC", "Hans Zimmer"]
for a in artists:
for v in audio_features:
result.append(df[v].loc[df['name_artist'] == a].mean())
result=np.reshape(result,(len(audio_features),len(artists)))
dataset=pd.DataFrame(result,audio_features,artists).round(2)
print(dataset)
【问题讨论】: