【问题标题】:user by item martrix pandas用户按项目矩阵熊猫
【发布时间】:2017-01-26 02:08:15
【问题描述】:

我在推荐系统工作。我已经按照this 逐项制作用户矩阵。但是,我遇到了错误IndexError: index 8928358160 is out of bounds for axis 0 with size 5

以下是数据集的示例。

import pandas as pd
import numpy as np

df = pd.read_csv('APRIL.csv')
df = df.drop(['BASKETID'],1)
df = df.head(10)
df
Out[89]:
MEMBERID    SKU QTY
0   8928358161  37101163    2
1   8928358161  36618858    1
2   8928358161  40855129    1
3   8933444371  35010078    1
4   8932505053  36335949    1
5   8932505053  92100668    1
6   8932505053  36529730    2
7   8921161362  61814893    1
8   8915688100  34732853    1
9   8915688100  35122457    1


n_users = df.MEMBERID.unique().shape[0]
n_items = df.SKU.unique().shape[0]
print str(n_users) + ' users'
print str(n_items) + ' items'
5 users
10 items

ratings = np.zeros((n_users, n_items))
for row in df.itertuples():
    ratings[row[1]-1, row[2]-1] = row[3]
ratings
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-92-0a393963bf4c> in <module>()
      1 ratings = np.zeros((n_users, n_items))
      2 for row in df.itertuples():
----> 3     ratings[row[1]-1, row[2]-1] = row[3]
      4 ratings

IndexError: index 8928358160 is out of bounds for axis 0 with size 5

我还是不明白index 8928358160从哪里来。

【问题讨论】:

    标签: pandas recommendation-engine


    【解决方案1】:

    为什么不将值转换为字符串? 尽管它是整数,但计算机可能会将其视为科学值,从而成为浮点值。

    试试这个:

    将 cust_id 和 item_number 从浮点值转换为字符:

    mergedfinal['cust_id'] = mergedfinal['cust_id'].astype(str)
    mergedfinal['item_number'] = mergedfinal['item_number'].astype(str)
    mergedfinal['SKU'] = mergedfinal['SKU'].astype(str)
    

    mergedfinal 是我的数据框

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-04-13
      • 2016-08-28
      • 2016-08-23
      • 2016-10-29
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多