【问题标题】:Iterating Over CSV Deleting Analyzed Data迭代 CSV 删除分析的数据
【发布时间】:2017-02-07 04:56:01
【问题描述】:

您好,我正在尝试获取一个 CSV 文件并遍历每个客户数据。解释一下,每个客户都有 12 个月的数据。我想分析他们的年度数据,将这些数据的相关性保存到一个新列表中并循环,直到分析完所有客户。

例如,以下是客户数据的样子(简化情况):

我已经能够让它工作以在一个客户数据的 CSV 中生成相关性。但是,我的数据表中有成千上万的客户。我想使用嵌套的 for 循环将每个客户的所有相关值放入列表/数组中。该列表将包含一行特定客户的相关性,然后下一行将是下一个客户。

这是我当前的代码:

import numpy
from numpy import genfromtxt
overalldata = genfromtxt('C:\Users\User V\Desktop\CUSTDATA.csv', delimiter=',')
emptylist = []
overalldatasubtract = overalldata[13::]
#This is where I try to use the four loop to go through all the customers. I     don't know if len will give me all the rows or the number of columns.
for x in range(0,len(overalldata),11):
    for x in range(0,13,1):
            cust_months = overalldata[0:x,1]
            cust_balancenormal = overalldata[0:x,16]
            cust_demo_one = overalldata[0:x,2]
            cust_demo_two = overalldata[0:x,3]
            num_acct_A = overalldata[0:x,4]
            num_acct_B = overalldata[0:x,5]
    #Correlation Calculations
            demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
            demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
            demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
            demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
            demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
            demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]

            result_correlation = [demo_one_corr_balance, demo_two_corr_balance, demo_one_corr_acct_a, demo_one_corr_acct_b, demo_two_corr_acct_a, demo_two_corr_acct_b]

result_correlation_combined = emptylist.append(result_correlation)
#This is where I try to delete the rows I have already analyzed.
overalldata = overalldata[11**x::]

print result_correlation_combined
print overalldatasubtract

看起来我的减法方法是有效的,但是当我用更大的数据集尝试它时,我意识到我的方法完全错误。

你会以不同的方式来做这件事吗?我认为它可以工作,但我找不到我的错误。

【问题讨论】:

    标签: python list csv append


    【解决方案1】:

    您对两个循环使用相同的变量x。在第二个循环中,x 会从 0 变为 12,无论客户是谁,由于您仅使用 x 设置行号,因此您被困在第一个客户身上。

    你的双循环应该是这样的:

    # loop over the customers
    for x_customer in range(0,len(overalldata),12):
        # loop over the months
        for x_month in range(0,12,1):
            # line number: x
            x = x_customer*12 + x_month
            ...
    

    我更改了循环的边界和步骤,因为:

    • 循环 1: 有 12 个月,因此每个客户 12 行 -> step = 12
    • 循环 2: 有 12 个月,所以月份的范围是 0 到 11 -> range(0,12,1)

    【讨论】:

    • 谢谢,这似乎是我想要做的,但是我仍然没有得到任何输出。我想将这些结果相关性保存到:result_correlation_combined = emptylist.append(result_correlation) 但是,这似乎没有保存任何内容,因为我一直得到一个空列表。
    【解决方案2】:

    这就是我解决问题的方法:这是我的 for 循环放置的问题。一个简单的缩进问题。感谢您对以上海报的帮助。

    对于范围内的 x_customer(0,len(overalldata),12):

        for x in range(0,13,1):
                cust_months = overalldata[0:x,1]
                cust_balancenormal = overalldata[0:x,16]
                cust_demo_one = overalldata[0:x,2]
                cust_demo_two = overalldata[0:x,3]
                num_acct_A = overalldata[0:x,4]
                num_acct_B = overalldata[0:x,5]
    #Correlation Calculations
                demo_one_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_one)[1,0]
                demo_two_corr_balance = numpy.corrcoef(cust_balancenormal, cust_demo_two)[1,0]
                demo_one_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_one)[1,0]
                demo_one_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_one)[1,0]
                demo_two_corr_acct_a = numpy.corrcoef(num_acct_A, cust_demo_two)[1,0]
                demo_two_corr_acct_b = numpy.corrcoef(num_acct_B, cust_demo_two)[1,0]
    
                result_correlation = [(demo_one_corr_balance),(demo_two_corr_balance),(demo_one_corr_acct_a),(demo_one_corr_acct_b),(demo_two_corr_acct_a),(demo_two_corr_acct_b)]
                numpy.savetxt('correlationoutput.csv', (result_correlation))
        result_correlation_combined = emptylist.append([result_correlation])
        cust_delete_list = [0,(x_customer),1]
        overalldata = numpy.delete(overalldata, (cust_delete_list), axis=0)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-03-06
      • 1970-01-01
      • 2013-02-17
      • 2015-01-16
      • 1970-01-01
      • 2021-03-20
      • 2016-11-29
      相关资源
      最近更新 更多