从 CSV 文件 Python 中提取列：ValueError: too many values to unpack (expected 4)答案

【问题标题】：Extract columns from CSV file Python: ValueError: too many values to unpack (expected 4)从 CSV 文件 Python 中提取列：ValueError: too many values to unpack (expected 4)
【发布时间】：2019-09-25 17:04:53
【问题描述】：

enter image description here我正在尝试从 csv 文件中提取几列。这是我正在使用的大型面板数据的简单版本。在 Excel 中打开时有点像下面这样。但是，我在运行代码时收到一条错误消息：“ValueError: too many values to unpack (expected 4)”。我只是将我的文件编辑为图像，以便于查看。

companyID 年份 company_age 债务_TA gcp 654001 2000 49 0.14 0 654001 2001 50 0.17 0 654001 2002 51 0.23 1 112089 2013 38 0.11 0 112089 2014 39 0.13 0 342980 2007 54 0.15 0 342980 2008 55 0.22 1

我已经搜索并尝试了几个关于此类错误的答案，但到目前为止没有一个对我有用。我的代码如下所示。

import csv
import numpy as np
from sklearn import feature_extraction

def parseFile (filename):
    companies = list ()
    with open (filename) as csvfile:
        reader = csv.reader (csvfile, delimiter = ',', quotechar = '"')       
        for index, line in enumerate (reader):
            #print index, line
            if (index > 0 and index < 150):
                CompanyID, year, company_age, gcp = line
                #print company_name
                company = {\
                    'CompanyID' : CompanyID,\
                    'year' : year,\
                    'company_age' : company_age,\
                    'gcp': int (gcp),\
                }
                companies.append (company)
    return companies

def extract_year_features (companies):
    year_list = list ()
    for company in companies:
        year_list.append (company['year'] * 10)
    tweet_vectorizer = feature_extraction.text.CountVectorizer ()
    X = tweet_vectorizer.fit_transform (year_list).toarray ()
    return X


def extract_company_age_features (companies):
    company_age_list = list ()
    for company in companies:
        company_age_list.append (company['company_age'] * 10)
    tweet_vectorizer = feature_extraction.text.CountVectorizer ()
    X = tweet_vectorizer.fit_transform (company_age_list).toarray ()
    return X

def extract_all_features (companies):
    return np.concatenate ( (extract_year_features (companies), \
                          extract_company_age_features (companies)), \
                          axis=1)


def generate_target (companies):
    y = [company['gcp'] for company in companies]
    return np.array (y)

companies = parseFile ("sample.csv")
X = extract_all_features (companies)
y = generate_target (companies)   
#credit to G.Li

谁能指出我做错了什么？我是一名 Python 初学者，已经尝试了几个类似问题的答案，但没有一个对我有用。提前致谢。

【问题讨论】：

Too many values to unpack (expected 4)的可能重复
建议：使用pandas导入和操作csv文件。
谢谢你们，Andrejs Cainikovs 和 user9940344。我会看看你的建议，看看效果如何。

标签： python python-3.x

【解决方案1】：

在这条线上

CompanyID, year, company_age, gcp = line

预计解压 4 个变量，但您的 csv 中有 5 个字段。你需要一个额外的 Debt_TA 变量。

【讨论】：

感谢您的快速回复！我想要的不是选择所有变量，而是选择一些变量。原因是我正在处理的大数据非常庞大，我不需要所有变量。

【解决方案2】：

问题出在 csv 阅读器中，在您的 csv 中没有“，”分隔符，因此在这一行中 CompanyID, year, company_age, gcp = line 失败，因为所有列都在同一个字符串中，并且 csv 中也有 5 列.

也看PEP8 style-guide，你有一些缩进问题

【讨论】：

感谢您的回复。当你说 no ',' reader = csv.reader (csvfile, delimiter = ',', quotechar = '"') 时，你是指这行代码吗？我在那里看到一个 ','。你能进一步解释一下吗？谢谢。
是的。在您的 csv de 分隔符中是一个空格，没有逗号。也许这个例子可以帮助你。 stackabuse.com/reading-and-writing-csv-files-in-python
感谢您提供信息。我刚刚检查了我的代码和 csv 文件，确保两个地方都存在逗号。运行代码后，出现相同的错误消息。我已经研究了很长时间，但无法弄清楚。
如果你的 csv 现在有逗号分隔符，也许@ashish14 的答案对你有帮助。如果你调试你的应用程序，你会看到更清晰
谢谢，我现在试试。

【解决方案3】：

尝试以这种方式添加，因为 csv 文件中有 5 个字段：

CompanyID, year, company_age, gcp = line[0], line[1], line[2], line[3]

【讨论】：

感谢您的回复。我刚刚尝试了您的建议，但收到了不同的错误消息。 “ValueError：int() 的无效文字，基数为 10：'0.14'”。但是，我认为这可能只是一个类型错误。 csv 文件中有 5 个变量，但除了名为“Debt_TA”的变量之外，我只需要其中的四个
我刚刚尝试了您的建议，将您的“line[3]”更改为“[line4]”，因为我不需要列“Debt_TA”。它以没有错误消息结束。那太棒了！但是，当我添加代码“print(X)”时，它带有以下内容： print(X) [[1 0 0 0 0 0 0 0 0 1 0 0 0 0] [0 1 0 0 0 0 0 0 0 0 1 0 0 0] [0 0 1 0 0 0 0 0 0 0 0 1 0 0] [0 0 0 0 0 1 0 1 0 0 0 0 0 0] [0 0 0 0 0 0 1 0 1 0 0 0 0 0] [0 0 0 1 0 0 0 0 0 0 0 0 1 0] [0 0 0 0 1 0 0 0 0 0 0 0 0 1]]。我认为这是不对的，因为我需要提取列值。你知道为什么吗？感谢您的帮助。