在 Python 中从 excel 中读取行和列并将它们放入数组中答案

【问题标题】：Read rows and columns from excel in Python and put them in array在 Python 中从 excel 中读取行和列并将它们放入数组中
【发布时间】：2020-12-28 22:09:34
【问题描述】：

我的目标是读取excel数据，然后将每个名字分别分类为名字，将第二名分类为第二名，将域分类为域变量。

【问题讨论】：

标签： python arrays pandas export-to-csv xlrd

【解决方案1】：

您可以使用pandas 遍历行，更新数据，然后再次使用pandas 将其保存到excel：

import pandas as pd

df = pd.read_excel('input.xlsx', index_col=None)

output = {'0': [], '1': [], '2': [], '3': [], '4': []}
for index, row in df.iterrows():
    output['0'].append(f"{row['First']}@{row['Domain']}")
    output['1'].append(f"{row['Second']}@{row['Domain']}")
    output['2'].append(f"{row['First']}{row['Second']}@{row['Domain']}")
    output['3'].append(f"{row['First']}.{row['Second']}@{row['Domain']}")
    output['4'].append(f"{row['First'][0]}{row['Second']}@{row['Domain']}")

df = pd.DataFrame(output, columns=list(output.keys()))
df.to_excel('output.xlsx')

输出：

【讨论】：

谢谢！但是如果有 10,000 多行，这会不会非常低效？我不必初始化 10k 数组。有没有更快的方法？
抱歉忘记给你加标签
抱歉，不知道有什么更快的方法。可能使用c++

【解决方案2】：

我知道你想要这样的东西：

df = pandas.read_excel("input.xlsx")

def generate(data):
    first,last,domain = data
    return [ fl+'@'+domain for fl in \
        [first,last,first+last,first+'.'+last,first[0]+last]]

df.apply(generate,'columns',result_type='expand').to_excel("output.xlsx")

做到这一点的好功能是Dataframe.apply。 generate的参数必须是一行对应的序列。

【讨论】：