【问题标题】:Pandas - read csv stored as string in memory to data framePandas - 读取存储在内存中的字符串到数据帧的csv
【发布时间】:2019-10-08 22:56:10
【问题描述】:

用逗号分隔的文本存储在如下 var 中

data = """
Class,Name,Long,Lat
A,ABC11,139.6295542,35.61144069
A,ABC20,139.630596,35.61045559
A,ABC03,139.6300307,35.61327781
B,ABC54,139.7787818,35.68847945
B,ABC05,139.7814447,35.6816882
B,ABC06,139.7788191,35.681865
B,ABC24,139.7790396,35.67781697
"""

有没有一种快速的方法可以将它读入 pandas 数据帧,而不必存储到文件中并使用pd.read_csv()。我来自R,它提供了一种很好的方法来做到这一点,如下所示。

text <- "
State,District,County,Num Voters,Total Votes in State,Votes for None,Candidate Name,Party,Votes Scored
CA,San Diego,Delmar,190962,48026634,2511,A1,IND,949
CA,San Diego,Delmar,190962,48026634,2511,A2,RP(K),44815
"
df <- read.table(textConnection(text), sep = ",", header = TRUE)

【问题讨论】:

标签: python pandas


【解决方案1】:

使用io.StringIO 对象(用于文本 I/O 的内存流):

import pandas as pd
from io import StringIO

data = """
Class,Name,Long,Lat
A,ABC11,139.6295542,35.61144069
A,ABC20,139.630596,35.61045559
A,ABC03,139.6300307,35.61327781
B,ABC54,139.7787818,35.68847945
B,ABC05,139.7814447,35.6816882
B,ABC06,139.7788191,35.681865
B,ABC24,139.7790396,35.67781697
"""

df = pd.read_csv(StringIO(data))
print(df)

输出:

  Class   Name        Long        Lat
0     A  ABC11  139.629554  35.611441
1     A  ABC20  139.630596  35.610456
2     A  ABC03  139.630031  35.613278
3     B  ABC54  139.778782  35.688479
4     B  ABC05  139.781445  35.681688
5     B  ABC06  139.778819  35.681865
6     B  ABC24  139.779040  35.677817

【讨论】:

    【解决方案2】:
    from prettytable import PrettyTable
    
    def create_table(data):
        data = data.strip().split('\n')
        pt = PrettyTable()
        pt.field_names = data[0].split(',')
    
        for row in data[1:]:
            pt.add_row(row.split(','))
    
        return pt
    
    data = """
    Class,Name,Long,Lat
    A,ABC11,139.6295542,35.61144069
    A,ABC20,139.630596,35.61045559
    A,ABC03,139.6300307,35.61327781
    B,ABC54,139.7787818,35.68847945
    B,ABC05,139.7814447,35.6816882
    B,ABC06,139.7788191,35.681865
    B,ABC24,139.7790396,35.67781697
    """
    
    table = create_table(data)
    
    print(table)
    +-------+-------+-------------+-------------+
    | Class |  Name |     Long    |     Lat     |
    +-------+-------+-------------+-------------+
    |   A   | ABC11 | 139.6295542 | 35.61144069 |
    |   A   | ABC20 |  139.630596 | 35.61045559 |
    |   A   | ABC03 | 139.6300307 | 35.61327781 |
    |   B   | ABC54 | 139.7787818 | 35.68847945 |
    |   B   | ABC05 | 139.7814447 |  35.6816882 |
    |   B   | ABC06 | 139.7788191 |  35.681865  |
    |   B   | ABC24 | 139.7790396 | 35.67781697 |
    +-------+-------+-------------+-------------+
    

    from prettytable import from_csv
    from io import StringIO
    
    table = from_csv(StringIO(data))
    
    print(table)
    

    【讨论】:

    • 这不会像问题所要求的那样返回 pandas.DataFrame 的实例,而是返回 prettytable.PrettyTable 的实例。
    猜你喜欢
    • 2014-05-31
    • 2023-03-25
    • 1970-01-01
    • 1970-01-01
    • 2014-01-22
    • 2019-01-15
    • 1970-01-01
    • 2020-09-29
    相关资源
    最近更新 更多