【问题标题】:Reading text file without any delimiter in Python在 Python 中读取没有任何分隔符的文本文件
【发布时间】:2020-08-01 20:43:19
【问题描述】:

我有一个文本文件,其内容如下:

H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0

我想把它分成如下列:

 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8

我尝试了如下所示的 pandas,但它只输出一列:

import pandas
df = pandas.read_csv("data.txt", header=4)

有什么帮助吗?提前谢谢你

【问题讨论】:

标签: python pandas


【解决方案1】:

使用pandas.read_fwf

import io
import pandas as pd

s = """
H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0"""

f = io.StringIO(s)
cols = [(19,30),(56,65),(65,75)]
df = pd.read_fwf(f,colspecs=cols,skiprows=[0,1,2,3],header=None)

df.loc[8:]
             0         1          2
8   2008.51117  264340.6  4122580.8
9   2008.51117  264340.6  4122580.8
10  2008.51117  264340.6  4122580.8
11  2008.51117  264340.6  4122580.8
12  2008.51120  264331.3  4122588.1
13  2008.51120  264331.3  4122588.1

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-12-07
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-08-21
    • 2015-07-07
    • 1970-01-01
    相关资源
    最近更新 更多