在 Python 中读取没有任何分隔符的文本文件答案

【问题标题】：Reading text file without any delimiter in Python在 Python 中读取没有任何分隔符的文本文件
【发布时间】：2020-08-01 20:43:19
【问题描述】：

我有一个文本文件，其内容如下：

H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0

我想把它分成如下列：

 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8
 2008.5 264363.6 4122568.8

我尝试了如下所示的 pandas，但它只输出一列：

import pandas
df = pandas.read_csv("data.txt", header=4)

有什么帮助吗？提前谢谢你

【问题讨论】：

尝试阅读read_csv方法相关的文档：pandas.pydata.org/pandas-docs/stable/reference/api/…
这能回答你的问题吗？ How to read file with space separated values in pandas
不，它没有。 df = pandas.read_csv("data.txt", header=4, delim_whitespace=True) 给出了这个错误：pandas.errors.ParserError: Error tokenizing data。 C 错误：预计第 27 行有 12 个字段，看到 13
试试：pd.read_csv('data.txt', skiprows=3, delim_whitespace=True, header = None)
您的数据是否与运行正常的this implementation 中的data.txt 文件不同？

标签： python pandas

【解决方案1】：

使用pandas.read_fwf

import io
import pandas as pd

s = """
H26      1         2         3         4         5         6         7         8
H26 5678901234567890123456789012345678901234567890123456789012345678901234567890
H26                                                                        

A           416.0  2008.51114 80   1  -4195081 88 68 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 67 68 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 66 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 87 65 69 264363.6 4122568.8 370.6
A           416.0  2008.51114 80   1  -4194881 88 65 69 264363.6 4122568.8 370.5
A           416.0  2008.51117 80   1  -1112380 86 58 96 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 57 99 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.3
A           416.0  2008.51117 80   1  -1112280 86 57101 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -1112180 85 58102 264340.6 4122580.8 370.2
A           416.0  2008.51117 80   1  -2112380 86 58103 264340.6 4122580.8 370.2
A           416.0  2008.51120 80   1  -2122179 82 51 67 264331.3 4122588.1 370.0
A           416.0  2008.51120 80   1  -2122279 82 51 69 264331.3 4122588.1 370.0"""

f = io.StringIO(s)
cols = [(19,30),(56,65),(65,75)]
df = pd.read_fwf(f,colspecs=cols,skiprows=[0,1,2,3],header=None)

df.loc[8:]
             0         1          2
8   2008.51117  264340.6  4122580.8
9   2008.51117  264340.6  4122580.8
10  2008.51117  264340.6  4122580.8
11  2008.51117  264340.6  4122580.8
12  2008.51120  264331.3  4122588.1
13  2008.51120  264331.3  4122588.1

【讨论】：