Python加载txt文件并按行中的位置拆分行答案

【问题标题】：Python loading txt file and split lines by position in linePython加载txt文件并按行中的位置拆分行
【发布时间】：2021-07-15 00:41:28
【问题描述】：

我是新来的，也是 python 初学者。我收到了一个包含 100k 行的文本文件，每行包含 120 个字符。每行代表 14 列的数据，但由于某些值较短，而另一些值则用空白填充。这样我就没有像“，”这样的分隔符。如果我选择空白作为分隔符，值将不会进入正确的列。

线条就像

字符 1：O 或 L
第2-5章：年份
字符 6-13：月份名称
字符 14-21：汽车品牌
字符 22：.

O2020august  Opel    .
L2015may     BMW     .
L2016april   Mercedes.
O2021january Opel    .
L2023februaryAudi    .

我被困住了

df = pd.read_csv('text.txt', index_col=0, header = None)
print (data)

对于建议的任何方法，我都很高兴。不需要是熊猫。

干杯珍妮

【问题讨论】：

请从intro tour 重复on topic 和how to ask。 “告诉我如何解决这个编码问题”不是堆栈溢出问题。我们希望您做出诚实的尝试，然后然后就您的算法或技术提出一个具体的问题。 Stack Overflow 无意取代现有的文档和教程。
让我提供一点提示brand = line[14:22].rstrip()。
嘿@Prune，很抱歉我不想将此网站用作教程，但我真的不知道如何继续。我也检查不同的论坛。如前所述，我不知道我可以使用什么作为分隔符。我试图创建一个列表，但这让我无处可去。按原样分离空白值。 file = open('text.txt', 'r') for line in file: line = line.strip() columns = line.split() print(columns)
@TimRoberts：谢谢你的帮助！
被“卡住”并不会使问题适合 Stack Overflow。同样，请参阅发布指南。相反，您似乎需要一个通用帮助网站。

标签： python pandas string split

【解决方案1】：

我相信这样的事情可以解决你的问题。

for line in txt:
   #line should point something like that => "O2020august Opel"
   print(line)
   s1 = line[:1]
   s2 = line[1:5]
   s3 = line[5:13]
   .
   .
   .
   print(s1, s2, s3)

您可以使用Python文件读取API的readline和readlines方法。

【讨论】：

因为他是初学者，我建议也解释一下如何打开文件:)

【解决方案2】：

或者您可以使用一个简单的辅助函数来为您完成这项工作。

def split_by_pos(string_to_split, *args):
    """
    Splits a string at the given positions
    :param string_to_split: the string to be split
    :param args: the positions where the function will split the string.
    :return: the splitted string as a tuple
    """
    return_value = list()
    args = sorted(args)
    previous = 0
    for position in args:
        return_value.append(string_to_split[previous:position])
        previous = position
    return_value.append(string_to_split[previous:])
    return tuple(return_value)


with open("a_random_file.txt", "r", encoding="utf-8") as fp:
    lines = fp.readlines()
    
for line in lines:
    print(split_by_pos(line, 1, 5, 12))

【讨论】：