【问题标题】:Compare lines in a single text file [duplicate]比较单个文本文件中的行[重复]
【发布时间】:2020-07-17 03:49:35
【问题描述】:

我有一个文本文件,它由相似的行组成,并且很少有与文本文件中的其他行有一半相似的部分。

输入.txt

I would like to play: Volleyball
I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all the three

从输入文件中,我想删除重复的行,如图所示

I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three

从输入文件中,我想删除重复的行,如图所示

I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three

下一步:

I would like to play
They like to play

输出文件的简要说明 我想参加的声明涵盖了许多不同的运动,所以我希望将其打印出来。他们喜欢玩的最后一行是不同的情况,所以我也想打印那行。 (我们将这些结果写成 .csv 格式,并在不同的列中打印涵盖最大运动数量以及所有独特运动的语句)

注意: 我不想打印 我不知道怎么打:排球 她想打:乒乓球 我想学怎么玩:棒球

因为已经涵盖了三项运动

我对如何将同一文本文件中的一行与另一行进行比较感到困惑。任何帮助,将不胜感激。谢谢

【问题讨论】:

  • 你是说如果两行以相同的单词结尾,只保留第一行?
  • 试试这个,"\n".join(set(text.splitlines()))
  • @Sushanth 我不想加入这些行。抱歉,我没能抓住你。
  • @CarySwoveland 对问题进行了更多更新。请看一下
  • 看起来您正在寻找创建一个正则表达式,但不知道从哪里开始。请查看Reference - What does this regex mean 资源,它有很多提示。此外,请参阅 Learning Regular Expressions 帖子了解一些基本的正则表达式信息。一旦您准备好表达方式,但解决方案仍然存在问题,请使用最新的详细信息编辑问题,我们很乐意帮助您解决问题。

标签: python python-3.x regex string


【解决方案1】:

你可以按照这个:

with open('Input.txt') as f:
    content = f.readlines()
import pandas as pd
content=pd.unique(content).tolist()

with open('Input.txt') as f:
    content = f.readlines()
result = []
for line in content:
    if line not in result:
        result.append(line)

【讨论】:

  • 这很好。也许您甚至可以将“内容”从列表转换为集合,而根本不创建“结果”。
  • 对于这种情况,将列表转换为集合是一个好主意。谢谢
  • @Md. Fantacher Islam 使用上面的代码,我通过添加不重复的下一行来获得多行输出。我不想每次都附加来打印该行。为了更清楚,我更新了这个问题。谢谢
【解决方案2】:

这很简单,在你的“.py”文件中这样做:

"""Simple Solution To Your Problem!"""

# Opening The Input File- `input.txt`
f = open('input.txt', encoding='utf-8', mode='w+')
new_file = '\
I would like to play: Volleyball\n\
I would like to play: Volleyball\n\
I do not know how to play: Volleyball\n\
I would like to play: Baseball\n\
I want to learn how to play: Volleyball'
f.write(new_file)
del f  # To Read The File Again


# Next, Printing Lines 1, 3, 4
with open('input.txt', encoding='utf-8', mode='r') as f:
lines = f.readlines()
wanted_lines = [0, 3, 4]
for each_line in wanted_lines:
    print(lines[each_line])
del f  # Just To Save Some Memory

【讨论】:

  • 其实输入的行并没有固定到我们需要什么行什么行。一旦我更清楚地更新了问题,您能否检查一下。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-12-08
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多