【发布时间】:2020-07-17 03:49:35
【问题描述】:
我有一个文本文件,它由相似的行组成,并且很少有与文本文件中的其他行有一半相似的部分。
输入.txt
I would like to play: Volleyball
I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all the three
从输入文件中,我想删除重复的行,如图所示
I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three
从输入文件中,我想删除重复的行,如图所示
I would like to play: Volleyball
I would like to play: TableTennis
I would like to play: Baseball
I do not know how to play: Volleyball
She would like to play: TableTennis
I want to learn how to play: Baseball
They like to play: all three
下一步:
I would like to play
They like to play
输出文件的简要说明 我想参加的声明涵盖了许多不同的运动,所以我希望将其打印出来。他们喜欢玩的最后一行是不同的情况,所以我也想打印那行。 (我们将这些结果写成 .csv 格式,并在不同的列中打印涵盖最大运动数量以及所有独特运动的语句)
注意: 我不想打印 我不知道怎么打:排球 她想打:乒乓球 我想学怎么玩:棒球
因为已经涵盖了三项运动
我对如何将同一文本文件中的一行与另一行进行比较感到困惑。任何帮助,将不胜感激。谢谢
【问题讨论】:
-
你是说如果两行以相同的单词结尾,只保留第一行?
-
试试这个,
"\n".join(set(text.splitlines())) -
@Sushanth 我不想加入这些行。抱歉,我没能抓住你。
-
@CarySwoveland 对问题进行了更多更新。请看一下
-
看起来您正在寻找创建一个正则表达式,但不知道从哪里开始。请查看Reference - What does this regex mean 资源,它有很多提示。此外,请参阅 Learning Regular Expressions 帖子了解一些基本的正则表达式信息。一旦您准备好表达方式,但解决方案仍然存在问题,请使用最新的详细信息编辑问题,我们很乐意帮助您解决问题。
标签: python python-3.x regex string