【问题标题】:Covert text file into a list using empty line as delimiter in Python [duplicate]在Python中使用空行作为分隔符将文本文件转换为列表[重复]
【发布时间】:2021-08-12 22:04:02
【问题描述】:

我有一个文本文件,其中包含由空行分隔的段落,如下所示。我正在尝试创建一个列表,其中每个元素都是文本文件中的一个段落。

The Yakutia region, or Sakha Republic, where the Siberian wildfires are mainly 
taking place is one of the most remote parts of Russia. 

The capital city, Yakutsk, recorded one of the coldest temperatures on Earth in 
February 1891, of minus 64.4 degrees Celsius (minus 83.9 degrees Fahrenheit); but 
the region saw record high temperatures this winter. 

The Siberian Times reported in mid-July that residents were breathing smoke from more than 
300 separate wildfires, but that only around half of the forest blazes were being tackled 
by firefighters — including paratroopers flown in by the Russian military — because 
the rest were thought to be too dangerous.

The wildfires have grown in size since then and have engulfed an estimated 62,300 square 
miles (161,300 square km) since the start of the year.

所以在上面的例子中,列表中有 4 个元素,每个段落一个。

我可以使用以下代码轻松地将段落组合成一个字符串,

mystr = " ".join([line.strip() for line in lines])

但我不知道如何使用段落之间的空行作为分隔符来从文本文件中创建一个列表。我试过了,

with open('texr.txt', encoding='utf8') as f:
    lines = [line for line in f]

希望我可以将每一行转换为一个列表元素,然后将空白空间之间的所有内容组合成一个字符串。但这似乎不起作用。我必须在这里遗漏一些非常基本的东西..

谢谢

【问题讨论】:

  • 您是否检查了结果字符串mystr 并看到当有空行时会出现什么模式?您是否尝试通过这种模式拆分 mystr

标签: python text


【解决方案1】:

试试:

with open('textr.txt') as fp:
    lst = [p.strip() for p in fp.read().split('\n\n')]
>>> lst

['The Yakutia region, or Sakha Republic, where the Siberian wildfires are mainly \ntaking place is one of the most remote parts of Russia.',
 'The capital city, Yakutsk, recorded one of the coldest temperatures on Earth in \nFebruary 1891, of minus 64.4 degrees Celsius (minus 83.9 degrees Fahrenheit); but \nthe region saw record high temperatures this winter.',
 'The Siberian Times reported in mid-July that residents were breathing smoke from more than \n300 separate wildfires, but that only around half of the forest blazes were being tackled \nby firefighters — including paratroopers flown in by the Russian military — because \nthe rest were thought to be too dangerous.',
 'The wildfires have grown in size since then and have engulfed an estimated 62,300 square \nmiles (161,300 square km) since the start of the year.']

【讨论】:

  • 谢谢,我在您的代码中添加了lst = [x.replace('\n', ' ') for x in [p.strip() for p in fp.read().split('\n\n')]] 以确保元素不会被行分割。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-11-29
  • 2021-09-07
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2014-05-24
相关资源
最近更新 更多