【问题标题】:Python: extract values between two strings in text filePython:提取文本文件中两个字符串之间的值
【发布时间】:2020-05-27 02:35:34
【问题描述】:

我有一个这样的对话文本文件:

    Mom: 
Hi
    Dad: 
Hi
    Mom: 
Bye
    Dad: 
Bye
    Dad: 
:)

我必须将两个扬声器行复制到它们自己的文本文件(mom.txt 和 dad.txt)这可行,但问题是如果同一扬声器连续存在两行或多行。

 def sort(path):
    inFile= open(path, 'r')
    inFile1= open(path, 'r')
    copy = False
    outFile = open('mom.txt', 'w')
    outFile1 = open('dad.txt', 'w')
    keepCurrentSet = False
    for line in inFile:
        if line.startswith("Dad:"):
            keepCurrentSet = False

        if keepCurrentSet:
            outFile.write(line)

        if line.startswith("Mom:"):
            keepCurrentSet = True

    for line1 in inFile1:
        if line1.startswith("Mom:"):
            keepCurrentSet = False

        if keepCurrentSet:
            outFile1.write(line1)

        if line1.startswith("Dad:"):
            keepCurrentSet = True


    outFile.close()        
    outFile1.close()
    inFile1.close()

outFile1 结果如下所示:

Hi
Bye
Dad:
:)

应该看起来像:

Hi
Bye
:)

有想法或更简单的方法来做到这一点?谢谢

【问题讨论】:

  • 你到底想做什么?请解释你应该如何以及为什么应该得到你声称应该得到的输出。
  • 爸爸的行到文本文件 dad.txt 和妈妈的行到文本文件 mom.txt

标签: python


【解决方案1】:

你可以使用:

def sort(path):
    with open(path) as f,\
            open('mom.txt', 'w') as mom,\
            open('dad.txt', 'w') as dad:
        curr = None # keep tracks of current speaker
        for line in f:
            if 'Mom:' in line:
                curr = 'Mom' # set the current speaker to Mom
            elif 'Dad:' in line:
                curr = 'Dad' # set the current speaker to Dad
            else:
                if curr == 'Mom':
                    mom.write(line)
                elif curr == 'Dad':
                    dad.write(line)

生成的 mom.txtdad.txt 文件应如下所示:

# mom.txt
Hi
Bye

# dad.txt
Hi
Bye
:)

【讨论】:

    【解决方案2】:

    这是您可以在一个循环中编写mom.txtdad.txt 的一种方法:

     def sort(path):
        inFile= open(path, 'r')
        inFile1= open(path, 'r')
        copy = False
        outFile = open('mom.txt', 'w')
        outFile1 = open('dad.txt', 'w')
        keepCurrentSetDad = False
        keepCurrentSetMom = False
        for line in inFile:
            print("--->",line)
            if 'Dad' in line:
                keepCurrentSetDad = True
                keepCurrentSetMom = False
                continue
            elif 'Mom' in line:
                keepCurrentSetMom = True
                keepCurrentSetDad = False
                continue
            if keepCurrentSetDad:
                outFile1.write(line)
            elif keepCurrentSetMom:
                outFile.write(line)
        outFile.close()        
        outFile1.close()
        inFile1.close()
    

    我只是编辑了您的代码。 请检查您的 txt 文件。无论你在这里展示什么,说话者都在一行,说话者的话在下一行。我一直坚持这种格式。

    【讨论】:

      【解决方案3】:

      我得到的答案更短,在循环内只需要检查一个条件。根据您的语言版本,您可以选择以下两者之一:

      Python 3.7+

      def sort(path):
          with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
              line = inFile.readline()
              while line != '':
                  if line.startswith('Mom:'):
                      momFile.write(inFile.readline())
                  elif line.startswith('Dad:'):
                      dadFile.write(inFile.readline())
                  line = inFile.readline()
      

      Python 3.8+,(注意海象运算符:=

      def sort(path):
          with open(path, 'r') as inFile, open('mom.txt', 'w+') as momFile, open('dad.txt', 'w+') as dadFile:
              while (line := inFile.readline()) != '':
                  if line.startswith('Mom:'):
                      momFile.write(inFile.readline())
                  elif line.startswith('Dad:'):
                      dadFile.write(inFile.readline())
      

      输出:

      mom.txt:
      Hi
      Bye
      
      dad.txt:
      Hi
      Bye
      :)
      

      如果您发现一些错误或可能的改进,请告诉我。

      【讨论】:

        猜你喜欢
        • 2016-08-02
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2016-08-13
        • 2013-05-14
        • 2018-11-02
        相关资源
        最近更新 更多