【问题标题】:Structure text from words of triplets saved in a 2D list从保存在 2D 列表中的三元组单词构造文本
【发布时间】:2021-01-12 19:07:13
【问题描述】:

我目前有一个文本,其单词保存为二维列表中的三元组。

我的二维列表:

[['Python', 'is', 'an'], ['interpreted,', 'high-level', 'and'], ['general-purpose', 'programming', 'language.'], ["Python's", 'design', 'philosophy'], ['emphasizes', 'code', 'readability'], ['with', 'its', 'notable'], ['use', 'of', 'significant'], ['whitespace.', 'Its', 'language'], ['constructs', 'and', 'object-oriented'], ['approach', 'aim', 'to'], ['help', 'programmers', 'write'], ['clear,', 'logical', 'code'], ['for', 'small', 'and'], ['large-scale', 'projects.']]

我正在创建一个 Python 代码,它随机选择一组这些三元组,然后尝试通过使用最后两个单词并选择一个以这两个单词开头的三元组来创建一个新的随机文本。最后,我的程序在写完 200 个单词或无法选择其他三元组时结束。

到目前为止的代码:

import random

with open(r'c:\\python\4_TRIPLETS\Sample.txt', 'r') as file:
    data = file.read().replace('\n', '').split()
    lines = [data[i:i + 3] for i in range(0, len(data), 3)]
    random.shuffle([random.shuffle(i) for i in lines])

first_triplet = random.choice(lines)
last_two = first_triplet[1:3]


output_text=[]
while True:
    candidates = [t for t in lines if t[0:2] == last_two]
    if not candidates:
        break
    
    next_triplet = random.choice(candidates)
    last_two = next_triplet[1:3]
    output_text.append(next_triplet)

我无法自动执行搜索匹配项并将它们存储在新列表中的重复过程。

有什么想法吗?

【问题讨论】:

    标签: python arrays python-3.x list 2d


    【解决方案1】:
    import random
    import sys
    import os
    import json
    
    outlist = []
    file_in = sys.argv[1]
    file_ot = str(file_in) + ".ot"
    
    with open(file_in, 'r') as file:
        data = file.read().replace('\n', '').split()
        lines = [data[i:i + 3] for i in range(0, len(data), 1)]
    print("\nΧωρισμένο Κείμενο σε Λίστα Τριπλετών:\n", lines)
    
    
    triplet = random.choice(lines)
    last_two = triplet[1:3]
    
    print("\nΕπιλεγμένη Τριπλέτα: \n", triplet)
    print("\nΔύο Τελευταίες Λέξεις Αυτής:\n", last_two)
    outlist.extend(triplet)
    
    proc_list = lines
    # first selected, remove from list
    proc_list.remove(triplet)
    
    n = 0
    while True:
    
        n += 1
        print("\nΕπανάληψη {0}\n".format(n))
    
        if proc_list == 0:
            print("a")
            break
    
        random.shuffle(proc_list)
        
        candidates = []
        
        for element in proc_list:
            if element[:2] == last_two:
                candidates.append(element)
        
        print(candidates)
    
        if not candidates:
            print("b")
            break
        
        
        if len(outlist) >= 200:   
            print("c")
            break
        
        triplet = random.choice(candidates)
        outlist.append(triplet[-1])
        proc_list.remove(triplet)
        
        last_two = triplet[1:3]
        print(outlist)
    
    with open(file_ot, 'w') as f:
        f.write(json.dumps(outlist, indent=10))
    
    print(" ".join(outlist))
    
    

    【讨论】:

      【解决方案2】:

      可以使用递归函数(改了部分代码,检查cmets):

      import random
      
      #adding ["is", "an", "experiment"] to check if it works (no other triplet were present that satisfy the condition)
      lines = [['Python', 'is', 'an'], ['interpreted,', 'high-level', 'and'], ['general-purpose', 'programming', 'language.'], ["Python's", 'design', 'philosophy'], ['emphasizes', 'code', 'readability'], ['with', 'its', 'notable'], ['use', 'of', 'significant'], ['whitespace.', 'Its', 'language'], ['constructs', 'and', 'object-oriented'], ['approach', 'aim', 'to'], ['help', 'programmers', 'write'], ['clear,', 'logical', 'code'], ['for', 'small', 'and'], ['large-scale', 'projects.'], ["is", "an", "experiment"]]
      
      first_triplet = ['Python', 'is', 'an'] # random.choice(lines)
      
      
      def appendNextTriplet(output_text, lines):
          if len(output_text) >= 200:
              return output_text
          candidates = [t for t in lines if t[:2] == output_text[-2:]]
          if not candidates:
              return output_text
          next_triplet = random.choice(candidates)
          output_text += next_triplet # changed from append to concatenation, it was not correct
          return appendNextTriplet(output_text, lines)
      
      print(appendNextTriplet(first_triplet, lines)) # ['Python', 'is', 'an', 'is', 'an', 'experiment']
      

      【讨论】:

      • 我在应聘者处收到语法错误:。不知道为什么。
      • 我使用了网站编辑器,可能是关于缩进的,我要在我的电脑上查看。
      • 好的,谢谢。我将提供更好的文本输入,因为我想根据这些三元组的集合在一个最多 200 个单词的新列表中创建一个新文本。你觉得这行得通吗???
      • @AndreasKreouzos 是的,它应该可以工作,遵循相同的机制
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-02-22
      • 2017-09-22
      • 1970-01-01
      • 1970-01-01
      • 2016-01-19
      • 1970-01-01
      相关资源
      最近更新 更多