【问题标题】:A program that identifies individual words in a sentence, stores these in a list and replaces each word with the position of that word in the list [closed]一个程序,可以识别句子中的单个单词,将它们存储在一个列表中,并将每个单词替换为该单词在列表中的位置 [关闭]
【发布时间】:2016-04-20 20:28:22
【问题描述】:

我正在开发一个程序,它可以识别句子中的单个单词,将它们存储在一个列表中,并将原始句子中的每个单词替换为该单词在列表中的位置,因此可以从这些位置重新创建句子此列表中的单词使用序列1,2,3,4,5,6,7,8,9,1,3,9,6,7,8,4,5。到目前为止,我的代码如下,但我需要一些关于如何使其更高效、更短的建议:

import time

sentence = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY"
s = sentence.split() 
another = [0]
time.sleep(0.5)
print(sentence)
for count, i in enumerate(s): 
    if s.count(i) < 2:
        another.append(max(another) + 1)
    else:
        another.append(s.index(i) +1)
another.remove(0)
time.sleep(0.5)
print(another)

【问题讨论】:

  • 如果这是您认为可以改进的工作代码,请参阅Code Review。如果没有,请通过minimal reproducible example 澄清问题。
  • 好吧,如果你失去了 time.sleep 调用,代码会快一秒,短两行。
  • 我知道,我只是想让它更有效率,因为我的朋友设法通过使用 4 行代码得到相同的结果
  • @MrPython 更少的行(本身)不会使代码更高效或更易读。

标签: python string python-3.x


【解决方案1】:

这是一个线性算法:

position = {} # word -> position
words = sentence.split()
for word in words:
    if word not in position: # new word
       position[word] = len(position) + 1 # store its position
print(*map(position.__getitem__, words), sep=",")
# -> 1,2,3,4,5,6,7,8,9,1,3,9,6,7,8,4,5

print() 调用使用 Python 3 * 语法来解压缩由map() 返回的结果,该结果返回此处对应单词的位置。见What does ** (double star) and * (star) do for parameters?

【讨论】:

    【解决方案2】:

    要获取sentence 中的单词位置列表并从此列表中重新创建原始句子:

    sentence = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY"
    s = sentence.split()
    positions = [s.index(x)+1 for x in s]
    recreated = [s[i-1] for i in positions]
    # the reconstructed sentence
    print(" ".join(recreated))
    # the list of index numbers/word positions
    print(positions)
    # the positions as a space separated string of numbers
    print(" ".join(positions)
    

    列表是零索引的,所以第一个元素是索引 0,而不是 1。当然,如果你希望它从 1 开始,你可以在列表推导中的所有索引中添加 1。

    要获得与脚本产生的完全相同的输出:

    sentence = "ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY"
    s = sentence.split()
    positions = [s.index(x)+1 for x in s]
    print(sentence)
    print(positions)
    

    输出:

    ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 3, 9, 6, 7, 8, 4, 5]
    

    【讨论】:

    • 虽然技术上不需要,但您可能希望通过执行类似 list(set(sentence.split())) 的操作来对 s 进行重复数据删除,如果语料库变大并且进行大量 s.index() 调用,则可能是相关的。跨度>
    • 但这并不会打印数字作为最终结果它只打印句子
    • @MrPython print (positions)print(" ".join(positions)。我只是认为从位置重新构造句子是最终目标
    • 是的,现在它打印出数字,但它从 0 开始。我希望结果是这个 1,2,3,4,5,6,7,8,9,1,3, 9,6,7,8,4,5
    • @MrPython 我做到了!?那整个代码
    【解决方案3】:
    sentence = 'ASK NOT WHAT YOUR COUNTRY CAN DO FOR YOU ASK WHAT YOU CAN DO FOR YOUR COUNTRY'
    words = sentence.split()
    
    # Counting backwards through the words means the last seen one will have 
    # the lowest index without needing 'if' tests or incrementing counters.
    positions = {word:index for index, word in reversed(list(enumerate(words, 1)))}
    
    print(' '.join(str(positions.get(word)) for word in words))
    

    在 repl.it 上试用:https://repl.it/CHvy/0

    【讨论】:

      【解决方案4】:

      效率不是很高,但有两行。

      words = sentence.split()
      positions = [words.index(word) + 1 for word in words]
      

      注意list.index(entry) 将返回entry 第一次出现的索引。如果您可以接受从 0 开始的索引,那么以下内容非常简洁:

      positions = list(map(words.index, words))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2020-10-09
        • 1970-01-01
        • 1970-01-01
        • 2015-05-12
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多