我对 Python 中的“集合”模块有疑问答案

【问题标题】：I have an issue with "collections" module in Python我对 Python 中的“集合”模块有疑问
【发布时间】：2018-06-04 19:54:12
【问题描述】：

我需要重新制作句子（从 txt 文件打开），使单词中的字母重复出现在单词本身中的次数。

例子：

“我需要喝一杯”必须变成：“我需要喝一杯”

这里是代码。我知道这很糟糕：

import collections

c = collections.Counter()

words_title = []

new_word = ''

new_word2 = ''

with open("file.txt", "r", encoding = "utf-8-sig") as file:                   
    reading = file.read()
    splitted = reading.split()

words_title = [word.title() for word in reading]

for word in words_title:
    for wor in word:
        for wo in word:
            c[wo] += 1
            new_word += word

for word2 in new_word:
    word2 = word2 * c[word2]
    new_word2 += word2

print(c)
print(new_word)
print(new_word2)

【问题讨论】：

您在询问代码的哪一部分？您希望它做什么，它会做什么，以及您在哪里试图修复它？我很确定for wor in word: for wo in word: 不是一件有用的事情，但我不知道你想要它做什么，所以我不知道如何解决它。
准确地说是中间部分。在我的示例中，它在 2 个字母所在的位置使用 4 个字母（例如“ccccooooccccoooonut”）。我希望这个词是“ccooccoonut”。
您对嵌套循环所做的事情是遍历每一对字符。例如，如果word 是'abc'，那么您实际上是在循环'aa', 'ab', 'ac', 'ba', 'bb', 'bc', 'ca', 'cb', 'cc'。除了你不再使用wor，所以你实际上只是在循环'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'。那应该怎么做？
我建议不要在问题中包含文件操作，因为它无关紧要。应该简化问题以使用 start_string = "whatever" 而不是从文件中加载它。您和我们也可以更轻松地进行测试。
我有一个完整的句子。我应该把它分成单词，然后把单词分成字母，然后我应该循环遍历这些字母。

标签： python collections

【解决方案1】：

这是我猜你想做的尝试：

from collections import Counter

start_string = '  coconuts taste great '

words = start_string.strip().split() # get single words from string

for word in words: # loop over individual words
    c = Counter(word) # count letters
    new_word = ''
    for w in word: # loop over letters in the original word
        new_word += w*c[w] # write the new word
    print new_word

#ccooccoonuts
#ttastte
#great

【讨论】：

正是我需要的！非常感谢！ ;)

【解决方案2】：

sentence = "I need a drink"
words = sentence.split()

out_sentence = ""

for word in words:
    for letter in word:
        for _ in range(word.count(letter)):
            out_sentence += letter
    out_sentence += " "

out_sentence = out_sentence[:-1]

print(out_sentence)

【讨论】：

正是我需要的！非常感谢！ ;)

【解决方案3】：

这里有一个与 Dux 的答案类似的单行代码，但使用生成器表达式并在最后而不是在每次迭代时加入所有字符序列：

from collections import Counter

s = 'I need a drink, coconut'

print(''.join(c * n[c] for w in s.split() for n in (Counter(w + ' '),) for c in w + ' '))
# Output: I neeeed a drink, ccooccoonut

请注意，第二个“for”只迭代一次，以便将 Counter 对象分配给n；这个小技巧确保只为每个单词 w 而不是为每个字符 c 创建一个新的 Counter 对象。

【讨论】：