【问题标题】:Storing a string and a set in a dictionary在字典中存储字符串和集合
【发布时间】:2014-06-14 02:55:14
【问题描述】:

我正在尝试构建一个字典,其中包含出现在输入文件中的唯一单词以及每个唯一单词的行号。这就是我到目前为止所拥有的。

def unique_word_index():
    line_no = 0
    word_set=set()
    line_no_set=set()
    word_map = {}
    for line in input_file:

       word_lst=line.strip().split()
       word_lst=[w.lower().strip(string.punctuation) for w in word_lst]
       line_no += 1

       for word in word_lst:
           if word !="":
               line_no_set.add(line_no)
           if 'word' in word_map.keys():
                word_map['word']=line_no_set
           else:
                word_map['word']=''

【问题讨论】:

  • 好吧,你在问什么?
  • 我应该如何以 ['word': line_no] 的形式制作字典

标签: python dictionary set


【解决方案1】:

试试下面的代码:

def unique_words(input_file):
    file = open(input_file)
    wordlist = {}
    dups = []
    copy = []
    for index, value in enumerate(file):
        words = value.split()
        for word in words:
            wordlist[word] = index
            dups.append(word)
    for word in dups:
        if dups.count(word) != 1 and word not in copy:
            del(wordlist[word])
            copy.append(word)
    for item in wordlist:
        print 'The unique word '+item+' occurs on line '+str(wordlist[item])

它将所有值添加到字典和列表中,然后运行到列表以确保每个值只出现一次。如果没有,我们将其从字典中删除,只留下唯一的数据。

运行如下:

>>> unique_words('test.txt')
The unique word them occurs on line 2
The unique word I occurs on line 1
The unique word there occurs on line 0
The unique word some occurs on line 2
The unique word times occurs on line 3
The unique word say occurs on line 2
The unique word too occurs on line 3
The unique word have occurs on line 1
The unique word of occurs on line 2
>>> 

【讨论】:

    【解决方案2】:

    你可以这样:

    def unique_words(input_file):
        word_map = dict()
        for i, line in enumerate(input_file):
            words = line.strip().split()
            for word in words:
                word = word.lower().strip(string.punctuation)
                if word in word_map:
                    word_map[word] = None
                else:
                    word_map[word] = i
        return dict((w, i) for w, i in word_map.items() if i is not None)
    

    它将单词及其对应的行号添加到字典word_map。当一个词被多次看到时,它的行号被替换为None。最后一行删除行号为None的条目。

    现在是精简版,使用Counter:

    from collections import Counter
    
    def unique_words(input_file):
        words = [(i, w.lower().strip(string.punctuation))
                for i, line in enumerate(input_file) for w in line.strip().split()]
        word_counts = Counter(w for _, w in words)
        return dict((w, i) for i, w in words if word_counts[w] == 1)
    

    【讨论】:

      猜你喜欢
      • 2021-11-27
      • 2016-05-22
      • 1970-01-01
      • 2018-10-13
      • 2020-03-07
      • 1970-01-01
      • 1970-01-01
      • 2017-05-02
      • 1970-01-01
      相关资源
      最近更新 更多