【发布时间】:2017-02-25 18:02:05
【问题描述】:
这与How to append to the end of an empty list? 相关,但我还没有足够的声誉在那里发表评论,所以我在这里发布了一个新问题。
我需要将术语附加到一个空的列表列表中。我开始:
Talks[eachFilename][TermVectors]=
[['paragraph','1','text'],
['paragraph','2','text'],
['paragraph','3','text']]
我想以
结束Talks[eachFilename][SomeTermsRemoved]=
[['paragraph','text'],
['paragraph','2'],
['paragraph']]
Talks[eachFilename][SomeTermsRemoved] 开始为空。我无法指定我想要的:
Talks[eachFilename][SomeTermsRemoved][0][0]='paragraph'
Talks[eachFilename][SomeTermsRemoved][0][1]='text'
Talks[eachFilename][SomeTermsRemoved][1][0]='paragraph'
等...(IndexError:列表索引超出范围)。如果我强制填充字符串然后尝试更改它,我会得到一个字符串是不可变的错误。
那么,我如何指定我希望Talks[eachFilename][SomeTermsRemoved][0] 为['paragraph','text'],而Talks[eachFilename][SomeTermsRemoved][1] 为['paragraph','2'] 等等?
.append 有效,但只生成一个长列,而不是一组列表。
更具体地说,我有许多在 dict 中初始化的列表
Talks = {}
Talks[eachFilename]= {}
Talks[eachFilename]['StartingText']=[]
Talks[eachFilename]['TermVectors']=[]
Talks[eachFilename]['TermVectorsNoStops']=[]
eachFilename 从文本文件列表中填充,例如:
Talks[eachFilename]=['filename1','filename2']
StartingText 有几行长文本(单个段落)
Talks[filename1][StartingText]=['This is paragraph one','paragraph two']
TermVectors 由 NLTK 包填充,其中包含术语列表,仍分组在原始段落中:
Talks[filename1][TermVectors]=
[['This','is','paragraph','one'],
['paragraph','two']]
我想进一步操作TermVectors,但保留原来的段落列表结构。这将创建一个每行 1 个术语的列表:
for eachFilename in Talks:
for eachTerm in range( 0, len( Talks[eachFilename]['TermVectors'] ) ):
for term in Talks[eachFilename]['TermVectors'][ eachTerm ]:
if unicode(term) not in stop_words:
Talks[eachFilename]['TermVectorsNoStops'].append( term )
结果(我失去了我的段落结构):
Talks[filename1][TermVectorsNoStops]=
[['This'],
['is'],
['paragraph'],
['one'],
['paragraph'],
['two']]
【问题讨论】:
-
不清楚这里的问题是什么。
Talks[eachFilename][SomeTermsRemoved]在你的代码中是如何定义的?