【发布时间】:2021-05-11 22:42:37
【问题描述】:
在使用 Word2Vec 时,我遇到了这样的错误:
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:3: DeprecationWarning: Call to deprecated __getitem__ (方法将在 4.0.0 中移除,使用 self.wv。 getitem() 代替)。
这与 ipykernel 包是分开的,因此我们可以避免导入直到
为什么会出现此错误,我该如何解决?
import nltk
nltk.download('punkt')
sentences=sent_tokenize(text)
sentences
nltk.download('stopwords')
sentences_clean=[re.sub(r'[^\w\s]','',sentence.lower()) for sentence in sentences] #noktalama kaldır , küçült
stop_words = stopwords.words('english')
sentence_tokens=[[words for words in sentence.split(' ') if words not in stop_words] for sentence in sentences_clean] #stop_words'leri kaldır.
sentence_tokens
#SÖZCÜK YERLEŞTİRME
w2v=Word2Vec(sentence_tokens,size=1,min_count=1,iter=1000)
sentence_embeddings=[[w2v[word][0] for word in words] for words in sentence_tokens]
max_len=max([len(tokens) for tokens in sentence_tokens]) #Bir cümlenin max uzunluğunu hesaplama
sentence_embeddings=[np.pad(embedding,(0,max_len-len(embedding)),'constant') for embedding in sentence_embeddings] #Padding işlemi.Bütün cümleleri aynı boyuta getirebilmke için yaplır
#print(sentence_embeddings) #Kelimelerin vektör uzayındaki halleri bulunur
【问题讨论】:
标签: nlp extract stanford-nlp word2vec summarization