【问题标题】:NLTK panlex_lite giving me errorNLTK panlex_lite 给我错误
【发布时间】:2016-06-30 16:40:10
【问题描述】:

我正在尝试将 NLTK 用于我在 Python 中的 NLP 学习。

某些名为“panlex_lite”的包一直给我错误,所以我尝试使用以下内容:

import nltk
nltk.download('all', halt_on_error = False)

它给了我以下错误:

[nltk_data]    | Downloading package panlex_lite to
[nltk_data]    |     /Users/Harshil/nltk_data...
[nltk_data]    |   Unzipping corpora/panlex_lite.zip.
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
  nltk.download('all', halt_on_error = False)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 664, in download
for msg in self.incr_download(info_or_id, download_dir, force):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 543, in incr_download
for msg in self.incr_download(info.children, download_dir, force):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 529, in incr_download
for msg in self._download_list(info_or_id, download_dir, force):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 572, in _download_list
for msg in self.incr_download(item, download_dir, force):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 549, in incr_download
for msg in self._download_package(info, download_dir, force):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 638, in _download_package
for msg in _unzip_iter(filepath, zipdir, verbose=False):
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 2039, in _unzip_iter
outfile.write(contents)
OSError: [Errno 22] Invalid argument

无论如何要解决这个问题?我试过使用“halt_on_error = False”方法,但它仍然给我错误。

谢谢。

【问题讨论】:

    标签: python nlp nltk


    【解决方案1】:

    这是一个“肮脏”的黑客:

    $ rm /Users/Harshil/nltk_data/corpora/panlex_lite.zip
    $ rm -r /Users/Harshil/nltk_data/corpora/panlex_lite
    $ python
    
    >>> import nltk
    >>> dler = nltk.downloader.Downloader()
    >>> dler._update_index()
    >>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
    >>> dler.download('all')
    

    另外,试试earthy:

    pip install earthy
    

    TL;DR

    import earthy
    path_to_nltk_data = '/home/yourusername/nltk_data/'
    earthy.download('all', path_to_nltk_data) # Excludes the third party (non-NLTK) packages.
    

    独家下载panlex_lite

    import earthy
    earthy.download('panlex_lite', path_to_nltk_data)
    

    要下载并非本地托管在 nltk_data github 上的所有第三方数据集:

    import earthy
    earthy.download('third_party', path_to_nltk_data')
    

    【讨论】:

      猜你喜欢
      • 2013-12-22
      • 1970-01-01
      • 2012-04-21
      • 2016-06-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-03-25
      • 1970-01-01
      相关资源
      最近更新 更多