【问题标题】:Python Flask Application on IBM cloud/bluemix with Textblob library throwing exception - textblob.exceptions.MissingCorpusErrorIBM cloud/bluemix 上的 Python Flask 应用程序,带有 Textblob 库抛出异常 - textblob.exceptions.MissingCorpusError
【发布时间】:2018-07-04 05:10:56
【问题描述】:

我正在尝试在 IBM 云/Bluemix 上运行带有一些文本分析(使用 TextBlob)功能的 python 烧瓶应用程序。通过 cf push 命令部署应用程序后出现以下错误(见下文)。根据 TextBlob 上的文档站点,当用户尝试使用需要用户系统上没有的数据集或模型的功能时,将引发此异常。

error:
Error while running the app:
textblob.exceptions.MissingCorpusError
MissingCorpusError: 
Looks like you are missing some required data for this feature.

To download the necessary data, simply run

python -m textblob.download_corpora
or use the NLTK downloader to download the missing data: 
http://nltk.org/data.html
If this doesn't fix the problem, file an issue at 
https://github.com/sloria/TextBlob/issues.

现在我的问题是我在我的requirement.txt 中添加了Flask、Textblob 和NLTK,如下所示。请建议我如何运行python -m textblob.download_corpora命令以使这个缺失的数据集/模型可用于bluemix 环境。如果没有运行上面提到的命令,有没有其他方法可以使它工作。 注意:此应用在本地系统上完美运行。

requirement.txt content:
Flask==0.12.2
cloudant==2.4.0
textblob==0.15.1
nltk==3.3

这是我在通过推送命令部署应用程序时收到的错误/警告

        -----> Downloading NLTK corpora...
!     nltk.txt not found, not downloading any corpora

Henrik 提出的编辑问题: 当我运行命令 python -m textblob.download_corpora 时,下面的语料库正在我的系统上下载。我在 nltk.txt 文件中提到了相同的列表

`[nltk_data] Downloading package brown to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package brown is already up-to-date!
 [nltk_data] Downloading package punkt to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package punkt is already up-to-date!
 [nltk_data] Downloading package wordnet to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package wordnet is already up-to-date!
 [nltk_data] Downloading package averaged_perceptron_tagger to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package averaged_perceptron_tagger is already up-to-
 [nltk_data]       date!
 [nltk_data] Downloading package conll2000 to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package conll2000 is already up-to-date!
 [nltk_data] Downloading package movie_reviews to
 [nltk_data]     C:\Users\MohanaKrishnaV\AppData\Roaming\nltk_data...
 [nltk_data]   Package movie_reviews is already up-to-date!
 Finished.

这就是我的 nltk.txt 的样子

brown wordnet
averaged_perceptron_tagger
brown
sentence_polarity
sentiwordnet
subjectivity
words
punkt
maxent_treebank_pos_tagger
movie_reviews
conll2000

我在我的 nltk.txt 中添加了额外的语料库,如下所示,希望这不是问题

 sentence_polarity
 sentiwordnet
 subjectivity
 words

错误日志如下所示:

   -------> Buildpack version 1.5.22
   -----> Installing pip-pop (0.1.1)
   Downloaded [https://buildpacks.cloudfoundry.org/dependencies/manual- 
   binaries/pip-pop/pip-pop-0.1.1-d410583a.tar.gz]
   -----> Installing pipenv (4.0.1)
   Downloaded [https://buildpacks.cloudfoundry.org/dependencies/manual- 
   binaries/pipenv/pipenv-4.0.1-148f753f.tar.gz]
    $ pip install -r requirements.txt
   You are using pip version 9.0.1, however version 10.0.1 is available.
   You should consider upgrading via the 'pip install --upgrade pip' command.
   You are using pip version 9.0.1, however version 10.0.1 is available.
   You should consider upgrading via the 'pip install --upgrade pip' command.
   -----> Downloading NLTK corpora...
   -----> Downloading NLTK packages: brown wordnet
   averaged_perceptron_tagger
   brown
   sentence_polarity
   sentiwordnet
   subjectivity
   words
   punkt
   maxent_treebank_pos_tagger
   movie_reviews
      [nltk_data] Downloading package brown to
      [nltk_data]     /tmp/contents525031002/deps/0/python/nltk_data...
      [nltk_data]   Package brown is already up-to-date!
      [nltk_data] Error loading wordnet : Package 'wordnet\r' not found in
      [nltk_data]     index
      Error installing package. Retry? [n/y/e]
    Traceback (most recent call last):
    File "/tmp/contents525031002/deps/0/python/lib/python2.7/runpy.py", line 
    174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
    File "/tmp/contents525031002/deps/0/python/lib/python2.7/runpy.py", line 
    72, in _run_code
    exec code in run_globals
    File "/tmp/contents525031002/deps/0/python/lib/python2.7/site- 
    packages/nltk/downloader.py", line 2272, in <module>
    halt_on_error=options.halt_on_error)
    File "/tmp/contents525031002/deps/0/python/lib/python2.7/site- 
    packages/nltk/downloader.py", line 681, in download
    choice = input().strip()
    EOFError: EOF when reading a line
    Exit status 0
    Staging complete
    Uploading droplet, build artifacts cache...
    Uploading build artifacts cache...
    Uploading droplet...
    Uploaded build artifacts cache (64.3M)
    Uploaded droplet (105.6M)
    Uploading complete
    Stopping instance 6cbf3cbc-aef1-4a73-a7ab-d562a606fe5b
    Destroying container
    Successfully destroyed container

这就是我推送我的应用程序的方式: cf login >> [我提供我的登录详细信息] >>cf push

【问题讨论】:

    标签: python flask ibm-cloud nltk textblob


    【解决方案1】:

    您部署的应用程序的根目录中似乎没有nltk.txtThe Cloud Foundry Python buildpacks have built-in support for NLTK。文本文件包含有关在部署期间需要安装哪些语料库的信息。

    nltk.txt 的示例内容:

    wordnet averaged_perceptron_tagger brown sentence_polarity
    

    确保它是单行,没有重复,没有奇怪的字符...

    【讨论】:

    • 嗨 Henrik,我遵循了您的建议,在根文件夹中添加了 nltk.txt。还尝试添加 CF 文档“Brown wordnet”中提到的数据集之一。这个包是在我推送应用程序时安装的。但是问题仍然存在,所以开始添加更多的数据集,特别是那些在 download_corpora(brown,punkt,wordnet,conll2000,maxent_treebank_pos_tagger,movie_reviews) 中提到的数据集,因为我不知道哪个数据集可以解决我的问题。但同样的错误弹出。请帮忙!!!
    • 您的应用使用的是哪个语料库?您可以从日志中看到下载了一些语料库吗?错误信息是否改变?请添加显示您如何推送应用程序的命令。
    • 您好 Henrik,我已根据所要求的所有信息编辑了我的帖子。如果您需要更多详细信息,请告诉我。
    • 我添加了更多细节。您的错误表明您的文件中有一个控制字符 (\r)。尝试将所有内容放在一行上,不要重复。
    • Henrik,将所有内容放在同一行后,上述错误消失了。非常感谢您的帮助。我曾经收到此错误“502 Bad Gateway: Registered endpoint failed to handle the request”。我怀疑这可能与内存有关并将内存加倍,它现在开始工作了。再次感谢您的帮助:)
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-08-23
    相关资源
    最近更新 更多