【发布时间】:2019-09-19 06:46:19
【问题描述】:
有人能解释一下下面的错误吗?
multiprocessing.pool.MaybeEncodingError:错误发送结果: '[>]'。
原因:'NotImplementedError("[E112] 不支持酸洗跨度, 因为跨度只是父文档的视图,不能存在于它们的 自己的。腌制的跨度总是必须包括它的 Doc 和 Vocab, 与腌制父文档相比,这几乎没有缺点 直接地。因此,不要腌制跨度,而是腌制它所属的 Doc 到或使用 Span.as_doc 将 span 转换为独立的 Doc 对象。")'
下面是我的代码:
resume_parser2.py
class Resume_parser2(object):
inputString = ''
skill = ['Java']
def __init__(self, resume):
self.__matcher = Matcher(nlp.vocab)
self.__skills = {
'skills' : None
}
self.__text = utils.extract_text(self.__resume, os.path.splitext(self.__resume)[1])
self.__text = ' '.join(self.__text.split())
self.__nlp = nlp(self.__text)
self.__noun_chunks = list(self.__nlp.noun_chunks)
def __get_basic_details(self):
skills = utils.extract_skills(self.__nlp,self.__noun_chunks)
self.__skills['skills'] = skills
return
def check_skills(self):
skill = ['Java']
if skill in self.__skills:
return self.__skills
def filtered_resume(resume):
parser = Resume_parser2(resume)
return parser.check_skills
if __name__ == '__main__':
pool = mp.Pool(mp.cpu_count())
resumes = []
data = []
for root, directories, filenames in os.walk('resumes'):
for filename in filenames:
file = os.path.join(root, filename)
resumes.append(file)
results = [pool.apply_async(filtered_resume, args=(x,)) for x in resumes]
results = [p.get() for p in results]
utils.extract_text 是从文件中提取文本的函数,utils.extract_skills 是从文件中提取技能集的函数。这两个都在提取数据。
Api.py
from resume_parser2 import Resume_parser2
from flask import Flask, request, redirect, url_for,send_from_directory, jsonify
from werkzeug.utils import secure_filename
import json
import multiprocessing as mp
import pandas as pd
import os
direct = 'E:/parsertool/backupresumes'
app = Flask(__name__)
def print_cyan(text):
print("\033[96m {}\033[00m" .format(text))
def extract_from_directory(directory):
if os.path.exists(directory):
pool = mp.Pool(mp.cpu_count())
resumes = []
for root, directories, filenames in os.walk(directory):
for filename in filenames:
extension = os.path.splitext(filename)[1]
if (extension == '.pdf' or extension == '.docx'):
file = os.path.join(root, filename)
resumes.append(file)
results = pool.map(filtered_resume, resumes)
pool.close()
pool.join()
return results
else:
return 'Directory not found.'
def filtered_resume(resume):
parser = Resume_parser2(resume)
return parser.check_skills
@app.route('/', methods=['GET', 'POST'])
def upload_file():
if request.method == 'POST':
#files = request.files.getlist('files')
#folder = dirname(files[0].filename)
#print(folder)
return jsonify(extract_from_directory(direct))
if __name__ == '__main__':
app.run(debug=True)
这里我无法理解错误。有人能解释一下这个错误是什么或如何解决吗?
【问题讨论】:
标签: python-3.x flask multiprocessing python-multiprocessing