【发布时间】:2021-05-04 14:36:01
【问题描述】:
我是 Google Colab 和 Python 的新手。
我已从 google drive 定向文件,并尝试使用 mrjob 运行 Map Reduce。
import sys
sys.argv=['0']
from mrjob.job import MRJob
from mrjob.protocol import JSONProtocol, RawValueProtocol
from mrjob.step import MRStep
#creating an mrjob
class averagerating(MRJob):
def steps(self):
return [MRStep(mapper=self.mapper_average_rating,reducer=self.reducer_average_rating)]
#creating a mapping fuction
def mapper_average_rating(self):
x_teleplay=dfR_new['teleplay_id']
y_rating=dfR_new.iloc[:, -1:].mean(axis=1)
average_rate_per_id=dfR_new.groupby(['teleplay_id'])[['rating']].mean()
yield y_rating, x_teleplay
#creating a reducer fuction
def reducer_average_rating(self,key,values):
key=average_rate_per_id['teleplay_id']
values=average_rate_per_id['rating']
yield key,values
print(key,values)
#main function
if __name__ == "__main__":
averagerating.run()
但是,它返回类型错误。
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-44-d5d6bd5175a1> in <module>()
27 #main function
28 if __name__ == "__main__":
---> 29 averagerating.run()
7 frames
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in run(cls)
614 """
615 # load options from the command line
--> 616 cls().execute()
617
618 def run_job(self):
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in execute(self)
685
686 else:
--> 687 self.run_job()
688
689 def make_runner(self):
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in run_job(self)
632 stream=log_stream)
633
--> 634 with self.make_runner() as runner:
635 try:
636 runner.run()
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in make_runner(self)
702
703 runner_class = self._runner_class()
--> 704 kwargs = self._runner_kwargs()
705
706 # screen out most false-ish args so that it's readable
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in _runner_kwargs(self)
725 # don't screen out irrelevant opts (see #1898)
726 self._kwargs_from_switches(set(_RUNNER_OPTS)),
--> 727 self._job_kwargs(),
728 )
729
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in _job_kwargs(self)
244 self.jobconf(), self.options.jobconf),
245 libjars=combine_lists(
--> 246 self.libjars(), self.options.libjars),
247 partitioner=self.partitioner(),
248 sort_values=self.sort_values(),
/usr/local/lib/python3.7/dist-packages/mrjob/job.py in libjars(self)
1371 ``--libjars`` option
1372 """
-> 1373 script_dir = os.path.dirname(self.mr_job_script())
1374
1375 paths = []
/usr/lib/python3.7/posixpath.py in dirname(p)
154 def dirname(p):
155 """Returns the directory component of a pathname"""
--> 156 p = os.fspath(p)
157 sep = _get_sep(p)
158 i = p.rfind(sep) + 1
TypeError: expected str, bytes or os.PathLike object, not NoneType
我想问一下我的代码最后一行的问题出在哪里,我该如何解决这个错误?
我添加sys.argv=['0'] 是因为如果我写sys.argv[] 或不添加sys.argv 列表索引将超出范围。
【问题讨论】:
-
这是完整的错误信息吗?它显示了第一帧和最后一帧,但缺少
7 frames which may have useful information. Error shows problem with some path which isNone`,但它没有显示它从哪里获得这条路径。 -
你为什么使用
sys.argv=['0']?通常,此列表中的第一项应该是脚本的路径。我只是猜测:也许代码出于某种原因使用了这个路径,但是你设置了错误的路径并且它找不到这个路径,所以它给出了None这给出了你的错误。 -
@furas 感谢您的评论!我已经更新了错误信息!
标签: python mapreduce google-colaboratory mrjob