【发布时间】:2012-11-15 12:39:00
【问题描述】:
我是 hadoop 流的新学习者。我在学习 mapreduce 时遇到了问题。
这是我的mapper.py 代码:
#!/usr/bin/env python
import sys
# input comes from STDIN (standard input)
for line in sys.stdin:
# remove leading and trailing whitespace
line = line.strip()
# split the line into words
words = line.split()
# increase counters
for word in words:
# write the results to STDOUT (standard output);
# what we output here will be the input for the
# Reduce step, i.e. the input for reducer.py
#
# tab-delimited; the trivial word count is 1
print '%s\t%s' % (word, 1)
当我执行以下操作时:
hadoop@Chris-ubuntu:/home/test$ echo "I love China I love ieee I love python" | /home/test/mapper.py
我得到了结果:
: No such file or directory
但是,我确定该文件确实存在于该路径中,ls 可以看到该路径。所以我只是想知道我该如何解决这个问题。
【问题讨论】:
-
this answer 能解决您的问题吗?
标签: python hadoop hadoop-streaming