【问题标题】:mapred.JobClient: Error reading task output http:... when running hadoop from Cygwin on Windows OSmapred.JobClient:在 Windows 操作系统上从 Cygwin 运行 hadoop 时读取任务输出 http:... 时出错
【发布时间】:2013-05-14 17:49:23
【问题描述】:

我在 Windows 上运行 Cygwin 的“Mahout in Action”一书中的“从文档生成向量”示例。 Hadoop 仅在本地计算机上启动。

下面是我的运行命令:

$ bin/mahout seq2sparse -i reuters-seqfiles/ -o reuters-vectors -ow

但它显示在 java.io.IOException 下面,有谁知道是什么导致了这个问题?提前致谢!

Running on hadoop, using HADOOP_HOME=my_hadoop_path
HADOOP_CONF_DIR=my_hadoop_conf_path
13/05/13 18:38:03 WARN driver.MahoutDriver: No seq2sparse.props found on classpath, will use command-line arguments only
13/05/13 18:38:03 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum n-gram size is: 1
13/05/13 18:38:03 INFO common.HadoopUtil: Deleting reuters-vectors
13/05/13 18:38:04 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum LLR value: 1.0
13/05/13 18:38:04 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of reduce tasks: 1
13/05/13 18:38:04 INFO input.FileInputFormat: Total input paths to process : 2
13/05/13 18:38:04 INFO mapred.JobClient: Running job: job_201305131836_0001
13/05/13 18:38:05 INFO mapred.JobClient:  map 0% reduce 0%
13/05/13 18:38:15 INFO mapred.JobClient: Task Id : attempt_201305131836_0001_m_000003_0, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)

13/05/13 18:38:15 WARN mapred.JobClient: Error reading task outputhttp://namenode_address:50060/tasklog?plaintext=true&taskid=attempt_201305131836_0001_m_000003_0&filter=stdout
13/05/13 18:38:15 WARN mapred.JobClient: Error reading task outputhttp://namenode_address:50060/tasklog?plaintext=true&taskid=attempt_201305131836_0001_m_000003_0&filter=stderr
13/05/13 18:38:21 INFO mapred.JobClient: Task Id : attempt_201305131836_0001_m_000003_1, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)

下面是tasktracker的运行日志:

 INFO org.apache.hadoop.mapred.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
 INFO org.apache.hadoop.mapred.TaskTracker: ProcessTree implementation is missing on this system. TaskMemoryManager is disabled.
 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201305141049_0001_m_000002_0 task's state:UNASSIGNED
 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201305141049_0001_m_000002_0
 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201305141049_0001_m_000002_0
INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201305141049_0001_m_1036671648
INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201305141049_0001_m_1036671648 spawned.
 INFO org.apache.hadoop.mapred.JvmManager: JVM : jvm_201305141049_0001_m_1036671648 exited. Number of tasks it ran: 0
 WARN org.apache.hadoop.mapred.TaskRunner: attempt_201305141049_0001_m_000002_0 Child Error
java.io.IOException: Task process exit with nonzero status of 1.
    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
 INFO org.apache.hadoop.mapred.TaskRunner: attempt_201305141049_0001_m_000002_0 done; removing files.
 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2

【问题讨论】:

    标签: windows hadoop cygwin mahout


    【解决方案1】:

    通过查看您发布的任何日志,您似乎还没有设置HADOOP_HOME=my_hadoop_pathHADOOP_CONF_DIR=my_hadoop_conf_path。 您需要将这些目录路径用于例如HADOOP_HOME=/usr/lib/hadoopHADOOP_CONF_DIR=/usr/lib/hadoop/conf

    如果不是这种情况,请尝试仅使用 bin/mahout 并检查 seq2sparse 是否存在于列表中的某处。此行明确指出未找到:driver.MahoutDriver: No seq2sparse.props found on classpath, will use command-line arguments only

    【讨论】:

    • 这不是根本原因。我已经设置了 HADOOP_HOME 和 HADOOP_CONF_DIR。
    • 你能做一个bin/mahout,用输出编辑你的问题吗?
    • 谢谢。我通过简单地重新安装和配置 Hadoop 和 mahout 解决了这个问题。现在一切正常。还是非常感谢。 :-)
    猜你喜欢
    • 1970-01-01
    • 2014-11-07
    • 2023-04-08
    • 2017-03-06
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-05-16
    相关资源
    最近更新 更多