【问题标题】:why the hadoop output file part-r-00000 is empty为什么hadoop输出文件part-r-00000为空
【发布时间】:2016-08-05 14:01:32
【问题描述】:

我的 MR 日志是:

[root@sicongli hadoop-2.4.1]# hadoop jar flowcount.jar   
cn.itheima.bigdata.hadoop.mr.flowcount.FlowCount /data/join.txt /out
16/04/13 23:32:20 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
16/04/13 23:32:22 INFO client.RMProxy: Connecting to ResourceManager at sicongli/192.168.218.111:8032

16/04/13 23:32:28 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/04/13 23:32:35 INFO input.FileInputFormat: Total input paths to process : 1
16/04/13 23:32:38 INFO mapreduce.JobSubmitter: number of splits:1
16/04/13 23:32:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1460601112521_0002
16/04/13 23:32:47 INFO impl.YarnClientImpl: Submitted application application_1460601112521_0002
16/04/13 23:32:47 INFO mapreduce.Job: The url to track the job: http://sicongli:8088/proxy/application_1460601112521_0002/
16/04/13 23:32:47 INFO mapreduce.Job: Running job: job_1460601112521_0002
16/04/13 23:35:20 INFO mapreduce.Job: Job job_1460601112521_0002 running in uber mode : false
16/04/13 23:35:28 INFO mapreduce.Job:  map 0% reduce 0%
16/04/13 23:36:47 INFO mapreduce.Job:  map 100% reduce 0%
16/04/13 23:37:25 INFO mapreduce.Job:  map 100% reduce 100%
16/04/13 23:37:48 INFO mapreduce.Job: Job job_1460601112521_0002 completed successfully
16/04/13 23:38:16 INFO mapreduce.Job: Counters: 49
    File System Counters
            FILE: Number of bytes read=6
            FILE: Number of bytes written=186579
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=399
            HDFS: Number of bytes written=0
            HDFS: Number of read operations=6
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
    Job Counters 
            Launched map tasks=1
            Launched reduce tasks=1
            Data-local map tasks=1
            Total time spent by all maps in occupied slots (ms)=17296
            Total time spent by all reduces in occupied slots (ms)=36727
            Total time spent by all map tasks (ms)=17296
            Total time spent by all reduce tasks (ms)=36727
            Total vcore-seconds taken by all map tasks=17296
            Total vcore-seconds taken by all reduce tasks=36727
            Total megabyte-seconds taken by all map tasks=17711104
            Total megabyte-seconds taken by all reduce tasks=37608448
    Map-Reduce Framework
            Map input records=23
            Map output records=0
            Map output bytes=0
            Map output materialized bytes=6
            Input split bytes=99
            Combine input records=0
            Combine output records=0
            Reduce input groups=0
            Reduce shuffle bytes=6
            Reduce input records=0
            Reduce output records=0
            Spilled Records=0
            Shuffled Maps =1
            Failed Shuffles=0
            Merged Map outputs=1
            GC time elapsed (ms)=217
            CPU time spent (ms)=1150
            Physical memory (bytes) snapshot=277962752
            Virtual memory (bytes) snapshot=1689296896
            Total committed heap usage (bytes)=127127552
    Shuffle Errors
            BAD_ID=0
            CONNECTION=0
            IO_ERROR=0
            WRONG_LENGTH=0
            WRONG_MAP=0
            WRONG_REDUCE=0
    File Input Format Counters 
            Bytes Read=300
    File Output Format Counters 
            Bytes Written=0
16/04/13 23:38:18 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:19 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:20 INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
16/04/13 23:38:23 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server

输出是:

[root@sicongli ~]# hadoop fs -ls /out
16/04/14 00:00:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   3 root supergroup          0 2016-04-13 23:37 /out/_SUCCESS
-rw-r--r--   3 root supergroup          0 2016-04-13 23:37 /out/part-r-00000

我有两个问题:

一:为什么输出文件part-r-0000为空

tow : 为什么会有警告:INFO ipc.Client: Retrying connect to server: sicongli/192.168.218.111:49806.已尝试 2 次;重试策略为 RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)

【问题讨论】:

    标签: java hadoop mapreduce


    【解决方案1】:

    问题 1 - 阅读计数器:

    映射输入记录=23

    映射输出记录=0

    Part-r-00000 是空的,因为您的地图任务中没有任何内容。如果您将地图任务的代码添加到您的帖子中,我们或许可以告诉您为什么

    问题 2 - 阅读this 问题的答案,他们可能会对您有所帮助。

    【讨论】:

    • 我的拖车问题怎么样
    • 尝试更新到 Hadoop 2.7.2。如果这不是一个选项,请尝试在您的计算机上禁用防火墙。
    • 现在我得到了结果。但信息:INFO ipc.Client:重试连接到服务器:sicongli/192.168.218.111:49806。已尝试 2 次;重试策略是 RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 总是有。我的防火墙已关闭
    • 这在特定版本的 Hadoop 中被列为错误,最高 2.6.0 - 这就是我建议尝试 2.7.2 的原因。看到您在本地机器上运行任务,我认为这不会影响您的最终结果。
    猜你喜欢
    • 1970-01-01
    • 2012-05-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-08-13
    • 1970-01-01
    相关资源
    最近更新 更多