这是我最近 100 行的日志文件内容
020-10-07 17:25:39,353 INFO [main] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream: close closed:false s3://pt-test-1/nutch/99930k-crawls/segments/20201007165453/content/part-r-00001/data
2020-10-07 17:25:39,647 INFO [s3n-worker-4] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream: uploadPart: partNum 3 of 's3://pt-test -1/nutch/99930k-crawls/segments/20201007165453/content/part-r-00001/data' 来自本地文件 '/mnt1/s3/emrfs-51822051222537493780/0000000002', 13187820 bytes in 294 ms, md5: wYPoxIwg=294 ms md5hex: c1836fda4d3b4aeada1e0f32a0fa3123
2020-10-07 17:25:40,476 INFO [main] com.amazon.ws.emr.hadoop.fs.s3.upload.dispatch.DefaultMultipartUploadDispatcher:已完成 3 个部分的分段上传 281623276 字节
2020-10-07 17:25:40,477 信息 [main] com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream:关闭关闭:false s3://pt-test-1/nutch/99930k-crawls/段/20201007165453/content/part-r-00001/index
2020-10-07 17:25:40,526 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1601725692999_0072_r_000001_0 已完成。并且正在提交中
2020-10-07 17:25:40,540 INFO [main] org.apache.hadoop.mapred.Task:任务“attempt_1601725692999_0072_r_000001_0”已完成。
2020-10-07 17:25:40,546 INFO [main] org.apache.hadoop.mapred.Task:attempt_1601725692999_0072_r_000001_0 的最终计数器:计数器:37
文件系统计数器
020-10-07 17:25:40,556 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:停止 ReduceTask 指标系统...
2020-10-07 17:25:40,557 信息 [cloudwatch] org.apache.hadoop.metrics2.impl.MetricsSinkAdapter:cloudwatch 线程中断。
2020-10-07 17:25:40,557 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:ReduceTask 指标系统已停止。
2020-10-07 17:25:40,557 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:ReduceTask 指标系统关闭完成。
日志类型结束:系统日志
LogType:syslog.shuffle
LogLastModifiedTime:2020 年 10 月 7 日星期三 17:25:50 +0000
日志长度:2318
日志内容:
2020-10-07 17:24:58,847 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:MergerManager:memoryLimit=3207593984,maxSingleShuffleLimit=801898496,mergeThreshold=2117012096,ioSortFactor=48,memToMemMergeOutput=48
2020-10-07 17:24:58,849 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:attempt_1601725692999_0072_r_000001_0 线程已启动:EventFetcher for fetching Map Completion Events
2020-10-07 17:24:58,855 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:attempt_1601725692999_0072_r_000001_0:得到 1 个新的地图输出
2020-10-07 17:24:58,867 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:attempt_1601725692999_0072_m_000000_0:从 1506408373 开始洗牌到磁盘大于 maxSingleShuffleLimit (801898496)
2020-10-07 17:24:58,869 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.Fetcher: fetcher#1 即将随机输出地图尝试_1601725692999_0072_m_000000_0 decomp: 1506408373 len: 450132846 to DIS
2020-10-07 17:24:59,216 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.OnDiskMapOutput:从 map-output 中读取 450132846 个字节,用于尝试_1601725692999_0072_m_000000_0
2020-10-07 17:24:59,217 INFO [EventFetcher for fetching Map Completion Events] org.apache.hadoop.mapreduce.task.reduce.EventFetcher:EventFetcher 被中断.. 返回
2020-10-07 17:24:59,217 INFO [fetcher#1] org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: ip-172-31-67-60.ec2.internal:13562 由 fetcher#1 释放在 361 毫秒内
2020-10-07 17:24:59,222 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:使用 0 个内存映射输出和 1 个磁盘映射输出调用 finalMerge
2020-10-07 17:24:59,244 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:从磁盘合并 1 个文件,450132846 字节
2020-10-07 17:24:59,244 INFO [main] org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl:将 0 个段、0 个字节从内存合并到 reduce
2020-10-07 17:24:59,247 INFO [main] org.apache.hadoop.mapred.Merger:合并 1 个排序段
2020-10-07 17:24:59,254 INFO [main] org.apache.hadoop.mapred.Merger:直到最后一个合并通道,总大小还剩下 1 个段:1506408349 字节