1、在代码中设置压缩

  

设置我们的map阶段的压缩

 

Configuration configuration = new Configuration();
configuration.set("mapreduce.map.output.compress","true");
configuration.set("mapreduce.map.output.compress.codec","org.apache.hadoop.io.compress.SnappyCodec");

设置我们的reduce阶段的压缩

 

configuration.set("mapreduce.output.fileoutputformat.compress","true");
configuration.set("mapreduce.output.fileoutputformat.compress.type","RECORD");
configuration.set("mapreduce.output.fileoutputformat.compress.codec","org.apache.hadoop.io.compress.SnappyCodec");

2、配置全局的MapReduce压缩

我们可以修改mapred-site.xml配置文件,然后重启集群,以便对所有的mapreduce任务进行压缩

 map输出数据进行压缩

 

<property>

 

          <name>mapreduce.map.output.compress</name>

 

          <value>true</value>

 

</property>

 

<property>

 

         <name>mapreduce.map.output.compress.codec</name>

 

         <value>org.apache.hadoop.io.compress.SnappyCodec</value>

 

</property>

 

 

reduce输出数据进行压缩

 

<property>       <name>mapreduce.output.fileoutputformat.compress</name>

 

       <value>true</value>

 

</property>

 

<property>         <name>mapreduce.output.fileoutputformat.compress.type</name>

 

        <value>RECORD</value>

 

</property>

 

 <property>        <name>mapreduce.output.fileoutputformat.compress.codec</name>

 

        <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property>

所有节点都要修改mapred-site.xml修改完成之后记得重启集群

 

 

 

相关文章:

  • 2021-07-02
  • 2021-10-18
  • 2022-12-23
猜你喜欢
  • 2021-11-20
  • 2021-09-30
  • 2021-05-04
  • 2021-10-16
  • 2022-12-23
  • 2022-01-16
  • 2021-11-21
相关资源
相似解决方案