【问题标题】:flume hdfs rollSize not working in multi channels and multi sinks水槽 hdfs rollSize 在多通道和多接收器中不起作用
【发布时间】:2016-08-24 02:17:29
【问题描述】:

我正在尝试使用 Flume-ng 获取 128MB 的日志信息并将其放入 HDFS 中的文件中。但是 HDFS 滚动选项不起作用。 Flume-ng 每秒发送日志文件。如何修复 flume.conf 文件?

agent01.sources = avroGenSrc
agent01.channels = memoryChannel hdfsChannel
agent01.sinks = fileSink hadoopSink

# For each one of the sources, the type is defined
agent01.sources.avroGenSrc.type = avro
agent01.sources.avroGenSrc.bind = dev-hadoop03.ncl
agent01.sources.avroGenSrc.port = 3333

# The channel can be defined as follows.
agent01.sources.avroGenSrc.channels = memoryChannel hdfsChannel

# Each sink's type must be defined
agent01.sinks.fileSink.type = file_roll
agent01.sinks.fileSink.sink.directory = /home1/irteam/flume/data
agent01.sinks.fileSink.sink.rollInterval = 3600
agent01.sinks.fileSink.sink.batchSize = 100

#Specify the channel the sink should use
agent01.sinks.fileSink.channel = memoryChannel



agent01.sinks.hadoopSink.type = hdfs
agent01.sinks.hadoopSink.hdfs.useLocalTimeStamp = true
agent01.sinks.hadoopSink.hdfs.path = hdfs://dev-hadoop04.ncl:9000/user/hive/warehouse/raw_logs/year=%Y/month=%m/day=%d
agent01.sinks.hadoopSink.hdfs.filePrefix = AccessLog.%Y-%m-%d.%Hh
agent01.sinks.hadoopSink.hdfs.fileType = DataStream
agent01.sinks.hadoopSink.hdfs.writeFormat = Text
agent01.sinks.hadoopSink.hdfs.rollInterval = 0
agent01.sinks.hadoopSink.hdfs.rollSize = 134217728
agent01.sinks.hadoopSink.hdfs.rollCount = 0

#Specify the channel the sink should use
agent01.sinks.hadoopSink.channel = hdfsChannel


# Each channel's type is defined.
agent01.channels.memoryChannel.type = memory
agent01.channels.hdfsChannel.type = memory

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent01.channels.memoryChannel.capacity = 100000
agent01.channels.memoryChannel.transactionCapacity = 10000

agent01.channels.hdfsChannel.capacity = 100000
agent01.channels.hdfsChannel.transactionCapacity = 10000

【问题讨论】:

    标签: hdfs flume flume-ng


    【解决方案1】:

    我找到了这个解决方案。 dfs.replication 不匹配导致此问题。

    在我的 hadoop conf (hadoop-2.7.2/etc/hadoop/hdfs-site.xml)

    <property>
      <name>dfs.replication</name>
      <value>3</value>
    </property>
    

    我有 2 个数据节点,所以我将其更改为

    <property>
      <name>dfs.replication</name>
      <value>2</value>
    </property>
    

    我在flume.conf中添加配置

    agent01.sinks.hadoopSink.hdfs.minBlockReplicas = 2
    

    感谢

    https://qnalist.com/questions/5015704/hit-max-consecutive-under-replication-rotations-error

    Flume HDFS sink keeps rolling small files

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-09-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多