【发布时间】:2015-07-15 13:51:17
【问题描述】:
我正在使用水槽将数据从服务器日志流式传输到 hdfs。但是,当数据流式传输到 hdfs 时,它首先创建 .tmp 文件。配置中有没有办法可以隐藏 .tmp 文件,或者可以通过附加 .tmp 文件来更改名称。在前。我的收集代理文件看起来像-
## TARGET AGENT ##
## configuration file location: /etc/flume/conf
## START Agent: flume-ng agent -c conf -f /etc/flume/conf/flume-trg-agent.conf -n collector
#http://flume.apache.org/FlumeUserGuide.html#avro-source
collector.sources = AvroIn
collector.sources.AvroIn.type = avro
collector.sources.AvroIn.bind = 0.0.0.0
collector.sources.AvroIn.port = 4545
collector.sources.AvroIn.channels = mc1 mc2
## Channels ##
## Source writes to 2 channels, one for each sink
collector.channels = mc1 mc2
#http://flume.apache.org/FlumeUserGuide.html#memory-channel
collector.channels.mc1.type = memory
collector.channels.mc1.capacity = 100
collector.channels.mc2.type = memory
collector.channels.mc2.capacity = 100
## Sinks ##
collector.sinks = LocalOut HadoopOut
## Write copy to Local Filesystem
#http://flume.apache.org/FlumeUserGuide.html#file-roll-sink
#collector.sinks.LocalOut.type = file_roll
#collector.sinks.LocalOut.sink.directory = /var/log/flume
#collector.sinks.LocalOut.sink.rollInterval = 0
#collector.sinks.LocalOut.channel = mc1
## Write to HDFS
#http://flume.apache.org/FlumeUserGuide.html#hdfs-sink
collector.sinks.HadoopOut.type = hdfs
collector.sinks.HadoopOut.channel = mc2
collector.sinks.HadoopOut.hdfs.path = /user/root/flume-channel/%{log_type}
collector.sinks.k1.hdfs.filePrefix = events-
collector.sinks.HadoopOut.hdfs.fileType = DataStream
collector.sinks.HadoopOut.hdfs.writeFormat = Text
collector.sinks.HadoopOut.hdfs.rollSize = 1000000
任何帮助将不胜感激。
【问题讨论】:
-
我已经通过放置 collector.sinks.HadoopOut.hdfs.inUsePrefix= 解决了这个问题。这个前缀“。”在临时文件的前面,使其对其他应用程序不可读
-
这正是你的做法。
-
请将下面的答案标记为已接受。
-
issues.apache.org/jira/browse/FLUME-2653 这里表示在 1.8.0 版本中,带有此更改请求的后缀和前缀可以为空。
标签: hadoop hdfs flume flume-ng