【问题标题】:Can multiple sinks read from same channel or how to load balance flume sinks?多个接收器可以从同一通道读取或如何负载平衡水槽接收器?
【发布时间】:2017-06-19 09:43:59
【问题描述】:

根据多个source,例如Hadoop Application Architecture,多个sink可以从同一个channel读取,以增加吞吐量: A sink can only fetch data from a single channel, but many sinks can fetch data from that same channel. A sink runs in a single thread, which has huge limitations on a single sink—for example, throughput to disk. Assume with HDFS you get 30 MBps to a single disk; if you only have one sink writing to HDFS then all you’re going to get is 30 MBps throughput with that sink. More sinks consuming from the same channel will resolve this bottleneck. The limitation with more sinks should be the network or the CPU. Unless you have a really small cluster, HDFS should never be your bottleneck.

但除此之外,还有一个带有load balancing sink processor 的接收器组的概念。根据article,无需创建接收器组即可更快地消费事件: It is important to understand that all sinks within a sink group are not active at the same time; only one of them is sending data at any point in time. Therefore, sink groups should not be used to clear off the channel faster—in this case, multiple sinks should simply be set to operate by themselves with no sink group, and they should be configured to read from the same channel

所以,我真的不明白什么时候应该将组接收器与负载平衡器一起使用,以及何时只添加更多从一个特定通道读取的接收器。

【问题讨论】:

    标签: flume flume-ng


    【解决方案1】:

    多个接收器可以从同一个通道读取,但重要的是要记住,Flume 只能保证每个事件将被推送到至少一个接收器,而不是每个连接的接收器。这些 sink 的处理速度是不同的,并且无法预测事件将被推送到哪个 sink。 如果您需要多个接收器从同一通道读取,请始终使用故障转移或负载平衡接收器处理器。

    【讨论】:

      猜你喜欢
      • 2015-02-13
      • 1970-01-01
      • 2016-03-02
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-04-29
      相关资源
      最近更新 更多