1.将java程序打成jar包
idea在maven project 处 双击 package,就会在target下产生jar包
ecplice在自己的项目上点击Run as 点击maven build 产生的jar包在target下
启动jar包开始生产数据
java -cp /home/hadoop/data/hadoopProject-1.0-SNAPSHOT.jar com.qf.phone.PhoneData /home/hadoop/data/didi.txt
2.flume安装及设置
在flume目录下创建job目录并在其下创建flume.conf
vi flume.conf并写入以下内容:
agent.sources= s1
agent.channels= c1
agent.sinks= k1
agent.sources.s1.type=exec
#设置flume要监控的目录
agent.sources.s1.command= tail -F /home/hadoop/data/didi.txt
agent.sources.s1.channels=c1
agent.channels.c1.type=memory
agent.channels.c1.capacity=10000
agent.channels.c1.transactionCapacity=100
#设置Kafka接收器
agent.sinks.k1.type=org.apache.flume.sink.kafka.KafkaSink
#设置Kafka的broker地址和端口号
agent.sinks.k1.brokerList=node-1:9092,node-2:9092,node-3:9092
#设置Kafka的Topic
agent.sinks.k1.topic=didi
#设置序列化方式
agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder
agent.sinks.k1.channel=c1
启动flume:
bin/flume-ng agent --conf conf/ --name agent --conf-file job/flume.conf
3.设置kafka
启动kafka
bin/kafka-server-start.sh -daemon config/server.properties
创建一个topic
./bin/kafka-topics.sh --create --zookeeper node-1:2181,node-2:2181,node-3:2181 --partition 3 --replication-factor 3 --topic test-0
启动消费者:
./bin/kafka-console-consumer.sh --zookeeper node-1:2181,node-2:2181,node-3:2181 --topic test0 --from-beginning