【发布时间】:2020-10-27 21:52:04
【问题描述】:
我们有本地 kafka 流数据到 Azure 数据块设置;我们使用以下查询连接到本地主机
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "host1:10.10.10.120:9092") \
.option("subscribe", "SIP.SIP.MENT") \
.option("minPartitions", "10") \
.option("startingOffsets", "earliest") \
.load()
接下来我们使用 显示(df)
我们永远不会显示任何结果,消费者在服务器上工作正常。
完全错误
[Consumer clientId=consumer-spark-kafka-source-6c634c0d-01de-4840-a7b9-414326972173-2063739220-driver-0-1, groupId=spark-kafka-source-6c634c0d-01de-4840-a7b9-414326972173-2063739220-driver-0] Discovered group coordinator xyz.xyz.com:9092 (id: 2147483647 rack: null)
20/10/28 01:26:20 WARN NetworkClient: [Consumer clientId=consumer-spark-kafka-source-6c634c0d-01de-4840-a7b9-414326972173-2063739220-driver-0-1, groupId=spark-kafka-source-6c634c0d-01de-4840-a7b9-414326972173-2063739220-driver-0] Error connecting to node xyz.xyz.com:9092 (id: 2147483647 rack: null)
java.net.UnknownHostException: xyz.xyz.com: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
【问题讨论】:
标签: azure apache-kafka-streams databricks azure-databricks pyspark-dataframes