【发布时间】:2020-05-30 20:45:42
【问题描述】:
我无法连接我在 eclipse 上编写的 spark 代码。
下面是代码,请指导我如何做同样的事情。 任何事情都会有帮助
>
> import java.util.Arrays;
>
> import org.apache.spark.SparkConf;
> import org.apache.spark.api.java.JavaPairRDD;
> import org.apache.spark.api.java.JavaRDD;
> import org.apache.spark.api.java.JavaSparkContext;
>
> public class SparkTest {
>
public static void main(String[] args) {
> SparkConf conf = new SparkConf()
.setAppName("JD Word Counter").setMaster("local");
>
> JavaSparkContext sc = new JavaSparkContext(conf);
> //hdfs://localhost:8020/user/root/textfile/test.txt
JavaRDD<String> inputFile = sc.textFile("hdfs://localhost:8020/user/root/textfile/test.txt");
> System.out.println("Hello start");
> System.out.println(inputFile.collect());
JavaRDD<String> wordsFromFile = inputFile.flatMap(content ->
Arrays.asList(content.split(" ")).iterator());
> System.out.println("hello end");
>
>
> //JavaPairRDD countData = wordsFromFile.mapToPair(t -> new Tuple2(t, 1)).reduceByKey((x, y) -> (int) x + (int) y);
//wordsFromFile
.saveAsTextFile("hdfs://localhost:8020/user/root/fileTest/");
>
> System.out.println(" This java program is complete");
}
>
> }
>
错误:
> I/O error constructing remote block reader.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout
> while waiting for channel to be ready for connect. ch :
> java.nio.channels.SocketChannel[connection-pending
> remote=/172.18.0.2:50010] at org.apache.hadoop.net.NetUtils.c
【问题讨论】:
标签: eclipse apache-spark hadoop hortonworks-data-platform