开发准备,idea,配置了Scala的win10电脑,因为集群中的spark是1.6版本,所以需要Scala的版本为Scala2.10.5
接下来,将spark的包在win10解压,将lib目录下的
复制一份到一个不含中文的目录中,jar包会有两个,我们需要的是大的那个。
打开idea新建工程,按图选择
选择Scala版本,选择2.10.5的,因为之前安装过2.11的,需要选择到2.10
目录结构
添加一个Scala类,
package cn.spark.study.core
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
/**
* @author Administrator
*/
object WordCount {
def main(args: Array[String]) {
val conf = new SparkConf()
.setAppName("WordCount")
val sc = new SparkContext(conf)
val lines = sc.textFile("hdfs://master:9000/spark.txt", 1);
val words = lines.flatMap { line => line.split(" ") }
val pairs = words.map { word => (word, 1) }
val wordCounts = pairs.reduceByKey { _ + _ }
wordCounts.foreach(wordCount => println(wordCount._1 + " appeared " + wordCount._2 + " times."))
}
}
接下来选择File--Project Structure
选择+--JAR --from module with dependent
点击apply OK
选择build---build artifact
选择clean,第二次选择build,稍等,会提示完成同时会在通知栏提示jar包位置
上传jar包,新建脚本
/usr/local/src/spark/bin/spark-submit \
--class cn.spark.study.core.WordCount \
--num-executors 100 \
--driver-memory 4G \
--executor-memory 4 \
--executor-cores 1 \
/root/spark_Test/scala/sparkTest.jar \
运行脚本,运行成功即完成。