spark如何读取几列elasticsearch？

【问题标题】：How to read a few columns of elasticsearch by spark?spark如何读取几列elasticsearch？
【发布时间】：2017-05-04 15:07:52
【问题描述】：

在es集群中，数据规模很大，我们使用spark计算数据，但是采用elasticsearch-hadoop的方式，其次是https://www.elastic.co/guide/en/elasticsearch/hadoop/current/spark.html

我们必须读取索引的完整列。有什么有用的吗？

【问题讨论】：

标签： apache-spark elasticsearch-hadoop

【解决方案1】：

是的，您可以分别设置配置参数“es.read.field.include”或“es.read.field.exclude”。详细信息here。假设 Spark 2 或更高版本的示例。

val sparkSession:SparkSession = SparkSession
  .builder()
  .appName("jobName")
  .config("es.nodes", "elastichostc1n1.example.com")
  .config("es.read.field.include", "foo,bar")
  .getOrCreate()

【讨论】：