【问题标题】:Extract value from cloudant IBM Bluemix NoSQL Database从 cloudant IBM Bluemix NoSQL 数据库中提取价值
【发布时间】:2017-08-15 09:46:42
【问题描述】:

如何从以 JSON 格式存储的 Cloudant IBM Bluemix NoSQL 数据库中提取价值?

我试过这个代码

def readDataFrameFromCloudant(host,user,pw,database):
   cloudantdata=spark.read.format("com.cloudant.spark"). \
      option("cloudant.host",host). \
      option("cloudant.username", user). \
      option("cloudant.password", pw). \
      load(database)

cloudantdata.createOrReplaceTempView("washing")
spark.sql("SELECT * from washing").show()
return cloudantdata

hostname = ""
user = ""
pw = ""
database = "database"
cloudantdata=readDataFrameFromCloudant(hostname, user, pw, database)

以这种格式存储

{
  "_id": "31c24a382f3e4d333421fc89ada5361e",
  "_rev": "1-8ba1be454fed5b48fa493e9fe97bedae",
  "d": {
    "count": 9,
    "hardness": 72,
    "temperature": 85,
    "flowrate": 11,
    "fluidlevel": "acceptable",
    "ts": 1502677759234
  }
}

我想要这个结果

预期

实际结果

【问题讨论】:

    标签: pyspark ibm-cloud cloudant pyspark-sql


    【解决方案1】:

    为重现问题创建一个虚拟数据集:

    cloudantdata = spark.read.json(sc.parallelize(["""
    {
      "_id": "31c24a382f3e4d333421fc89ada5361e",
      "_rev": "1-8ba1be454fed5b48fa493e9fe97bedae",
      "d": {
        "count": 9,
        "hardness": 72,
        "temperature": 85,
        "flowrate": 11,
        "fluidlevel": "acceptable",
        "ts": 1502677759234
      }
    }
    """]))
    cloudantdata.take(1)
    

    返回:

    [Row(_id='31c24a382f3e4d333421fc89ada5361e', _rev='1-8ba1be454fed5b48fa493e9fe97bedae', d=Row(count=9, flowrate=11, fluidlevel='acceptable', hardness=72, temperature=85, ts=1502677759234))]
    

    现在展平:

    flat_df = cloudantdata.select("_id", "_rev", "d.*")
    flat_df.take(1)
    

    返回:

    [Row(_id='31c24a382f3e4d333421fc89ada5361e', _rev='1-8ba1be454fed5b48fa493e9fe97bedae', count=9, flowrate=11, fluidlevel='acceptable', hardness=72, temperature=85, ts=1502677759234)]
    

    我在 IBM Data Science Experience 笔记本上使用 Python 3.5(实验性)和 Spark 2.0

    测试了这段代码

    此答案基于:https://stackoverflow.com/a/45694796/1033422

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多