1.如何让 spark-sql 能够访问hive?

只需将hive-site.xml 放到 spark/conf 下即可,hive-site.xml 内容请参照hive集群搭建

 2.要在spark 代码中使用sql操作hive,需要在初始化sparksession 时加上

enableHiveSupport()
 val spark = SparkSession
      .builder()
      .appName("df")
      .master("local[*]")
      .enableHiveSupport()
      .getOrCreate()

3.spark开启hive动态分区功能

spark.sql("SET hive.exec.dynamic.partition = true")
spark.sql("SET hive.exec.dynamic.partition.mode = nonstrict ")

4.spark 查看hive表是否存在

val exists = spark.catalog.tableExists(db, tb)

5.spark 删除hdfs路径(用于重建hive表指定路径)

val hadoopConf = spark.sparkContext.hadoopConfiguration
        val hdfs = org.apache.hadoop.fs.FileSystem.get(hadoopConf)
        val path = new Path(location)
        if (hdfs.exists(path)) {
          //为防止误删,禁止递归删除
          hdfs.delete(path, false)
        }

 

相关文章:

  • 2022-01-03
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2021-10-21
猜你喜欢
  • 2021-08-01
  • 2021-09-24
  • 2021-08-29
  • 2021-04-15
  • 2021-06-12
  • 2021-11-03
  • 2021-04-15
相关资源
相似解决方案