如果您只想为特定的数据帧操作启用它,而不管 spark 会话是如何创建的,那么这应该会有所帮助。
默认是
java.time.ZoneId.systemDefault
res50: java.time.ZoneId = Asia/Calcutta
当您查询 spark 配置时,同样会反映出来。
spark.sql("SET spark.sql.session.timeZone").show(false)
+--------------------------+-------------+
|key |value |
+--------------------------+-------------+
|spark.sql.session.timeZone|Asia/Calcutta|
+--------------------------+-------------+
现在是数据框
val df = Seq((1580452395095L)).toDF("DATE")
将其更改为 UTC - 伦敦
spark.conf.set("spark.sql.session.timeZone","Europe/London")
查询配置设置将显示伦敦
spark.sql("SET spark.sql.session.timeZone").show(false)
+--------------------------+-------------+
|key |value |
+--------------------------+-------------+
|spark.sql.session.timeZone|Europe/London|
+--------------------------+-------------+
结果:
df.withColumn("NEW_DATE", to_timestamp(from_unixtime(col("DATE") / 1000))).show(false)
+-------------+-------------------+
|DATE |NEW_DATE |
+-------------+-------------------+
|1580452395095|2020-01-31 06:33:15|
+-------------+-------------------+
改回系统默认值,
spark.conf.set("spark.sql.session.timeZone",java.time.ZoneId.systemDefault.toString)
df.withColumn("NEW_DATE", to_timestamp(from_unixtime(col("DATE") / 1000))).show(false)
+-------------+-------------------+
|DATE |NEW_DATE |
+-------------+-------------------+
|1580452395095|2020-01-31 12:03:15|
+-------------+-------------------+
spark.sql("SET spark.sql.session.timeZone").show(false)
+--------------------------+-------------+
|key |value |
+--------------------------+-------------+
|spark.sql.session.timeZone|Asia/Calcutta|
+--------------------------+-------------+