【发布时间】:2021-07-08 06:26:30
【问题描述】:
我在 Databricks 笔记本中使用 python 创建了一个函数
%python
import numpy as np
from pyspark.sql.functions import udf
# from pyspark.sql.types import DateType
def get_work_day(start_date,work_days_to_be_added,site_work_days,holidays_list):
holidays_list = list(holidays_list)
if (site_work_days == 5):
work_days = '1111100'
elif (site_work_days == 6):
work_days = '1111110'
elif (site_work_days == 7):
work_days = '1111111'
elif (site_work_days == 1):
work_days = '1000000'
elif (site_work_days == 2):
work_days = '1100000'
elif (site_work_days == 3):
work_days = '1110000'
elif (site_work_days == 4):
work_days = '1111000'
dt = np.busday_offset(start_date,work_days_to_be_added,roll='forward',weekmask = work_days,holidays=holidays_list)
return str(dt)
spark.udf.register("get_work_day()", get_work_day)
当我从同一个笔记本调用它时它工作正常,但当我从其他笔记本调用它时抛出一个错误。
我在 SQL 代码中调用上述函数,并且 SQL 代码在同一个笔记本中执行时工作正常,但在我在其他笔记本中运行时它会中断
Select column_with_date_value,get_work_day(column_with_date_value,4,4,('2021-05-06','2021-05-07')) from db.samp
我得到的错误是
DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Undefined function: 'get_work_day'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 28
谁能告诉我如何注册此功能,以便他们可以跨笔记本使用。
【问题讨论】:
标签: python sql apache-spark pyspark databricks