Snowflake：SQL 编译错误：错误行无效标识符 '"dateutc"'答案

【问题标题】：Snowflake: SQL compilation error: error line invalid identifier '"dateutc"'Snowflake：SQL 编译错误：错误行无效标识符 '"dateutc"'
【发布时间】：2021-07-21 02:25:55
【问题描述】：

我正在将数据从 Postgres 移动到雪花。最初它有效，但我添加了：

df_postgres["dateutc"]= pd.to_datetime(df_postgres["dateutc"])

因为日期格式被错误地加载到雪花中，现在我看到了这个错误：

SQL 编译错误：位置 87 处的错误第 1 行无效标识符 '"dateutc"'

这是我的代码：

from sqlalchemy import create_engine
import pandas as pd
import glob
import os
from config import postgres_user, postgres_pass, host,port, postgres_db, snow_user, snow_pass,snow_account,snow_warehouse   
from snowflake.connector.pandas_tools import pd_writer
from snowflake.sqlalchemy import URL


from sqlalchemy.dialects import registry
registry.register('snowflake', 'snowflake.sqlalchemy', 'dialect')

    
engine = create_engine(f'postgresql://{postgres_user}:{postgres_pass}@{host}:{port}/{postgres_db}')


conn = engine.connect()

#reads query
df_postgres = pd.read_sql("SELECT * FROM rok.my_table", conn)

#dropping these columns
drop_cols=['RPM', 'RPT']
df_postgres.drop(drop_cols, inplace=True, axis=1)

#changed columns to lowercase
df_postgres.columns = df_postgres.columns.str.lower()


df_postgres["dateutc"]= pd.to_datetime(df_postgres["dateutc"])


print(df_postgres.dateutc.dtype)

sf_conn = create_engine(URL(
    account = snow_account,
    user = snow_user,
    password = snow_pass,
    database = 'test',
    schema = 'my_schema',
    warehouse = 'test',
    role = 'test',
))



df_postgres.to_sql(name='my_table',
                 index = False,  
                 con = sf_conn,
                 if_exists = 'append', 
                 chunksize = 300,
                 method = pd_writer)

【问题讨论】：

我有点确定您的表在某些时候是使用常规标识符创建的，即未引用。在这种情况下，雪花以大写形式存储它们：docs.snowflake.com/en/sql-reference/…。现在由于某种原因，pd_writer 被指示引用标识符（delimited），因此无法找到 "dateutc"。那或表确实没有列，并且由于您使用'append' 它失败了。尝试将其命名为 DATEUTC 看看会发生什么。
@IljaEverilä 感谢您的回复，我看到您提到了 pd_writer，我删除了它并且它起作用了！

标签： python sqlalchemy snowflake-cloud-data-platform

【解决方案1】：

将 Ilja 的答案从评论转移到完整的答案：

Snowflake 区分大小写。
在编写“不带引号”的 SQL 时，Snowflake 会将表名和字段转换为大写。
这通常有效，直到有人决定开始在 SQL 中引用他们的标识符。
pd_writer 为标识符添加引号。
因此，当您有 df_postgres["dateutc"] 时，它在转换为完全引用的查询时仍为小写。
在 Python 中编写 df_postgres["DATEUTC"] 应该可以解决此问题。

【讨论】：

顺便说一句。是不是 Pandas 实际上首先使用常规标识符创建表，因为列名都是小写的，然后 pd_writer 做了它应该做的事情，即使用分隔标识符？我看到elsewhere 和well 一样出现了类似的问题。