【发布时间】:2016-09-13 13:34:20
【问题描述】:
我正在使用交叉连接来访问 2 个表中的数据。但是使用交叉连接时,我收到错误“d.DebugData not found in table "bigdata:RawDebug.CarrierDetails"。任何帮助将不胜感激!
SELECT
HardwareId, DebugReason, DebugData,
CASE
WHEN REGEXP_MATCH(DebugData,'\\d+') THEN c.Network
ELSE REGEXP_REPLACE(DebugData,'\\?',' ')
END
as ActualDebugData
FROM(
SELECT
HardwareId, DebugReason, DebugData
FROM TABLE_DATE_RANGE([bigdata:RawDebug.T],TIMESTAMP ('2016-05-15'),TIMESTAMP('2016-05-15'))
WHERE Reason = 500
) as d
CROSS JOIN (
SELECT Network
FROM [bigdata:RawDebug.CarrierDetails]
WHERE Mcc = substr(d.DebugData,0,3) AND Mnc = substr(d.DebugData,4,LENGTH(d.Reason - 1))
LIMIT 1
) AS c
试过了,但我得到了这个错误:“ON 子句必须是 AND of = 每个表中一个字段名的比较,所有字段名都以表名作为前缀。”
%%sql --module Test2
DEFINE QUERY Test2
SELECT
HardwareId, DebugReason, DebugData,
CASE
WHEN REGEXP_MATCH(DebugData,'\\d+') THEN c.Network
ELSE REGEXP_REPLACE(DebugData,'\\?',' ')
END AS ActualDebugData
FROM (
SELECT
HardwareId, DebugReason, DebugData,
SUBSTR(DebugData,0,3) AS d1, REGEXP_REPLACE(SUBSTR(DebugData,3,LENGTH(DebugData)-1),'%[^a-zA-Z0-9, ]%',' ') as d2
FROM TABLE_DATE_RANGE([bigdata:RawDebug.T],TIMESTAMP('2016-05-15'),TIMESTAMP('2016-05-15'))
WHERE DebugReason = 500
) AS d
LEFT JOIN (
SELECT
Network, Mcc, Mnc
,ROW_NUMBER() OVER(PARTITION BY Mcc, Mnc) AS pos
FROM [bigdata:RawDebug.CarrierDetails]
) AS c
ON c.Mcc = INTEGER(d.d1) AND c.Mnc = INTEGER(d.d2)
WHERE c.pos = 1
我正在添加以下结构:
RawDebug:
HardwareId DebugReason DebugData
550029358 50013 VER%
550029359 50013 RO%
550029360 50013 34020?
550029361 50013 34021?
当 DebugData 有字符时,我有匹配它的 case 语句,当它有数字时,我必须取前 3 个字符的子字符串并将其与 Carrierdetails 中的 Mcc 和剩余字符匹配,并将其与 Carrierdetails 中的 Mnc 匹配.
对于最近的查询,它不会考虑所有情况。相反,它需要一个特定的数字并为所有行使用 tat ActualDebugData。
【问题讨论】:
标签: google-bigquery cross-join