你不需要“投射”任何东西,你只需要分解数组然后解压结构。我为您的数据添加了一个索引,以便更清楚地了解事情的结局。
数据:
idx arr_of_structs
0 [{periode:20160118-20160205,nb:1},{periode:20161130-20161130,nb:1},{periode:20161130-20161221,nb:1}]
1 [{periode:20161212-20161217,nb:0}]
查询:
SELECT idx -- index
, my_struct.periode AS periode -- unpacks periode
, my_struct.nb AS nb -- unpacks nb
FROM database.table
LATERAL VIEW EXPLODE(arr_of_structs) exptbl AS my_struct
输出:
idx periode nb
0 20160118-20160205 1
0 20161130-20161130 1
0 20161130-20161221 1
1 20161212-20161217 0
您的问题有点不清楚所需的结果是什么,但是一旦您更新它,我就会相应地修改查询。
编辑:
上述解决方案不正确,我没有发现您的输入是STRING。
查询:
SELECT REGEXP_EXTRACT(tmp_arr[0], "([0-9]{8}-[0-9]{8})") AS periode
, REGEXP_EXTRACT(tmp_arr[1], ":([0-9]*)") AS nb
FROM (
SELECT idx
, pos
, COLLECT_SET(tmp_col) AS tmp_arr
FROM (
SELECT idx
, tmp_col
, CASE WHEN PMOD(pos, 2) = 0 THEN pos+1 ELSE pos END AS pos
FROM (
SELECT *
, ROW_NUMBER() OVER () AS idx
FROM database.table ) x
LATERAL VIEW POSEXPLODE(SPLIT(periode, ',')) exptbl AS pos, tmp_col ) y
GROUP BY idx, pos) z
输出:
periode nb
20160118-20160205 1
20161130-20161130 1
20161130-20161221 1
20161212-20161217 0