【问题标题】:How to get the first not null value from a column of values in Big Query?如何从 Big Query 中的一列值中获取第一个非空值?
【发布时间】:2015-12-23 16:28:14
【问题描述】:

我正在尝试根据时间戳从一列值中提取第一个非空值。有人可以分享您对此的想法。谢谢。

到目前为止我尝试了什么?

FIRST_VALUE( column ) OVER ( PARTITION BY id ORDER BY timestamp) 

Input :-

id,column,timestamp
1,NULL,10:30 am
1,NULL,10:31 am
1,'xyz',10:32 am
1,'def',10:33 am
2,NULL,11:30 am
2,'abc',11:31 am

Output(expected) :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am

【问题讨论】:

  • 初始语句和您的示例输出似乎不一致。看来您想用第一个非NULL 值填充NULL 值。
  • 否。我需要将第一个非空值作为 id 级别 col 中所有值的输出。

标签: sql bigdata google-bigquery


【解决方案1】:

你可以这样修改你的sql来得到你想要的数据。

FIRST_VALUE( column )
  OVER ( 
    PARTITION BY id
    ORDER BY
      CASE WHEN column IS NULL then 0 ELSE 1 END DESC,
      timestamp
  )

【讨论】:

  • MikeD 你确定这个查询有效吗?我正在尝试这个并且我收到错误消息:“在分析表达式中,ORDER BY 必须引用命名列。找到 CASE”
【解决方案2】:

试试这个旧的字符串操作技巧:

Select 
ID,
  Column,
  ttimestamp,
  LTRIM(Right(CColumn,20)) as CColumn,
  FROM
(SELECT
  ID,
  Column,
  ttimestamp,
  MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn
FROM (

  SELECT
    *
  FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp),
        (Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp),
        (Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp),
        (Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp),
        (Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp),
        (Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp)
))

【讨论】:

    【解决方案3】:

    据我所知,Big Query 没有像“IGNORE NULLS”或“NULLS LAST”这样的选项。鉴于此,这是我能想到的最简单的解决方案。我希望看到更简单的解决方案。 假设输入数据在“original_data”表中,

    select w2.id, w1.column, w2.timestamp
    from
    (select id,column,timestamp
       from
         (select id,column,timestamp, row_number() 
                       over (partition BY id ORDER BY timestamp) position
           FROM original_data
           where column is not null
        )
       where position=1 
    ) w1
    right outer join
     original_data as w2
    on w1.id = w2.id 
    

    【讨论】:

    【解决方案4】:

    选择 ID,
    (SELECT top(1) column FROM test1 where id=1 and column is not null order by autoID desc)作为名称 ,时间戳 来自你的表

    输出:- 1,'xyz',上午 10:30 1,'xyz',上午 10 点 31 分 1,'xyz',上午 10 点 32 分 1,'xyz',上午 10 点 33 分 2,'abc',上午 11:30 2,'abc',上午 11:31

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-02-15
      • 2012-06-24
      • 2013-09-03
      • 1970-01-01
      • 2022-06-14
      • 2022-01-23
      • 1970-01-01
      • 2021-06-22
      相关资源
      最近更新 更多