【发布时间】:2018-09-02 07:32:25
【问题描述】:
我在雅典娜创建了我的第一个兽人桌。
CREATE EXTERNAL TABLE `orc_test`(
`hexid` string,
`version` int,
`ts` int,
`campaignid` int,
`sourceuid` string,
`publisherid` string,
`adslot` string,
`share` double,
`earn` double,
`spent` double,
`rebate` double,
`payout` double,
`cost` double,
`ip` string,
`useragent` string,
`referer` string,
`param1` string,
`param2` string,
`param3` string,
`param4` string,
`param5` string,
`param6` string,
`appid` string,
`bundleid` string,
`deviceid` string,
`uuid` string,
`callbackurl` string)
PARTITIONED BY (
`year` string,
`month` string,
`day` string,
`hour` string,
`minute` string)
STORED AS orc
LOCATION
‘s3://someplace/orc_2’
TBLPROPERTIES (‘orc.compress’=‘SNAPPY’)
当我搜索特定数据(例如 hexid)时。该查询扫描整个表。作为一个柱状表,不会只搜索hexId列,不扫描整个表吗?
我觉得我错过了什么。
谢谢!
pd:对不起我的英语!
【问题讨论】:
标签: performance amazon-athena snappy orc