【问题标题】:Offset functionality in HiveHive 中的偏移功能
【发布时间】:2013-10-02 22:07:15
【问题描述】:

如何在 Hive 中实现与 SQL 的“偏移量”相同的功能?

SELECT * from table LIMIT 20 OFFSET 30

谢谢!

【问题讨论】:

标签: hive hiveql


【解决方案1】:

我不知道有一个内置函数或 UDF 会模仿这种行为,但如果您使用 HIVE 0.13,您可以使用 row_number() 函数来获得所需的结果。

select pk, col_1, col_2, ... , col_n
from (
    select pk, col_1, col_2, ... , col_n, row_number() OVER (ORDER by pk) as rank
    from some_database.some_table
    ) x
where rank between 31 and 50

【讨论】:

    【解决方案2】:

    Limit 使用 2 个参数。 限制(计数)和限制偏移,计数。

    所以请使用第二个选项。与

    select salary from employee order by salary desc limit 0,1
    

    你会得到最高的薪水。

    这里(偏移量)0 - 第一行和计数(1)

    【讨论】:

      【解决方案3】:
      public class CountRatingQueryBuilder {
      
      private static final String SCORING_TABLE_NAME = "web_resource_rating";
      
      private final Connection connection;
      private final ScoringMetadata scoringMetadata;
      
      private final SelectSelectStep select;
      private final Factory create;
      
      public CountRatingQueryBuilder(Connection connection, ScoringMetadata scoringMetadata){
          this.connection = connection;
          this.scoringMetadata = scoringMetadata;
      
          create = new Factory(this.connection, SQLDialect.MYSQL);
          select = create.select();
      
          withSelectFieldsClause();
      }
      
      public CountRatingQueryBuilder withLimit(int limit){
          select.limit(limit);
          return this;
      }
      
      public CountRatingQueryBuilder withRegionId(Integer regionId){
          select.where(REGION_ID.field().equal(regionId));
          return this;
      }
      
      public CountRatingQueryBuilder withResourceTypeId(int resourceTypeId){
          select.where(RESOURCE_TYPE_ID.field().equal(resourceTypeId));
          return this;
      }
      
      public CountRatingQueryBuilder withRequestTimeBetween(long beginTimestamp, long endTimestamp){
          select.where(REQUEST_TIME.field().between(beginTimestamp, endTimestamp));
          return this;
      }
      
      public CountRatingQueryBuilder withResourceId(int resourceId){
          select.where(RESOURCE_ID.field().equal(resourceId));
          return this;
      }
      
      
      
      protected void withGroupByClause(){
          select.groupBy(REGION_ID.field());
          select.groupBy(RESOURCE_TYPE_ID.field());
          select.groupBy(RESOURCE_ID.field());
          select.groupBy(CONTENT_ID.field());
      }
      
      protected void withSelectFieldsClause(){
          select.select(REGION_ID.field());
          select.select(RESOURCE_TYPE_ID.field());
          select.select(CONTENT_ID.field());
          select.select(RESOURCE_ID.field());
          select.select(Factory.count(HIT_COUNT.field()).as(SUM_HIT_COUNT.fieldName()));
      }
      
      protected void withFromClause(){
          select.from(SCORING_TABLE_NAME);
      }
      
      protected void withOrderByClause(){
          select.orderBy(SUM_HIT_COUNT.field().desc());
      }
      
      public String build(){
          withGroupByClause();
          withOrderByClause();
          withFromClause();
          return select.getSQL().replace("offset ?","");//dirty hack for MySQL dialect. TODO: we can try to implement our own SQL dialect for Hive :)
      
      }
      
      public List<ResultRow> buildAndFetch(){
          String sqlWithPlaceholders = build();
      
          List<ResultRow> scoringResults = new ArrayList<ResultRow>(100);
          List<Record> recordResults = create.fetch(sqlWithPlaceholders, ArrayUtils.subarray(select.getBindValues().toArray(new Object[select.getBindValues().size()]),0, select.getBindValues().size()-1));//select.fetch();
          for(Record record : recordResults){
              ResultRowBuilder resultRowBuilder = ResultRowBuilder.create();
      
              resultRowBuilder.withContentType(scoringMetadata.getResourceType(record.getValue(RESOURCE_TYPE_ID.fieldName(), Integer.class)));
              resultRowBuilder.withHitCount(record.getValue(SUM_HIT_COUNT.fieldName(), Long.class));
              resultRowBuilder.withUrl(record.getValue(CONTENT_ID.fieldName(), String.class));
              scoringResults.add(resultRowBuilder.build());
          }
          return scoringResults;
      }
      
      }
      

      希望这是从以下链接复制的正确答案: 请参考jooq extend existing dialect. Adopt MySQL dialect to apache Hive dialect详细了解。

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2016-06-08
        • 2021-04-05
        • 1970-01-01
        • 1970-01-01
        • 2013-04-14
        相关资源
        最近更新 更多