【问题标题】:Keep top N elements of an array by group按组保留数组的前 N ​​个元素
【发布时间】:2015-08-02 02:25:16
【问题描述】:

我在 Blaze Advisor(规则引擎)中使用专有语言。我正在寻找一种算法,如何按特定属性形成的组仅保留数组中的前 N ​​个项目。例如有两个数组:

parrent[0].id = 1
parrent[1].id = 2

第二个数组:

child[0].parrentId = 1
child[0].result = 3.0
child[1].parrentId = 1
child[1].result = 2.0
child[2].parrentId = 1
child[2].result = 4.0
child[3].parrentId = 1
child[3].result = 6.0
child[4].parrentId = 1
child[4].result = -1.0
child[5].parrentId = 2
child[5].result = 1.0
child[6].parrentId = 2
child[6].result = 16.0
child[7].parrentId = 2
child[7].result = 2.0
child[8].parrentId = 2
child[8].result = -10.0
child[9].parrentId = 2
child[9].result = 5.0

我想只保留parrentId 数组中每个parrentId 的前三个元素,如result 属性所示。在我的语言中,我可以完成所有基本操作——我可以为每个构造使用 if/else、while、for、并创建新变量。我可以对数组 asc/desc 进行排序并获取已排序元素的索引。我可以删除数组的元素。

对于我的数据,我需要以下结果:

child[0].parrentId = 1
child[0].result = 3.0
child[1].parrentId = 1
child[2].result = 4.0
child[3].parrentId = 1
child[3].result = 6.0
child[6].parrentId = 2
child[6].result = 16.0
child[7].parrentId = 2
child[7].result = 2.0
child[9].parrentId = 2
child[9].result = 5.0

【问题讨论】:

    标签: arrays algorithm language-agnostic srl blaze-advisor


    【解决方案1】:

    使用辅助类:

    还有功能:

    上面有代码:

    len is an integer initially top.children.count - 1;
    idx is an integer initially len;
    while idx > atIdx do {
        top.children[idx] = top.children[idx-1];
        decrement idx;
    }
    top.children[atIdx] = child;
    

    此代码可以满足您的要求:

    child is an fixed array of 10 Child;
    
    counter is an integer initially 0;
    while counter < 10 do { child[counter] = a Child; increment counter }
    
    child[0].parrentId = 1;
    child[0].result = 3.0;
    child[1].parrentId = 1;
    child[1].result = 2.0;
    child[2].parrentId = 1;
    child[2].result = 4.0;
    child[3].parrentId = 1;
    child[3].result = 6.0;
    child[4].parrentId = 1;
    child[4].result = -1.0;
    child[5].parrentId = 2;
    child[5].result = 1.0;
    child[6].parrentId = 2;
    child[6].result = 16.0;
    child[7].parrentId = 2;
    child[7].result = 2.0;
    child[8].parrentId = 2;
    child[8].result = -10.0;
    child[9].parrentId = 2;
    child[9].result = 5.0;
    
    groups is an array of real;
    
    topN is an integer initially 4;
    
    //Init the hashmap of [group] -> [array of 'N' top Child]
    top3fromGroup is an association from real to TopChildren;
    for each Child in child do if not groups.contains(it.parrentId) then { 
        top3fromGroup[it.parrentId] = a TopChildren;
        initCounter is an integer initially 0;
        while initCounter < topN do {
            top3fromGroup[it.parrentId].children[initCounter] = a Child initially { it.result = Double.MIN_VALUE;} 
            increment initCounter;
        }
        groups.append(it.parrentId);
    }
    
    //Filling the groups at the hashmap with the Child elements ordered inside its groups
    for each real in groups do { 
        group is a real initially it;
        for each Child in child do {
            localChild is some Child initially it;
            if it.parrentId = group then {
                top is some TopChildren initially top3fromGroup[group]; 
                topValuesIdx is an integer initially 0;
                while topValuesIdx < top.children.count do {
                    topChild is some Child initially top.children[topValuesIdx];
                    if localChild.result > topChild.result then { 
                        insertAt(topValuesIdx, localChild, top);
                        topValuesIdx = top.children.count;
                    } 
                    increment topValuesIdx;
                }
            }
        }
    }
    
    //Printing the hashmap
    for each real in groups do {
        group is a real initially it;
        print("Group: " group);
        childIdx is an integer initially 0;
        for each Child in top3fromGroup[it].children do {
            print("\tchild["childIdx"].parrentId = " it.parrentId); 
            print("\tchild["childIdx"].result = " it.result);
            increment childIdx;
        }
    }
    

    Eclipse/Blaze 控制台上的输出将是:

    Group: 1.0
        child[0].parrentId = 1.0
        child[0].result = 6.0
        child[1].parrentId = 1.0
        child[1].result = 4.0
        child[2].parrentId = 1.0
        child[2].result = 3.0
        child[3].parrentId = 1.0
        child[3].result = 2.0
    Group: 2.0
        child[0].parrentId = 2.0
        child[0].result = 16.0
        child[1].parrentId = 2.0
        child[1].result = 5.0
        child[2].parrentId = 2.0
        child[2].result = 2.0
        child[3].parrentId = 2.0
        child[3].result = 1.0
    
    Execution complete.
    

    我知道这是一个非常简单的解决方案,而不是最佳解决方案。

    【讨论】:

    • 很高兴知道有人在使用 Blaze 顾问 :)
    【解决方案2】:

    可以使用selection algorithm 来保留数组的前 N ​​个元素。将数据分组这一事实不会给问题增加太多复杂性,只需忽略不在组中的元素即可。

    例如,如果您使用partial sorting,您可以只对组中的元素进行部分排序。您不会像在标准部分排序中那样提前知道第 N 个条目的索引,但您可以修改外部循环以遍历所有条目(而不仅仅是第一个 N),但跟踪您有多少已经部分排序并在完成时中断。

    顺便说一句,既然您在问题中声明您可以用您的专有语言对数组进行排序,那么如果它不是太低效,您只需对整个数组进行排序,然后循环整个数组,直到找到顶部所有感兴趣的组的 N 个项目。这是蛮力,但如果您没有性能问题,那么它可能是最简单的答案。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-04-12
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多