【问题标题】:Deduplicate array of arrays in javascript在javascript中删除重复的数组数组
【发布时间】:2020-11-01 12:45:10
【问题描述】:

我想对数组数组进行重复数据删除。重复数组是匹配元素索引子集的数组。在这种情况下,比如说索引[1] 和索引[3]

const unDeduplicated = [
  [ 11, 12, 13, 14, 15, ],
  [ 21, 22, 23, 24, 25, ],
  [ 31, 88, 33, 99, 35, ], // duplicate in indices: 1, 3 with row index 4
  [ 41, 42, 43, 44, 45, ],
  [ 51, 88, 53, 99, 55, ], // duplicate in indices: 1, 3 // delete this row from result
];

const deduplicated = getDeduplicated( unDeduplicated, [ 1, 3, ], );

console.log( deduplicated );
// expected result:
// [
//   [ 11, 12, 13, 14, 15, ],
//   [ 21, 22, 23, 24, 25, ],
//   [ 31, 88, 33, 99, 35, ],
//   [ 41, 42, 43, 44, 45, ],
//   // this row was omitted from result because it was duplicated at indices 1 and 3 with row index 2
// ]

什么函数getDeduplicated()可以给我这样的结果?

我已经尝试了以下功能,但这只是一个开始。它离给我想要的结果还差得远。但它让我知道我正在尝试做什么。

/**
 * Returns deduplicated array as a data grid ([][] -> 2D array)
 * @param { [][] } unDedupedDataGrid The original data grid to be deduplicated to include only unque rows as defined by the indices2compare.
 * @param { Number[] } indices2compare An array of indices to compare for each array element.
 * If every element at each index for a given row is duplicated elsewhere in the array,
 * then the array element is considered a duplicate
 * @returns { [][] }
 */
const getDeduplicated = ( unDedupedDataGrid, indices2compare, ) => {
  let deduped = [];
  unDedupedDataGrid.forEach( row => {
    const matchedArray = a.filter( row => row[1] === 88 && row[3] === 99 );
    const matchedArrayLength = matchedArray.length;
    if( matchedArrayLength ) return;
    deduped.push( row, );
  });
}

我研究了一些可能有帮助的 lodash 函数,例如 _.filter_.some,但到目前为止,我似乎无法找到产生所需结果的结构。

【问题讨论】:

  • 预期结果的第三行是否意味着索引 3 处为 99?好像变成了88。
  • @LeaftheLegend:你是对的。好收获!

标签: javascript arrays duplicates lodash


【解决方案1】:

您可以在遍历行时根据列中的值创建Set。您可以选择仅为指定列创建集合,例如1和3在你的情况下。然后在遍历每一行时检查该行中的任何指定列是否具有已经在相应集合中的值,如果存在则丢弃该行。

(在手机上,无法输入实际代码。我猜代码也很简单)

【讨论】:

    【解决方案2】:

    这可能不是最有效的算法,但我会做类似的事情

    function getDeduplicated(unDeduplicated, idxs) {
      const result = [];
      const used = new Set();
      unDeduplicated.forEach(arr => {
        const vals = idxs.map(i => arr[i]).join();
        if (!used.has(vals)) {
          result.push(arr);
          used.add(vals);
        }
      });
    
      return result;
    }
    

    【讨论】:

      【解决方案3】:

      如果我很了解你想要做什么,我想知道,但这就是我所做的

      list = [
        [ 11, 12, 13, 14, 15, ],
        [ 21, 22, 23, 24, 25, ],
        [ 21, 58, 49, 57, 28, ],
        [ 31, 88, 33, 88, 35, ],
        [ 41, 42, 43, 44, 45, ],
        [ 51, 88, 53, 88, 55, ],
        [ 41, 77, 16, 29, 37, ],
      ];
      
      el_list = []  // Auxiliar to save all unique numbers
      res_list = list.reduce(
          (_list, row) => {
              // console.log(_list)
              this_rows_el = []  // Auxiliar to save this row's elements
              _list.push(row.reduce(
                  (keep_row, el) => {
                      // console.log(keep_row, this_rows_el, el)
                      if(keep_row && el_list.indexOf(el)==-1 ){
                          el_list.push(el)
                          this_rows_el.push(el)
                          return true
                      }else if(this_rows_el.indexOf(el)!=-1) return true  // Bypass repeated elements in this row
                      else return false
                  }, true) ? row : null)  // To get only duplicated rows (...) ? null : row )
              return _list
          }, []
      )
      
      console.log(res_list)

      【讨论】:

        【解决方案4】:

        这是相当简洁的。它使用嵌套过滤器。它也适用于任意数量的副本,只保留第一个。

        init = [
          [ 11, 12, 13, 14, 15],
          [ 21, 22, 23, 24, 25],
          [ 31, 88, 33, 99, 35],
          [ 41, 42, 43, 44, 45],
          [ 51, 88, 53, 99, 55],
        ];
        
        var deDuplicate = function(array, indices){
        var res = array.filter(
          (elem) => !array.some(
          (el) =>
          array.indexOf(el) < array.indexOf(elem) && //check that we don't discard the first dupe
          el.filter((i) => indices.includes(el.indexOf(i))).every((l,index) => l === elem.filter((j) => indices.includes(elem.indexOf(j)))[index])
        //check if the requested indexes are the same.
        // Made a bit nasty by the fact that you can't compare arrays with ===
          )
        );
        return(res);
        }
        console.log(deDuplicate(init,[1,3]));
        

        【讨论】:

          【解决方案5】:

          不是最有效的,但这会删除多个重复数组的副本

          const unDeduplicated = [ [ 11, 12, 13, 14, 15, ], [ 21, 22, 23, 24, 25, ], [ 31, 88, 33, 99, 35, ], [ 41, 33, 43, 44, 45, ], [ 51, 88, 53, 99, 55, ]]
          const unDeduplicated1 = [
            [ 11, 12, 13, 14, 15, ],
            [ 21, 22, 23, 24, 25, ],// duplicate in indices: 1, 3 with row index 3
            [ 31, 88, 33, 99, 35, ], // duplicate in indices: 1, 3 with row index 4
            [ 21, 22, 43, 24, 45, ],// duplicate in indices: 1, 3 // delete this
            [ 51, 88, 53, 99, 55, ], // duplicate in indices: 1, 3 // delete this row from result
          ];
          function getDeduplicated(arr, arind) {
            for (let i = 0; i < arr.length; i++) {
              for (let j = 1 + i; j < arr.length; j++) {
                if (arr[j].includes(arr[i][arind[0]]) && arr[j].includes(arr[i][arind[1]])) {
                  arr.splice(j, 1)
                  i--
                } else continue
              }
            }
            return arr
          }
          const deduplicated = getDeduplicated(unDeduplicated, [1, 3]);
          const deduplicated2 = getDeduplicated(unDeduplicated1, [1, 3]);
          
          console.log(deduplicated)
          console.log("#####################")
          console.log(deduplicated2)

          【讨论】:

            猜你喜欢
            • 2014-12-03
            • 2017-10-16
            • 2019-08-29
            • 1970-01-01
            • 1970-01-01
            • 2011-12-02
            • 2013-08-03
            • 2019-05-21
            • 2016-09-15
            相关资源
            最近更新 更多