【问题标题】:MongoDB - finding common value count between a set of documentsMongoDB - 查找一组文档之间的共同值计数
【发布时间】:2017-12-06 22:21:33
【问题描述】:

我有两个字段的文档,我想在同一组(类别)的行之间找到共同的值计数:例如

当前数据如下所示(假设为 Json 格式):

我需要这样的输出

将不胜感激任何指导/指针。谢谢

【问题讨论】:

  • 你能从 Mongo 获取这种格式的数据吗:[{category: "Music", book: "A"}, {category: "Music", book: "B"}, {category: "Film", book: "A"}]?如果是这样,我可以帮助您找到解决方案。
  • 是的,数据就是这种格式,例如{类别:“音乐”,书籍:“A”,尺寸:3}。 size 提供特定类别的行数。谢谢

标签: javascript arrays mongodb mongodb-query


【解决方案1】:

首先,以对象数组的形式获取数据。然后,我们可以使用下面的算法来得到你所需要的:

  1. 获取唯一类别["Music", "Film", "History", "Science"]
  2. 获取这些类别的组合[["Music", "Film"], ["Music", "History"], ["Music", "Science"], ["Film", "History"], ...]
  3. 创建类别名称到该类别中包含的书籍的映射。我们可以使用Set 来确保值是唯一的。该地图的结构类似于{"Music": Set("A", "B"), "Film": Set("A", "B", "C"), "History": Set("C", "B"), "Science": Set("C")}
  4. 使用您刚刚创建的对象和数组来查找组合之间的重复项。
  5. 说完所有,您将拥有一个具有以下结构的数组:[ [cat1, cat2, [bookInCommon1, bookInCommon2]], [cat1, cat3, [bookInCommon1, bookInCommon2]], ...]

运行下面的代码以查看它的实际效果。 mongoData 保存从 Mongo 获取的数据。

const mongoData = [{
    category: "Music",
    book: "A"
}, {
    category: "Music",
    book: "B"
},{
    category: "Music",
    book: "A"
},{
    category: "Film",
    book: "A"
},{
    category: "Film",
    book: "A"
},{
    category: "Film",
    book: "B"
},{
    category: "Film",
    book: "C"
},{
    category: "Film",
    book: "C"
},{
    category: "Film",
    book: "A"
},{
    category: "History",
    book: "C"
},{
    category: "History",
    book: "C"
},{
    category: "History",
    book: "B"
},{
    category: "History",
    book: "B"
},{
    category: "Science",
    book: "C"
},{
    category: "Science",
    book: "C"
},{
    category: "Science",
    book: "C"
}];

// Step 1: Get the categories
const categories = Array.from(new Set(mongoData.map(x => x.category)));

// Step 2: Get combinations of those categories
const combos = [];
for(let i = 0; i < categories.length - 1; i++) {

    let outerCat = categories[i];

    for(let j = i + 1; j < categories.length; j++) {

        let innerCat = categories[j];

        combos.push([
            outerCat,
            innerCat
        ]);
    }
}

// Step 3: Map the categories to the books that they contain
const catBooks = mongoData.reduce((map, entry) => {

    map[entry.category] = map[entry.category] || new Set(); 
    map[entry.category] = map[entry.category].add(entry.book);

    return map;

}, {});

// Step 4: Get the duplicate books for each combo
combos.forEach((combo, index) => {
    
    const cat1 = combo[0];
    const cat2 = combo[1];

    const cat1BooksArr = Array.from(catBooks[cat1]);
    const cat2BooksSet = catBooks[cat2];

    const dupes = cat1BooksArr.filter(book => {
        return cat2BooksSet.has(book);
    });

    combos[index].push(dupes); // push into combos array
});

// Done! Your combos array contains arrays that look like this: [cat1, cat2, [dupes]]
combos.forEach(combo => {
    console.log("Combo: " + combo[0] + ", " + combo[1]);
    console.log("\tNumber of dupes: " + combo[2].length); 
});

【讨论】:

  • 谢谢。 +2 很好地表达了这一点。我认为这应该符合要求。我想知道是否可以在数据库级别完成类似的事情,因为我将拥有超过 10 万条记录?
猜你喜欢
  • 2015-10-27
  • 1970-01-01
  • 1970-01-01
  • 2023-02-09
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多