从 Firestore 中选择随机文档答案

【问题标题】：Select random document from Firestore从 Firestore 中选择随机文档
【发布时间】：2021-10-17 17:27:23
【问题描述】：

我在 Cloud Firestore 的一个集合中有 1000 个文档，是否可以获取随机文档？

例如：Students 是 Firestore 中的一个集合，我在该集合中有 1000 名学生，我的要求是每次调用随机挑选 10 名学生。

【问题讨论】：

如果学生有身份证号，那不就可以用随机生成的数字来选择，选哪个学生了吗？
@Geshode 我将学生 ID 分配为文档 ID
那么，不能使用随机生成的数字吗？只是一个想法。
你能解释一下你的想法吗
此问题已重新打开，因为它与其他语言的问题重复。

标签： java android firebase google-cloud-firestore

【解决方案1】：

基于@ajzbc answer，我为 Unity3D 编写了这篇文章，它为我工作。

FirebaseFirestore db;

    void Start()
    {
        db = FirebaseFirestore.DefaultInstance;
    }

    public void GetRandomDocument()
    {

       Query query1 = db.Collection("Sports").WhereGreaterThanOrEqualTo(FieldPath.DocumentId, db.Collection("Sports").Document().Id).Limit(1);
       Query query2 = db.Collection("Sports").WhereLessThan(FieldPath.DocumentId, db.Collection("Sports").Document().Id).Limit(1);

        query1.GetSnapshotAsync().ContinueWithOnMainThread((querySnapshotTask1) =>
        {

             if(querySnapshotTask1.Result.Count > 0)
             {
                 foreach (DocumentSnapshot documentSnapshot in querySnapshotTask1.Result.Documents)
                 {
                     Debug.Log("Random ID: "+documentSnapshot.Id);
                 }
             } else
             {
                query2.GetSnapshotAsync().ContinueWithOnMainThread((querySnapshotTask2) =>
                {

                    foreach (DocumentSnapshot documentSnapshot in querySnapshotTask2.Result.Documents)
                    {
                        Debug.Log("Random ID: " + documentSnapshot.Id);
                    }

                });
             }
        });
    }

【讨论】：

【解决方案2】：

我遇到了类似的问题（我只需要每 24 小时或当用户手动刷新页面时获取一个随机文档，但您也可以将此解决方案应用于您的案例），对我有用的是以下内容：

技术

第一次阅读一小部分文档，假设从 1 到 10 个文档（在您的情况下为 10 到 30 或 50 个）。
根据文档列表范围内随机生成的数字选择随机文档。
在客户端设备上本地保存您选择的文档的最后一个 ID（可能像我一样在共享首选项中）。
如果您想要一个新的随机文档，您将使用保存的文档 ID 在保存的文档 ID 之后再次启动该过程（步骤 1 到 3），这将排除之前出现的所有文档。
重复该过程，直到在保存的文档 ID 之后没有更多文档，然后假设这是您第一次运行此算法，从头开始重新开始（通过将保存的文档 ID 设置为 null 并再次启动该过程（步骤1 到 4)。

技术优缺点

优点：

您可以在每次获得新的随机文档时确定跳转大小。
无需修改对象的原始模型类。
无需修改您已经拥有或设计的数据库。
在向集合中添加新文档时，无需在集合中添加文档并为每个文档添加随机 id，就像提到的解决方案 here。
无需加载大量文档即可仅获取一个文档或小型文档列表，
如果您使用的是 firestore 自动生成的 id，则效果很好（因为集合中的文档已经稍微随机化了）
如果您想要一个随机文档或小尺寸的随机文档列表，效果很好。
适用于所有平台（包括 iOS、Android、Web）。

缺点

处理保存文档的 ID 以在下一个获取随机文档的请求中使用（这比处理每个文档中的新字段或处理将集合中每个文档的 ID 添加到主要收藏）
如果列表不够大，可能会多次获取一些文档（在我的情况下这不是问题）并且我没有找到完全避免这种情况的任何解决方案。

实现（android 上的 kotlin）：

var documentId = //get document id from shared preference (will be null if not set before)
getRandomDocument(documentId)

fun getRandomDocument(documentId: String?) {
    if (documentId == null) {
        val query = FirebaseFirestore.getInstance()
                .collection(COLLECTION_NAME)
                .limit(getLimitSize())
        loadDataWithQuery(query)
    } else {
        val docRef = FirebaseFirestore.getInstance()
                .collection(COLLECTION_NAME).document(documentId)
        docRef.get().addOnSuccessListener { documentSnapshot ->
            val query = FirebaseFirestore.getInstance()
                    .collection(COLLECTION_NAME)
                    .startAfter(documentSnapshot)
                    .limit(getLimitSize())
            loadDataWithQuery(query)
        }.addOnFailureListener { e ->
            // handle on failure
        }
    }
}

fun loadDataWithQuery(query: Query) {
    query.get().addOnSuccessListener { queryDocumentSnapshots ->
        val documents = queryDocumentSnapshots.documents
        if (documents.isNotEmpty() && documents[documents.size - 1].exists()) {
            //select one document from the loaded list (I selected the last document in the list)
            val snapshot = documents[documents.size - 1]
            var documentId = snapshot.id
            //SAVE the document id in shared preferences here
            //handle the random document here
        } else {
            //handle in case you reach to the end of the list of documents
            //so we start over again as this is the first time we get a random document
            //by calling getRandomDocument() with a null as a documentId
            getRandomDocument(null)
        }
    }
}

fun getLimitSize(): Long {
    val random = Random()
    val listLimit = 10
    return (random.nextInt(listLimit) + 1).toLong()
}

【讨论】：

【解决方案3】：

Alex Mamo 描述的第二种方法看起来与此类似：

获取存储文档ID的数组列表
从该列表中获取一些字符串（我将文档 ID 存储为字符串）

在下面的代码中，您从数组中获取 3 个随机且唯一的字符串并将其存储在一个列表中，您可以从中访问这些字符串并进行查询。我在片段中使用此代码：

    @Nullable
    @Override
    public View onCreateView(@NonNull LayoutInflater inflater, @Nullable ViewGroup container, @Nullable Bundle savedInstanceState) {

        View view = inflater.inflate(R.layout.fragment_category_selection, container, false);

        btnNavFragCat1 = view.findViewById(R.id.btn_category_1);

        btnNavFragCat1.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {

                questionKeyRef.document(tvCat1).get().addOnCompleteListener(new OnCompleteListener<DocumentSnapshot>() {
                    @Override
                    public void onComplete(@NonNull Task<DocumentSnapshot> task) {
                        if (task.isSuccessful()) {

                            DocumentSnapshot document = task.getResult();
                            List<String> questions = (List<String>) document.get("questions"); // This gets the array list from Firestore

                            List<String> randomList = getRandomElement(questions, 0);

                            removeDuplicates(randomList);

                            ...
                        }
                    }
                });

            }
        });

        ...

        return view;
    }

    private List<String> getRandomElement(List<String> list, int totalItems) {
        int PICK_RANDOM_STRING = 3;
        Random rand = new Random();
        List<String> newList = new ArrayList<>();
        int count = 0;
        while (count < PICK_RANDOM_STRING) {

            int randomIndex = rand.nextInt(list.size());
            String currentValue = list.get(randomIndex);
            if (!newList.contains(currentValue)) {
                newList.add(currentValue);
                count++;
            }
        }

        return newList;
    }

    private void removeDuplicates(List<String> list) {
        try {
            Log.e("One", list.get(0));
            Log.e("Two", list.get(1));
            Log.e("Three", list.get(2));

            query1 = list.get(0); // In this vars are the strings stored with them you can then make a normal query in Firestore to get the actual document
            query2 = list.get(1);
            query3 = list.get(2);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }

这是我从 Firestore 获得的数组：

【讨论】：

这就是重点......从集合中获取所有文档非常昂贵且效率不高！

【解决方案4】：

是的，要实现这一点，请使用以下代码：

FirebaseFirestore rootRef = FirebaseFirestore.getInstance();
CollectionReference studentsCollectionReference = rootRef.collection("students");
studentsCollectionReference.get().addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
    @Override
    public void onComplete(@NonNull Task<QuerySnapshot> task) {
        if (task.isSuccessful()) {
            List<Student> studentList = new ArrayList<>();
            for (DocumentSnapshot document : task.getResult()) {
                Student student = document.toObject(Student.class);
                studentList.add(student);
            }

            int studentListSize = studentList.size();
            List<Students> randomStudentList = new ArrayList<>();
            for(int i = 0; i < studentListSize; i++) {
                Student randomStudent = studentList.get(new Random().nextInt(studentListSize));
                if(!randomStudentList.contains(randomStudent)) {
                    randomStudentList.add(randomStudent);
                    if(randomStudentList.size() == 10) {
                        break;
                    }
                }
            }
        } else {
            Log.d(TAG, "Error getting documents: ", task.getException());
        }
    }
});

这被称为经典解决方案，您可以将其用于仅包含少量记录的集合，但如果您害怕获得大量读取，我会向您推荐第二种方法。这还涉及通过添加一个新文档来对您的数据库进行一些更改，该文档可以包含一个包含所有学生 ID 的数组。因此，要获得这 10 名随机学生，您只需要进行一次 get() 调用，这意味着只需进行一次读取操作。获得该数组后，您可以使用相同的算法并获得这 10 个随机 ID。获得这些随机 ID 后，您可以获取相应的文档并将它们添加到列表中。这样，您只需再执行 10 次读取即可获得实际的随机学生。总共只有 11 个文档读取。

这种做法称为非规范化（复制数据），是 Firebase 的常见做法。如果您是 NoSQL 数据库的新手，为了更好地理解，我建议您观看此视频，Denormalization is normal with the Firebase Database。它适用于 Firebase 实时数据库，但同样的原则适用于 Cloud Firestore。

但是请记住，您在这个新创建的节点中添加随机产品的方式与不再需要时需要删除它们的方式相同。

要将学生 ID 添加到数组中，只需使用：

FieldValue.arrayUnion("yourArrayProperty")

要删除学生证，请使用：

FieldValue.arrayRemove("yourArrayProperty")

要一次获得所有 10 个随机学生，您可以使用 List<Task<DocumentSnapshot>>，然后调用 Tasks.whenAllSuccess(tasks)，正如我在这篇文章中的回答中所述：

Android Firestore convert array of document references to List<Pojo>

【讨论】：

内存效率不高。如果我只想要 10 个文档，为什么它检索可能是 1000 个。
使用第一种解决方案，可用于小数据集，但第二种方法仅意味着 11 次读取，10 名随机学生。
randomNumber 在 for 循环中是相同的，所以我认为同一个文档将被存储 10 次。
@PratikButani 很好地抓住了 Pratik。刚刚用正确的方法更新了我的答案，并感谢您帮助我更正答案。现在不再有重复了；）
也从侧面完成，因为最终您的回答给了我提示。 :) 谢谢。

【解决方案5】：

根据Alex's answer，我得到了从 Firebase Firestore 数据库中获取重复记录的提示（特别是对于少量数据）

我在他的问题中遇到了一些问题，如下所示：

它给出了与randomNumber相同的所有记录不更新。
即使我们每次更新randomNumber，最终列表中也可能有重复记录。
它可能有我们已经显示的重复记录。

我已更新答案如下：

    FirebaseFirestore database = FirebaseFirestore.getInstance();
    CollectionReference collection = database.collection(VIDEO_PATH);
    collection.get().addOnCompleteListener(new OnCompleteListener<QuerySnapshot>() {
        @Override
        public void onComplete(@NonNull Task<QuerySnapshot> task) {
            if (task.isSuccessful()) {
                List<VideoModel> videoModelList = new ArrayList<>();
                for (DocumentSnapshot document : Objects.requireNonNull(task.getResult())) {
                    VideoModel student = document.toObject(VideoModel.class);
                    videoModelList.add(student);
                }

                /* Get Size of Total Items */
                int size = videoModelList.size();
                /* Random Array List */
                ArrayList<VideoModel> randomVideoModels = new ArrayList<>();
                /* for-loop: It will loop all the data if you want 
                 * RANDOM + UNIQUE data.
                 * */
                for (int i = 0; i < size; i++) {
                    // Getting random number (inside loop just because every time we'll generate new number)
                    int randomNumber = new Random().nextInt(size);

                    VideoModel model = videoModelList.get(randomNumber);

                    // Check with current items whether its same or not
                    // It will helpful when you want to show related items excepting current item
                    if (!model.getTitle().equals(mTitle)) {
                        // Check whether current list is contains same item.
                        // May random number get similar again then its happens
                        if (!randomVideoModels.contains(model))
                            randomVideoModels.add(model);

                        // How many random items you want 
                        // I want 6 items so It will break loop if size will be 6.
                        if (randomVideoModels.size() == 6) break;
                    }
                }

                // Bind adapter
                if (randomVideoModels.size() > 0) {
                    adapter = new RelatedVideoAdapter(VideoPlayerActivity.this, randomVideoModels, VideoPlayerActivity.this);
                    binding.recyclerView.setAdapter(adapter);
                }
            } else {
                Log.d("TAG", "Error getting documents: ", task.getException());
            }
        }
    });

希望这个逻辑对所有拥有少量数据的人有所帮助，我认为它不会对 1000 到 5000 个数据造成任何问题。

谢谢。

【讨论】：

VideoModel 类是什么样子的？