【问题标题】:How To find Jaccard Similarity of classes in Pandas如何在 Pandas 中找到类的 Jaccard 相似度
【发布时间】:2017-01-27 11:00:28
【问题描述】:

我想在我的数据集中找到每对组之间的 Jaccard 相似度。我的数据如下,第一列是我的数据,第二列是class lable:

import pandas as pd
import numpy as np
df = pd.DataFrame({'Data' : ["a1","a2","a3","a4","a5","a6","a7"], 'ClassLable' :     ["c1","c2","c2","c2","c3","c3","c1"]}); df
df2 = pd.DataFrame({'Data' : ["a1","a2","a4","a6","a7","a8","a9"], 'ClassLable' : ["c11","c21","c21","c12","c13","c13","c11"]}); df2

我想计算 df 和 df2 之间每对类标签的 Jaccard 指数。例如:

c1class = pd.DataFrame({'Data':["a1","a7"]})
c11class = pd.DataFrame({'Data':["a1","a9"]})
Jaccard = 1/3

换句话说,对于 df1 和 df2,我想在每个类标签的联合上找到相交的项目

【问题讨论】:

    标签: python pandas scikit-learn


    【解决方案1】:

    您是否在寻找类似的东西:

    from sklearn.metrics import jaccard_similarity_score
    
    jaccard_similarity_score(df['Data'],df2['Data'])
    Out[92]: 0.2857142857142857
    
    jaccard_similarity_score(c1class, c11class)
    Out[93]: 0.5
    

    【讨论】:

    • 没有。我想找到不同数据组之间的 Jaccard 相似性。我想计算每对类的jaccard,以便根据相似的对象找到相似的类
    猜你喜欢
    • 2018-01-02
    • 1970-01-01
    • 1970-01-01
    • 2018-03-07
    • 2023-03-07
    • 2017-03-27
    • 2022-01-04
    • 2021-09-01
    • 2022-07-21
    相关资源
    最近更新 更多