【问题标题】:Cassandra based Mahout user friend recommendations基于 Cassandra 的 Mahout 用户好友推荐
【发布时间】:2014-02-26 20:17:35
【问题描述】:

我要推荐一个用户,当前用户可以加为好友的用户列表。

我正在使用 Cassandra 和 mahout。 mahout 集成包中已经有CassandraDataModel 的实现。我想用这个类。

所以我的推荐类如下所示

public class UserFriendsRecommender {

@Inject
private CassandraDataModel dataModel;

public List<RecommendedItem> recommend(Long userId, int number) throws TasteException{
    UserSimilarity userSimilarity = new PearsonCorrelationSimilarity(dataModel);
    // Optional:
    userSimilarity.setPreferenceInferrer(new AveragingPreferenceInferrer(dataModel));

    UserNeighborhood neighborhood =
              new NearestNUserNeighborhood(3, userSimilarity, dataModel);
    Recommender recommender = new GenericUserBasedRecommender(dataModel, neighborhood, userSimilarity); 
    Recommender cachingRecommender = new CachingRecommender(recommender);
    List<RecommendedItem> recommendations = cachingRecommender.recommend(userId, number);
    return recommendations;
}

}

CassandraDataModel 有 4 个列族

static final String USERS_CF = "users";
  static final String ITEMS_CF = "items";
  static final String USER_IDS_CF = "userIDs";
  static final String ITEM_IDS_CF = "itemIDs";

我很难理解这个类,尤其是列族的。有没有我可以寻找的例子,或者如果有人可以用一个小例子来解释会很好。

javadoc 这么说

* <p>
 * First, it uses a column family called "users". This is keyed by the user ID
 * as an 8-byte long. It contains a column for every preference the user
 * expresses. The column name is item ID, again as an 8-byte long, and value is
 * a floating point value represnted as an IEEE 32-bit floating poitn value.
 * </p>
 * 
 * <p>
 * It uses an analogous column family called "items" for the same data, but
 * keyed by item ID rather than user ID. In this column family, column names are
 * user IDs instead.
 * </p>
 * 
 * <p>
 * It uses a column family called "userIDs" as well, with an identical schema.
 * It has one row under key 0. It contains a column for every user ID in the
 * model. It has no values.
 * </p>
 * 
 * <p>
 * Finally it also uses an analogous column family "itemIDs" containing item
 * IDs.
 * </p>

【问题讨论】:

    标签: cassandra mahout mahout-recommender


    【解决方案1】:

    以下关于 CassandraDataMdoel 所需列族的所有说明都应在您创建的键空间(推荐人或其他名称)下的 cassandra-cli 中执行。

    1:表用户

    userID是row key,每个itemID有一个单独的列名,value是preference:

    CREATE COLUMN FAMILY users
    WITH comparator = LongType
    AND key_validation_class=LongType
    AND default_validation_class=FloatType;
    

    插入值:

    set users[0][0]='1.0';
    set users[1][0]='3.0';
    set users[2][2]='1.0';
    

    2:表格项

    itemID为row key,每个userID有一个单独的column name,value为preference:

    CREATE COLUMN FAMILY items
    WITH comparator = LongType
    AND key_validation_class=LongType
    AND default_validation_class=FloatType;
    

    插入值:

    set items[0][0]='1.0';
    set items[0][1]='3.0';
    set items[2][2]='1.0';
    

    3:表用户ID

    这个表只有一行,但有很多列,即每个用户ID都有一个单独的列:

    CREATE COLUMN FAMILY userIDs
    WITH comparator = LongType
    AND key_validation_class=LongType;
    

    插入值:

    set userIDs[0][0]='';
    set userIDs[0][1]='';
    set userIDs[0][2]='';
    

    4:表项ID:

    这个表只有一行,但有很多列,即每个 itemID 都有一个单独的列:

    CREATE COLUMN FAMILY itemIDs
    WITH comparator = LongType
    AND key_validation_class=LongType;
    

    插入值:

    set itemIDs[0][0]='';
    set itemIDs[0][1]='';
    set itemIDs[0][2]='';
    

    【讨论】:

      【解决方案2】:

      作为对上述答案的补充,对于 Cassandra 2.0,新语法如下,根据已弃用的 cli。

      表用户:

      CREATE TABLE users (userID bigint, itemID bigint, value float, PRIMARY KEY (userID, itemID));

      表格项:

      CREATE TABLE items (itemID bigint, userID bigint, value float, PRIMARY KEY (itemID, userID));

      表用户 ID:

      CREATE TABLE userIDs (id bigint, userID bigint PRIMARY KEY(id, userID));

      表项 ID:

      CREATE TABLE itemIDs (id bigint, itemID bigint PRIMARY KEY(id, itemID));

      【讨论】:

        猜你喜欢
        • 2011-09-09
        • 1970-01-01
        • 1970-01-01
        • 2015-01-16
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多