【发布时间】:2015-06-29 05:51:52
【问题描述】:
我有一个 RDD:
JavaPairRDD<Long, ViewRecord> myRDD
通过newAPIHadoopRDD 方法创建。我有一个现有的地图功能,我想以 Spark 方式实现它:
LongWritable one = new LongWritable(1L);
protected void map(Long key, ViewRecord viewRecord, Context context)
throws IOException ,InterruptedException {
String url = viewRecord.getUrl();
long day = viewRecord.getDay();
tuple.getKey().set(url);
tuple.getValue().set(day);
context.write(tuple, one);
};
PS:元组来源于:
KeyValueWritable<Text, LongWritable>
可以在这里找到:TextLong.java
【问题讨论】:
标签: hadoop apache-spark