当前位置：Gxlcms > mysql > hadoopmapreduce数据排序

hadoopmapreduce数据排序

时间：2021-07-01 10:21:17 帮助过：51人阅读

hadoop mapreduce数据排序有如下3个输入文件： file0 [plain] 2 32 654 32 15 756 65223 file1 [plain] 5956 22 650 92 file2 [plain] 26 54 6 由于reduce获得的key是按字典顺序排序的，利用默认的规则即可。 [java] // map将输入中的value化成IntWritable

hadoop mapreduce数据排序

有如下3个输入文件：

file0

[plain]

654

756

65223

file1

[plain]

5956

650

file2

[plain]

由于reduce获得的key是按字典顺序排序的，利用默认的规则即可。

[java]

// map将输入中的value化成IntWritable类型，作为输出的key

public static class Map extends

Mapper {

private static IntWritable data = new IntWritable();

// 实现map函数

public void map(Object key, Text value, Context context)

throws IOException, InterruptedException {

String line = value.toString();

data.set(Integer.parseInt(line));

context.write(data, new IntWritable(1));

}

// reduce将输入中的key复制到输出数据的key上，

// 然后根据输入的value-list中元素的个数决定key的输出次数

// 用全局linenum来代表key的位次

public static class Reduce extends

Reducer {

private static IntWritable linenum = new IntWritable(1);

// 实现reduce函数

public void reduce(IntWritable key, Iterable values,

Context context) throws IOException, InterruptedException {

for (IntWritable val : values) {

context.write(linenum, key);

linenum = new IntWritable(linenum.get() + 1);

}

输出如下：

[plain]

1 2

2 6

3 15

4 22

5 26

6 32

7 32

8 54

9 92

10 650

11 654

12 756

13 5956

14 65223

hadoopmapreduce数据排序

人气教程排行