User Tools

Site Tools


Sidebar

Navigation

Join us by

user mailing list
devel mailing list


More news...

RSS

tutorial:ibm_demo
  • Grep

To generate files:

time hadoop jar /opt/hadoop/hadoop-1.2.1/hadoop-examples-1.2.1.jar  randomtextwriter -Ddfs.block.size=1073741824 -Dtest.randomtextwrite.bytes_per_map=19478485 -Dtest.randomtextwrite.maps_per_host=16  /Workloads/grep/datagrep_1

To grep “toto” inside:

time hadoop jar /opt/hadoop/hadoop-1.2.1/hadoop-examples-1.2.1.jar grep  /Workloads/grep/datagrep11 /Workloads/grep/datagrep-out "toto"
  • terasort

To generate files:

time hadoop jar /opt/hadoop/hadoop-1.2.1/hadoop-examples-1.2.1.jar teragen -Ddfs.block.size=536870912 -Dmapred.map.tasks=32 -Dmapred.reduce.tasks=16 -Dmapred.map.tasks.speculative.execution=true -Dmapred.compress.map.output=true   10000000 /Workloads/teragen/ter_in.4967

To sort:

time hadoop jar /opt/hadoop/hadoop-1.2.1/hadoop-examples-1.2.1.jar  terasort -Ddfs.block.size=536870912 -Dio.file.buffer.size=524288  -Dmapred.map.tasks=32 -Dmapred.reduce.tasks=16 -Dio.sort.factor=48 -Dio.sort.mb=650 -Dio.sort.record.percent=0.138 -Dio.sort.spill.percent=1.0  /Workloads/teragen/ter_in.4967 /Workloads/terasort/ter_out.4967

Here parameters

tutorial/ibm_demo.txt · Last modified: 2014/12/17 09:29 (external edit)