时间:2021-07-01 10:21:17 帮助过:25人阅读
最近要做一次关于yarn的分享,于是想搭建一个Hadoop环境。Hadoop 2.0较之前的Hadoop 0.1x变化比较大,折腾了好久了,终于把环境搞好了。我搭建了一个两节点的集群,只配置了一些必须的参数,让集群勉强跑起来。 1、core-site.xml configurationpropertynamef
最近要做一次关于yarn的分享,于是想搭建一个Hadoop环境。Hadoop 2.0较之前的Hadoop 0.1x变化比较大,折腾了好久了,终于把环境搞好了。我搭建了一个两节点的集群,只配置了一些必须的参数,让集群勉强跑起来。
1、core-site.xml
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://10.232.42.91:19000/</value>
- </property>
- </configuration>
2、mapred-site.xml
- <configuration>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- </configuration>
3、yarn-site.xml
- <configuration>
- <property>
- <name>yarn.resourcemanager.address</name>
- <value>hdfs://10.232.42.91:19001/</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>hdfs://10.232.42.91:19002/</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce.shuffle</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>10.232.42.91:8030</value>
- </property>
- </configuration>
把JAVA_HOME、HADOOP_HOME都设置到.bashrc里面去,然后运行sbin/start-all.sh。使用jps可以看到两个节点下运行的进程如下。
- [master] jps
- 31318 ResourceManager
- 28981 DataNode
- 11580 JobHistoryServer
- 28858 NameNode
- 29155 SecondaryNameNode
- 31426 NodeManager
- 11016 Jps
- [slave] jps
- 12592 NodeManager
- 11711 DataNode
- 17699 Jps
上面这个JobHistoryServer需要单独启动,通过它可以看到每个application的详细日志。启动命令如下。
- sbin/mr-jobhistory-daemon.sh start historyserver
打开http://10.232.42.91:8088/cluster/cluster这个地址可以看到cluster的介绍信息。这里再也看不到slot相关的数据了。
万事俱备。放点文本数据到hdfs://10.232.42.91:19000/input这个目录下,运行wordcount看看效果。
- $ cd hadoop/share/hadoop/mapreduce
- $ hadoop jar hadoop-mapreduce-examples-2.0.3-alpha.jar wordcount hdfs://10.232.42.91:19000/input hdfs://10.232.42.91:19000/output
- 13/03/07 21:08:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- 13/03/07 21:08:26 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
- 13/03/07 21:08:26 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
- 13/03/07 21:08:26 INFO input.FileInputFormat: Total input paths to process : 3
- 13/03/07 21:08:26 INFO mapreduce.JobSubmitter: number of splits:3
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
- 13/03/07 21:08:26 WARN conf.Configuration: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
- 13/03/07 21:08:26 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
- 13/03/07 21:08:26 WARN conf.Configuration: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
- 13/03/07 21:08:26 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
- 13/03/07 21:08:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1362658309553_0019
- 13/03/07 21:08:26 INFO client.YarnClientImpl: Submitted application application_1362658309553_0019 to ResourceManager at /10.232.42.91:19001
- 13/03/07 21:08:26 INFO mapreduce.Job: The url to track the job: http://search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0019/
- 13/03/07 21:08:26 INFO mapreduce.Job: Running job: job_1362658309553_0019
- 13/03/07 21:08:33 INFO mapreduce.Job: Job job_1362658309553_0019 running in uber mode : false
- 13/03/07 21:08:33 INFO mapreduce.Job: map 0% reduce 0%
- 13/03/07 21:08:39 INFO mapreduce.Job: map 100% reduce 0%
- 13/03/07 21:08:44 INFO mapreduce.Job: map 100% reduce 100%
- 13/03/07 21:08:44 INFO mapreduce.Job: Job job_1362658309553_0019 completed successfully
- 13/03/07 21:08:44 INFO mapreduce.Job: Counters: 43
- File System Counters
- FILE: Number of bytes read=12698
- FILE: Number of bytes written=312593
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=16947
- HDFS: Number of bytes written=8739
- HDFS: Number of read operations=12
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=3
- Launched reduce tasks=1
- Rack-local map tasks=3
- Total time spent by all maps in occupied slots (ms)=10750
- Total time spent by all reduces in occupied slots (ms)=4221
- Map-Reduce Framework
- Map input records=317
- Map output records=2324
- Map output bytes=24586
- Map output materialized bytes=12710
- Input split bytes=316
- Combine input records=2324
- Combine output records=885
- Reduce input groups=828
- Reduce shuffle bytes=12710
- Reduce input records=885
- Reduce output records=828
- Spilled Records=1770
- Shuffled Maps =3
- Failed Shuffles=0
- Merged Map outputs=3
- GC time elapsed (ms)=376
- CPU time spent (ms)=4480
- Physical memory (bytes) snapshot=557428736
- Virtual memory (bytes) snapshot=2105122816
- Total committed heap usage (bytes)=254607360
- Shuffle Errors
- BAD_ID=0
- CONNECTION=0
- IO_ERROR=0
- WRONG_LENGTH=0
- WRONG_MAP=0
- WRONG_REDUCE=0
- File Input Format Counters
- Bytes Read=16631
- File Output Format Counters
- Bytes Written=8739
接下来玩玩yarn吧。Hadoop官方文档那篇WritingYarnApplications太让人蛋碎了,好在我领悟到distributedshell就是使用yarn编写的。要研究yarn的话,直接去Hadoop source里面找相应的代码研究即可。
- $ hadoop jar hadoop-yarn-applications-distributedshell-2.0.3-alpha.jar --jar hadoop-yarn-applications-distributedshell-2.0.3-alpha.jar org.apache.hadoop.yarn.applications.distributedshell.Client --shell_command uname --shell_args '-a'
- 13/03/07 21:42:44 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
- 13/03/07 21:42:44 INFO distributedshell.Client: Initializing Client
- 13/03/07 21:42:44 INFO distributedshell.Client: Running Client
- 13/03/07 21:42:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
- 13/03/07 21:42:44 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
- 13/03/07 21:42:44 INFO distributedshell.Client: Got Cluster metric info from ASM, numNodeManagers=2
- 13/03/07 21:42:44 INFO distributedshell.Client: Got Cluster node info from ASM
- 13/03/07 21:42:44 INFO distributedshell.Client: Got node report from ASM for, nodeId=search042091.sqa.cm4:39557, nodeAddresssearch042091.sqa.cm4:8042, nodeRackName/default-rack, nodeNumContainers0, nodeHealthStatusis_node_healthy: true, health_report: "", last_health_report_time: 1362663711950,
- 13/03/07 21:42:44 INFO distributedshell.Client: Got node report from ASM for, nodeId=search041134.sqa.cm4:49313, nodeAddresssearch041134.sqa.cm4:8042, nodeRackName/default-rack, nodeNumContainers0, nodeHealthStatusis_node_healthy: true, health_report: "", last_health_report_time: 1362663712038,
- 13/03/07 21:42:44 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.0, queueMaxCapacity=1.0, queueApplicationCount=17, queueChildQueueCount=0
- 13/03/07 21:42:44 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS
- 13/03/07 21:42:44 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE
- 13/03/07 21:42:44 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS
- 13/03/07 21:42:44 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE
- 13/03/07 21:42:44 INFO distributedshell.Client: Min mem capabililty of resources in this cluster 1024
- 13/03/07 21:42:44 INFO distributedshell.Client: Max mem capabililty of resources in this cluster 8192
- 13/03/07 21:42:44 INFO distributedshell.Client: AM memory specified below min threshold of cluster. Using min value., specified=10, min=1024
- 13/03/07 21:42:44 INFO distributedshell.Client: Setting up application submission context for ASM
- 13/03/07 21:42:44 INFO distributedshell.Client: Copy App Master jar from local filesystem and add to local environment
- 13/03/07 21:42:45 INFO distributedshell.Client: Set the environment for the application master
- 13/03/07 21:42:45 INFO distributedshell.Client: Setting up app master command
- 13/03/07 21:42:45 INFO distributedshell.Client: Completed setting up app master command ${JAVA_HOME}/bin/java -Xmx1024m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --num_containers 1 --priority 0 --shell_command uname --shell_args -a --debug 1><log_dir>/AppMaster.stdout 2><log_dir>/AppMaster.stderr
- 13/03/07 21:42:45 INFO distributedshell.Client: Submitting application to ASM
- 13/03/07 21:42:45 INFO client.YarnClientImpl: Submitted application application_1362658309553_0020 to ResourceManager at /10.232.42.91:19001
- 13/03/07 21:42:46 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:47 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:48 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:49 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:50 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:51 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:52 INFO distributedshell.Client: Got application report from ASM for, appId=20, clientToken=null, appDiagnostics=, appMasterHost=, appQueue=default, appMasterRpcPort=0, appStartTime=1362663765373, yarnAppState=FINISHED, distributedFinalState=SUCCEEDED, appTrackingUrl=search042091.sqa.cm4.tbsite.net:8088/proxy/application_1362658309553_0020/, appUser=henshao
- 13/03/07 21:42:52 INFO distributedshell.Client: Application has completed successfully. Breaking monitoring loop
- 13/03/07 21:42:52 INFO distributedshell.Client: Application completed successfully
- </log_dir></log_dir>
运行完成之后,找不到输出在哪儿,费了好大的劲,终于在hadoop/logs/userlogs下面找到输出了。不知道为何运行了两个container。
- $ tree hadoop/logs/userlogs/application_1362658309553_0018
- application_1362658309553_0018
- |-- container_1362658309553_0018_01_000001
- | |-- AppMaster.stderr
- | `-- AppMaster.stdout
- `-- container_1362658309553_0018_01_000002
- |-- stderr
- `-- stdout
- $ cat hadoop/logs/userlogs/application_1362658309553_0018/container_1362658309553_0018_01_000002/stdout
- Linux search042091.sqa.cm4 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
好,开始用yarn调度一个程序。我写了一个脚本,里面启动了服务器。
- $ cat ~/start_sp.sh
- #!/bin/env bash
- source /home/admin/.bashrc
- /home/admin/sp/bin/sap_server -c /home/admin/sp/sp_worker/etc/sap_server_app.cfg -l /home/admin/sp/sp_worker/etc/sap_server_log.cfg -k restart
启动起来之后,进程关系图如下。
接着我把脚本直接kill掉,期待yarn给我重启脚本。发现application运行结束了,AppMaster.stderr日志里面有如下内容。
- 13/03/08 21:40:02 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1
- 13/03/08 21:40:02 INFO distributedshell.ApplicationMaster: Got container status for containerID=container_1362747551045_0017_01_000002, state=COMPLETE, exitStatus=137, diagnostics=
- Killed by external signal
- 13/03/08 21:40:02 INFO distributedshell.ApplicationMaster: Current application state: loop=464, appDone=true, total=1, requested=1, completed=1, failed=1, currentAllocated=1
- 13/03/08 21:40:02 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM
- 13/03/08 21:40:02 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.AMRMClientImpl is stopped.
- 13/03/08 21:40:02 INFO distributedshell.ApplicationMaster: Application Master failed. exiting
原文地址:Hadoop 2.0配置, 感谢原作者分享。