Hadoop YARN Web服务REST API是一组URI资源,用于访问群集,节点,应用程序和应用程序历史信息。URI资源根据返回的信息类型分为API。一些URI资源返回集合,而其他URI资源返回单例。
基于REST的Web服务的URI具有以下语法:
http:// {服务的http地址} / ws / {版本} / {resourcepath}
此语法中的元素如下:
{服务的http地址}-服务的http地址,以获取有关的信息。 当前支持的是ResourceManager,NodeManager, MapReduce应用程序主服务器和历史记录服务器。 {version}-API的版本。在此版本中,版本为v1。 {resourcepath}-定义单例资源或资源集合的路径。
接下来的几节描述了Web服务REST API的HTTP响应的一些语法和其他细节。
此版本的Web服务REST API支持JSON和XML格式的响应。JSON是默认设置。要设置响应格式,可以在HTTP请求的Accept标头中指定格式。
如HTTP响应代码中所指定,响应主体可以包含代表资源的数据或错误消息。如果成功,则响应主体采用选定格式,即JSON或XML。在发生错误的情况下,根据所请求的格式,谐振主体采用JSON或XML格式。响应的Content-Type标头包含请求的格式。如果应用程序请求不支持的格式,则响应状态代码为500。请注意,未指定响应正文中字段的顺序,并且可能会更改。另外,可能会将其他字段添加到响应正文中。因此,您的应用程序应使用解析例程,该例程可以按任何顺序从响应主体中提取数据。
调用HTTP请求后,应用程序应检查响应状态代码以验证成功或检测到错误。如果响应状态代码指示错误,则响应主体包含错误消息。第一个字段是异常类型,当前仅返回RemoteException。下表列出了RemoteException错误消息中的项目:
项目 | 数据类型 | 描述 |
---|---|---|
例外 | 串 | 异常类型 |
javaClassName | 串 | Java类名称的异常 |
信息 | 串 | 异常的详细信息 |
_1324057493980_0001HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application
响应状态行:HTTP / 1.1 200 OK
响应标题:
HTTP / 1.1 200 OK 内容类型:application / json 传输编码:分块 服务器:码头(6.1.26)
响应主体:
{ 应用”: { “ id”:“ application_1324057493980_0001”, “ user”:“ user1”, “名称”:””, “ queue”:“默认”, “ state”:“ ACCEPTED”, “ finalStatus”:“未定义”, “进度”:0, “ trackingUI”:“未指定”, “诊断”:“”, “ clusterId”:1324057493980, “ startedTime”:1324057495921, “ finishedTime”:0, “ elapsedTime”:2063, “ amContainerLogs”:“ http:\ / \ / amNM:2 \ / node \ / containerlogs \ / container_1324057493980_0001_01_000001”, “ amHostHttpAddress”:“ amNM:2” } }
在这里,我们请求有关尚不存在的应用程序的信息。
_1324057493980_9999HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application
响应状态行:找不到HTTP / 1.1 404
响应标题:
找不到HTTP / 1.1 404 内容类型:application / json 传输编码:分块 服务器:码头(6.1.26)
响应主体:
{ “ RemoteException”:{ “ javaClassName”:“ org.apache.hadoop.yarn.webapp.NotFoundException”, “ exception”:“ NotFoundException”, “ message”:“ java.lang.Exception:ID为application_1324057493980_9999的应用程序未找到” } }
您可以使用多种方法/语言来使用Web服务REST API。本示例使用curl命令行界面执行REST GET调用。
在此示例中,用户使用以下命令将MapReduce应用程序提交到ResourceManager:
hadoop jar hadoop-mapreduce-test.jar sleep -Dmapred.job.queue.name = a1 -m 1 -r 1 -rt 1200000 -mt 20
客户端打印有关提交的作业的信息以及应用程序ID,类似于:
18/12/1 04:25:15 INFO mapred.ResourceMgrDelegate:已将应用程序application_1326821518301_0010提交到位于host.domain.com/10.10.10.10:8032的ResourceManager 18/12/1 04:25:15 INFO mapreduce.Job:正在运行的作业:job_1326821518301_0010 18/12/01 04:25:21信息mapred.ClientServiceDelegate:跟踪作业的网址:host.domain.com:8088/proxy/application_1326821518301_0010/ 18/12/1 04:25:22 INFO mapreduce.Job:以超级模式运行的Job job_1326821518301_0010:false 18/12/01 04:25:22 INFO mapreduce.Job:地图0%减少0%
然后,用户希望跟踪应用程序。用户首先从ResourceManager获取有关应用程序的信息。使用–comopressed选项可请求压缩输出。curl在客户端处理解压缩。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010”
输出:
{ “ app”:{ “ finishedTime”:0, “ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”, “ trackingUI”:“ ApplicationMaster”, “ state”:“ RUNNING”, “ user”:“ user1”, “ id”:“ application_1326821518301_0010”, “ clusterId”:1326821518301, “ finalStatus”:“未定义”, “ amHostHttpAddress”:“ host.domain.com:8042”, “进度”:82.44703, “ name”:“睡眠工作”, “ startedTime”:1326860715335, “ elapsedTime”:31814, “诊断”:“”, “ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/”, “ queue”:“ a1” } }
_1326821518301_0010。这可以转到Web浏览器或使用Web服务REST API。用户随后希望获取有关正在运行的应用程序的更多详细信息,并直接转到该应用程序的MapReduce应用程序主服务器。ResourceManager列出了可用于该应用程序的trackingUrl:http : //host.domain.com : 8088/proxy/ application 。用户使用Web服务REST API来获取该MapReduce应用程序主正在运行的作业列表:
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs”
输出:
{ “职位” : { “工作”:[ { “ runningReduceAttempts”:1, “ reduceProgress”:72.104515, “ failedReduceAttempts”:0, “ newMapAttempts”:0, “ mapsRunning”:0, “ state”:“ RUNNING”, “ successfulReduceAttempts”:0, “ reducesRunning”:1, “ acls”:[ { “ value”:“”, “名称”:“ mapreduce.job.acl-modify-job” }, { “ value”:“”, “名称”:“ mapreduce.job.acl-view-job” } ], “ reducesPending”:0, “ user”:“ user1”, “ reducesTotal”:1 “ mapsCompleted”:1 “ startTime”:1326860720902, “ id”:“ job_1326821518301_10_10”, “ successfulMapAttempts”:1 “ runningMapAttempts”:0, “ newReduceAttempts”:0, “ name”:“睡眠工作”, “ mapsPending”:0, “ elapsedTime”:64432, “ reducesCompleted”:0, “ mapProgress”:100, “诊断”:“”, “ failedMapAttempts”:0, “ killedReduceAttempts”:0, “ mapsTotal”:1 “ uberized”:错误, “ killedMapAttempts”:0, “ finishTime”:0 } ] } }
然后,用户希望获取上面列出的作业ID为job_1326821518301_10_10的作业的任务详细信息。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks”
输出:
{ “任务” : { “任务”:[ { “进度”:100, “ elapsedTime”:5059, “ state”:“ Succeeded”, “ startTime”:1326860725014, “ id”:“ task_1326821518301_10_10_m_0”, “ type”:“ MAP”, “ successfulAttempt”:“ attempt_1326821518301_10_10_m_0_0”, “ finishTime”:1326860730073 }, { “进度”:72.104515, “ elapsedTime”:0, “ state”:“ RUNNING”, “ startTime”:1326860732984, “ id”:“ task_1326821518301_10_10_r_0”, “ type”:“ REDUCE”, “ successfulAttempt”:“”, “ finishTime”:0 } ] } }
映射任务已完成,但reduce任务仍在运行。用户希望获取归约任务task_1326821518301_10_10_r_0的任务尝试信息,请注意,由于JSON是默认输出格式,因此这里并不需要接收标头:
curl --compressed -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts”
输出:
{ “ taskAttempts”:{ “ taskAttempt”:[ { “ elapsedMergeTime”:158, “ shuffleFinishTime”:1326860735378, “ assignedContainerId”:“ container_1326821518301_0010_01_000003”, “进度”:72.104515, “ elapsedTime”:0, “ state”:“ RUNNING”, “ elapsedShuffleTime”:2394, “ mergeFinishTime”:1326860735536, “机架”:“ /10.10.10.0”, “ elapsedReduceTime”:0, “ nodeHttpAddress”:“ host.domain.com:8042”, “ type”:“ REDUCE”, “ startTime”:1326860732984, “ id”:“ attempt_1326821518301_10_10_r_0_0”, “ finishTime”:0 } ] } }
减少尝试仍在运行,用户希望查看该尝试的当前计数器值:
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts/attempt_1326821518301_10_10 /计数器”
输出:
{ “ JobTaskAttemptCounters”:{ “ taskAttemptCounterGroup”:[ { “ counterGroupName”:“ org.apache.hadoop.mapreduce.FileSystemCounter”, “柜台”:[ { “值”:4216, “名称”:“ FILE_BYTES_READ” }, { “值”:77151, “名称”:“ FILE_BYTES_WRITTEN” }, { “值”:0, “名称”:“ FILE_READ_OPS” }, { “值”:0, “名称”:“ FILE_LARGE_READ_OPS” }, { “值”:0, “名称”:“ FILE_WRITE_OPS” }, { “值”:0, “名称”:“ HDFS_BYTES_READ” }, { “值”:0, “名称”:“ HDFS_BYTES_WRITTEN” }, { “值”:0, “名称”:“ HDFS_READ_OPS” }, { “值”:0, “名称”:“ HDFS_LARGE_READ_OPS” }, { “值”:0, “名称”:“ HDFS_WRITE_OPS” } ] }, { “ counterGroupName”:“ org.apache.hadoop.mapreduce.TaskCounter”, “柜台”:[ { “值”:0, “名称”:“ COMBINE_INPUT_RECORDS” }, { “值”:0, “名称”:“ COMBINE_OUTPUT_RECORDS” }, { “值”:1767, “名称”:“ REDUCE_INPUT_GROUPS” }, { “值”:25104, “名称”:“ REDUCE_SHUFFLE_BYTES” }, { “值”:1767, “名称”:“ REDUCE_INPUT_RECORDS” }, { “值”:0, “名称”:“ REDUCE_OUTPUT_RECORDS” }, { “值”:0, “名称”:“ SPILLED_RECORDS” }, { “值”:1 “名称”:“ SHUFFLED_MAPS” }, { “值”:0, “名称”:“ FAILED_SHUFFLE” }, { “值”:1 “名称”:“ MERGED_MAP_OUTPUTS” }, { “值”:50, “名称”:“ GC_TIME_MILLIS” }, { “值”:1580, “名称”:“ CPU_MILLISECONDS” }, { “值”:141320192, “名称”:“ PHYSICAL_MEMORY_BYTES” }, { “值”:1118552064, “名称”:“ VIRTUAL_MEMORY_BYTES” }, { “值”:73728000, “名称”:“ COMMITTED_HEAP_BYTES” } ] }, { “ counterGroupName”:“随机播放错误”, “柜台”:[ { “值”:0, “名称”:“ BAD_ID” }, { “值”:0, “名称”:“连接” }, { “值”:0, “名称”:“ IO_ERROR” }, { “值”:0, “名称”:“ WRONG_LENGTH” }, { “值”:0, “名称”:“ WRONG_MAP” }, { “值”:0, “名称”:“ WRONG_REDUCE” } ] }, { “ counterGroupName”:“ org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter”, “柜台”:[ { “值”:0, “名称”:“ BYTES_WRITTEN” } ] } ], “ id”:“ attempt_1326821518301_10_10_r_0_0” } }
作业完成,用户希望从历史记录服务器中获取该作业的最终作业信息。
curl --compressed -X GET“ http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10”
输出:
{ “工作”:{ “ avgReduceTime”:1250784, “ failedReduceAttempts”:0, “ state”:“ Succeeded”, “ successfulReduceAttempts”:1, “ acls”:[ { “ value”:“”, “名称”:“ mapreduce.job.acl-modify-job” }, { “ value”:“”, “名称”:“ mapreduce.job.acl-view-job” } ], “ user”:“ user1”, “ reducesTotal”:1 “ mapsCompleted”:1 “ startTime”:1326860720902, “ id”:“ job_1326821518301_10_10”, “ avgMapTime”:5059, “ successfulMapAttempts”:1 “ name”:“睡眠工作”, “ avgShuffleTime”:2394, “ reducesCompleted”:1, “诊断”:“”, “ failedMapAttempts”:0, “ avgMergeTime”:2552, “ killedReduceAttempts”:0, “ mapsTotal”:1 “ queue”:“ a1”, “ uberized”:错误, “ killedMapAttempts”:0, “ finishTime”:1326861986164 } }
用户还可以从ResourceManager获取最终的应用程序信息。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010”
输出:
{ “ app”:{ “ finishedTime”:1326861991282, “ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”, “ trackingUI”:“历史”, “ state”:“完成”, “ user”:“ user1”, “ id”:“ application_1326821518301_0010”, “ clusterId”:1326821518301, “ finalStatus”:“成功”, “ amHostHttpAddress”:“ host.domain.com:8042”, “进度”:100, “ name”:“睡眠工作”, “ startedTime”:1326860715335, “ elapsedTime”:1275947, “诊断”:“”, “ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10”, “ queue”:“ a1” } }