Hadoop YARN Web服务REST API是一组URI资源,用于访问群集,节点,应用程序和应用程序历史信息。URI资源根据返回的信息类型分为API。一些URI资源返回集合,而其他URI资源返回单例。
基于REST的Web服务的URI具有以下语法:
http:// {服务的http地址} / ws / {版本} / {resourcepath}
此语法中的元素如下:
{服务的http地址}-服务的http地址,以获取有关的信息。
当前支持的是ResourceManager,NodeManager,
MapReduce应用程序主服务器和历史记录服务器。
{version}-API的版本。在此版本中,版本为v1。
{resourcepath}-定义单例资源或资源集合的路径。
接下来的几节描述了Web服务REST API的HTTP响应的一些语法和其他细节。
此版本的Web服务REST API支持JSON和XML格式的响应。JSON是默认设置。要设置响应格式,可以在HTTP请求的Accept标头中指定格式。
如HTTP响应代码中所指定,响应主体可以包含代表资源的数据或错误消息。如果成功,则响应主体采用选定格式,即JSON或XML。在发生错误的情况下,根据所请求的格式,谐振主体采用JSON或XML格式。响应的Content-Type标头包含请求的格式。如果应用程序请求不支持的格式,则响应状态代码为500。请注意,未指定响应正文中字段的顺序,并且可能会更改。另外,可能会将其他字段添加到响应正文中。因此,您的应用程序应使用解析例程,该例程可以按任何顺序从响应主体中提取数据。
调用HTTP请求后,应用程序应检查响应状态代码以验证成功或检测到错误。如果响应状态代码指示错误,则响应主体包含错误消息。第一个字段是异常类型,当前仅返回RemoteException。下表列出了RemoteException错误消息中的项目:
| 项目 | 数据类型 | 描述 |
|---|---|---|
| 例外 | 串 | 异常类型 |
| javaClassName | 串 | Java类名称的异常 |
| 信息 | 串 | 异常的详细信息 |
_1324057493980_0001HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application
响应状态行:HTTP / 1.1 200 OK
响应标题:
HTTP / 1.1 200 OK 内容类型:application / json 传输编码:分块 服务器:码头(6.1.26)
响应主体:
{
应用”:
{
“ id”:“ application_1324057493980_0001”,
“ user”:“ user1”,
“名称”:””,
“ queue”:“默认”,
“ state”:“ ACCEPTED”,
“ finalStatus”:“未定义”,
“进度”:0,
“ trackingUI”:“未指定”,
“诊断”:“”,
“ clusterId”:1324057493980,
“ startedTime”:1324057495921,
“ finishedTime”:0,
“ elapsedTime”:2063,
“ amContainerLogs”:“ http:\ / \ / amNM:2 \ / node \ / containerlogs \ / container_1324057493980_0001_01_000001”,
“ amHostHttpAddress”:“ amNM:2”
}
}
在这里,我们请求有关尚不存在的应用程序的信息。
_1324057493980_9999HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application
响应状态行:找不到HTTP / 1.1 404
响应标题:
找不到HTTP / 1.1 404 内容类型:application / json 传输编码:分块 服务器:码头(6.1.26)
响应主体:
{
“ RemoteException”:{
“ javaClassName”:“ org.apache.hadoop.yarn.webapp.NotFoundException”,
“ exception”:“ NotFoundException”,
“ message”:“ java.lang.Exception:ID为application_1324057493980_9999的应用程序未找到”
}
}
您可以使用多种方法/语言来使用Web服务REST API。本示例使用curl命令行界面执行REST GET调用。
在此示例中,用户使用以下命令将MapReduce应用程序提交到ResourceManager:
hadoop jar hadoop-mapreduce-test.jar sleep -Dmapred.job.queue.name = a1 -m 1 -r 1 -rt 1200000 -mt 20
客户端打印有关提交的作业的信息以及应用程序ID,类似于:
18/12/1 04:25:15 INFO mapred.ResourceMgrDelegate:已将应用程序application_1326821518301_0010提交到位于host.domain.com/10.10.10.10:8032的ResourceManager 18/12/1 04:25:15 INFO mapreduce.Job:正在运行的作业:job_1326821518301_0010 18/12/01 04:25:21信息mapred.ClientServiceDelegate:跟踪作业的网址:host.domain.com:8088/proxy/application_1326821518301_0010/ 18/12/1 04:25:22 INFO mapreduce.Job:以超级模式运行的Job job_1326821518301_0010:false 18/12/01 04:25:22 INFO mapreduce.Job:地图0%减少0%
然后,用户希望跟踪应用程序。用户首先从ResourceManager获取有关应用程序的信息。使用–comopressed选项可请求压缩输出。curl在客户端处理解压缩。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010”
输出:
{
“ app”:{
“ finishedTime”:0,
“ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”,
“ trackingUI”:“ ApplicationMaster”,
“ state”:“ RUNNING”,
“ user”:“ user1”,
“ id”:“ application_1326821518301_0010”,
“ clusterId”:1326821518301,
“ finalStatus”:“未定义”,
“ amHostHttpAddress”:“ host.domain.com:8042”,
“进度”:82.44703,
“ name”:“睡眠工作”,
“ startedTime”:1326860715335,
“ elapsedTime”:31814,
“诊断”:“”,
“ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/”,
“ queue”:“ a1”
}
}
_1326821518301_0010。这可以转到Web浏览器或使用Web服务REST API。用户随后希望获取有关正在运行的应用程序的更多详细信息,并直接转到该应用程序的MapReduce应用程序主服务器。ResourceManager列出了可用于该应用程序的trackingUrl:http : //host.domain.com : 8088/proxy/ application 。用户使用Web服务REST API来获取该MapReduce应用程序主正在运行的作业列表:
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs”
输出:
{
“职位” : {
“工作”:[
{
“ runningReduceAttempts”:1,
“ reduceProgress”:72.104515,
“ failedReduceAttempts”:0,
“ newMapAttempts”:0,
“ mapsRunning”:0,
“ state”:“ RUNNING”,
“ successfulReduceAttempts”:0,
“ reducesRunning”:1,
“ acls”:[
{
“ value”:“”,
“名称”:“ mapreduce.job.acl-modify-job”
},
{
“ value”:“”,
“名称”:“ mapreduce.job.acl-view-job”
}
],
“ reducesPending”:0,
“ user”:“ user1”,
“ reducesTotal”:1
“ mapsCompleted”:1
“ startTime”:1326860720902,
“ id”:“ job_1326821518301_10_10”,
“ successfulMapAttempts”:1
“ runningMapAttempts”:0,
“ newReduceAttempts”:0,
“ name”:“睡眠工作”,
“ mapsPending”:0,
“ elapsedTime”:64432,
“ reducesCompleted”:0,
“ mapProgress”:100,
“诊断”:“”,
“ failedMapAttempts”:0,
“ killedReduceAttempts”:0,
“ mapsTotal”:1
“ uberized”:错误,
“ killedMapAttempts”:0,
“ finishTime”:0
}
]
}
}
然后,用户希望获取上面列出的作业ID为job_1326821518301_10_10的作业的任务详细信息。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks”
输出:
{
“任务” : {
“任务”:[
{
“进度”:100,
“ elapsedTime”:5059,
“ state”:“ Succeeded”,
“ startTime”:1326860725014,
“ id”:“ task_1326821518301_10_10_m_0”,
“ type”:“ MAP”,
“ successfulAttempt”:“ attempt_1326821518301_10_10_m_0_0”,
“ finishTime”:1326860730073
},
{
“进度”:72.104515,
“ elapsedTime”:0,
“ state”:“ RUNNING”,
“ startTime”:1326860732984,
“ id”:“ task_1326821518301_10_10_r_0”,
“ type”:“ REDUCE”,
“ successfulAttempt”:“”,
“ finishTime”:0
}
]
}
}
映射任务已完成,但reduce任务仍在运行。用户希望获取归约任务task_1326821518301_10_10_r_0的任务尝试信息,请注意,由于JSON是默认输出格式,因此这里并不需要接收标头:
curl --compressed -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts”
输出:
{
“ taskAttempts”:{
“ taskAttempt”:[
{
“ elapsedMergeTime”:158,
“ shuffleFinishTime”:1326860735378,
“ assignedContainerId”:“ container_1326821518301_0010_01_000003”,
“进度”:72.104515,
“ elapsedTime”:0,
“ state”:“ RUNNING”,
“ elapsedShuffleTime”:2394,
“ mergeFinishTime”:1326860735536,
“机架”:“ /10.10.10.0”,
“ elapsedReduceTime”:0,
“ nodeHttpAddress”:“ host.domain.com:8042”,
“ type”:“ REDUCE”,
“ startTime”:1326860732984,
“ id”:“ attempt_1326821518301_10_10_r_0_0”,
“ finishTime”:0
}
]
}
}
减少尝试仍在运行,用户希望查看该尝试的当前计数器值:
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts/attempt_1326821518301_10_10 /计数器”
输出:
{
“ JobTaskAttemptCounters”:{
“ taskAttemptCounterGroup”:[
{
“ counterGroupName”:“ org.apache.hadoop.mapreduce.FileSystemCounter”,
“柜台”:[
{
“值”:4216,
“名称”:“ FILE_BYTES_READ”
},
{
“值”:77151,
“名称”:“ FILE_BYTES_WRITTEN”
},
{
“值”:0,
“名称”:“ FILE_READ_OPS”
},
{
“值”:0,
“名称”:“ FILE_LARGE_READ_OPS”
},
{
“值”:0,
“名称”:“ FILE_WRITE_OPS”
},
{
“值”:0,
“名称”:“ HDFS_BYTES_READ”
},
{
“值”:0,
“名称”:“ HDFS_BYTES_WRITTEN”
},
{
“值”:0,
“名称”:“ HDFS_READ_OPS”
},
{
“值”:0,
“名称”:“ HDFS_LARGE_READ_OPS”
},
{
“值”:0,
“名称”:“ HDFS_WRITE_OPS”
}
]
},
{
“ counterGroupName”:“ org.apache.hadoop.mapreduce.TaskCounter”,
“柜台”:[
{
“值”:0,
“名称”:“ COMBINE_INPUT_RECORDS”
},
{
“值”:0,
“名称”:“ COMBINE_OUTPUT_RECORDS”
},
{
“值”:1767,
“名称”:“ REDUCE_INPUT_GROUPS”
},
{
“值”:25104,
“名称”:“ REDUCE_SHUFFLE_BYTES”
},
{
“值”:1767,
“名称”:“ REDUCE_INPUT_RECORDS”
},
{
“值”:0,
“名称”:“ REDUCE_OUTPUT_RECORDS”
},
{
“值”:0,
“名称”:“ SPILLED_RECORDS”
},
{
“值”:1
“名称”:“ SHUFFLED_MAPS”
},
{
“值”:0,
“名称”:“ FAILED_SHUFFLE”
},
{
“值”:1
“名称”:“ MERGED_MAP_OUTPUTS”
},
{
“值”:50,
“名称”:“ GC_TIME_MILLIS”
},
{
“值”:1580,
“名称”:“ CPU_MILLISECONDS”
},
{
“值”:141320192,
“名称”:“ PHYSICAL_MEMORY_BYTES”
},
{
“值”:1118552064,
“名称”:“ VIRTUAL_MEMORY_BYTES”
},
{
“值”:73728000,
“名称”:“ COMMITTED_HEAP_BYTES”
}
]
},
{
“ counterGroupName”:“随机播放错误”,
“柜台”:[
{
“值”:0,
“名称”:“ BAD_ID”
},
{
“值”:0,
“名称”:“连接”
},
{
“值”:0,
“名称”:“ IO_ERROR”
},
{
“值”:0,
“名称”:“ WRONG_LENGTH”
},
{
“值”:0,
“名称”:“ WRONG_MAP”
},
{
“值”:0,
“名称”:“ WRONG_REDUCE”
}
]
},
{
“ counterGroupName”:“ org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter”,
“柜台”:[
{
“值”:0,
“名称”:“ BYTES_WRITTEN”
}
]
}
],
“ id”:“ attempt_1326821518301_10_10_r_0_0”
}
}
作业完成,用户希望从历史记录服务器中获取该作业的最终作业信息。
curl --compressed -X GET“ http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10”
输出:
{
“工作”:{
“ avgReduceTime”:1250784,
“ failedReduceAttempts”:0,
“ state”:“ Succeeded”,
“ successfulReduceAttempts”:1,
“ acls”:[
{
“ value”:“”,
“名称”:“ mapreduce.job.acl-modify-job”
},
{
“ value”:“”,
“名称”:“ mapreduce.job.acl-view-job”
}
],
“ user”:“ user1”,
“ reducesTotal”:1
“ mapsCompleted”:1
“ startTime”:1326860720902,
“ id”:“ job_1326821518301_10_10”,
“ avgMapTime”:5059,
“ successfulMapAttempts”:1
“ name”:“睡眠工作”,
“ avgShuffleTime”:2394,
“ reducesCompleted”:1,
“诊断”:“”,
“ failedMapAttempts”:0,
“ avgMergeTime”:2552,
“ killedReduceAttempts”:0,
“ mapsTotal”:1
“ queue”:“ a1”,
“ uberized”:错误,
“ killedMapAttempts”:0,
“ finishTime”:1326861986164
}
}
用户还可以从ResourceManager获取最终的应用程序信息。
curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010”
输出:
{
“ app”:{
“ finishedTime”:1326861991282,
“ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”,
“ trackingUI”:“历史”,
“ state”:“完成”,
“ user”:“ user1”,
“ id”:“ application_1326821518301_0010”,
“ clusterId”:1326821518301,
“ finalStatus”:“成功”,
“ amHostHttpAddress”:“ host.domain.com:8042”,
“进度”:100,
“ name”:“睡眠工作”,
“ startedTime”:1326860715335,
“ elapsedTime”:1275947,
“诊断”:“”,
“ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10”,
“ queue”:“ a1”
}
}