Hadoop 文档

General

Common

HDFS

MapReduce

MapReduce REST APIs

YARN

YARN REST APIs

YARN Service

Submarine

Hadoop Compatible File Systems

Auth

Tools

Reference

Configuration

总览

Hadoop YARN Web服务REST API是一组URI资源,用于访问群集,节点,应用程序和应用程序历史信息。URI资源根据返回的信息类型分为API。一些URI资源返回集合,而其他URI资源返回单例。

URI的

基于REST的Web服务的URI具有以下语法:

  http:// {服务的http地址} / ws / {版本} / {resourcepath}

此语法中的元素如下:

  {服务的http地址}-服务的http地址,以获取有关的信息。 
                              当前支持的是ResourceManager,NodeManager, 
                              MapReduce应用程序主服务器和历史记录服务器。
  {version}-API的版本。在此版本中,版本为v1。
  {resourcepath}-定义单例资源或资源集合的路径。 

HTTP请求

要调用REST API,您的应用程序将对与资源关联的URI调用HTTP操作。

HTTP操作摘要

当前仅支持GET。它检索有关指定资源的信息。

安全

Web服务REST API的安全性与Web UI相同。如果您的集群管理员启用了过滤器,则必须通过他们指定的机制进行身份验证。

标头支持

  • 接受
    • 接受编码

当前,标头中使用的唯一字段是AcceptAccept-Encoding“接受”当前支持XML和JSON作为接受的响应类型。Accept-Encoding当前仅支持gzip格式,如果指定了gzip格式,则将返回gzip压缩输出,否则输出为未压缩。所有其他标题字段都将被忽略。

HTTP响应

接下来的几节描述了Web服务REST API的HTTP响应的一些语法和其他细节。

压缩

如果在HTTP请求的Accept-Encoding标头中指定gzip,则此版本支持gzip压缩(Accept-Encoding:gzip)。

回应格式

此版本的Web服务REST API支持JSON和XML格式的响应。JSON是默认设置。要设置响应格式,可以在HTTP请求的Accept标头中指定格式。

如HTTP响应代码中所指定,响应主体可以包含代表资源的数据或错误消息。如果成功,则响应主体采用选定格式,即JSON或XML。在发生错误的情况下,根据所请求的格式,谐振主体采用JSON或XML格式。响应的Content-Type标头包含请求的格式。如果应用程序请求不支持的格式,则响应状态代码为500。请注意,未指定响应正文中字段的顺序,并且可能会更改。另外,可能会将其他字段添加到响应正文中。因此,您的应用程序应使用解析例程,该例程可以按任何顺序从响应主体中提取数据。

响应错误

调用HTTP请求后,应用程序应检查响应状态代码以验证成功或检测到错误。如果响应状态代码指示错误,则响应主体包含错误消息。第一个字段是异常类型,当前仅返回RemoteException。下表列出了RemoteException错误消息中的项目:

项目 数据类型 描述
例外 异常类型
javaClassName Java类名称的异常
信息 异常的详细信息

回应范例

单一资源的JSON响应

_1324057493980_0001HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application

响应状态行:HTTP / 1.1 200 OK

响应标题:

  HTTP / 1.1 200 OK
  内容类型:application / json
  传输编码:分块
  服务器:码头(6.1.26)

响应主体:

{
  应用”:
  {
    “ id”:“ application_1324057493980_0001”,
    “ user”:“ user1”,
    “名称”:””,
    “ queue”:“默认”,
    “ state”:“ ACCEPTED”,
    “ finalStatus”:“未定义”,
    “进度”:0,
    “ trackingUI”:“未指定”,
    “诊断”:“”,
    “ clusterId”:1324057493980,
    “ startedTime”:1324057495921,
    “ finishedTime”:0,
    “ elapsedTime”:2063,
    “ amContainerLogs”:“ http:\ / \ / amNM:2 \ / node \ / containerlogs \ / container_1324057493980_0001_01_000001”,
    “ amHostHttpAddress”:“ amNM:2”
  }
}

带有错误响应的JSON响应

在这里,我们请求有关尚不存在的应用程序的信息。

_1324057493980_9999HTTP请求:GET http://rmhost.domain:8088 / ws / v1 / cluster / apps / application

响应状态行:找不到HTTP / 1.1 404

响应标题:

  找不到HTTP / 1.1 404
  内容类型:application / json
  传输编码:分块
  服务器:码头(6.1.26)

响应主体:

{
   “ RemoteException”:{
      “ javaClassName”:“ org.apache.hadoop.yarn.webapp.NotFoundException”,
      “ exception”:“ NotFoundException”,
      “ message”:“ java.lang.Exception:ID为application_1324057493980_9999的应用程序未找到”
   }
}

样品用量

您可以使用多种方法/语言来使用Web服务REST API。本示例使用curl命令行界面执行REST GET调用。

在此示例中,用户使用以下命令将MapReduce应用程序提交到ResourceManager:

  hadoop jar hadoop-mapreduce-test.jar sleep -Dmapred.job.queue.name = a1 -m 1 -r 1 -rt 1200000 -mt 20

客户端打印有关提交的作业的信息以及应用程序ID,类似于:

18/12/1 04:25:15 INFO mapred.ResourceMgrDelegate:已将应用程序application_1326821518301_0010提交到位于host.domain.com/10.10.10.10:8032的ResourceManager
18/12/1 04:25:15 INFO mapreduce.Job:正在运行的作业:job_1326821518301_0010
18/12/01 04:25:21信息mapred.ClientServiceDelegate:跟踪作业的网址:host.domain.com:8088/proxy/application_1326821518301_0010/
18/12/1 04:25:22 INFO mapreduce.Job:以超级模式运行的Job job_1326821518301_0010:false
18/12/01 04:25:22 INFO mapreduce.Job:地图0%减少0%

然后,用户希望跟踪应用程序。用户首先从ResourceManager获取有关应用程序的信息。使用–comopressed选项可请求压缩输出。curl在客户端处理解压缩。

curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010” 

输出:

{
   “ app”:{
      “ finishedTime”:0,
      “ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”,
      “ trackingUI”:“ ApplicationMaster”,
      “ state”:“ RUNNING”,
      “ user”:“ user1”,
      “ id”:“ application_1326821518301_0010”,
      “ clusterId”:1326821518301,
      “ finalStatus”:“未定义”,
      “ amHostHttpAddress”:“ host.domain.com:8042”,
      “进度”:82.44703,
      “ name”:“睡眠工作”,
      “ startedTime”:1326860715335,
      “ elapsedTime”:31814,
      “诊断”:“”,
      “ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/”,
      “ queue”:“ a1”
   }
}

_1326821518301_0010。这可以转到Web浏览器或使用Web服务REST API。用户随后希望获取有关正在运行的应用程序的更多详细信息,并直接转到该应用程序的MapReduce应用程序主服务器。ResourceManager列出了可用于该应用程序的trackingUrl:http : //host.domain.com : 8088/proxy/ application 。用户使用Web服务REST API来获取该MapReduce应用程序主正在运行的作业列表:

 curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs”

输出:

{
   “职位” : {
      “工作”:[
         {
            “ runningReduceAttempts”:1,
            “ reduceProgress”:72.104515,
            “ failedReduceAttempts”:0,
            “ newMapAttempts”:0,
            “ mapsRunning”:0,
            “ state”:“ RUNNING”,
            “ successfulReduceAttempts”:0,
            “ reducesRunning”:1,
            “ acls”:[
               {
                  “ value”:“”,
                  “名称”:“ mapreduce.job.acl-modify-job”
               },
               {
                  “ value”:“”,
                  “名称”:“ mapreduce.job.acl-view-job”
               }
            ],
            “ reducesPending”:0,
            “ user”:“ user1”,
            “ reducesTotal”:1
            “ mapsCompleted”:1
            “ startTime”:1326860720902,
            “ id”:“ job_1326821518301_10_10”,
            “ successfulMapAttempts”:1
            “ runningMapAttempts”:0,
            “ newReduceAttempts”:0,
            “ name”:“睡眠工作”,
            “ mapsPending”:0,
            “ elapsedTime”:64432,
            “ reducesCompleted”:0,
            “ mapProgress”:100,
            “诊断”:“”,
            “ failedMapAttempts”:0,
            “ killedReduceAttempts”:0,
            “ mapsTotal”:1
            “ uberized”:错误,
            “ killedMapAttempts”:0,
            “ finishTime”:0
         }
      ]
   }
}

然后,用户希望获取上面列出的作业ID为job_1326821518301_10_10的作业的任务详细信息。

 curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks” 

输出:

{
   “任务” : {
      “任务”:[
         {
            “进度”:100,
            “ elapsedTime”:5059,
            “ state”:“ Succeeded”,
            “ startTime”:1326860725014,
            “ id”:“ task_1326821518301_10_10_m_0”,
            “ type”:“ MAP”,
            “ successfulAttempt”:“ attempt_1326821518301_10_10_m_0_0”,
            “ finishTime”:1326860730073
         },
         {
            “进度”:72.104515,
            “ elapsedTime”:0,
            “ state”:“ RUNNING”,
            “ startTime”:1326860732984,
            “ id”:“ task_1326821518301_10_10_r_0”,
            “ type”:“ REDUCE”,
            “ successfulAttempt”:“”,
            “ finishTime”:0
         }
      ]
   }
}

映射任务已完成,但reduce任务仍在运行。用户希望获取归约任务task_1326821518301_10_10_r_0的任务尝试信息,请注意,由于JSON是默认输出格式,因此这里并不需要接收标头:

  curl --compressed -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts”

输出:

{
   “ taskAttempts”:{
      “ taskAttempt”:[
         {
            “ elapsedMergeTime”:158,
            “ shuffleFinishTime”:1326860735378,
            “ assignedContainerId”:“ container_1326821518301_0010_01_000003”,
            “进度”:72.104515,
            “ elapsedTime”:0,
            “ state”:“ RUNNING”,
            “ elapsedShuffleTime”:2394,
            “ mergeFinishTime”:1326860735536,
            “机架”:“ /10.10.10.0”,
            “ elapsedReduceTime”:0,
            “ nodeHttpAddress”:“ host.domain.com:8042”,
            “ type”:“ REDUCE”,
            “ startTime”:1326860732984,
            “ id”:“ attempt_1326821518301_10_10_r_0_0”,
            “ finishTime”:0
         }
      ]
   }
}

减少尝试仍在运行,用户希望查看该尝试的当前计数器值:

 curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/proxy/application_1326821518301_0010/ws/v1/mapreduce/jobs/job_1326821518301_10_10/tasks/task_1326821518301_10_10_r_0/attempts/attempt_1326821518301_10_10 /计数器” 

输出:

{
   “ JobTaskAttemptCounters”:{
      “ taskAttemptCounterGroup”:[
         {
            “ counterGroupName”:“ org.apache.hadoop.mapreduce.FileSystemCounter”,
            “柜台”:[
               {
                  “值”:4216,
                  “名称”:“ FILE_BYTES_READ”
               }, 
               {
                  “值”:77151,
                  “名称”:“ FILE_BYTES_WRITTEN”
               }, 
               {
                  “值”:0,
                  “名称”:“ FILE_READ_OPS”
               },
               {
                  “值”:0,
                  “名称”:“ FILE_LARGE_READ_OPS”
               },
               {
                  “值”:0,
                  “名称”:“ FILE_WRITE_OPS”
               },
               {
                  “值”:0,
                  “名称”:“ HDFS_BYTES_READ”
               },
               {
                  “值”:0,
                  “名称”:“ HDFS_BYTES_WRITTEN”
               },
               {
                  “值”:0,
                  “名称”:“ HDFS_READ_OPS”
               },
               {
                  “值”:0,
                  “名称”:“ HDFS_LARGE_READ_OPS”
               },
               {
                  “值”:0,
                  “名称”:“ HDFS_WRITE_OPS”
               }
            ]  
         }, 
         {
            “ counterGroupName”:“ org.apache.hadoop.mapreduce.TaskCounter”,
            “柜台”:[
               {
                  “值”:0,
                  “名称”:“ COMBINE_INPUT_RECORDS”
               }, 
               {
                  “值”:0,
                  “名称”:“ COMBINE_OUTPUT_RECORDS”
               }, 
               {  
                  “值”:1767,
                  “名称”:“ REDUCE_INPUT_GROUPS”
               },
               {  
                  “值”:25104,
                  “名称”:“ REDUCE_SHUFFLE_BYTES”
               },
               {
                  “值”:1767,
                  “名称”:“ REDUCE_INPUT_RECORDS”
               },
               {
                  “值”:0,
                  “名称”:“ REDUCE_OUTPUT_RECORDS”
               },
               {
                  “值”:0,
                  “名称”:“ SPILLED_RECORDS”
               },
               {
                  “值”:1
                  “名称”:“ SHUFFLED_MAPS”
               },
               {
                  “值”:0,
                  “名称”:“ FAILED_SHUFFLE”
               },
               {
                  “值”:1
                  “名称”:“ MERGED_MAP_OUTPUTS”
               },
               {
                  “值”:50,
                  “名称”:“ GC_TIME_MILLIS”
               },
               {
                  “值”:1580,
                  “名称”:“ CPU_MILLISECONDS”
               },
               {
                  “值”:141320192,
                  “名称”:“ PHYSICAL_MEMORY_BYTES”
               },
              {
                  “值”:1118552064,
                  “名称”:“ VIRTUAL_MEMORY_BYTES”
               }, 
               {  
                  “值”:73728000,
                  “名称”:“ COMMITTED_HEAP_BYTES”
               }
            ]
         },
         {  
            “ counterGroupName”:“随机播放错误”,
            “柜台”:[
               {  
                  “值”:0,
                  “名称”:“ BAD_ID”
               },
               {  
                  “值”:0,
                  “名称”:“连接”
               },
               {  
                  “值”:0,
                  “名称”:“ IO_ERROR”
               },
               {  
                  “值”:0,
                  “名称”:“ WRONG_LENGTH”
               },
               {  
                  “值”:0,
                  “名称”:“ WRONG_MAP”
               },
               {  
                  “值”:0,
                  “名称”:“ WRONG_REDUCE”
               }
            ]
         },
         {  
            “ counterGroupName”:“ org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter”,
            “柜台”:[
              {  
                  “值”:0,
                  “名称”:“ BYTES_WRITTEN”
               }
            ]
         }
      ],
      “ id”:“ attempt_1326821518301_10_10_r_0_0”
   }
}

作业完成,用户希望从历史记录服务器中获取该作业的最终作业信息。

  curl --compressed -X GET“ http://host.domain.com:19888/ws/v1/history/mapreduce/jobs/job_1326821518301_10_10” 

输出:

{
   “工作”:{
      “ avgReduceTime”:1250784,
      “ failedReduceAttempts”:0,
      “ state”:“ Succeeded”,
      “ successfulReduceAttempts”:1,
      “ acls”:[
         {
            “ value”:“”,
            “名称”:“ mapreduce.job.acl-modify-job”
         },
         {
            “ value”:“”,
            “名称”:“ mapreduce.job.acl-view-job”
         }
      ],
      “ user”:“ user1”,
      “ reducesTotal”:1
      “ mapsCompleted”:1
      “ startTime”:1326860720902,
      “ id”:“ job_1326821518301_10_10”,
      “ avgMapTime”:5059,
      “ successfulMapAttempts”:1
      “ name”:“睡眠工作”,
      “ avgShuffleTime”:2394,
      “ reducesCompleted”:1,
      “诊断”:“”,
      “ failedMapAttempts”:0,
      “ avgMergeTime”:2552,
      “ killedReduceAttempts”:0,
      “ mapsTotal”:1
      “ queue”:“ a1”,
      “ uberized”:错误,
      “ killedMapAttempts”:0,
      “ finishTime”:1326861986164
   }
}

用户还可以从ResourceManager获取最终的应用程序信息。

  curl --compressed -H“接受:应用程序/ json” -X GET“ http://host.domain.com:8088/ws/v1/cluster/apps/application_1326821518301_0010” 

输出:

{
   “ app”:{
      “ finishedTime”:1326861991282,
      “ amContainerLogs”:“ http://host.domain.com:8042/node/containerlogs/container_1326821518301_0010_01_000001”,
      “ trackingUI”:“历史”,
      “ state”:“完成”,
      “ user”:“ user1”,
      “ id”:“ application_1326821518301_0010”,
      “ clusterId”:1326821518301,
      “ finalStatus”:“成功”,
      “ amHostHttpAddress”:“ host.domain.com:8042”,
      “进度”:100,
      “ name”:“睡眠工作”,
      “ startedTime”:1326860715335,
      “ elapsedTime”:1275947,
      “诊断”:“”,
      “ trackingUrl”:“ http://host.domain.com:8088/proxy/application_1326821518301_0010/jobhistory/job/job_1326821518301_10_10”,
      “ queue”:“ a1”
   }
}