报错复现:

hive>select count(*) from student;

报错如下:

2020-06-03 22:00:36,787 ERROR [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] exec.Task: Failed to execute tez graph.
java.lang.NullPointerException
	at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.getSession(TezSessionState.java:711)
	at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:646)
	at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:353)
	at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:467)
	at org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getUnmanagedSession(WorkloadManagerFederation.java:66)
	at org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getSession(WorkloadManagerFederation.java:38)
	at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:180)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2479)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2150)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1561)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
	at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
2020-06-03 22:00:36,804 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] reexec.ReOptimizePlugin: ReOptimization: retryPossible: false
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
2020-06-03 22:00:36,804 ERROR [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
2020-06-03 22:00:36,804 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] ql.Driver: Completed executing command(queryId=appleyuchi_20200603220035_d33c8bdf-800c-4ddb-9306-5e8e6e939b38); Time taken: 0.119 seconds
2020-06-03 22:00:36,804 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] lockmgr.DbTxnManager: Stopped heartbeat for query: appleyuchi_20200603220035_d33c8bdf-800c-4ddb-9306-5e8e6e939b38
2020-06-03 22:00:36,893 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] exec.ListSinkOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_LIST_SINK_12:0, 
2020-06-03 22:00:36,997 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] conf.HiveConf: Using the default value passed in for log id: 57ee4918-ac03-4f15-82c0-0cfd7cbcda73
2020-06-03 22:00:36,997 INFO  [57ee4918-ac03-4f15-82c0-0cfd7cbcda73 main] session.SessionState: Resetting thread name to  main

排查方案:

排查操作步骤截图
http://desktop:8088/cluster
ID-application_1591188209071_0001

点击Logs

点击here(注意,整个页面有两个"here"链接,不是最开头的那个,而是后面)

经过上述排查步骤后,得到报错信息:

java.lang.RuntimeException: Failed to connect to timeline server. Connection retries limit exceeded. The posted timeline event may be missing
	at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:357)
	at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:404)
	at com.sun.jersey.api.client.Client.handle(Client.java:652)
	at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
	at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
	at com.sun.jersey.api.client.WebResource$Builder.put(WebResource.java:539)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:166)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putDomain(TimelineWriter.java:98)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putDomain(TimelineClientImpl.java:183)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.createTimelineDomain(ATSHistoryACLPolicyManager.java:127)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.createSessionDomain(ATSHistoryACLPolicyManager.java:164)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.setupSessionACLs(ATSHistoryACLPolicyManager.java:222)
	at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.createSessionDomain(ATSHistoryLoggingService.java:426)
	at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.serviceStart(ATSHistoryLoggingService.java:164)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
	at org.apache.tez.dag.history.HistoryEventHandler.serviceStart(HistoryEventHandler.java:110)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.tez.dag.app.DAGAppMaster$ServiceWithDependency.start(DAGAppMaster.java:1865)
	at org.apache.tez.dag.app.DAGAppMaster$ServiceThread.run(DAGAppMaster.java:1886)
2020-06-03 22:01:03,700 [WARN] [ServiceThread:org.apache.tez.dag.history.HistoryEventHandler] |ats.ATSHistoryLoggingService|: Could not setup history acls, disabling history logging.
org.apache.tez.common.security.HistoryACLPolicyException: Fail to create ACL-related domain in Timeline
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.createTimelineDomain(ATSHistoryACLPolicyManager.java:131)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.createSessionDomain(ATSHistoryACLPolicyManager.java:164)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.setupSessionACLs(ATSHistoryACLPolicyManager.java:222)
	at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.createSessionDomain(ATSHistoryLoggingService.java:426)
	at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.serviceStart(ATSHistoryLoggingService.java:164)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
	at org.apache.tez.dag.history.HistoryEventHandler.serviceStart(HistoryEventHandler.java:110)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
	at org.apache.tez.dag.app.DAGAppMaster$ServiceWithDependency.start(DAGAppMaster.java:1865)
	at org.apache.tez.dag.app.DAGAppMaster$ServiceThread.run(DAGAppMaster.java:1886)
Caused by: java.lang.RuntimeException: Failed to connect to timeline server. Connection retries limit exceeded. The posted timeline event may be missing
	at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:357)
	at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:404)
	at com.sun.jersey.api.client.Client.handle(Client.java:652)
	at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
	at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
	at com.sun.jersey.api.client.WebResource$Builder.put(WebResource.java:539)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:166)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112)
	at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putDomain(TimelineWriter.java:98)
	at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putDomain(TimelineClientImpl.java:183)
	at org.apache.tez.dag.history.ats.acls.ATSHistoryACLPolicyManager.createTimelineDomain(ATSHistoryACLPolicyManager.java:127)
	... 10 more

根据[1]可知:Timeline Server其实就是高级版本的JobHistoryServer

既然JobHistoryServer是单机存在的jps进程,那么猜测Timeline Server也是可以单机部署的jps进程

也就是说,如果想要在hive中使用Tez引擎,那么必须启动Timeline Server

从对Timeline Server的情景分析中,也可以看出,其实所谓的Tez-ui可有可无, Tez-ui只是记录Tez运行的一个log展示界面。

 

解决方案:

终端输入

yarn timelineserver

 

 

Reference:

[1]YARN Timeline Server介绍

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐