Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-8070][yarn][tests] Print errors found in log files #5012

Closed
wants to merge 2 commits into from

Conversation

zentol
Copy link
Contributor

@zentol zentol commented Nov 14, 2017

What is the purpose of the change

This PR modifies the YarnTestBase to print found exceptions in the test failure message.

The output now looks like this:

java.lang.AssertionError(Found a file /home/Zento/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1510667711263_0001/container_1510667711263_0001_01_000001/jobmanager.log with a prohibited string (one of [Exception, Started [email protected]:8081]). Excerpts:
[
java.lang.Exception: Could not create actor system
        at org.apache.flink.runtime.clusterframework.BootstrapTools.startActorSystem(BootstrapTools.java:171)
        at org.apache.flink.runtime.clusterframework.BootstrapTools.startActorSystem(BootstrapTools.java:115)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:313)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:199)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:196)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:196)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:123)
Caused by: java.lang.VerifyError: Inconsistent stackmap frames at branch target 152
])
        at org.junit.runners.model.MultipleFailureException.assertEmpty(MultipleFailureException.java:67)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:39)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

Verifying this change

Create some failure condition in the yarn tests and check the test output. (I luckily got one on hand locally due to FLINK-8071)

@greghogan
Copy link
Contributor

LGTM. I'm puzzled why the ]) is printed out-of-order in the middle of the stack trace.

@zentol
Copy link
Contributor Author

zentol commented Nov 21, 2017

In this particular stack trace, you have the stack trace of the assertion error

java.lang.AssertionError(... exception message ...)
        at org.junit.runners.model.MultipleFailureException.assertEmpty(MultipleFailureException.java:67)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:39)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

and the excerpts in the exception message of the AssertionError, as a list of exceptions

Found a file /home/Zento/flink/flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1510667711263_0001/container_1510667711263_0001_01_000001/jobmanager.log with a prohibited string (one of [Exception, Started [email protected]:8081]). Excerpts:
[
java.lang.Exception: Could not create actor system
        at org.apache.flink.runtime.clusterframework.BootstrapTools.startActorSystem(BootstrapTools.java:171)
        at org.apache.flink.runtime.clusterframework.BootstrapTools.startActorSystem(BootstrapTools.java:115)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:313)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:199)
        at org.apache.flink.yarn.YarnApplicationMasterRunner$1.call(YarnApplicationMasterRunner.java:196)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:196)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:123)
Caused by: java.lang.VerifyError: Inconsistent stackmap frames at branch target 152
]

The formatting isn't really ideal, but i don't know of an easy way to change it. (And it's still better than nothing)

@zentol
Copy link
Contributor Author

zentol commented Nov 21, 2017

Also, this particular exception stops at after Caused by: java.lang.VerifyError: Inconsistent stackmap frames at branch target 152 because the next lines don't look like a stack trace.

Here's a snippet:

  at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:196)
        at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:123)
Caused by: java.lang.VerifyError: Inconsistent stackmap frames at branch target 152
Exception Details:
  Location:
    akka/dispatch/Mailbox.processAllSystemMessages()V @152: getstatic
  Reason:

@zentol
Copy link
Contributor Author

zentol commented Nov 22, 2017

merging.

@asfgit asfgit closed this in 7a434c3 Nov 22, 2017
asfgit pushed a commit that referenced this pull request Nov 22, 2017
@zentol zentol deleted the 8070 branch November 22, 2017 11:03
@greghogan
Copy link
Contributor

Ah, thanks @zentol for the clarification. Very nice to have this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants