Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOUDSTACK-9299: Out-of-band Management for CloudStack #1502

Merged
merged 5 commits into from
May 12, 2016

Conversation

rohityadavcloud
Copy link
Member

@rohityadavcloud rohityadavcloud commented Apr 20, 2016

Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.

Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (https://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.

This feature allows following common use-cases:

  • Restarting stalled/failed hosts
  • Powering off under-utilised hosts
  • Powering on hosts for provisioning or to increase capacity
  • Allowing system administrators to see the current power state of the host

For testing this feature, please install ipmitool (using yum/apt/brew) and ipmisim:
https://pypi.python.org/pypi/ipmisim

The default ipmitool location is assumed in /usr/bin, if this is different in your env please fix the global setting, see FS for details on various global settings.

FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack

/cc @jburwell @swill @abhinandanprateek @murali-reddy @borisstoyanov

@rohityadavcloud rohityadavcloud force-pushed the outofband-master branch 5 times, most recently from 497906b to f76ab8e Compare April 20, 2016 08:57
@rohityadavcloud
Copy link
Member Author

rohityadavcloud commented Apr 20, 2016

UI Screenshots;

In host view:
screenshot from 2016-04-20 14-34-00

In host metrics view:
screenshot from 2016-04-20 14-34-08

Quick action view:
screenshot from 2016-04-20 14-33-55

Configure box:
screenshot from 2016-04-20 14-34-17

Issue power action:
screenshot from 2016-04-20 14-33-43

Information/detail tab:
screenshot from 2016-04-20 14-34-23

@rohityadavcloud rohityadavcloud changed the title [WIP] Don't start review yet -- CLOUDSTACK-9299: Out-of-band Management for CloudStack CLOUDSTACK-9299: Out-of-band Management for CloudStack Apr 20, 2016
@rohityadavcloud rohityadavcloud force-pushed the outofband-master branch 8 times, most recently from a77a527 to 6b03133 Compare April 22, 2016 09:54
}

private void validateParams() {
if (getHostId() == null || getHostId() < 1L) {
Copy link
Member Author

@rohityadavcloud rohityadavcloud Apr 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove validateParams() as soon as dynamic-roles gets merged to master, I'll rebase this PR and use the new validations annotations field.

@rohityadavcloud rohityadavcloud force-pushed the outofband-master branch 2 times, most recently from a60d38b to d614c70 Compare April 22, 2016 10:43
@rohityadavcloud
Copy link
Member Author

/////////////// API Implementation///////////////////
/////////////////////////////////////////////////////

private void validateParams() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pity your new Annotation parameter for @parameter, validators is not in yet.

@DaanHoogland
Copy link
Contributor

I did an one hour code review and found nothing that will make me 👎 it so; LGTM @swill I will however start integration tests and find more time as the size of the change warrants a two day review.

looks good @bhaisaab

@rohityadavcloud
Copy link
Member Author

@DaanHoogland thanks, @borisstoyanov will share QA results next week as well

@swill
Copy link
Contributor

swill commented Apr 22, 2016

Thanks guys. I will try to get this one queued up for CI.

@kiwiflyer
Copy link
Contributor

Really exciting PR @bhaisaab!

We'll pull this in for testing as well.

@rohityadavcloud
Copy link
Member Author

Thanks @swill @kiwiflyer

@DaanHoogland
Copy link
Contributor

@swill I am running the tests on this one. You want to both run it or beat me at it?

@pyr
Copy link
Contributor

pyr commented Apr 22, 2016

First read-through didn't raise any eyebrows on my end. LGTM

@swill
Copy link
Contributor

swill commented Apr 22, 2016

Perfect thanks @DaanHoogland. Once we get more people using bubble, maybe we can start posting to the PR that we are kicking off a CI run (like you did here) so we can be sure to best use our limited resources. I have been helping @kiwiflyer and @dmabry get bubble setup in their environment and I think they are all set now, so they will likely be contributing to this effort as well.

} finally {
if (Strings.isNullOrEmpty(stdError)) {
stdOutput = readStream(process.getInputStream());
stdError = readStream(process.getErrorStream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using Guava's CharStreams.toString(new BufferedWriter(process.getErrorStream)) instead of readStream?

@rohityadavcloud rohityadavcloud force-pushed the outofband-master branch 2 times, most recently from f6ae5fa to 6aa0fbd Compare May 11, 2016 06:27
@rohityadavcloud
Copy link
Member Author

@jburwell can you do a final review, LGTM or share further improvements. Thanks.

@rohityadavcloud
Copy link
Member Author

@nvazquez can you check why test_03_list_snapshots failed in the Travis run, with tearDown exception.

@nvazquez
Copy link
Contributor

@rhtyd sure, I'll examine this. This tests are introduced in #1497, I'll work on this

@nvazquez
Copy link
Contributor

nvazquez commented May 11, 2016

@rhtyd @swill a PR for fixing the problem #1539

@serg38
Copy link

serg38 commented May 11, 2016

@swill PR1539 passed Jenkins and Travis. After you merge it should resolve the issue in other PRs

Support access to a host’s out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.

Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (https://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.

This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host

For testing this feature `ipmisim` can be used:
https://pypi.python.org/pypi/ipmisim

FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack

Signed-off-by: Rohit Yadav <[email protected]>
- For out-of-band management feature (CLOUDSTACK-9299) use patched version of
  ipmitool that would work on trusty travis machines
- The ipmitool used is from xenial/16.04 release with patch from RedHat
  https://bugzilla.redhat.com/show_bug.cgi?id=1286035
- Installs ipmitool from xenial repositories to get all the dependencies
  and then install patched deb version
- Skip test if the known failure occurs

Signed-off-by: Rohit Yadav <[email protected]>
Increases timeout to a larger value to avoid failures in VM environments such as
TravisCI.

Signed-off-by: Rohit Yadav <[email protected]>
Reorder cleanup items so cleanup won't fail

Signed-off-by: Rohit Yadav <[email protected]>
This fixes several Jenkins failures as previous runs don't cleanup this
file created by one of the unit tests.

Signed-off-by: Rohit Yadav <[email protected]>
@rohityadavcloud
Copy link
Member Author

@jburwell fixed the ProcessRunner issues, please do a final review and LGTM or suggest changes. Thanks.

@nvazquez @swill I've fixed two CI issues (Travis and Jenkins issues) in this PR as well

@rohityadavcloud
Copy link
Member Author

tag:mergeready

/cc @swill all green now

@jburwell
Copy link
Contributor

@swill @rhtyd LGTM based on code review

@asfgit asfgit merged commit 12fff7d into apache:master May 12, 2016
asfgit pushed a commit that referenced this pull request May 12, 2016
CLOUDSTACK-9299: Out-of-band Management for CloudStackSupport access to a hosts out-of-band management interface (e.g. IPMI, iLO,
DRAC, etc.) to manage host power operations (on/off etc.) and querying current
power state in CloudStack.

Given the wide range of out-of-band management interfaces such as iLO and iDRA,
the service implementation allows for development of separate drivers as plugins.
This feature comes with a ipmitool based driver that uses the
ipmitool (https://linux.die.net/man/1/ipmitool) to communicate with any
out-of-band management interface that support IPMI 2.0.

This feature allows following common use-cases:
- Restarting stalled/failed hosts
- Powering off under-utilised hosts
- Powering on hosts for provisioning or to increase capacity
- Allowing system administrators to see the current power state of the host

For testing this feature, please install `ipmitool` (using yum/apt/brew) and `ipmisim`:
https://pypi.python.org/pypi/ipmisim

The default ipmitool location is assumed in /usr/bin, if this is different in your env please fix the global setting, see FS for details on various global settings.

FS:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Out-of-band+Management+for+CloudStack

/cc @jburwell @swill @abhinandanprateek @murali-reddy @borisstoyanov

* pr/1502:
  maven: ignore utils/testsmallfileinactive for rat checking
  CLOUDSTACK-9378: Fix for #1497
  HypervisorUtilsTest: increate timeout to 8seconds
  travis: Use patched version of ipmitool for tests
  CLOUDSTACK-9299: Out-of-band Management for CloudStack

Signed-off-by: Will Stevens <[email protected]>
@swill
Copy link
Contributor

swill commented May 12, 2016

I see the following error which is causing the PR #1297 to fail. Suggestions?

+---------------------------------------------+----------------------+--------+
| test_oobm_zchange_password                  | exceptions.Exception | 5.220  |
+---------------------------------------------+----------------------+--------+

More details:

======================================================================
ERROR: Tests out-of-band management change password feature
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/apache/cloudstack/test/integration/smoke/test_outofbandmanagement.py", line 560, in test_oobm_zchange_password
    response = self.apiclient.changeOutOfBandManagementPassword(cmd)
  File "/home/travis/.local/lib/python2.7/site-packages/marvin/cloudstackAPI/cloudstackAPIClient.py", line 1553, in changeOutOfBandManagementPassword
    response = self.connection.marvinRequest(command, response_type=response, method=method)
  File "/home/travis/.local/lib/python2.7/site-packages/marvin/cloudstackConnection.py", line 379, in marvinRequest
    raise e
Exception: Job failed: {jobprocstatus : 0, created : u'2016-05-12T19:06:38+0000', cmd : u'org.apache.cloudstack.api.command.admin.outofbandmanagement.ChangeOutOfBandManagementPasswordCmd', userid : u'629e5dba-1873-11e6-8612-42010a80001c', jobstatus : 2, jobid : u'35b81153-96cd-4d1e-8353-eb9b746dd145', jobresultcode : 530, jobresulttype : u'object', jobresult : {errorcode : 530, errortext : u'Failed to change out-of-band management password for host (7c3c6475-d9de-4892-a20e-c81b01351fcd) due to driver error: Failed to find IPMI user to change password, error: packet session id 0x0 does not match active session 0xa0a2a3a4\nERROR: Received an Unexpected message ID\nSet Session Privilege Level to ADMINISTRATOR failed\nError: Unable to establish IPMI v2 / RMCP+ session\nClose Session command failed\n'}, accountid : u'629e41c4-1873-11e6-8612-42010a80001c'}

----------------------------------------------------------------------
Ran 16 tests in 188.550s

FAILED (errors=1)

@swill
Copy link
Contributor

swill commented May 12, 2016

Jenkins for #1537 is also being held up by these tests:

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.cloudstack.outofbandmanagement.driver.ipmitool.IpmitoolWrapperTest
Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 300.337 sec <<< FAILURE! - in org.apache.cloudstack.outofbandmanagement.driver.ipmitool.IpmitoolWrapperTest
testExecuteCommands(org.apache.cloudstack.outofbandmanagement.driver.ipmitool.IpmitoolWrapperTest)  Time elapsed: 300.086 sec  <<< FAILURE!
java.lang.AssertionError: null
    at org.junit.Assert.fail(Assert.java:86)
    at org.junit.Assert.assertTrue(Assert.java:41)
    at org.junit.Assert.assertTrue(Assert.java:52)
    at org.apache.cloudstack.outofbandmanagement.driver.ipmitool.IpmitoolWrapperTest.testExecuteCommands(IpmitoolWrapperTest.java:112)

Results :
Failed tests: 
  IpmitoolWrapperTest.testExecuteCommands:112 null

Tests run: 8, Failures: 1, Errors: 0, Skipped: 0

@rohityadavcloud
Copy link
Member Author

@swill thanks, will have a look at it

@rohityadavcloud
Copy link
Member Author

@swill I've tried to fix them here: #1544

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet