Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secure environment variables encrypted in Yaml #1145

Closed
bradrydzewski opened this issue Aug 18, 2015 · 36 comments
Closed

Secure environment variables encrypted in Yaml #1145

bradrydzewski opened this issue Aug 18, 2015 · 36 comments
Milestone

Comments

@bradrydzewski
Copy link

Currently we have the ability to store private environment variables in the database. I'd like to expand this capability to store encrypted environment variables in the yaml itself, similar to travis.

secure:
  docker_user: U2FsdGVkX19g96OdOIRPtVNGzPP1CgGlxRxRVDvlVHZKYrulqPLlYFG3TfUw53Gk
  docker_pass: U2FsdGVkX18B3tY76uPxLe2tA6UzaBrVMMsArAUS4bo=

the secure parameters are decrypted at runtime and injected into your build yaml, similar to how we inject private variables (entered in the UI) or matrix parameters:

deploy:
  docker:
    email: $$docker_user
    password: $$docker_pass

Unlike travis, injecting the private parameters as environment variables is not sufficient since publish, deploy and notify sections (which are typically the only sections that require such secrets) are executed outside of the build container and expect very specific parameters in the yaml. This is a result of the new 0.4 plugin model.

So for now we need to inject using the same syntax as matrix and private variables. I will spend some time, however, thinking up alternate approaches ...

@bradrydzewski bradrydzewski added this to the v0.4.0 milestone Aug 18, 2015
@bradrydzewski
Copy link
Author

cc @shawnzhu

@bradrydzewski
Copy link
Author

note that this implementation uses AES256 encryption because all parameters are encrypted via the endpoint. We use a per-repo encryption key (repo.Hash as the secret) that never leaves Drone nor is exposed to any end user or any build.

I'm hesitant to use rsa encryption because the rsa.EncryptPKCS1v15 function is considered insecure. The benefit of RSA is that we could encrypt with only the public key, and it could be done outside of Drone on a developers laptop, without sending private data to the Drone server. At the same time, Drone has the private key and can always decrypt, so I'm not sure it really buys much more security.

@tboerger
Copy link

And how do we get the encrypted value that have to be placed in the yaml?

@shawnzhu
Copy link

@tboerger one idea is creating a new sub-command for drone CLI to get the encrypted version:

$ drone encrypt <owner/repo-name> <sensitive-value>
U2FsdGVkX18B3tY76uPxLe2tA6UzaBrVMMsArAUS4bo=

@tboerger
Copy link

great if users can access the cli command standalone :)

@shawnzhu
Copy link

the way over new drone CLI sub command is just one way to enable user to encrypt sensitive data locally without sending to drone server.

it also provided a new endpoint /api/repos/:owner/:name/encrypt to encrypt request body, but I'm hesitate to keep this because the idea of using RSA is avoid sending sensitive information over network. @bradrydzewski comments?

@bradrydzewski
Copy link
Author

And how do we get the encrypted value that have to be placed in the yaml?

@tboerger we'll also expose something in the UI to generate the encrypted strings, for those that don't have the CLI installed

but I'm hesitate to keep this because the idea of using RSA is avoid sending sensitive information over network.

@shawnzhu I think both approaches have their pros and cons.

If we use EncryptOAEP we'll need to send the public key and secret hash down to the client to encrypt. If a team member uploads their own rsa public and private keys to the repository, it is possible a team member can decrypt any secrets I put in the yaml.

If we use AES we'll need to send the secret over the network and encrypt via the REST API. The key used to AES encrypt secrets is never shared or exposed to the end user. This means I could add a secret to a yaml and none of my team members would be able to decrypt it [1].

So it seems to come down to which you trust more, your network or your team members? I don't think there is any right answer here. I'm definitely open to more dialog.

[1] we currently don't inject any secrets in the build in 0.4 so that people can't echo them or leak them. Secrets are only sent to plugins, which are much more limited in functionality, and generally cannot execute arbitrary commands. Not sure if this will change / need to change in the future.

@shawnzhu
Copy link

@bradrydzewski my thought is the key difference is whether it allows user (repo owner) to inject their own encryption key (private key for RSA, or repo secret hash for AES). because one should trust team members (when adding them to administrate a hosted git repo) who have the equal privilege to maintain sensitive data.

I would prefer to:

  1. fix drone set-key command to be capable to change keypair via regenerate instead of injecting given private key (like how to manage repository secret hash)
  2. be capable to regenerate repository secret hash
  3. make this feature extensible like <AES>U2FsdGVkX18B3tY76uPxLe2tA6UzaBrVMMsArAUS4bo= and <RSA>some-long-base64-string

then we can keep existing AES based solution with an RSA solution. users should not care about this since it's generated by either a drone sub command or web form

@bradrydzewski
Copy link
Author

be capable to regenerate repository secret hash

important to note that the hash is used to sign the hook URL added to github etc. changing the hash would invalidate the hook, so we'll need to re-generate the hook

fix drone set-key command to be capable to change keypair via regenerate instead of injecting given private key (like how to manage repository secret hash)

the reason we added this feature was to allow people to share a key across multiple repos. There were a few different use cases, but the biggest was to avoid having to manually move deploy-keys to the user account-level to enable cloning other private repos (when using git submodules, npm deps from private repos, etc). So we would break one feature to add another :( This was a highly requested feature, so I'm sure it is a good idea to remove


Either way we're going to be sending decrypted secrets over the network. Even if we implement RSA encryption we still end up sending the decrypted values to the Docker daemon when launching the build (over tcp). The alternative is sending the encrypted values to the Docker daemon with the Key and Hash (still over tcp). So it seems using RSA only solves 1/2 of the problem :(

Any thoughts how to resolve the second half?

@shawnzhu
Copy link

important to note that the hash is used to sign the hook URL added to github etc. changing the hash would invalidate the hook

Yes, which means drone must update hook URL if repo hash is changed. but if one repo Hash never be leaked to client side, no need to expose any user interface to regenerate it.

I'm fine to leave drone set-key as it was if it is still a highly requested feature. However, one can not decrypt secured variables in yaml unless they got original encryption key (impossible over drone's user interface). I just want to avoid transferring encryption key over network.


it is another topic about trust issue between drone-build and docker machines where builds run. Docker itself enables TLS from transportation layer in default (it is another thing if drone administrator disable this). according to application layer, transferring encrypted values is acceptable. The only input I have is keeping shared Hash and/or Key possessed in a secure store (libtrust?). Anyway, specific operational security requirement to drone + docker would be a good starting point.

@bradrydzewski
Copy link
Author

@shawnzhu agreed, but if a team is very security conscious perhaps they should run drone with SSL? If you aren't running drone with SSL, an eavesdropper would have the ability to intercept the authorization token or intercept the GitHub access token at login and gain access to your GitHub account ...

I'm mostly playing devils advocate here (I don't mean to be difficult) but I want to understand if the real answer is "use SSL if you need strong security" ... if yes, we should heavily promote SSL in the documentation.

@shawnzhu
Copy link

@bradrydzewski yes, SSL/TLS should be default option to secure endpoint over HTTP. Per this question I found this from twitter API doc:

It is strongly recommended you use HTTPS for all OAuth authorization steps.

BTW, drone may use "insecure_ssl": "1" option when creating a github webhook for those who provides inscured SSL certificate (self signed, or not recognized internal CA) for drone host.

@bradrydzewski
Copy link
Author

@shawnzhu also may make sense to consider:
https://godoc.org/golang.org/x/crypto/nacl/box
https://godoc.org/golang.org/x/crypto/nacl/secretbox

Instead of using the RSA keys, we could generate an alternate set of keys strictly used for private variables. Secretbox seems like a pretty interesting alternative here

@shawnzhu
Copy link

nacl/box is also public key cryptography, my only concern is the length of encryption key is 256 (bit), which is smaller than that of drone's RSA private key which is 2048 (bit).

@bradrydzewski
Copy link
Author

here is an example using nacl/box purely for encryption:
https://gist.github.com/bradrydzewski/7bf42524aae5a1316df0

it doesn't use nacl/box for authentication, and I'm not entirely sure what impact this has on the overall effectiveness of the algorithm, but may be worth considering.

@bradrydzewski
Copy link
Author

@shawnzhu this is way outside my area of expertise, but AES uses a 128 bit to 256 bit key and it is considered pretty hard (impossible?) to brute force. Wondering if this is an issue for box.

@bradrydzewski
Copy link
Author

I see #1154 and I really like taking the sha256 of repo.Hash to avoid leaking the value outside of Drone. So when we run drone encrypt (eventually) would we send the sha256 of repo.Hash + public key down to the client?

@shawnzhu
Copy link

yes, this is what I thought. the downside is, technically the ciphertext (never change) of RSA OAEP leaked and it won't change until repo.Hash changes.

Actually this is because I tried my best to get a hash.Hash type from given repo.Hash without checksum but no luck.

@bradrydzewski
Copy link
Author

@shawnzhu awesome work. Testing it now, we may need to use base64.URLEncoding. The yaml file seems to be picky about string escaping so ideally we can use an encoding that just copies / pastes directly into the yaml without too much effort.

I'll play around with it more tomorrow

@bradrydzewski
Copy link
Author

@shawnzhu after some yaml parser issues I adjusted to use base64.RawURLEncoding which drops some of the special characters. I deployed and have started using this feature to secure DockerHub credentials so that I can automate plugin deployment:
http:https://beta.drone.io/drone-plugins/drone-git/7

Works like a charm 😄

I've been thinking about your prior comment regarding embedding the variable names in the encrypted string itself. So for example, instead of this:

secure:
  - DOCKER_PASS: WsQVT0g...
  - DOCKER_PASS: ECLQAnlq...

Doing this:

secure:
  - WsQVT0g...
  - ECLQAnlq...

I think this may actually be possible. We would still need to return a map[string]string with the decrypted variables so that they could be injected properly, but there is no reason we couldn't store them as a []string in the yaml file, decrypt, and split the string by = to separate the key from the value ...

Would love to hear your thoughts.

Also, I'm going to add a form to the UI so that people can generate encrypted strings without having the CLI installed. The CLI will still remain the recommended way to encrypt, so this will be more for convenience.

Once I add the UI functionality, and we decide on whether to embed the parameter names in the encrypted string, or keep external, we can close this one out!

@tboerger
Copy link

I don't like the variable included within secret. After half of a year nobody remembers which variables are already part of it

@bradrydzewski
Copy link
Author

@tboerger great point, I can see how that would get annoying. As a side not, you will still see $$param in the yaml itself, for example:

deploy:
  heroku:
    token: $$heroku_token
secure:
  heroku_token: <encrypted>

would alternatively look like this:

deploy:
  heroku:
    token: $$heroku_token
secure:
  - <encrypted>

so you can still see which private tokens are required by looking at the $$ sign. The only benefit (and I'm not sure it's a big benefit) is that you don't know which encrypted string is for which parameter. Is this a big enough benefit to change the implementation?

@tboerger
Copy link

I think the confusion will be bigger than benefit :)

What if you want to remove the encrypted value? You don't know which one :)

@bradrydzewski
Copy link
Author

I've run into one issue with RSA keys. In some cases developers want to encrypt large strings, for example, other keys or certificates used to sign and publish Android applications. RSA can only encrypt a maximum of 245 bytes 😢

@shawnzhu
Copy link

this should be a limitation of rsa.EncryptOAEP:

The message must be no longer than the length of the public modulus less twice the hash length plus 2.

but the number 245 (256 minus 11) looks like the limitation of rsa.EncryptPKCS1v15:

The message must be no longer than the length of the public modulus minus 11 bytes

do you think it is a good idea to improve the implementation by splitting long message into array of 245 bytes bytes array?

@benschumacher
Copy link

What if the solution was to offer both options, and leave it to users to determine what is most appropriate for their project? This would likely provide the most future-proof solution, too, as it would necessitate choosing a format for the encryption details that'd be flexible enough to accommodate different schemes, which provides both backwards and forward compatibility.

One option would be using one of the JOSE standards as the container for the encryption details and content, and then putting that entire JSON object (Base64 encoded, or otherwise) directly into the .drone.yml's secure content area. There's a Go library already for this standard (h/t @linuxwolf) that could be leveraged: https://github.com/square/go-jose

I know that Drone currently uses RSA for public keys, but has there been any investigation into using something elliptic curve-based instead? I'm not sure if that causes compatibility issues with GitHub, or other VCSes.

@benschumacher
Copy link

I have a friend (cc @hildjj) who mentioned that he was using this functionality in Travis CI. Might be interesting to get some feedback from somebody who's used the system in their environment. It looks like the Travis implementation in their CLI is just using RSA directly... I guess it works, but it feels like it could be restrictive.

@hildjj
Copy link

hildjj commented Aug 25, 2015

@benschumacher successfully summons me.

First: travis gets this really right. Their job is easier because they have a fixed service to talk to. However, you might be able to use the public key information you've already got from the cert that you offer for either TLS or SSH, which makes this not as bad as you might think.

Second, get a crypto person to help you. They're going to ask you what your threat model is first, so be prepared to answer that. A start:

  • I have a bit of plaintext that I want to share with a process that will run later on a machine that I can identify now.
  • I want to check an encrypted version of that plaintext into a file that is publicly available (e.g. through GitHub)
  • I don't really care if other authorized users can see the plaintext.
  • It is necessary for the machine running the process to see the plaintext.
  • I don't need replay protection at this layer.
  • The plaintext may be bigger than one RSA chunk with reasonable-sized keys.
  • The contents of the file are under control of attackers.

That that person will likely tell you: just use JOSE. You MUST do some public-key crypto here. AES alone likely doesn't solve the use case (but you'll actually use both RSA and AES before you're done). Then you'll want to add algorithm agility so you can support elliptic curves. JOSE has all of that stuff layered correctly out of the box:

https://tools.ietf.org/html/rfc7516

Next, please include the variable name in the plaintext, even if you feel like you must repeat it outside as a label for maintainability. I'm worried about them being able to assign the plaintext to a new variable, and use that to exfiltrate data or attack unexpected parts of the system. Note that your command-line tool might be able to make an API call to list the variable names relatively safely.

@bradrydzewski
Copy link
Author

@benschumacher @hildjj thanks for the input and I really like the suggestion to go down the JOSE path. I need to familiarize myself with the specification a bit more, but the Go implementation that @benschumacher referenced looks pretty solid.

Their job is easier because they have a fixed service to talk to. However, you might be able to use the public key information you've already got from the cert that you offer for either TLS or SSH, which makes this not as bad as you might think.

Yes, we can use the keys generated for SSH here

Next, please include the variable name in the plaintext, even if you feel like you must repeat it outside as a label for maintainability. I'm worried about them being able to assign the plaintext to a new variable

The drone implementation works slightly differently than Travis. The secure parameters are only provided to publish, deploy and notification plugins. Each plugin runs in its own Docker container. The main build container, which can execute arbitrary commands and arbitrary code, won't have access to secure variables to overwrite, alter or echo.

Regardless, we can include the variable names in the plaintext. I've received feedback from a few others that would prefer this approach as well.

Thanks again for jumping in here, much appreciated!

@benschumacher
Copy link

Next, please include the variable name in the plaintext, even if you feel like you must repeat it outside as a label for maintainability. I'm worried about them being able to assign the plaintext to a new variable

The drone implementation works slightly differently than Travis. The secure parameters are only provided to publish, deploy and notification plugins. Each plugin runs in its own Docker container. The main build container, which can execute arbitrary commands and arbitrary code, won't have access to secure variables to overwrite, alter or echo.

Regardless, we can include the variable names in the plaintext. I've received feedback from a few others that would prefer this approach as well.

This is an excellent point, and I hadn't thought about this. Being able to move the encrypted portion from one variable to another could expose a secret intended for one plugin container to another. It does make the data appear more opaque, but this could be addressed via comments in the YAML.

@bradrydzewski
Copy link
Author

This also has me thinking ...

What if the encrypted string included the plugin / step for which it was intended? This would prevent us from sending secrets to a plugin that shouldn't receive them. For example, let's say we encrypted the following string:

plugins/slack|SLACK_TOKEN=3da541559918a808c2402bba5012f6c60b27661c

When Drone executes a build, we could ensure SLACK_TOKEN only gets sent to the plugins/slack plugin. This eliminates a potential vector where a malicious pull request is issued, attempts to substitute the slack notification step with a different docker image, which attempts to leak or expose your secret variables. For example:

notify:
  slack:
-   image: plugins/slack
+   image: evil-plugins/slack
    token: $$slack_token
    channel: foo

We prevent the use of secrets in pull requests for this reason.. Travis does as well. We also white-list plugins for this reason. But ... since Drone will only send secrets to separate plugin containers, and never injects in the build environment ... if we can verify it is sent to the correct, trusted plugin, we should be able to safely enable secrets in pull requests.

(hopefully that made sense)

@shawnzhu
Copy link

I really like plugin whitelist feature which means it won't download evil plugins in drone via using docker hub trusted build. However, the following steps should be a threat:

  1. during build step, tag evil docker image to plugin/evil-plugin:latest within privileged build container, so that it injects an evil plugin with the whitelisted prefix.
  2. when plugin container runs within same docker machine, attacker creates a pull request to use plugin/evil-plugin and drone-build doesn't re-pull plugin image from a registry, then it will create plugin container from the previous tagged evil plugin image.

So I would say it should always re-pull plugin image in https://github.com/drone/drone/blob/0.4.0/cmd/drone-build/client.go#L94-L95 to make sure it always download from trusted docker registry.

@bradrydzewski
Copy link
Author

I really like plugin whitelist feature which means it won't download evil plugins in drone via using docker hub trusted build.

This isn't going away. We'll definitely keep this functionality

during build step, tag evil docker image to plugin/evil-plugin:latest within privileged build container, so that it injects an evil plugin with the whitelisted prefix.

The privileged, volumes, and net=host options are all disabled and can only be enabled for builds that are set as trusted by the system administrator. So while this attack is possible, someone would have to knowingly allow these vulnerabilities for a public repository. If someone allows this configuration, a malicious plugin is the least of their worries -- the attacker will own the whole box. So enabling trusted mode for a repository implies trust, and I would advise against enabling trusted mode if the repository is public.

The ability to build and publish Docker images in 0.4 no longer requires a privileged build container, and no longer allows access to the host machines Docker daemon. In 0.4, we have a special plugin that runs Docker in Docker to avoid this exploit. This plugin can only run a pre-defined set of commands (docker login, docker build and docker push) to prevent the user from running arbitrary commands.

There are plenty of potential exploits that we need to consider and I'm very glad you're thinking about them -- and we should document them all -- but probably in a separate issue or markdown file in /doc. Let's keep this discussion focused on secure parameters 😄

@hildjj
Copy link

hildjj commented Aug 27, 2015

@bradrydzewski I like that. I'd suggest that since you're Base64-protected already, you could use NULL's to separate the tokens (or better yet, length encode) in order to reduce the chances of parser attacks (what happens when someone puts in a plugin or variable name containing pipe or equal?).

plugins/slack\0SLACK_TOKEN\03da541559918a808c2402bba5012f6c60b27661c

@bradrydzewski
Copy link
Author

related to this issue, #1189 is a proposal by @benschumacher that we discuss and I've drafted. I would love to get some feedback.

@bradrydzewski
Copy link
Author

moving discussion to #1189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants