-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS: add {verify, verify_peer}
to enable verification
#4575
Conversation
9fd56d5
to
3cb9716
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right. We don't fix the TLS warning by weakening security. I think we either do nothing or we need to set this up correctly.
3cb9716
to
7d24b5a
Compare
b155ad2
to
32d36f4
Compare
71d54cd
to
9108a3c
Compare
9108a3c
to
6d602ac
Compare
6d602ac
to
9edbf4b
Compare
f24da6d
to
2359657
Compare
I agree with nick. The value of I've also suggested elsewhere to move the logic from configure to dev/run as we only know the number of nodes at dev/run time. With that done it will be possible to automatically add the correct arguments to |
2359657
to
aa0b6f4
Compare
90e9ee7
to
5cc5aaa
Compare
dev/run
Outdated
"enable_tls": opts.enable_tls, | ||
"tcp_node": opts.tcp_node, | ||
"hostname": socket.gethostname() if opts.enable_tls else "127.0.0.1", | ||
"bind_address": "0.0.0.0" if opts.enable_tls else "127.0.0.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary? Ideally the dev nodes remain bound only to 127.0.0.1. Is something in openssl forcing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTP 26, 127.0.0.1
works fine because I can specify server_name_indication
in conf file.
However in OTP 25, if the IP is 127.0.0.1
, I tried different hostnames (CN), and the verify process always returns hostname_check_failed
. If I set local.ini bind_address to 0.0.0.0
, using my hostname as CN, no errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to override server_name_indication? I thought the CN once set to the erlang node name would be the right value. did that not work? ("/[email protected]"
, I mean.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np, [email protected]
does not work. CN needs to be equal to the hostname
.
e.g.: If the node name is ssl_test@localhost
, CN needs to equal localhost
.
I've tried different CNs, such as 127.0.0.1, {127,0,0,1}, 127.0.1.1, localhost, localhost.local, $hostname -s/-f
, etc. If the node name contains 127.0.0.1, they just return hostname_check_failed
in OTP 25.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's more complicated than that. reading inet_tls_dist.erl
makes it clear that they expect the host name part in the SAN field and the node name in a utf8string as a commonName under a directoryName. I have not yet persuaded openssl to generate a certificate that inet_tls_dist:cert_nodes/1
is happy with. I think the author(s) of inet_tls_dist
has made a cryptic choice;
%% Look in Extensions, in all subjectAltName:s
%% to find node names in this certificate.
%% Host names are picked up as a subjectAltName containing
%% a dNSName, and the first subjectAltName containing
%% a commonName is the node name.
%%
The nearest I can get so far is;
[{'Extension',{2,5,29,17},
false,
[{dNSName,"127.0.0.1"},
{directoryName,{rdnSequence,[[{'AttributeTypeAndValue',{2,5, 4,3},
<<12,7,99,111,117,99,104,100,98>>}]]}}]},
cert_nodes
finds the hostname but because the <<12,7,99,111,117,99,104,100,98>>
is not a utf8String
tuple it doesn't find the node name.
This is the (bizarre) way I have managed to add the dirName to the certificate. all other methods have failed (like the invitingly simple -addext
).
printf "[SAN]\nsubjectAltName=DNS:127.0.0.1,dirName:dir_ext\n[dir_ext]\nCN=couchdb\n" > ext.cnf
openssl x509 -req -days 3650 -set_serial 01 -in node-req.pem -out node-cert.pem -CA ca-cert.pem -CAkey ca-key.pem -extfile ext.cnf -extensions SAN
Thus far I think the openssl cli is mediocre and you can't actually do everything there that openssl is capable of. It seems you must use openssl conf files to do anything interesting.
This has, so far, been frustrating and illuminating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mkdir -p out
./certs self-signed \
--out-cert out/ca-cert.pem --out-key out/ca-key.pem \
--template root-ca \
--subject '/CN=CouchDB Root CA'
for NODE in $@; do
./certs create-cert \
--issuer-cert out/ca-cert.pem --issuer-key out/ca-key.pem \
--out-cert "out/${NODE}-cert.pem" --out-key "out/${NODE}-key.pem" \
--template node \
--subject '/OU=CouchDB' \
--host 127.0.0.1 \
--node "${NODE}"
done
1> Parse = fun(File) ->
1> {ok, PemBin} = file:read_file(File),
1> [{_, DerCert, _}] = public_key:pem_decode(PemBin),
1> OTPCert = public_key:pkix_decode_cert(DerCert,otp),
1> inet_tls_dist:cert_nodes(OTPCert) end.
#Fun<erl_eval.44.65746770>
2>
2> Parse("node1-cert.pem").
["[email protected]"]
3> Parse("node2-cert.pem").
["[email protected]"]
4> Parse("node3-cert.pem").
["[email protected]"]
5>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standalone demo;
start_ssl.rel
{release, {"OTP APN 181 01","R15A"}, {erts, "10.7"},
[{kernel,"8.3.2.3"},
{stdlib,"3.17.2.2"},
{crypto, "5.0.6.3"},
{public_key, "1.12.0.1"},
{sasl, "4.1.2"},
{asn1, "5.0.18.1"},
{ssl, "10.7.3.6"}
]}.
> systools:make_script("start_ssl",[]).
ok
couchdb.sh
#!/bin/sh
mkdir -p out
./certs self-signed \
--out-cert out/ca-cert.pem --out-key out/ca-key.pem \
--template root-ca \
--subject '/CN=CouchDB Root CA'
for NODE in $@; do
./certs create-cert \
--issuer-cert out/ca-cert.pem --issuer-key out/ca-key.pem \
--out-cert "out/${NODE}-cert.pem" --out-key "out/${NODE}-key.pem" \
--template node \
--subject '/OU=CouchDB' \
--host 127.0.0.1 \
--node "${NODE}"
cat <<EOF > out/$NODE-dist.conf
[
{server, [
{cacertfile, "$(pwd)/out/ca-cert.pem"},
{certfile, "$(pwd)/out/${NODE}-cert.pem"},
{keyfile, "$(pwd)/out/${NODE}-key.pem"},
{secure_renegotiate, true},
{verify, verify_peer},
{fail_if_no_peer_cert, true}
]},
{client, [
{cacertfile, "$(pwd)/out/ca-cert.pem"},
{certfile, "$(pwd)/out/${NODE}-cert.pem"},
{keyfile, "$(pwd)/out/${NODE}-key.pem"},
{secure_renegotiate, true},
{verify, verify_peer},
{fail_if_no_peer_cert, true}
]}
].
EOF
done
./couchdb.sh node1 node2 node3
then start up two nodes;
erl -boot ./start_ssl -proto_dist inet_tls -ssl_dist_optfile out/node1-dist.conf -name [email protected]
erl -boot ./start_ssl -proto_dist inet_tls -ssl_dist_optfile out/node2-dist.conf -name [email protected]
then ping
Eshell V12.3.2.9 (abort with ^G)
([email protected])1> net_adm:ping('[email protected]').
pong
([email protected])2>
no Authenticity is not established by certificate path validation
warning, yay!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a remsh using inet_tls_dist but without presenting a client certificate is rejected as expected;
erl -boot ./start_ssl -name [email protected] -hidden -proto_dist inet_tls -remsh [email protected]
Erlang/OTP 24 [erts-12.3.2.9] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit]
=WARNING REPORT==== 25-May-2023::13:09:35.881585 ===
Description: "Authenticity is not established by certificate path validation"
Reason: "Option {verify, verify_peer} and cacertfile/cacerts is missing"
*** ERROR: Shell process terminated! (^G to start new job) ***
=NOTICE REPORT==== 25-May-2023::13:09:36.033411 ===
TLS client: In state cipher received SERVER ALERT: Fatal - Handshake Failure
([email protected])1> =NOTICE REPORT==== 25-May-2023::13:09:36.023069 ===
TLS server: In state certify at tls_dtls_connection.erl:315 generated SERVER ALERT: Fatal - Handshake Failure
- no_client_certificate_provided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the info, especially for generating the correct certificates to pass validation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you, @rnewson, for figuring it out. This would a nice resource for users to start with and get a decent setup without too much of a headache.
5cc5aaa
to
5fce901
Compare
{verify, verify_peer}
to enable verification
6494515
to
b04ad5b
Compare
9676e9d
to
2f505dd
Compare
@@ -24,6 +24,6 @@ if [ -z $HOST ]; then | |||
fi | |||
|
|||
NAME="remsh$$@$HOST" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we need to generate a remsh certificate dynamically now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
erl -name $NAME -remsh "$NODE@$HOST" -hidden -proto_dist inet_tls -ssl_dist_optfile "src/couch_dist/certs/out/${NODE}-dist.conf"
Maybe not, since we're using a dynamic "${NODE}-dist.conf" and pointing to the relevant certificates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
despite the code in inet_tls_dist.erl
that looks for the dirname entry (and parses it out in cert_nodes/1), it appears to be ignored.
It suffices to have a subjectAltName of the hostname. My elixir-certs fork is unnecessary and should not be used.
The cert resulting from
./certs create-cert --issuer-cert out/ca-cert.pem --issuer-key out/ca-key.pem \
--out-cert out/node-cert.pem --out-key out/node-key.pem \
--template server --subject '/CN=127.0.0.1'
is sufficient and can be used as the client and server cert of all nodes with the same hostname, which is all of the dev nodes.
Obviously a 'real' cluster would have distinct ip's or hostnames for each node, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So
commit 5586959ed008c09141689f1e8865476150e48519
Author: Raimo Niskanen <[email protected]>
Date: Thu Apr 26 09:37:24 2018 +0200
Allow check for node name
I think introduced the ability to check the node names from the data in the certificate
and
commit 794df8cbba8d7942dcb3bf2cbdfa526b04d41dd3
Author: Raimo Niskanen <[email protected]>
Date: Fri Jun 8 15:49:06 2018 +0200
Use public_key to verify client hostname
removed it (and used the net_kernel:allow list to restrict node names instead).
they just left the cert_nodes function in but it's unreachable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I keep your elixir-certs
fork in this demo, as it might be used for a real
cluster setup or maybe in the future?
I can modify the gen_cert
script to use the server
template and point to one conf and certificate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was maybe optional previously with {verify_fun, {inet_tls_dist, verify_client, undefined}},
option, but I am getting confused at this point.
Jessica and I have independently shown that '[email protected]' can start up with node2-dist.conf (which points at node2-cert.pem which has the dirname of 'node2') and still ping '[email protected]' which uses node2-dist.conf also.
So erlang does not check the node name embedded in the cert and apparently cannot be configured to do so.
the inet_tls_dist:cert_nodes/1
function (and parse_extensions) just exists to tease me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RaimoNiskanen in case you might find our journey amusing ^.
2f505dd
to
766cf2e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work on this so far. a few comments before we can merge.
src/couch_dist/README.md
Outdated
|
||
- `erlserver.pem`: contains the certificate and its private key. | ||
- `{verify, verify_peer}`: you can specify the hostname with `{server_name_indication, <hostname>}`. | ||
- `{fail_if_no_peer_cert, true}`: should be used on the server side only, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fail_if_no_peer_cert
should be set on client and server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connection succeeds on OTP 24 and 25 but fails on OTP 26.
./dev/remsh-tls
fails to connect to node1 if {fail_if_no_peer_cert, true}
is present on the client side.
See https://www.erlang.org/blog/otp-26-highlights/#ssl-improved-checking-of-options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that tells us remsh-tls is not right. It needs to present a valid certificate too. It wouldn't be very secure if you could choose not to present a certificate to join the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding this to the gen_cert
script?
Generate different couch_dist.conf
files for different erlang versions.
ERL_VER=$(erl -eval "io:put_chars(erlang:system_info(otp_release)), halt()." -noshell)
if [ $ERL_VER -ge "26" ]; then
OPTS="{verify, verify_peer}"
else
OPTS="{verify, verify_peer},
{fail_if_no_peer_cert, true}"
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OTP 26, the checking of options is strengthened to return errors for incorrect options that used to be silently ignored. For example, ssl now rejects the fail_if_no_peer_cert option if used for the client:
In OTP 25, the option would be silently ignored.
https://www.erlang.org/blog/otp-26-highlights/#ssl-improved-checking-of-options
Or just leave {fail_if_no_peer_cert, true}
on the server side. Because when we specify {verify, verify_peer}
, it requires us to provide cacertfile, certfile, and keyfile to connect successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you. I re-read https://www.erlang.org/doc/apps/ssl/ssl_distribution.html and I think you are right. no need for {fail_if_no_peer_cert, true}
on the client config side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert gen_cert
script back, so {fail_if_no_peer_cert, true}
would only appear on server config side.
1b56df4
to
3a19aed
Compare
When the CouchDB custom (couch) distribution is enabled, OTP 24 and 25, you will get a warning when using `remsh-tls` or `remsh -t`; but in OTP 26, this is an error. Because the default value of the verify option is `verify_peer` instead of `verify_none`. We need to provide the appropriate CA certificates to pass the validation. ```bash $ ./dev/remsh-tls =WARNING REPORT==== 4-May-2023::12:43:28.893022 === Description: "Server authenticity is not verified since certificate path validation is not enabled" Reason: "The option {verify, verify_peer} and one of the options 'cacertfile' or 'cacerts' are required to enable this." ``` ```bash $ ./dev/remsh-tls Could not connect to "[email protected]" ``` - Add `{verify, verify_peer}` to enable verification - Add `certs` in `couch_dist` app to generate proper certificates - Add `-t`/`--enable-tls` mode in ./dev/run to automatically generate vm.args, certificates and configuration files - Add `--no-tls <node>` option to specify node to use TCP only - Add README and documentation - Remove TLS certificate generation code from `configure` Greate thanks to Robert Newson for figuring out how to generate the correct certificates to pass verification! Ref: - https://www.erlang.org/doc/apps/ssl/ssl_distribution.html - https://www.erlang.org/doc/man/ssl_app.html - https://www.erlang.org/blog/otp-26-highlights/#ssl-safer-defaults - https://github.com/rnewson/elixir-certs
3a19aed
to
d92f9df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, great job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Tested on 24 and 26 with a 3 node dev/run cluster. Cluster started, all nodes connected via TLS, remsh-tls worked.
net_kernel:nodes_info().
{ok,[{'[email protected]',[{owner,<0.456.0>},
{state,up},
{address,{net_address,{{127,0,0,1},60461},
"127.0.0.1",tls,inet}},
{type,normal},
{in,69},
{out,98}]},
{'[email protected]',[{owner,<0.1176.0>},
{state,up},
{address,{net_address,{{127,0,0,1},60489},
"127.0.0.1",tls,inet}},
{type,hidden},
{in,14},
{out,23}]},
{'[email protected]',[{owner,<0.449.0>},
{state,up},
{address,{net_address,{{127,0,0,1},60463},
"127.0.0.1",tls,inet}},
{type,normal},
{in,76},
{out,103}]}]}
Fantastic work @jiahuili430 and @rnewson !
When the CouchDB custom (couch) distribution is enabled, OTP 24 and 25,
you will get a warning when using
remsh-tls
orremsh -t
; but in OTP 26,this is an error. Because the default value of the verify option is
verify_peer
instead of
verify_none
. We need to provide the appropriate CA certificates topass the validation.
$ ./dev/remsh-tls Could not connect to "[email protected]"
{verify, verify_peer}
to enable verificationcerts
incouch_dist
app to generate proper certificates-t
/--enable-tls
mode in ./dev/run to automatically generatevm.args, certificates and configuration files
--no-tls <node>
option to specify node to use TCP onlyconfigure
Greate thanks to Robert Newson for figuring out how to generate the
correct certificates to pass verification!
Ref:
Overview
Testing recommendations
Related Issues or Pull Requests
Checklist
rel/overlay/etc/default.ini
src/docs
folder