-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be supported #43105
Comments
Our authz API requires extracting the source principal from SPIFFE ID, for example, to restrict access by the source namespace. How would you support this API with custom SPIFFE IDs? Similarly, our telemetry is very much "workload" oriented, meaning we drop the qualified pod name as soon as possible and only report the deployment name, in order to reduce the metric cardinality. In general, using per-pod identity might give a false sense of security. Kubernetes itself doesn't distinguish between pods of the same KSA from authorization perspective (in RBAC, etc), and even then, tenancy by SA is very weak, and namespaces are much closer to an isolation unit. |
Couple of ways I see
Do our telemetry APIs rely on parsing the ISTIO-SPIFFE ID format today? If they do, the above solutions would work. If they do not, they shouldn't be affected.
Depends - my point is that attesting pod identity is the provenance and sole responsibility of the SDS server/workload CA you happen to be using - and the granularity to which identity is attested also belongs to that. Even today - the default If we support pluggable SDS servers, and we do, we should consider respecting whatever the workload CA attests (or at least just respect the parts we care about and ignore the rest), rather than creating downstream de-facto assumptions that constrain what the workload CA can attest and encode in the cert, which is frankly backwards. If you use the default istiod SDS workload CA, all we attest is Kubernetes service account, as attested by the Kubernetes API - but we are necessarily trusting SDS server to attest those things. If you swap that out for another SPIFFE-compliant SDS server, like say SPIRE, it can attest a superset of that - we don't have to care about the superset, but we shouldn't prevent the superset from being represented, which is what we do today. |
Yes, but that's an API change. We're very wary of making any semantic changes to the existing APIs since any change can potentially break users.
That could work, but our implementation does strict regex matching I think. We'd need to make sure pattern matching is backwards compatible.
It matters because we report principals literally as primary metric tags. Having a pod name as a principal will overwhelm the metric systems (none of them scale well to POD^2 cardinality).
If SPIFFE certs are only used by Istio, then it's better to propose to SPIRE to generate Istio-compatible identities, because Istio in general simply doesn't make use of pod names in the APIs. The only issue is inter-op with another system that shares the identities, and for that, we'd need more details on what the other system is. |
Yep. Also, this would only matter to people using a nonstandard SDS. Anyone continuing to use the default
That is useful info and something to consider.
We may require a subset of the spec for our own purposes, but we should not disallow (or refuse to pass thru) a superset of our requirements that are fully within the spec we support just due to some naive validation rules on our part - that's what we do today, and IMO that's a bug and demonstrably not strictly necessary if you are already using an alternate SDS/workload CA due to nonstandard requirements. We should expect the things we need in the cert to be in the cert - if there are more things that external entities might want, that should be negotiated between the custom workload CA you are using to mint Istio workload certs, and your external entities - we shouldn't get in the middle of that and block it, or try to support all permutations of that ourselves. |
For what case do you need more granular workload identity? For stateless application, k8s designed deployment as a logic concept for a group of instances, and each one has same permission. Why does istio need to separate them for auth? |
It doesn't - but there are external systems or integrations that will handle workload certs that might want or need that (see #42114 and @costinm's use cases), and Istio should not prevent you from putting more granular workload identity in the certs than what Istio itself needs. Especially since we support alternative workload CAs that provide more granularity - we just make it unnecessarily difficult to use that additional granularity. Today Istio does prevent you from doing that, practically speaking, even if you use a non-default workload CA that supports this. Additionally, we support replacing the istiod-default workload CA with alternate workload CAs which can attest a much more granular identity than istiod can - which is good - but rather than passing thru additional granularity that the customizable workload CA might put in workload certs, we put an upper bound on it
|
@bleggett, you have hit the nail on its head. I'm working on a project now and would like to use Istio, but this lstio limitation stops me choosing it. As we have our own SPIFFE CA that creates the identity that doesn't follow the Istio required pattern, even thought it is 100% SPIFFE compliant! |
- modify cluster spiffe ids to use custom format - modify federation trust relationships to use new ids - add templated destination rule to workloads 1 and 2 with a DestinationRule as suggested in istio/istio#43105
I am quite in favor of having more flexibility in how we check identities and apply authz - but I am not sure Spiffe URL and having the 'workload name' as part of the URL is the right solution. It will be critical for interop with other mesh implementations - that may not use spiffee but DNS or other identities, and it will also allow passing secured info about node, cluster, etc which are missing. One proposal that I think would solve this nicely ( and much more ) is to add a second SAN with the fully qualified Having a URL with hard to predict format and regex or other ugly ways to guess what variant of spiffee was used is quite dangerous and complicated. |
You'd have to do practically the same thing in the same way with a fully-qualified DNS name in most cases if you wanted to extract parts of the hostname identity as "descriptive metadata" - e.g. parsing segments out of In general the problem here is that Istio is overly prescriptive of the SPIFFE format - it doesn't just expect certain fields, it precludes any other fields or additional specifiers from being used and imposes an ordering on the fields which are present, which is an unnecessary/overly-opinionated fragility that makes it impossible to consume SPIFFE IDs generated outside of Istio, among other things. Istio does not need to predict the SPIFFE format at all - it can simply expect certain named segments to exist in whatever SPIFFE URI it gets, as per the SPIFFE spec, and complain if they are not there. The fact that it mandates a complete SPIFFE format today is an Istio bug, really - if we need to integrate with things like e.g. Cilium I expect we will have to fix this and become more flexible WRT the SPIFFE formats we handle anyway in a way similar to what I'm describing above. Instead of this, we could add a SAN, or extra non-compliant x509 cert fields - the thing that bothers me is that we don't really need to, if we fix the above. At the end of the day though, I am interested in a standardized, not-just-Istio-parsable form of globally-unique workload identity, however we can get there. And I am very interested in Istio not coming up with it's own mechanism for this. SPIFFE is designed to address exactly this problem, we already use it in spots, other projects also use it, and it's a well-defined CNCF spec - so IMO a clear and compelling argument needs to exist for why we shouldn't use it, if we don't want to. The best argument against it so far is "because K8S chose not to follow it as a standard" - which is fine, but since we span clusters and might have mesh interop concerns, we might have needs that extend beyond current K8S requirements. |
The other nice thing about SPIFFE is that identity can be described in a way that is not inherently rooted in DNS server trust, which is nice because not everyone can rely on a fully end-to-end attestably-secure DNS stack in all scenarios - Google and other cloud providers naturally do not have this problem within their own clouds. |
I don't see why a name in the hostname syntax is 'rooted' in DNS - it's
also an opaque identifier like an email address or URL.
You don't need to do any DNS lookup in any of the verifications you do.
Connections to www.google.com do not rely on
a secure DNS - but on the server having a signed certificate for the name.
In most cases for internal traffic - the DNS SAN will be the same with the
service - example.namespace.svc.cluster.local - and while a
DNS lookup is done by the client, the verification is based on the signed
certificate, so even if DNS is insecure the communication is secure.
That is not actually the case with Istio and SPIFFE today - if DNS is
hacked, the VIP of a different service can be returned and
the entire security and checks are messed up. This is well known and why
Istio REQUIRES a secure DNS to be secure.
There is nothing special about expressing something as URL instead of
hostname or email, from a security perspective. The advantage
of DNS over URL, when client authenticates the server, is that it is
independent of a discovery server mapping DNS to VIP to URLs,
and fully interoperable and well known mechanism.
…On Fri, Apr 28, 2023 at 11:59 AM Ben Leggett ***@***.***> wrote:
The other nice thing about SPIFFE is that identity can be described in a
way that is *not* inherently rooted in DNS server trust, which is nice
because not everyone can rely on a fully end-to-end attestably-secure DNS
stack in all scenarios - Google and other cloud providers naturally do not
have this problem.
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2VZ52AFQATW3T5R2S3XDQHPNANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I think the root problem is that SPIFFE is over-selling the use of a URL
(that in most cases is NOT a workload identity)
to magically make things secure, and ignoring the complexities and insecure
side-channels it introduces.
Even if they had a well defined schema - like the distinguished name or JWT
claims - it would still be tied to
a discovery system to map what users want - access example.namespace.svc -
to the URLs representing identity
in whatever control plane is used. Complexity and mappings are not good
for security.
SPIFFE to represent a client identity - no problem, it's as good as any
other opaque identifier. But if you want to extract
metadata - a schema like JWT would still be better than opaque URL.
…On Fri, Apr 28, 2023 at 6:13 PM Costin Manolache ***@***.***> wrote:
I don't see why a name in the hostname syntax is 'rooted' in DNS - it's
also an opaque identifier like an email address or URL.
You don't need to do any DNS lookup in any of the verifications you do.
Connections to www.google.com do not rely on
a secure DNS - but on the server having a signed certificate for the name.
In most cases for internal traffic - the DNS SAN will be the same with the
service - example.namespace.svc.cluster.local - and while a
DNS lookup is done by the client, the verification is based on the signed
certificate, so even if DNS is insecure the communication is secure.
That is not actually the case with Istio and SPIFFE today - if DNS is
hacked, the VIP of a different service can be returned and
the entire security and checks are messed up. This is well known and why
Istio REQUIRES a secure DNS to be secure.
There is nothing special about expressing something as URL instead of
hostname or email, from a security perspective. The advantage
of DNS over URL, when client authenticates the server, is that it is
independent of a discovery server mapping DNS to VIP to URLs,
and fully interoperable and well known mechanism.
On Fri, Apr 28, 2023 at 11:59 AM Ben Leggett ***@***.***>
wrote:
> The other nice thing about SPIFFE is that identity can be described in a
> way that is *not* inherently rooted in DNS server trust, which is nice
> because not everyone can rely on a fully end-to-end attestably-secure DNS
> stack in all scenarios - Google and other cloud providers naturally do not
> have this problem.
>
> —
> Reply to this email directly, view it on GitHub
> <#43105 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAAUR2VZ52AFQATW3T5R2S3XDQHPNANCNFSM6AAAAAAUPIVGYI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
If you want to do any sort of attesting that a specific identifier belongs to a specific service and all you are using is DNS, you have to trust the DNS server to make that attestation, and all it can reasonably express is a name <-> IP mapping. I think we're saying the same thing here. SPIRE identifiers are also fully opaque, but unlike DNS, SPIRE offers many forms of workload attestation that go far beyond trusting the DNS records a given server possesses. All DNS can do is attest the validity of a name <-> IP map entry. That's a weak form of workload identity, and is not multifactor.
This is a constraint of Istio (and kubernetes) yes. It has very little to do with SPIRE - I think the point you are making is that it will always be a constraint of Istio and Kubernetes, whether we used DNS or SPIRE to identify workloads, which I would certainly agree with - workload identity is one part of the puzzle.
Correct - the difference is in what attestations you can practically cryptographically attest against that identifier - DNS is not designed to attest anything besides a name <-> IP mapping, which by itself is not sufficient for attesting workload identity. |
Sure - parsing the specific fields of the identifier is out of scope of what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting - DNS has no such constraint or standard but conventions can be overlaid on it. The difference is that DNS identifiers are only designed to attest a single identity factor historically (and until recently didn't even offer any real security guarantees about that attestation), and SPIFFE is expressly more general than that - crypographically binding attestations of multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS records is simply not something you can or will ever be able to do within the DNS standard. It's a fundamentally unsound basis for workload identity - unless you invent several layers of de-facto standards that live outside normative DNS implementations, at which point you arrive at something that looks exactly like what SPIFFE/SPIRE already is, but with DNS names instead of SPIRE IDs. Which seems like a rather extreme and unproductive form of NIH. And at that point what you have done is invest a lot of work to avoid using an existing standard, so you can craft another, even more de-facto standard around DNS records that is potentially worse, and certainly no better. I'm not against putting DNS records in certs as a shortcut, or for admitting that we are probably, in the short term, bound to what K8S has decided to do - but I am saying that (vanilla, secure or not) DNS records are not, in the long term, a sufficient mechanism for representing workload identity (or for acting as a generic identifier that more specific forms of workload identity attestations can be cryptographically bound to), unless we invent several layers of nonstandard extensions to/assumptions around DNS. And if we do that, we have essentially reinvented SPIFFE/SPIRE but done some violence to an older, established, and simpler standard to get there. |
On Mon, May 1, 2023 at 8:06 AM Ben Leggett ***@***.***> wrote:
I don't see why a name in the hostname syntax is 'rooted' in DNS - it's
also an opaque identifier like an email address or URL. You don't need to
do any DNS lookup in any of the verifications you do.
If you want to do any sort of attesting that a specific identifier belongs
to a specific service and all you are using is DNS, you have to trust the
DNS server to make that attestation.
I think we're saying the same thing here.
SPIRE identifiers are also fully opaque, but unlike DNS, SPIRE offers many
forms of workload attestation that go far beyond trusting the DNS records a
given server possesses. All DNS can do is attest the validity of a name <->
map. That's a weak form of workload identity, and is not multifactor.
Not sure what you mean by 'opaque' - Istio use of Spiffe is certainly not
opaque, and fully opaque identifiers are not very useful without a service
that
can provide info about them making them less opaque.
Spire != Spiffe and Istio is also a strange user - yes, it would be great
to have 'more workload attestations' ( I have a proposal about adding
telemetry info ),
that has little to do with Spire or Spiffe. A certificate or JWT can
attest/sign multiple things - including ownership of a hostname ( which may
be in DNS or not -
I think we should be more clear that FQDN is a host identifier which may or
may not be recorded in DNS and is defined in a different RFC).
I also don't understand what 'weak' means - a certificate or JWT represents
a set of signed statements that the signer has verified. What makes 'I can
attest
the FQDN - pod name, namespace and cluster suffix' weaker than 'I can
attest the service account, namespace and cluster' ? Same source ( K8S
APIserver and JWTs).
Multifactor is another thing I don't get - Istio doesn't have any
multi-factor concept.
I do agree that Spire ( and other CAs - but not Citadel because we chose
not to ) can attest more than service account. I don't understand the
opposition to have Citadel do this - but it's fine since other CAs can do
it. I suspect we do agree that more than service account should
be 'attested' - and I hope FQDN ( pod name, namespace, cluster) or service
FQDN would be among the things we treat as first class, since
the names are used as first class in k8s.
That is not actually the case with Istio and SPIFFE today - if DNS is
hacked, the VIP of a different service can be returned and the entire
security and checks are messed up. This is well known and why Istio
REQUIRES a secure DNS to be secure.
This is a constraint of Istio (and kubernetes) yes. It has very little to
do with SPIRE - I think the point you are making is that it will always be
a constraint of Istio and Kubernetes, whether we used DNS or SPIRE to
identify workloads, which I would certainly agree with - workload identity
is one part of the puzzle.
I'm not sure what SPIRE has to do with this discussion - it is one of the
many certificate providers, and each CA can attest and include different
things in a cert.
What Istio and K8S are concerned with is attestations for the things we use
and verify - regardless of CA ( or form of attestation - JWT tokens are
also fine for
most use cases if used properly ). That is the service account (in
particular for servers), FQDN - and in an ideal world VIPs and IPs.
There is nothing special about expressing something as URL instead of
hostname or email, from a security perspective. The advantage of DNS over
URL, when client authenticates the server, is that it is independent of a
discovery server mapping DNS to VIP to URLs, and fully interoperable and
well known mechanism.
Correct - the difference is in what attestations you can practically
cryptographically attest against that identifier - DNS is not designed to
attest anything besides a name <-> IP mapping, which by itself is not
sufficient for attesting workload identity.
That is not true on multiple levels. DNS is not only for name to IP - it is
commonly used to represent for example certs associated with a name,
And nobody is discussing attesting the DNS A record - the cert is attesting
a FQDN, either a hostname ( pod name, namespace, cluster) or a service FQDN.
A waypoint may get a service cert - if the CA can attest that the gateway
is authorized (RBAC) to serve it, very much like ACME works.
The FQDN is intended to avoid the weaknesses of DNS - it is a proof you own
the hostname, even if DNS returns the wrong IP.
… Message ID: ***@***.***>
|
On Mon, May 1, 2023 at 8:18 AM Ben Leggett ***@***.***> wrote:
I think the root problem is that SPIFFE is over-selling the use of a URL
(that in most cases is NOT a workload identity) to magically make things
secure, and ignoring the complexities and insecure side-channels it
introduces.
Even if they had a well defined schema - like the distinguished name or
JWT claims - it would still be tied to a discovery system to map what users
want - access example.namespace.svc - to the URLs representing identity in
whatever control plane is used.
Sure - parsing the specific fields of the identifier is out of scope of
what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting
- DNS has no such constraint or standard but conventions can be overlaid on
it.
There are a couple of RFCs and docs defining such constraints.
Again - this is not DNS, but hostnames, and the requirements for hostnames
( uniqueness on each domain, how it is represented in OS and language APIs,
etc) are well known.
Unfortunately K8S has some gaps - .cluster.local is good, but representing
cluster names in MCS is still a bit weak and needs improvements. However
for all use
cases we are about - since the client starts with a FQDN ( or just hostname
with clear expansion ) the naming must be working, that's the 'original
intent' and
everything we do is to make sure the destination is entitled to use it.
The difference is that DNS identifiers are only designed to attest a
single identity factor historically (and until recently didn't even offer
any real security guarantees about that attestation), and SPIFFE is
expressly more general than that - crypographically binding attestations of
multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS
records is simply not something you can or will ever be able to do within
the DNS standard.
FQDN identifiers can be attested in many ways. Nothing makes a URL more
'attestable' than a FQDN, and what a CA provider uses to attest is strictly
separated from the format of the signed info.
I still don't understand what DNS records have to do with any of this -
there are a lot of things in DNS secure and a lot of uses of DNS for
representing certs and trust, but I don't
see what it has to do with using the FQDN in the cert. There is no DNS
involved in what we are discussion ( except that original client must still
make a DNS request
with a FQDN to resolve the IP we intercept - and need to ultimately map to
a certificate - and if DNS is not secure the rest falls)
It's a fundamentally unsound basis for workload identity - unless you
invent several layers of de-facto standards that live outside normative DNS
implementations, at which point you arrive at something that looks
*exactly* like what SPIFFE/SPIRE already is, but with DNS names instead
of SPIRE IDs.
I still don't undersatand what you mean. What workload identity - Istio
service account ? Spiffe is pretty vague on what is the identity of the
workload except an opaque URL,
and nothing makes it more 'sound' than any other opaque identifier.
While 'workload identity' is not clearly defined - the use of FQDN is very
sound and the basis of all internet communication - as well as what K8S and
Istio are
really handling, for client to server communication. What is unsound is the
disconnect between what user is using ( example.namespace.svc ) which is
clearly what needs to be validated because that's what the user wants to
communicate with - and the various vaguely defined and
implementation-specific
identities in Spiffe.
And at that point what you have done is invest a lot of work to avoid
using an existing standard, so you can craft another, even more de-facto
standard around DNS records that is potentially worse, and certainly no
better.
I don't know if this is sarcasm, hard to tell in comments :-)
The use of FQDN and hostnames in certifiate is the most broadly used
standard on the internet and in enterprise - and K8S is not so different.
There is not a lot of work to use it - virtually every library and CA
supports it.
What we are doing with spiffee is investing a lot of work in something that
is certainly no better - as we know very well ( i.e. breaks interop,
has a strong dependency on a secure DNS, etc).
Message ID: ***@***.***>
… |
BTW - I want to be clear I do not disagree with you that including multiple
'attestations' in a certificate is a good thing, and Spiffe remains
essential for client identity ( attesting the service account of the client
) and is useful for some attestation about servers as well.
My point in this comments is that we must support the broader internet
standards for interoperability and compatibility - and that
includes using FQDN SANs and SNIs in our certs and support them first class.
…On Mon, May 1, 2023 at 9:09 AM Costin Manolache ***@***.***> wrote:
On Mon, May 1, 2023 at 8:18 AM Ben Leggett ***@***.***>
wrote:
> I think the root problem is that SPIFFE is over-selling the use of a URL
> (that in most cases is NOT a workload identity) to magically make things
> secure, and ignoring the complexities and insecure side-channels it
> introduces.
>
> Even if they had a well defined schema - like the distinguished name or
> JWT claims - it would still be tied to a discovery system to map what users
> want - access example.namespace.svc - to the URLs representing identity in
> whatever control plane is used.
>
> Sure - parsing the specific fields of the identifier is out of scope of
> what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting
> - DNS has no such constraint or standard but conventions can be overlaid on
> it.
>
There are a couple of RFCs and docs defining such constraints.
Again - this is not DNS, but hostnames, and the requirements for hostnames
( uniqueness on each domain, how it is represented in OS and language APIs,
etc) are well known.
Unfortunately K8S has some gaps - .cluster.local is good, but representing
cluster names in MCS is still a bit weak and needs improvements. However
for all use
cases we are about - since the client starts with a FQDN ( or just
hostname with clear expansion ) the naming must be working, that's the
'original intent' and
everything we do is to make sure the destination is entitled to use it.
> The difference is that DNS identifiers are only designed to attest a
> single identity factor historically (and until recently didn't even offer
> any real security guarantees about that attestation), and SPIFFE is
> expressly more general than that - crypographically binding attestations of
> multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS
> records is simply not something you can or will ever be able to do within
> the DNS standard.
>
> FQDN identifiers can be attested in many ways. Nothing makes a URL more
'attestable' than a FQDN, and what a CA provider uses to attest is strictly
separated from the format of the signed info.
I still don't understand what DNS records have to do with any of this -
there are a lot of things in DNS secure and a lot of uses of DNS for
representing certs and trust, but I don't
see what it has to do with using the FQDN in the cert. There is no DNS
involved in what we are discussion ( except that original client must still
make a DNS request
with a FQDN to resolve the IP we intercept - and need to ultimately map to
a certificate - and if DNS is not secure the rest falls)
> It's a fundamentally unsound basis for workload identity - unless you
> invent several layers of de-facto standards that live outside normative DNS
> implementations, at which point you arrive at something that looks
> *exactly* like what SPIFFE/SPIRE already is, but with DNS names instead
> of SPIRE IDs.
>
I still don't undersatand what you mean. What workload identity - Istio
service account ? Spiffe is pretty vague on what is the identity of the
workload except an opaque URL,
and nothing makes it more 'sound' than any other opaque identifier.
While 'workload identity' is not clearly defined - the use of FQDN is very
sound and the basis of all internet communication - as well as what K8S and
Istio are
really handling, for client to server communication. What is unsound is
the disconnect between what user is using ( example.namespace.svc ) which
is
clearly what needs to be validated because that's what the user wants to
communicate with - and the various vaguely defined and
implementation-specific
identities in Spiffe.
> And at that point what you have done is invest a lot of work to avoid
> using an existing standard, so you can craft another, even more de-facto
> standard around DNS records that is potentially worse, and certainly no
> better.
>
>
>
I don't know if this is sarcasm, hard to tell in comments :-)
The use of FQDN and hostnames in certifiate is the most broadly used
standard on the internet and in enterprise - and K8S is not so different.
There is not a lot of work to use it - virtually every library and CA
supports it.
What we are doing with spiffee is investing a lot of work in something
that is certainly no better - as we know very well ( i.e. breaks interop,
has a strong dependency on a secure DNS, etc).
Message ID: ***@***.***>
>
|
I think the list of discussions there was toward the format that adds additional information to the end of the spiffe path, but it can be added in the beginning, e.g., Adding it would also solve the problem of multicluster deployment when the limitation is that you HAVE TO avoid namespace collision. Adding it will not have to change a lot of internals here because it is already part of the configuration. I understand that making the arbitrary SPIFFE ID support, as @bleggett mentioned, has some challenges to make it work with telemetry and access control. Still, using Istio for mTLS only and other systems like OPA for authorization might be a conscious choice. In this case, the user can opt-in to disable the default istio spiffe id template scheme. |
If you read the SPIFFE specification, you can see that it is not only about URL, and not only about x.509 as an identity document. SPIFFE is the only mechanism that can enable federation easily for Istio and as authentication in multiple CA federated environments spiffe auth prevents cases of identity spoofing because of the trust domain.
|
I was thinking: what if allowed to provide a scheme that should be used by Istio and leave the default as it is today: |
On Tue, Sep 26, 2023 at 11:53 PM Eli Nesterov ***@***.***> wrote:
@costinm <https://github.com/costinm>
I think the root problem is that SPIFFE is over-selling the use of a URL
(that in most cases is NOT a workload identity)
to magically make things secure, and ignoring the complexities and insecure
side-channels it introduces.
If you read the SPIFFE specification, you can see that it is not only
about URL, and not only about x.509 as an identity document. SPIFFE is the
only mechanism that can enable federation easily for Istio and as
authentication in multiple CA federated environments spiffe auth prevents
cases of identity spoofing because of the trust domain.
I wouldn't say it's the 'only' mechanism - even the existing Istio APIs
allow control over which CAs are accepted for a particular host.
Identify federation is not yet a supported feature in Istio and we don't
have agreement on what will be supported - so it's not a factor
beyond what our APIs already support ( both around CAs and OIDC ).
… and insecure side-channels it introduces.
I would love to learn more about side channels here :) in the case of
istio and SPIRE, it uses the same mechanism of delivery of X509-SVID to the
envoy. It actually makes it better from a security standpoint because
doesn't need to rely on service accounts only as a security mechanism
(which forces Istio users to create SA they might not even use or they all
just use default)
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2T4TSEUQ3WHDF4C5R3X4PEPJANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
AFAIK you can already use any SAN you want - using the explicit APIs we
support.
I think we agreed ( at least a subgroup ) that for extra info we can use
the newly defined cert fields, believe
K8S got some assigned.
The risks of changing the identity format in Istio are hard to overstate...
For ambient - I don't mind making some
changes, but more towards using DNS certs for servers.
…On Wed, Sep 27, 2023 at 10:03 AM Eli Nesterov ***@***.***> wrote:
I was thinking: what if allowed to provide a scheme that should be used by
Istio and leave the default as it is today:
ns/namespace/sa/service-account in this case, if I opt-in using a
different scheme, I just need to provide it to Istio and as long it has
namespace and service accounts in the spiffe id everything should function
as it is now.
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2XC5AQB3SF4SLFK7RLX4RL65ANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@costinm do you mean by using the Destination rule? or any other mechanism? Would you mind please to point me to the direction where I can read/find more about it |
Also is ti possible to configure envoy through istio to use different formats of spiffe id in different contexts (e.g. internal inside cluster and external when talking to services outside of the mesh e.g. via service records + destination rules?) |
"Identity Federation" usually means communicating with a peer that has a
different identity provider/roots.
It has 2 sides - client verifying the server and server authorizing a
client.
DestinationRule allows you to specify which root CAs to trust and any SAN
you want - URL or DNS are both fine.
For verifying a client using a different identity domain - with JWTs we
have normal OIDC, for client certs we can
also specify any CA we want - and I believe the authz rules still allow
rules using the full SAN ( in addition to
the extracted namespace and SA ).
This works not only with Siffe and URL-based SANs - and has been quite
stable for a long time. It is not
automatic or default.
OIDC also has a very long history with this - unfortunately the constraints
in certs were never broadly implemented,
just like Spiffe 'federation' is unlikely to have broad adoption - in
practice when you have 2 domains using
different identity providers there are some gateways in the middle, you
don't have 'flat network' which is
required for mTLS. So you end up with most of the federation config in the
gateway - and for that the current
APIs work as expected ( and many other gateways besides Istio provide
similar controls ).
I personally wouldn't mind too much more flexibility in the format of
client certificates - but I am strongly
against messing up with the identity of the servers, we already have a very
fragile and incompatible
mechanism in 'secure naming' ( i.e. guessing which identity a server might
have by looking at each pod that
is selected across the mesh is quite bad IMHO). And with ACME - including
'private' ACME as in step-ca -
I think we would be far better moving back to using standard DNS SAN for
services ( in particular in ambient)
…On Wed, Sep 27, 2023 at 10:19 AM Eli Nesterov ***@***.***> wrote:
Also is ti possible to configure envoy through istio to use different
formats of spiffe id in different contexts (e.g. internal inside cluster
and external when talking to services outside of the mesh e.g. via service
records + destination rules?)
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2R2WT5CZL436YHS32DX4RNZHANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
For more context on ambient - if we are moving towards L7 'gateways'
enforcing the policies - and possibly
interop with other 'classes' of gateway ( not only istio ): most of the
mTLS will be between a client
and a gateway which might not be an Istio ( i.e. spiffe) gateway.
Ztunnel is per node - and currently has some special requirements for
certs, and I don't think we have
any design on 'federation', in particular as we hope ztunnel to converge
with CNI and maybe have
improved netpol.
As I mentioned - 'flat network' and a lot of trust in the control planes is
generally required for normal
workload to workload mTLS - for everything else mTLS will terminate in
gateway and most likely
will use JWTs for multi-hop proof. And for JWTs I know SPIFFE has their
'standard' - but so does OIDC
and that is already broadly supported.
In this context - not the best time to mess with 'istio sidecar' identity
model or add one more migration
problem for ambient, when the use case is exceedingly uncommon.
…On Wed, Sep 27, 2023 at 6:24 PM Costin Manolache ***@***.***> wrote:
"Identity Federation" usually means communicating with a peer that has a
different identity provider/roots.
It has 2 sides - client verifying the server and server authorizing a
client.
DestinationRule allows you to specify which root CAs to trust and any SAN
you want - URL or DNS are both fine.
For verifying a client using a different identity domain - with JWTs we
have normal OIDC, for client certs we can
also specify any CA we want - and I believe the authz rules still allow
rules using the full SAN ( in addition to
the extracted namespace and SA ).
This works not only with Siffe and URL-based SANs - and has been quite
stable for a long time. It is not
automatic or default.
OIDC also has a very long history with this - unfortunately the
constraints in certs were never broadly implemented,
just like Spiffe 'federation' is unlikely to have broad adoption - in
practice when you have 2 domains using
different identity providers there are some gateways in the middle, you
don't have 'flat network' which is
required for mTLS. So you end up with most of the federation config in the
gateway - and for that the current
APIs work as expected ( and many other gateways besides Istio provide
similar controls ).
I personally wouldn't mind too much more flexibility in the format of
client certificates - but I am strongly
against messing up with the identity of the servers, we already have a
very fragile and incompatible
mechanism in 'secure naming' ( i.e. guessing which identity a server might
have by looking at each pod that
is selected across the mesh is quite bad IMHO). And with ACME - including
'private' ACME as in step-ca -
I think we would be far better moving back to using standard DNS SAN for
services ( in particular in ambient)
On Wed, Sep 27, 2023 at 10:19 AM Eli Nesterov ***@***.***>
wrote:
> Also is ti possible to configure envoy through istio to use different
> formats of spiffe id in different contexts (e.g. internal inside cluster
> and external when talking to services outside of the mesh e.g. via service
> records + destination rules?)
>
> —
> Reply to this email directly, view it on GitHub
> <#43105 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAAUR2R2WT5CZL436YHS32DX4RNZHANCNFSM6AAAAAAUPIVGYI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@costinm I think this flexibility only works when you use secrets in the destination rule.
When it comes to using SDS I cannot find a way to do that. I think using different contexts would solve that e.g. using |
The DestinationRule approach is per-workload. That means if you have 1000 workloads you need to create 1000 DestinationRule overrides to "fix" the Istio-hardcoded SAN format. That's a kludge, not an API. The hardcoded SAN format is:
It's just not particularly robust as-implemented - the inflexibility is a side effect of the fragility. While changing the default format would be invasive (though certainly not the most invasive change Istio has ever presented users with), there's clearly several other opt-in options that wouldn't harm the default functionality that are worth pursuing here. the tl;dr is that if It simply should not be Istio's business how a CA ties a workload to a cert. Currently, it is, and that creates a lot of problems - not just "I have a niche use case" problems, but also general fragility and scoping problems from the auth APIs on down. |
The Istio spiffe format IS itio specific. There is no standard URL format.
If you have 1000 workloads not using Istio Spiffee - maybe you should not
use Istio, or modify whatever is generating the certs
to use Istio format. Why would we take the risks and do the work
maintaining a separate format ( one ? 10 ? how many
ways do you want to parse a URL - and make sure this is reflected in all
code that deals with certs). Are we going to have
regex to extract arbitrary URLs ?
And the format is the least of your problems - how do you discover what
identity to use for each of the 1000 workloads ?
Secure naming generates istio style URLs.
You do not have to use our CA - but the URL format we use to represent
identities. Any CA would work. And you
can also just use DNS SANs for the services - to avoid the much larger
problem of configuring which SANs are allowed
for each service.
I am not a big fan of using an arbitrary URL and requiring an external
mapping service from a service name to a
VIP to a list of URLs representing any KSA. But that's what we have used,
and I have not seen any reason why
a different URL format would be better ( or we need to support 7 URL
formats )
…On Thu, Sep 28, 2023 at 9:25 AM Ben Leggett ***@***.***> wrote:
The DestinationRule approach is per-workload. That means if you have 1000
workloads you need to create 1000 DestinationRule overrides to "fix" the
Istio-hardcoded SAN format. That's a kludge, not an API.
- Istio specific
- Not sufficient for all use cases
- Enforced in one spot in the Istio envoy config
Other parts of the Istio code make assumptions about the format of the
SPIFFE ID, based on that config, it's not particularly robust
as-implemented.
While changing the default format would be invasive, there's clearly
plenty of other opt-in options that wouldn't harm the default functionality.
If istiod currently is the only thing that can act as a CA for workloads,
that's tech debt - not essential protected functionality. "You must use our
CA because our CA is special" is a bug. Nothing about *Istio's*
functionality needs to depend on you selecting a specific CA.
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2VIGTTPQCEEJS72V6LX4WQHJANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
On Thu, Sep 28, 2023 at 8:56 AM Eli Nesterov ***@***.***> wrote:
"Identity Federation" usually means communicating with a peer that has a
different identity provider/roots.
It has 2 sides - client verifying the server and server authorizing a
client.
DestinationRule allows you to specify which root CAs to trust and any SAN
you want - URL or DNS are both fine.
@costinm <https://github.com/costinm> I think this flexibility only works
when you use secrets in the destination rule.
Something like:
spec:
host: mydbserver.prod.svc.cluster.local
trafficPolicy:
tls:
mode: MUTUAL
clientCertificate: /etc/certs/myclientcert.pem
privateKey: /etc/certs/client_private_key.pem
caCertificates: /etc/certs/rootcacerts.pem
When it comes to using SDS I cannot find a way to do that. I think using
different contexts would solve that e.g. using builtin:https://external would
tell sds to use context foo so the envoy can get client cert\key and root
for that external out-of-cluster mTLS (this is where different spiffe id
and any other setting can be used). I think that wouldn't mess with
internal Istio spiffe id format.
I'm all for improving the UX for DestinationRule loading certificates. It's
a pretty stable API/feature, we went through this with gateway so
if anything is broken or hard to use - I would rather fix it then try to
hack the URL format in the spiffe cert and any associated
code.
This will also help with many other use cases and improve an existing API.
BTW - in your example you would still need a list of Spiffee URLs to
validate as proof for the host (unless the host is using
a DNS SAN). This thread is about custom Spiffee URLs - and as I mentioned
in a previous comment the list
of URLs may change and it is not so easy to discover if the workloads are
in other clusters ( which may not
be watched by Istiod ).
Message ID: ***@***.***>
… |
There is nothing Istio specific about the format. There is a hardcoded default in Istio, which is not exposed in any API, but which several of our APIs have to implicitly assume and be aware of. There are only 2 questions here: Istio currently DOES hardcode the SPIFFE ID format - prefix AND postfix, which some code (and some APIs) opaquely assume.
The answer to 2 is clearly "no, it does not". The answer to 1 is likely "no, it does not, but it's a little hard to fix". There's not much more to argue about here from a design perspective. A SPIFFE ID is a pointer. It is not an identity. All Istio really, genuinely needs to do is hand that pointer to a workload CA and get a x509 cert back. To the degree that istio cares about the contents of the pointer, versus the thing it points to, is a degree to which Istio is making fragile assumptions. We should minimize, not maximize, the number of fragile assumptions. This is exactly analogous to DNS - assumptions should be made about the thing the pointer points to, and not the format of the pointer itself. k8s needs to force a DNS name postfix format, and does, but it hardcodes/forces a minimal opinion - just the postfix is non-negotiable, it has no opinion about any other part of the DNS name. There's no reason why we can't, or shouldn't, follow that for SPIFFE ID formats.
Again - Istio's value is not in being a workload CA. Nor is Istio's value in being a remarkably inflexible workload CA. It's 5% of the functionality and there are many implementations. It really shouldn't matter what workload CA Istio uses at all. Istio binds policy to workload identities. How those workload identities are generated, or mapped to workloads, is an implementation detail of the workload CA. Istio provides a basic workload CA implementation, much like it provides a basic waypoint implementation. Istio currently overindexes on its own basic workload CA implementation as a de-facto standard, when that is not strictly required. Istio's invariants around the workload CA are
That's it. Nothing there requires the use of a specific workload CA. It requires a (very basic, probably x509) contract between Istio and the workload CA.
Again - see DNS and the previous examples. How many "DNS formats" does Istio (or K8S) support? What parts of the DNS name MUST FOLLOW a fixed format for Istio to function? It's not a matter of "what Istio must support" - it's a matter of Istio making minimal, versus maximal, assumptions for the constraints it places around an external system (DNS, SPIFFE, x509, etc).
No. Secure naming, from Istio's current perspective, generates URLs that contain at least a trust domain, a service account name, and a namespace. We have some code that confuses at least with at most, for no particularly defensible reason. And again - the pointer is not "secure", it just needs to be "sufficiently" unambiguous, and tightly bound to the cryptographic identity the workload CA resolves it to. The latter is where the "secure" comes from.
The current URL format is an over-fragile hack, with no API. We're right back at what I said previously:
Why on earth would we break a very well-scoped public API (arguably one of the few well-scoped APIs we have) in order to work around an unnecessarily hard-coded internal default for a workload's SPIFFE ID format? DestinationRules are a symptom, they aren't the problem here. If I have a system where the requirement is that "DNS names MUST HAVE AT LEAST a postfix that matches this pattern" and I write code that creates an implicit requirement that "DNS names CAN ONLY match this pattern" - I've made a mistake, not created a defensible internal standard. |
What exactly assumes the format in Istio? My understanding is that it's fairly small:
Is there anything consuming the principal as a non-opaque string? There's a lot of infrastructure that produces the principals (CA, constraints, etc), but I'm asking about the consumers strictly. |
SAN matching for TLS context is the one that comes to my mind; you may have implied that in your first bullet though |
The client side part is missing. We have code that is pod -> full URI. Need
a way to replicate that
…On Fri, Sep 29, 2023 at 3:17 PM Keith Mattix II ***@***.***> wrote:
SAN matching for TLS context is the one that comes to my mind; you may
have implied that in your first bullet though
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXPBCI47QCUOIOI32U3X45CIZANCNFSM6AAAAAAUPIVGYI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@keithmattix Where is it parsing SPIFFE ID? I only see exact matching (and some trust domain stuff, which is also precise match). Ambient does something different (point 2), but we can ignore that |
Ah I see what you mean. In that case, I don't think we have a ton of parsing; SPIFFE as a whole just seems like a convenient format to store certain information. |
Yeah, it's an encoding of certain attested workload attributes. John's point is valid - xDS control plane is attesting the server SVIDs on the client by using the service registry and filling out the templates, which requires that the client and the server agree on the template format. So there are two major places:
|
@EItanya any update on the design doc for this? |
Update on this - one of the things we have settled on I think is that
|
I don't think appending more info is the wrong solution, now that OIDs have
been defined for pod info. Also using an additional DNS SAN with the pod
fqdn is cleaner and provides more interop for server authentication.
Parsing Spidffe URLs and defining more istio-only formats is a dead end.
…On Fri, Jul 26, 2024, 08:46 Ben Leggett ***@***.***> wrote:
Update on this - one of the things we have settled on I think is that
- Allowing *completely arbitrary* SPIFFEIDs is problematic and
requires quite a bit of rework, breaking existing policies, and several
knock-on effects to the trust model potentially - we probably will not do
this in the short term.
- Allowing *appending* arbitrary segments to the suffix/prefix of the
Istio SPIFFEID should be pretty doable (suffix might just be a slight Envoy
tweak to support actually) - we probably *will* do this in the medium
term.
—
Reply to this email directly, view it on GitHub
<#43105 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAUR2TXCLOZLFLYG46IGP3ZOJVMXAVCNFSM6AAAAAAUPIVGYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJTGAZTGMRWGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Summary
Detail
Currently, Istio uses a nonstandard variant of the SPIFFE ID spec, that mandates a SPIFFE ID format in the URI SAN field of the x509 workload certs:
spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account>
This means that workload certs minted by the default Istio SDS are indistinguishable - if I have 5 pods under the same service account, they share the same credentials, even if they may have different containers, run on different nodes, etc etc.
That is because the default Istio SDS is simplistic, does no granular workload identity attestation, and merely passes through trust and workload identity to K8S service accounts, which is Good Enough Most Of The Time.
Now that Istio supports replacing the default SDS provider with alternative SPIFFE-compliant SDS servers, such as SPIRE, this restriction makes less sense - the SDS server does (and should) control the format of the SPIFFE ID, and the granularity of the workload identity - for instance, if I use SPIRE with Istio and want to do workload attestation beyond just the service account level, I can easily do that today, and the SPIFFE ID format is defined with SPIRE, not Istio.
In fact, it is perfectly possible to do this today - I can integrate SPIRE with Istio as per our current docs, and configure SPIRE to mint SPIFFE IDs in a non-Istio-standard format, appending more granularity to the SPIFFE identifier to suit the level of attestation granularity my SPIFFE authority is actually engaging in:
spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account>/nodeid/<node_id/wl/<workload_name>
- for instanceThis works just fine with Istio, with the following exception - SPIFFE SAN validation is a hardcoded Envoy config that requires an exact match on
spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account>
- even though other forms of matching for SANs are supported by Envoy, we do not support them or expose them as configurable options.This can be worked around with a DestinationRule such as the following:
Once you do this, SPIFFE IDs can be constructed with whatever level of granularity you desire, and workload certs can be distinguishable by consumers at the level of attestation that is actually performed by the SDS, rather than the level of attestation that Istio's default SDS performs.
There has been resistance to changing the default SPIFFE ID format due to back compat with existing customer rules that also hardcode SPIFFE IDs in the format that the default Istio SDS emits, and that's reasonable - but given that we support pluggable SPIFFE-compliant SDS implementations there is no good reason why Istio itself should forbid or otherwise prevent customers from using an alternate SDS from using more granular SPIFFE IDs than the default.
Especially since this works fine today with a simple DestinationRule tweak, indicating that the problem is a simple set of currently-unconfigurable defaults, and not a systemic obstacle.
Frankly, outside of maybe requiring that a SPIFFE ID have at a minimum several expected parsable fields in it so Istio itself can extract the information it needs from SPIFFE IDs (
/ns/<namespace>
and/sa/<serviceaccount>/
), it isn't really Istio's business what the SPIFFE ID format is - the SPIFFE ID format is and should be owned by the SPIRE-compliant SDS instance, and we support more than one SPIRE-compliant SDS instance. We just make bad assumptions elsewhere in the code that force those compliant instances to hew exclusively to the SPIFFE ID format our default SDS emits, which is an unnecessary restriction.Affected product area (please put an X in all that apply)
[x] Ambient
[x] Docs
[x] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[x] Security
[ ] Test and Release
[x] User Experience
[ ] Developer Infrastructure
Affected features (please put an X in all that apply)
[ ] Multi Cluster
[ ] Virtual Machine
[ ] Multi Control Plane
Additional context
The text was updated successfully, but these errors were encountered: