Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be supported #43105

bleggett · 2023-02-02T16:57:45Z

Summary

When using the default Istio SDS, the current SPIFFE ID format should be the default
If using an alternative SPIFFE-compliant SDS, using an alternative SPIFFE ID format should be allowed without having to resort to DestRule hacks
Other parts of Istio should actively avoid introducing unnecessary implicit assumptions about the SPIFFE ID format and what is minting them to avoid creating more de-facto restrictions around their use, format, and the level of attestation a given user-supplied SPIFFE-compliant SDS server is actually configured for.

Detail

Currently, Istio uses a nonstandard variant of the SPIFFE ID spec, that mandates a SPIFFE ID format in the URI SAN field of the x509 workload certs:

spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account>

This means that workload certs minted by the default Istio SDS are indistinguishable - if I have 5 pods under the same service account, they share the same credentials, even if they may have different containers, run on different nodes, etc etc.

That is because the default Istio SDS is simplistic, does no granular workload identity attestation, and merely passes through trust and workload identity to K8S service accounts, which is Good Enough Most Of The Time.

Now that Istio supports replacing the default SDS provider with alternative SPIFFE-compliant SDS servers, such as SPIRE, this restriction makes less sense - the SDS server does (and should) control the format of the SPIFFE ID, and the granularity of the workload identity - for instance, if I use SPIRE with Istio and want to do workload attestation beyond just the service account level, I can easily do that today, and the SPIFFE ID format is defined with SPIRE, not Istio.

In fact, it is perfectly possible to do this today - I can integrate SPIRE with Istio as per our current docs, and configure SPIRE to mint SPIFFE IDs in a non-Istio-standard format, appending more granularity to the SPIFFE identifier to suit the level of attestation granularity my SPIFFE authority is actually engaging in:

spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account>/nodeid/<node_id/wl/<workload_name> - for instance

This works just fine with Istio, with the following exception - SPIFFE SAN validation is a hardcoded Envoy config that requires an exact match on spiffe:https://<trust_domain>/ns/<workload_namespace>/sa/<workload_service_account> - even though other forms of matching for SANs are supported by Envoy, we do not support them or expose them as configurable options.

This can be worked around with a DestinationRule such as the following:

# TODO destination rules need to be created for any SPIFFE IDs that don't follow the
# format that Istio expects (ns/NAMESPACE/sa/TARGET_POD_SVC_ACCOUNT)
# because ATM Istio defaults to clientside SAN checks that assume that SPIFFE ID format
# and this is not currently configurable
#
# Additionally, since DestinationRules override Istio's "default automTLS" settings, we need `mode: ISTIO_MUTUAL`
# in each DestRule to tell Istio that even though we have a custom destination config, we still want mTLS.
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: {{ .serviceName }}-custom-spire-destrule
spec:
  host: {{ .serviceName }}
  trafficPolicy:
    tls:
      mode: ISTIO_MUTUAL
      subjectAltNames:
      - spiffe:https://example.org/ns/{{ $.Release.Namespace }}/sa/{{ .serviceName }}/wl/foo

Once you do this, SPIFFE IDs can be constructed with whatever level of granularity you desire, and workload certs can be distinguishable by consumers at the level of attestation that is actually performed by the SDS, rather than the level of attestation that Istio's default SDS performs.

There has been resistance to changing the default SPIFFE ID format due to back compat with existing customer rules that also hardcode SPIFFE IDs in the format that the default Istio SDS emits, and that's reasonable - but given that we support pluggable SPIFFE-compliant SDS implementations there is no good reason why Istio itself should forbid or otherwise prevent customers from using an alternate SDS from using more granular SPIFFE IDs than the default.

Especially since this works fine today with a simple DestinationRule tweak, indicating that the problem is a simple set of currently-unconfigurable defaults, and not a systemic obstacle.

Frankly, outside of maybe requiring that a SPIFFE ID have at a minimum several expected parsable fields in it so Istio itself can extract the information it needs from SPIFFE IDs (/ns/<namespace> and /sa/<serviceaccount>/), it isn't really Istio's business what the SPIFFE ID format is - the SPIFFE ID format is and should be owned by the SPIRE-compliant SDS instance, and we support more than one SPIRE-compliant SDS instance. We just make bad assumptions elsewhere in the code that force those compliant instances to hew exclusively to the SPIFFE ID format our default SDS emits, which is an unnecessary restriction.

Affected product area (please put an X in all that apply)

[x] Ambient
[x] Docs
[x] Installation
[ ] Networking
[ ] Performance and Scalability
[ ] Extensions and Telemetry
[x] Security
[ ] Test and Release
[x] User Experience
[ ] Developer Infrastructure

Affected features (please put an X in all that apply)

[ ] Multi Cluster
[ ] Virtual Machine
[ ] Multi Control Plane

Additional context

The text was updated successfully, but these errors were encountered:

bleggett · 2023-02-02T16:58:46Z

Some context:

#28712

#42114

kyessenov · 2023-02-02T17:31:52Z

Our authz API requires extracting the source principal from SPIFFE ID, for example, to restrict access by the source namespace. How would you support this API with custom SPIFFE IDs?

Similarly, our telemetry is very much "workload" oriented, meaning we drop the qualified pod name as soon as possible and only report the deployment name, in order to reduce the metric cardinality.

In general, using per-pod identity might give a false sense of security. Kubernetes itself doesn't distinguish between pods of the same KSA from authorization perspective (in RBAC, etc), and even then, tenancy by SA is very weak, and namespaces are much closer to an isolation unit.

bleggett · 2023-02-02T17:50:50Z

Our authz API requires extracting the source principal from SPIFFE ID, for example, to restrict access by the source namespace. How would you support this API with custom SPIFFE IDs?

Couple of ways I see

Allowing the AuthZ API to use SPIFFE ID URI matching to allow users to customize for themselves what constitutes "source principal" as defined by whatever SDS/workload CA they are using, if they want.
Or simply saying "whatever SPIFFE ID your workload CA mints, it MUST have /sa/<service_account> and /ns/<namespace> fields in it somewhere so K8S-service account authZ rules can work, but your workload CA/SDS server is free to add as many other fields to the SPIFFE ID as you need"

Similarly, our telemetry is very much "workload" oriented, meaning we drop the qualified pod name as soon as possible and only report the deployment name, in order to reduce the metric cardinality.

Do our telemetry APIs rely on parsing the ISTIO-SPIFFE ID format today? If they do, the above solutions would work. If they do not, they shouldn't be affected.

In general, using per-pod identity might give a false sense of security. Kubernetes itself doesn't distinguish between pods of the same KSA from authorization perspective (in RBAC, etc), and even then, tenancy by SA is very weak, and namespaces are much closer to an isolation unit.

Depends - my point is that attesting pod identity is the provenance and sole responsibility of the SDS server/workload CA you happen to be using - and the granularity to which identity is attested also belongs to that. Even today - the default istiod SDS/workload CA owns those guarantees, and is the trust root of all of them.

If we support pluggable SDS servers, and we do, we should consider respecting whatever the workload CA attests (or at least just respect the parts we care about and ignore the rest), rather than creating downstream de-facto assumptions that constrain what the workload CA can attest and encode in the cert, which is frankly backwards.

If you use the default istiod SDS workload CA, all we attest is Kubernetes service account, as attested by the Kubernetes API - but we are necessarily trusting SDS server to attest those things. If you swap that out for another SPIFFE-compliant SDS server, like say SPIRE, it can attest a superset of that - we don't have to care about the superset, but we shouldn't prevent the superset from being represented, which is what we do today.

kyessenov · 2023-02-02T18:36:34Z

Allowing the AuthZ API to use SPIFFE ID URI matching to allow users to customize for themselves what constitutes "source principal" as defined by whatever SDS/workload CA they are using, if they want.

Yes, but that's an API change. We're very wary of making any semantic changes to the existing APIs since any change can potentially break users.

Or simply saying "whatever SPIFFE ID your workload CA mints, it MUST have /sa/<service_account> and /ns/<namespace> fields in it somewhere so K8S-service account authZ rules can work, but your workload CA/SDS server is free to add as many other fields to the SPIFFE ID as you need"

That could work, but our implementation does strict regex matching I think. We'd need to make sure pattern matching is backwards compatible.

Do our telemetry APIs rely on parsing the ISTIO-SPIFFE ID format today? If they do, the above solutions would work. If they do not, they shouldn't be affected.

It matters because we report principals literally as primary metric tags. Having a pod name as a principal will overwhelm the metric systems (none of them scale well to POD^2 cardinality).

If you use the default istiod SDS workload CA, all we attest is Kubernetes service account, as attested by the Kubernetes API - but we are necessarily trusting SDS server to attest those things. If you swap that out for another SPIFFE-compliant SDS server, like say SPIRE, it can attest a superset of that - we don't have to care about the superset, but we shouldn't prevent the superset from being represented, which is what we do today.

If SPIFFE certs are only used by Istio, then it's better to propose to SPIRE to generate Istio-compatible identities, because Istio in general simply doesn't make use of pod names in the APIs. The only issue is inter-op with another system that shares the identities, and for that, we'd need more details on what the other system is.

bleggett · 2023-02-02T19:27:48Z

Allowing the AuthZ API to use SPIFFE ID URI matching to allow users to customize for themselves what constitutes "source principal" as defined by whatever SDS/workload CA they are using, if they want.

Yes, but that's an API change. We're very wary of making any semantic changes to the existing APIs since any change can potentially break users.

Or simply saying "whatever SPIFFE ID your workload CA mints, it MUST have /sa/<service_account> and /ns/<namespace> fields in it somewhere so K8S-service account authZ rules can work, but your workload CA/SDS server is free to add as many other fields to the SPIFFE ID as you need"

That could work, but our implementation does strict regex matching I think. We'd need to make sure pattern matching is backwards compatible.

Yep. Also, this would only matter to people using a nonstandard SDS. Anyone continuing to use the default istiod SDS shouldn't be affected - if you make an explicit choice to use a different SDS than the one we ship, we should support that and document it, but it (and the config required to support extended SPIFFE IDs) doesn't need to be the default.

Do our telemetry APIs rely on parsing the ISTIO-SPIFFE ID format today? If they do, the above solutions would work. If they do not, they shouldn't be affected.

It matters because we report principals literally as primary metric tags. Having a pod name as a principal will overwhelm the metric systems (none of them scale well to POD^2 cardinality).

That is useful info and something to consider.

If you use the default istiod SDS workload CA, all we attest is Kubernetes service account, as attested by the Kubernetes API - but we are necessarily trusting SDS server to attest those things. If you swap that out for another SPIFFE-compliant SDS server, like say SPIRE, it can attest a superset of that - we don't have to care about the superset, but we shouldn't prevent the superset from being represented, which is what we do today.

If SPIFFE certs are only used by Istio, then it's better to propose to SPIRE to generate Istio-compatible identities, because Istio in general simply doesn't make use of pod names in the APIs. The only issue is inter-op with another system that shares the identities, and for that, we'd need more details on what the other system is.

The only issue is inter-op with another system that shares the identities, and for that, we'd need more details on what the other system is. Or, we could not care about what other identity properties other systems want or need and simply act as a minimal intermediary between whatever workload CA you want to use that is compatible with us, and whatever external requirements you have - when it comes to SPIFFE, we are the ones that aren't compliant with (or allow fully-compliant implementations of) the published standard, and IMO that's on us, not SPIRE (or any other SDS).

We may require a subset of the spec for our own purposes, but we should not disallow (or refuse to pass thru) a superset of our requirements that are fully within the spec we support just due to some naive validation rules on our part - that's what we do today, and IMO that's a bug and demonstrably not strictly necessary if you are already using an alternate SDS/workload CA due to nonstandard requirements.

We should expect the things we need in the cert to be in the cert - if there are more things that external entities might want, that should be negotiated between the custom workload CA you are using to mint Istio workload certs, and your external entities - we shouldn't get in the middle of that and block it, or try to support all permutations of that ourselves.

hzxuzhonghu · 2023-02-03T09:14:35Z

For what case do you need more granular workload identity? For stateless application, k8s designed deployment as a logic concept for a group of instances, and each one has same permission. Why does istio need to separate them for auth?

bleggett · 2023-02-03T15:01:09Z

For what case do you need more granular workload identity? For stateless application, k8s designed deployment as a logic concept for a group of instances, and each one has same permission. Why does istio need to separate them for auth?

It doesn't - but there are external systems or integrations that will handle workload certs that might want or need that (see #42114 and @costinm's use cases), and Istio should not prevent you from putting more granular workload identity in the certs than what Istio itself needs. Especially since we support alternative workload CAs that provide more granularity - we just make it unnecessarily difficult to use that additional granularity.

Today Istio does prevent you from doing that, practically speaking, even if you use a non-default workload CA that supports this.

Additionally, we support replacing the istiod-default workload CA with alternate workload CAs which can attest a much more granular identity than istiod can - which is good - but rather than passing thru additional granularity that the customizable workload CA might put in workload certs, we put an upper bound on it

Istio itself doesn't need more granular identity attestation than the identity attestation that istiod's default workload CA supports.
Istio supports alternative workload CAs
Istio currently prevents you from using the more granular identity attestation of those external workload CAs because it insists on a de-facto standard, rather than simply insisting that the fields it needs are present, and ignoring+passing thru more granular specifiers.

dafang982 · 2023-04-06T15:02:02Z

We may require a subset of the spec for our own purposes, but we should not disallow (or refuse to pass thru) a superset of our requirements that are fully within the spec we support just due to some naive validation rules on our part - that's what we do today, and IMO that's a bug and demonstrably not strictly necessary if you are already using an alternate SDS/workload CA due to nonstandard requirements.

@bleggett, you have hit the nail on its head. I'm working on a project now and would like to use Istio, but this lstio limitation stops me choosing it. As we have our own SPIFFE CA that creates the identity that doesn't follow the Istio required pattern, even thought it is 100% SPIFFE compliant!

- modify cluster spiffe ids to use custom format - modify federation trust relationships to use new ids - add templated destination rule to workloads 1 and 2 with a DestinationRule as suggested in istio/istio#43105

costinm · 2023-04-28T18:24:34Z

I am quite in favor of having more flexibility in how we check identities and apply authz - but I am not sure Spiffe URL and having the 'workload name' as part of the URL is the right solution.

It will be critical for interop with other mesh implementations - that may not use spiffee but DNS or other identities, and it will also allow passing secured info about node, cluster, etc which are missing.

One proposal that I think would solve this nicely ( and much more ) is to add a second SAN with the fully qualified
pod hostname - and/or add custom extensions for the extra info, and extend the API to allow the use of such extensions.

Having a URL with hard to predict format and regex or other ugly ways to guess what variant of spiffee was used is quite dangerous and complicated.

bleggett · 2023-04-28T18:33:48Z

Having a URL with hard to predict format and regex or other ugly ways to guess what variant of spiffee was used is quite dangerous and complicated.

You'd have to do practically the same thing in the same way with a fully-qualified DNS name in most cases if you wanted to extract parts of the hostname identity as "descriptive metadata" - e.g. parsing segments out of pod-ip-address.my-namespace.pod.cluster-domain.example doesn't strike me as inherently more efficient than doing the same with spiffeid:https://cd/<cluster-domain>/ns/<namespace>/<...>/pod/<podid>, so I'm not sure it's realistically better to use DNS - tho it is more in line with what K8S has (for now) chosen to standardize on, I'll grant you that.

In general the problem here is that Istio is overly prescriptive of the SPIFFE format - it doesn't just expect certain fields, it precludes any other fields or additional specifiers from being used and imposes an ordering on the fields which are present, which is an unnecessary/overly-opinionated fragility that makes it impossible to consume SPIFFE IDs generated outside of Istio, among other things.

Istio does not need to predict the SPIFFE format at all - it can simply expect certain named segments to exist in whatever SPIFFE URI it gets, as per the SPIFFE spec, and complain if they are not there.

The fact that it mandates a complete SPIFFE format today is an Istio bug, really - if we need to integrate with things like e.g. Cilium I expect we will have to fix this and become more flexible WRT the SPIFFE formats we handle anyway in a way similar to what I'm describing above.

Instead of this, we could add a SAN, or extra non-compliant x509 cert fields - the thing that bothers me is that we don't really need to, if we fix the above.

At the end of the day though, I am interested in a standardized, not-just-Istio-parsable form of globally-unique workload identity, however we can get there. And I am very interested in Istio not coming up with it's own mechanism for this. SPIFFE is designed to address exactly this problem, we already use it in spots, other projects also use it, and it's a well-defined CNCF spec - so IMO a clear and compelling argument needs to exist for why we shouldn't use it, if we don't want to.

The best argument against it so far is "because K8S chose not to follow it as a standard" - which is fine, but since we span clusters and might have mesh interop concerns, we might have needs that extend beyond current K8S requirements.

bleggett · 2023-04-28T18:58:51Z

The other nice thing about SPIFFE is that identity can be described in a way that is not inherently rooted in DNS server trust, which is nice because not everyone can rely on a fully end-to-end attestably-secure DNS stack in all scenarios - Google and other cloud providers naturally do not have this problem within their own clouds.

costinm · 2023-04-29T01:13:15Z

I don't see why a name in the hostname syntax is 'rooted' in DNS - it's also an opaque identifier like an email address or URL. You don't need to do any DNS lookup in any of the verifications you do. Connections to www.google.com do not rely on a secure DNS - but on the server having a signed certificate for the name. In most cases for internal traffic - the DNS SAN will be the same with the service - example.namespace.svc.cluster.local - and while a DNS lookup is done by the client, the verification is based on the signed certificate, so even if DNS is insecure the communication is secure. That is not actually the case with Istio and SPIFFE today - if DNS is hacked, the VIP of a different service can be returned and the entire security and checks are messed up. This is well known and why Istio REQUIRES a secure DNS to be secure. There is nothing special about expressing something as URL instead of hostname or email, from a security perspective. The advantage of DNS over URL, when client authenticates the server, is that it is independent of a discovery server mapping DNS to VIP to URLs, and fully interoperable and well known mechanism.

…

On Fri, Apr 28, 2023 at 11:59 AM Ben Leggett ***@***.***> wrote: The other nice thing about SPIFFE is that identity can be described in a way that is *not* inherently rooted in DNS server trust, which is nice because not everyone can rely on a fully end-to-end attestably-secure DNS stack in all scenarios - Google and other cloud providers naturally do not have this problem. — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2VZ52AFQATW3T5R2S3XDQHPNANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

costinm · 2023-04-29T01:17:44Z

I think the root problem is that SPIFFE is over-selling the use of a URL (that in most cases is NOT a workload identity) to magically make things secure, and ignoring the complexities and insecure side-channels it introduces. Even if they had a well defined schema - like the distinguished name or JWT claims - it would still be tied to a discovery system to map what users want - access example.namespace.svc - to the URLs representing identity in whatever control plane is used. Complexity and mappings are not good for security. SPIFFE to represent a client identity - no problem, it's as good as any other opaque identifier. But if you want to extract metadata - a schema like JWT would still be better than opaque URL.

…

On Fri, Apr 28, 2023 at 6:13 PM Costin Manolache ***@***.***> wrote: I don't see why a name in the hostname syntax is 'rooted' in DNS - it's also an opaque identifier like an email address or URL. You don't need to do any DNS lookup in any of the verifications you do. Connections to www.google.com do not rely on a secure DNS - but on the server having a signed certificate for the name. In most cases for internal traffic - the DNS SAN will be the same with the service - example.namespace.svc.cluster.local - and while a DNS lookup is done by the client, the verification is based on the signed certificate, so even if DNS is insecure the communication is secure. That is not actually the case with Istio and SPIFFE today - if DNS is hacked, the VIP of a different service can be returned and the entire security and checks are messed up. This is well known and why Istio REQUIRES a secure DNS to be secure. There is nothing special about expressing something as URL instead of hostname or email, from a security perspective. The advantage of DNS over URL, when client authenticates the server, is that it is independent of a discovery server mapping DNS to VIP to URLs, and fully interoperable and well known mechanism. On Fri, Apr 28, 2023 at 11:59 AM Ben Leggett ***@***.***> wrote: > The other nice thing about SPIFFE is that identity can be described in a > way that is *not* inherently rooted in DNS server trust, which is nice > because not everyone can rely on a fully end-to-end attestably-secure DNS > stack in all scenarios - Google and other cloud providers naturally do not > have this problem. > > — > Reply to this email directly, view it on GitHub > <#43105 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAUR2VZ52AFQATW3T5R2S3XDQHPNANCNFSM6AAAAAAUPIVGYI> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

bleggett · 2023-05-01T15:06:01Z

I don't see why a name in the hostname syntax is 'rooted' in DNS - it's also an opaque identifier like an email address or URL. You don't need to do any DNS lookup in any of the verifications you do.

If you want to do any sort of attesting that a specific identifier belongs to a specific service and all you are using is DNS, you have to trust the DNS server to make that attestation, and all it can reasonably express is a name <-> IP mapping.

I think we're saying the same thing here.

SPIRE identifiers are also fully opaque, but unlike DNS, SPIRE offers many forms of workload attestation that go far beyond trusting the DNS records a given server possesses. All DNS can do is attest the validity of a name <-> IP map entry. That's a weak form of workload identity, and is not multifactor.

That is not actually the case with Istio and SPIFFE today - if DNS is hacked, the VIP of a different service can be returned and the entire security and checks are messed up. This is well known and why Istio REQUIRES a secure DNS to be secure.

This is a constraint of Istio (and kubernetes) yes. It has very little to do with SPIRE - I think the point you are making is that it will always be a constraint of Istio and Kubernetes, whether we used DNS or SPIRE to identify workloads, which I would certainly agree with - workload identity is one part of the puzzle.

There is nothing special about expressing something as URL instead of hostname or email, from a security perspective. The advantage of DNS over URL, when client authenticates the server, is that it is independent of a discovery server mapping DNS to VIP to URLs, and fully interoperable and well known mechanism.

Correct - the difference is in what attestations you can practically cryptographically attest against that identifier - DNS is not designed to attest anything besides a name <-> IP mapping, which by itself is not sufficient for attesting workload identity.

bleggett · 2023-05-01T15:18:47Z

I think the root problem is that SPIFFE is over-selling the use of a URL (that in most cases is NOT a workload identity) to magically make things secure, and ignoring the complexities and insecure side-channels it introduces.

Even if they had a well defined schema - like the distinguished name or JWT claims - it would still be tied to a discovery system to map what users want - access example.namespace.svc - to the URLs representing identity in whatever control plane is used.

Sure - parsing the specific fields of the identifier is out of scope of what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting - DNS has no such constraint or standard but conventions can be overlaid on it.

The difference is that DNS identifiers are only designed to attest a single identity factor historically (and until recently didn't even offer any real security guarantees about that attestation), and SPIFFE is expressly more general than that - crypographically binding attestations of multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS records is simply not something you can or will ever be able to do within the DNS standard.

It's a fundamentally unsound basis for workload identity - unless you invent several layers of de-facto standards that live outside normative DNS implementations, at which point you arrive at something that looks exactly like what SPIFFE/SPIRE already is, but with DNS names instead of SPIRE IDs. Which seems like a rather extreme and unproductive form of NIH.

And at that point what you have done is invest a lot of work to avoid using an existing standard, so you can craft another, even more de-facto standard around DNS records that is potentially worse, and certainly no better.

I'm not against putting DNS records in certs as a shortcut, or for admitting that we are probably, in the short term, bound to what K8S has decided to do - but I am saying that (vanilla, secure or not) DNS records are not, in the long term, a sufficient mechanism for representing workload identity (or for acting as a generic identifier that more specific forms of workload identity attestations can be cryptographically bound to), unless we invent several layers of nonstandard extensions to/assumptions around DNS. And if we do that, we have essentially reinvented SPIFFE/SPIRE but done some violence to an older, established, and simpler standard to get there.

costinm · 2023-05-01T15:54:44Z

On Mon, May 1, 2023 at 8:06 AM Ben Leggett ***@***.***> wrote: I don't see why a name in the hostname syntax is 'rooted' in DNS - it's also an opaque identifier like an email address or URL. You don't need to do any DNS lookup in any of the verifications you do. If you want to do any sort of attesting that a specific identifier belongs to a specific service and all you are using is DNS, you have to trust the DNS server to make that attestation. I think we're saying the same thing here. SPIRE identifiers are also fully opaque, but unlike DNS, SPIRE offers many forms of workload attestation that go far beyond trusting the DNS records a given server possesses. All DNS can do is attest the validity of a name <-> map. That's a weak form of workload identity, and is not multifactor. Not sure what you mean by 'opaque' - Istio use of Spiffe is certainly not

opaque, and fully opaque identifiers are not very useful without a service that can provide info about them making them less opaque. Spire != Spiffe and Istio is also a strange user - yes, it would be great to have 'more workload attestations' ( I have a proposal about adding telemetry info ), that has little to do with Spire or Spiffe. A certificate or JWT can attest/sign multiple things - including ownership of a hostname ( which may be in DNS or not - I think we should be more clear that FQDN is a host identifier which may or may not be recorded in DNS and is defined in a different RFC). I also don't understand what 'weak' means - a certificate or JWT represents a set of signed statements that the signer has verified. What makes 'I can attest the FQDN - pod name, namespace and cluster suffix' weaker than 'I can attest the service account, namespace and cluster' ? Same source ( K8S APIserver and JWTs). Multifactor is another thing I don't get - Istio doesn't have any multi-factor concept. I do agree that Spire ( and other CAs - but not Citadel because we chose not to ) can attest more than service account. I don't understand the opposition to have Citadel do this - but it's fine since other CAs can do it. I suspect we do agree that more than service account should be 'attested' - and I hope FQDN ( pod name, namespace, cluster) or service FQDN would be among the things we treat as first class, since the names are used as first class in k8s.

That is not actually the case with Istio and SPIFFE today - if DNS is hacked, the VIP of a different service can be returned and the entire security and checks are messed up. This is well known and why Istio REQUIRES a secure DNS to be secure. This is a constraint of Istio (and kubernetes) yes. It has very little to do with SPIRE - I think the point you are making is that it will always be a constraint of Istio and Kubernetes, whether we used DNS or SPIRE to identify workloads, which I would certainly agree with - workload identity is one part of the puzzle.

I'm not sure what SPIRE has to do with this discussion - it is one of the many certificate providers, and each CA can attest and include different things in a cert. What Istio and K8S are concerned with is attestations for the things we use and verify - regardless of CA ( or form of attestation - JWT tokens are also fine for most use cases if used properly ). That is the service account (in particular for servers), FQDN - and in an ideal world VIPs and IPs.

There is nothing special about expressing something as URL instead of hostname or email, from a security perspective. The advantage of DNS over URL, when client authenticates the server, is that it is independent of a discovery server mapping DNS to VIP to URLs, and fully interoperable and well known mechanism. Correct - the difference is in what attestations you can practically cryptographically attest against that identifier - DNS is not designed to attest anything besides a name <-> IP mapping, which by itself is not sufficient for attesting workload identity.

That is not true on multiple levels. DNS is not only for name to IP - it is commonly used to represent for example certs associated with a name, And nobody is discussing attesting the DNS A record - the cert is attesting a FQDN, either a hostname ( pod name, namespace, cluster) or a service FQDN. A waypoint may get a service cert - if the CA can attest that the gateway is authorized (RBAC) to serve it, very much like ACME works. The FQDN is intended to avoid the weaknesses of DNS - it is a proof you own the hostname, even if DNS returns the wrong IP.

…

Message ID: ***@***.***>

costinm · 2023-05-01T16:10:08Z

On Mon, May 1, 2023 at 8:18 AM Ben Leggett ***@***.***> wrote: I think the root problem is that SPIFFE is over-selling the use of a URL (that in most cases is NOT a workload identity) to magically make things secure, and ignoring the complexities and insecure side-channels it introduces. Even if they had a well defined schema - like the distinguished name or JWT claims - it would still be tied to a discovery system to map what users want - access example.namespace.svc - to the URLs representing identity in whatever control plane is used. Sure - parsing the specific fields of the identifier is out of scope of what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting - DNS has no such constraint or standard but conventions can be overlaid on it.

There are a couple of RFCs and docs defining such constraints. Again - this is not DNS, but hostnames, and the requirements for hostnames ( uniqueness on each domain, how it is represented in OS and language APIs, etc) are well known. Unfortunately K8S has some gaps - .cluster.local is good, but representing cluster names in MCS is still a bit weak and needs improvements. However for all use cases we are about - since the client starts with a FQDN ( or just hostname with clear expansion ) the naming must be working, that's the 'original intent' and everything we do is to make sure the destination is entitled to use it.

The difference is that DNS identifiers are only designed to attest a single identity factor historically (and until recently didn't even offer any real security guarantees about that attestation), and SPIFFE is expressly more general than that - crypographically binding attestations of multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS records is simply not something you can or will ever be able to do within the DNS standard. FQDN identifiers can be attested in many ways. Nothing makes a URL more

'attestable' than a FQDN, and what a CA provider uses to attest is strictly separated from the format of the signed info. I still don't understand what DNS records have to do with any of this - there are a lot of things in DNS secure and a lot of uses of DNS for representing certs and trust, but I don't see what it has to do with using the FQDN in the cert. There is no DNS involved in what we are discussion ( except that original client must still make a DNS request with a FQDN to resolve the IP we intercept - and need to ultimately map to a certificate - and if DNS is not secure the rest falls)

It's a fundamentally unsound basis for workload identity - unless you invent several layers of de-facto standards that live outside normative DNS implementations, at which point you arrive at something that looks *exactly* like what SPIFFE/SPIRE already is, but with DNS names instead of SPIRE IDs.

I still don't undersatand what you mean. What workload identity - Istio service account ? Spiffe is pretty vague on what is the identity of the workload except an opaque URL, and nothing makes it more 'sound' than any other opaque identifier. While 'workload identity' is not clearly defined - the use of FQDN is very sound and the basis of all internet communication - as well as what K8S and Istio are really handling, for client to server communication. What is unsound is the disconnect between what user is using ( example.namespace.svc ) which is clearly what needs to be validated because that's what the user wants to communicate with - and the various vaguely defined and implementation-specific identities in Spiffe.

And at that point what you have done is invest a lot of work to avoid using an existing standard, so you can craft another, even more de-facto standard around DNS records that is potentially worse, and certainly no better.

I don't know if this is sarcasm, hard to tell in comments :-) The use of FQDN and hostnames in certifiate is the most broadly used standard on the internet and in enterprise - and K8S is not so different. There is not a lot of work to use it - virtually every library and CA supports it. What we are doing with spiffee is investing a lot of work in something that is certainly no better - as we know very well ( i.e. breaks interop, has a strong dependency on a secure DNS, etc). Message ID: ***@***.***>

…

costinm · 2023-05-01T16:13:08Z

BTW - I want to be clear I do not disagree with you that including multiple 'attestations' in a certificate is a good thing, and Spiffe remains essential for client identity ( attesting the service account of the client ) and is useful for some attestation about servers as well. My point in this comments is that we must support the broader internet standards for interoperability and compatibility - and that includes using FQDN SANs and SNIs in our certs and support them first class.

…

On Mon, May 1, 2023 at 9:09 AM Costin Manolache ***@***.***> wrote: On Mon, May 1, 2023 at 8:18 AM Ben Leggett ***@***.***> wrote: > I think the root problem is that SPIFFE is over-selling the use of a URL > (that in most cases is NOT a workload identity) to magically make things > secure, and ignoring the complexities and insecure side-channels it > introduces. > > Even if they had a well defined schema - like the distinguished name or > JWT claims - it would still be tied to a discovery system to map what users > want - access example.namespace.svc - to the URLs representing identity in > whatever control plane is used. > > Sure - parsing the specific fields of the identifier is out of scope of > what SPIFFE (and SPIRE) offers. Same with the DNS naming you're suggesting > - DNS has no such constraint or standard but conventions can be overlaid on > it. > There are a couple of RFCs and docs defining such constraints. Again - this is not DNS, but hostnames, and the requirements for hostnames ( uniqueness on each domain, how it is represented in OS and language APIs, etc) are well known. Unfortunately K8S has some gaps - .cluster.local is good, but representing cluster names in MCS is still a bit weak and needs improvements. However for all use cases we are about - since the client starts with a FQDN ( or just hostname with clear expansion ) the naming must be working, that's the 'original intent' and everything we do is to make sure the destination is entitled to use it. > The difference is that DNS identifiers are only designed to attest a > single identity factor historically (and until recently didn't even offer > any real security guarantees about that attestation), and SPIFFE is > expressly more general than that - crypographically binding attestations of > multiple factors of workload identity (as SPIRE does to SPIFFE IDs) to DNS > records is simply not something you can or will ever be able to do within > the DNS standard. > > FQDN identifiers can be attested in many ways. Nothing makes a URL more 'attestable' than a FQDN, and what a CA provider uses to attest is strictly separated from the format of the signed info. I still don't understand what DNS records have to do with any of this - there are a lot of things in DNS secure and a lot of uses of DNS for representing certs and trust, but I don't see what it has to do with using the FQDN in the cert. There is no DNS involved in what we are discussion ( except that original client must still make a DNS request with a FQDN to resolve the IP we intercept - and need to ultimately map to a certificate - and if DNS is not secure the rest falls) > It's a fundamentally unsound basis for workload identity - unless you > invent several layers of de-facto standards that live outside normative DNS > implementations, at which point you arrive at something that looks > *exactly* like what SPIFFE/SPIRE already is, but with DNS names instead > of SPIRE IDs. > I still don't undersatand what you mean. What workload identity - Istio service account ? Spiffe is pretty vague on what is the identity of the workload except an opaque URL, and nothing makes it more 'sound' than any other opaque identifier. While 'workload identity' is not clearly defined - the use of FQDN is very sound and the basis of all internet communication - as well as what K8S and Istio are really handling, for client to server communication. What is unsound is the disconnect between what user is using ( example.namespace.svc ) which is clearly what needs to be validated because that's what the user wants to communicate with - and the various vaguely defined and implementation-specific identities in Spiffe. > And at that point what you have done is invest a lot of work to avoid > using an existing standard, so you can craft another, even more de-facto > standard around DNS records that is potentially worse, and certainly no > better. > > > I don't know if this is sarcasm, hard to tell in comments :-) The use of FQDN and hostnames in certifiate is the most broadly used standard on the internet and in enterprise - and K8S is not so different. There is not a lot of work to use it - virtually every library and CA supports it. What we are doing with spiffee is investing a lot of work in something that is certainly no better - as we know very well ( i.e. breaks interop, has a strong dependency on a secure DNS, etc). Message ID: ***@***.***> >

elinesterov · 2023-09-27T06:42:18Z

I think the list of discussions there was toward the format that adds additional information to the end of the spiffe path, but it can be added in the beginning, e.g., /cluster-id/ns/namespace/sa/service account cluster_id already a part of istio configuration and env variables and makes total sense in the multicluster scenarios.

Adding it would also solve the problem of multicluster deployment when the limitation is that you HAVE TO avoid namespace collision.

Adding it will not have to change a lot of internals here because it is already part of the configuration.

I understand that making the arbitrary SPIFFE ID support, as @bleggett mentioned, has some challenges to make it work with telemetry and access control. Still, using Istio for mTLS only and other systems like OPA for authorization might be a conscious choice. In this case, the user can opt-in to disable the default istio spiffe id template scheme.

elinesterov · 2023-09-27T06:53:29Z

@costinm

I think the root problem is that SPIFFE is over-selling the use of a URL
(that in most cases is NOT a workload identity)
to magically make things secure, and ignoring the complexities and insecure
side-channels it introduces.

If you read the SPIFFE specification, you can see that it is not only about URL, and not only about x.509 as an identity document. SPIFFE is the only mechanism that can enable federation easily for Istio and as authentication in multiple CA federated environments spiffe auth prevents cases of identity spoofing because of the trust domain.

and insecure side-channels it introduces.
I would love to learn more about side channels here :) in the case of istio and SPIRE, it uses the same mechanism of delivery of X509-SVID to the envoy. It actually makes it better from a security standpoint because doesn't need to rely on service accounts only as a security mechanism (which forces Istio users to create SA they might not even use or they all just use default)

elinesterov · 2023-09-27T17:03:30Z

I was thinking: what if allowed to provide a scheme that should be used by Istio and leave the default as it is today: ns/namespace/sa/service-account in this case, if I opt-in using a different scheme, I just need to provide it to Istio and as long it has namespace and service accounts in the spiffe id everything should function as it is now.

costinm · 2023-09-27T17:09:02Z

On Tue, Sep 26, 2023 at 11:53 PM Eli Nesterov ***@***.***> wrote: @costinm <https://github.com/costinm> I think the root problem is that SPIFFE is over-selling the use of a URL (that in most cases is NOT a workload identity) to magically make things secure, and ignoring the complexities and insecure side-channels it introduces. If you read the SPIFFE specification, you can see that it is not only about URL, and not only about x.509 as an identity document. SPIFFE is the only mechanism that can enable federation easily for Istio and as authentication in multiple CA federated environments spiffe auth prevents cases of identity spoofing because of the trust domain.

I wouldn't say it's the 'only' mechanism - even the existing Istio APIs allow control over which CAs are accepted for a particular host. Identify federation is not yet a supported feature in Istio and we don't have agreement on what will be supported - so it's not a factor beyond what our APIs already support ( both around CAs and OIDC ).

…

and insecure side-channels it introduces. I would love to learn more about side channels here :) in the case of istio and SPIRE, it uses the same mechanism of delivery of X509-SVID to the envoy. It actually makes it better from a security standpoint because doesn't need to rely on service accounts only as a security mechanism (which forces Istio users to create SA they might not even use or they all just use default) — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2T4TSEUQ3WHDF4C5R3X4PEPJANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

costinm · 2023-09-27T17:11:49Z

AFAIK you can already use any SAN you want - using the explicit APIs we support. I think we agreed ( at least a subgroup ) that for extra info we can use the newly defined cert fields, believe K8S got some assigned. The risks of changing the identity format in Istio are hard to overstate... For ambient - I don't mind making some changes, but more towards using DNS certs for servers.

…

On Wed, Sep 27, 2023 at 10:03 AM Eli Nesterov ***@***.***> wrote: I was thinking: what if allowed to provide a scheme that should be used by Istio and leave the default as it is today: ns/namespace/sa/service-account in this case, if I opt-in using a different scheme, I just need to provide it to Istio and as long it has namespace and service accounts in the spiffe id everything should function as it is now. — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2XC5AQB3SF4SLFK7RLX4RL65ANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

elinesterov · 2023-09-27T17:17:43Z

AFAIK you can already use any SAN you want - using the explicit APIs we
support.

@costinm do you mean by using the Destination rule? or any other mechanism? Would you mind please to point me to the direction where I can read/find more about it

elinesterov · 2023-09-27T17:19:03Z

Also is ti possible to configure envoy through istio to use different formats of spiffe id in different contexts (e.g. internal inside cluster and external when talking to services outside of the mesh e.g. via service records + destination rules?)

costinm · 2023-09-28T01:24:54Z

"Identity Federation" usually means communicating with a peer that has a different identity provider/roots. It has 2 sides - client verifying the server and server authorizing a client. DestinationRule allows you to specify which root CAs to trust and any SAN you want - URL or DNS are both fine. For verifying a client using a different identity domain - with JWTs we have normal OIDC, for client certs we can also specify any CA we want - and I believe the authz rules still allow rules using the full SAN ( in addition to the extracted namespace and SA ). This works not only with Siffe and URL-based SANs - and has been quite stable for a long time. It is not automatic or default. OIDC also has a very long history with this - unfortunately the constraints in certs were never broadly implemented, just like Spiffe 'federation' is unlikely to have broad adoption - in practice when you have 2 domains using different identity providers there are some gateways in the middle, you don't have 'flat network' which is required for mTLS. So you end up with most of the federation config in the gateway - and for that the current APIs work as expected ( and many other gateways besides Istio provide similar controls ). I personally wouldn't mind too much more flexibility in the format of client certificates - but I am strongly against messing up with the identity of the servers, we already have a very fragile and incompatible mechanism in 'secure naming' ( i.e. guessing which identity a server might have by looking at each pod that is selected across the mesh is quite bad IMHO). And with ACME - including 'private' ACME as in step-ca - I think we would be far better moving back to using standard DNS SAN for services ( in particular in ambient)

…

On Wed, Sep 27, 2023 at 10:19 AM Eli Nesterov ***@***.***> wrote: Also is ti possible to configure envoy through istio to use different formats of spiffe id in different contexts (e.g. internal inside cluster and external when talking to services outside of the mesh e.g. via service records + destination rules?) — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2R2WT5CZL436YHS32DX4RNZHANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

costinm · 2023-09-28T01:36:11Z

For more context on ambient - if we are moving towards L7 'gateways' enforcing the policies - and possibly interop with other 'classes' of gateway ( not only istio ): most of the mTLS will be between a client and a gateway which might not be an Istio ( i.e. spiffe) gateway. Ztunnel is per node - and currently has some special requirements for certs, and I don't think we have any design on 'federation', in particular as we hope ztunnel to converge with CNI and maybe have improved netpol. As I mentioned - 'flat network' and a lot of trust in the control planes is generally required for normal workload to workload mTLS - for everything else mTLS will terminate in gateway and most likely will use JWTs for multi-hop proof. And for JWTs I know SPIFFE has their 'standard' - but so does OIDC and that is already broadly supported. In this context - not the best time to mess with 'istio sidecar' identity model or add one more migration problem for ambient, when the use case is exceedingly uncommon.

…

On Wed, Sep 27, 2023 at 6:24 PM Costin Manolache ***@***.***> wrote: "Identity Federation" usually means communicating with a peer that has a different identity provider/roots. It has 2 sides - client verifying the server and server authorizing a client. DestinationRule allows you to specify which root CAs to trust and any SAN you want - URL or DNS are both fine. For verifying a client using a different identity domain - with JWTs we have normal OIDC, for client certs we can also specify any CA we want - and I believe the authz rules still allow rules using the full SAN ( in addition to the extracted namespace and SA ). This works not only with Siffe and URL-based SANs - and has been quite stable for a long time. It is not automatic or default. OIDC also has a very long history with this - unfortunately the constraints in certs were never broadly implemented, just like Spiffe 'federation' is unlikely to have broad adoption - in practice when you have 2 domains using different identity providers there are some gateways in the middle, you don't have 'flat network' which is required for mTLS. So you end up with most of the federation config in the gateway - and for that the current APIs work as expected ( and many other gateways besides Istio provide similar controls ). I personally wouldn't mind too much more flexibility in the format of client certificates - but I am strongly against messing up with the identity of the servers, we already have a very fragile and incompatible mechanism in 'secure naming' ( i.e. guessing which identity a server might have by looking at each pod that is selected across the mesh is quite bad IMHO). And with ACME - including 'private' ACME as in step-ca - I think we would be far better moving back to using standard DNS SAN for services ( in particular in ambient) On Wed, Sep 27, 2023 at 10:19 AM Eli Nesterov ***@***.***> wrote: > Also is ti possible to configure envoy through istio to use different > formats of spiffe id in different contexts (e.g. internal inside cluster > and external when talking to services outside of the mesh e.g. via service > records + destination rules?) > > — > Reply to this email directly, view it on GitHub > <#43105 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAAUR2R2WT5CZL436YHS32DX4RNZHANCNFSM6AAAAAAUPIVGYI> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

elinesterov · 2023-09-28T15:56:32Z

"Identity Federation" usually means communicating with a peer that has a
different identity provider/roots.
It has 2 sides - client verifying the server and server authorizing a
client.

DestinationRule allows you to specify which root CAs to trust and any SAN
you want - URL or DNS are both fine.

@costinm I think this flexibility only works when you use secrets in the destination rule.
Something like:

spec:
  host: mydbserver.prod.svc.cluster.local
  trafficPolicy:
    tls:
      mode: MUTUAL
      clientCertificate: /etc/certs/myclientcert.pem
      privateKey: /etc/certs/client_private_key.pem
      caCertificates: /etc/certs/rootcacerts.pem

When it comes to using SDS I cannot find a way to do that. I think using different contexts would solve that e.g. using builtin:https://external would tell sds to use context foo so the envoy can get client cert\key and root for that external out-of-cluster mTLS (this is where different spiffe id and any other setting can be used). I think that wouldn't mess with internal Istio spiffe id format.

bleggett · 2023-09-28T16:25:13Z

The DestinationRule approach is per-workload. That means if you have 1000 workloads you need to create 1000 DestinationRule overrides to "fix" the Istio-hardcoded SAN format. That's a kludge, not an API.

The hardcoded SAN format is:

Istio specific
Not sufficient for all use cases
Enforced in one spot in the Istio envoy config
Implicitly assumed everywhere else

It's just not particularly robust as-implemented - the inflexibility is a side effect of the fragility.

While changing the default format would be invasive (though certainly not the most invasive change Istio has ever presented users with), there's clearly several other opt-in options that wouldn't harm the default functionality that are worth pursuing here.

the tl;dr is that if istiod is currently the only thing that can act as a CA for workloads, that's tech debt - not essential protected functionality. "You must use our CA because our CA is special" is a bug. Nothing about Istio's functionality needs to depend on you selecting a specific workload CA. We simply overindex on our built-in implementation of the workload CA.

It simply should not be Istio's business how a CA ties a workload to a cert. Currently, it is, and that creates a lot of problems - not just "I have a niche use case" problems, but also general fragility and scoping problems from the auth APIs on down.

costinm · 2023-09-29T02:42:54Z

The Istio spiffe format IS itio specific. There is no standard URL format. If you have 1000 workloads not using Istio Spiffee - maybe you should not use Istio, or modify whatever is generating the certs to use Istio format. Why would we take the risks and do the work maintaining a separate format ( one ? 10 ? how many ways do you want to parse a URL - and make sure this is reflected in all code that deals with certs). Are we going to have regex to extract arbitrary URLs ? And the format is the least of your problems - how do you discover what identity to use for each of the 1000 workloads ? Secure naming generates istio style URLs. You do not have to use our CA - but the URL format we use to represent identities. Any CA would work. And you can also just use DNS SANs for the services - to avoid the much larger problem of configuring which SANs are allowed for each service. I am not a big fan of using an arbitrary URL and requiring an external mapping service from a service name to a VIP to a list of URLs representing any KSA. But that's what we have used, and I have not seen any reason why a different URL format would be better ( or we need to support 7 URL formats )

…

On Thu, Sep 28, 2023 at 9:25 AM Ben Leggett ***@***.***> wrote: The DestinationRule approach is per-workload. That means if you have 1000 workloads you need to create 1000 DestinationRule overrides to "fix" the Istio-hardcoded SAN format. That's a kludge, not an API. - Istio specific - Not sufficient for all use cases - Enforced in one spot in the Istio envoy config Other parts of the Istio code make assumptions about the format of the SPIFFE ID, based on that config, it's not particularly robust as-implemented. While changing the default format would be invasive, there's clearly plenty of other opt-in options that wouldn't harm the default functionality. If istiod currently is the only thing that can act as a CA for workloads, that's tech debt - not essential protected functionality. "You must use our CA because our CA is special" is a bug. Nothing about *Istio's* functionality needs to depend on you selecting a specific CA. — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2VIGTTPQCEEJS72V6LX4WQHJANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

costinm · 2023-09-29T02:49:51Z

On Thu, Sep 28, 2023 at 8:56 AM Eli Nesterov ***@***.***> wrote: "Identity Federation" usually means communicating with a peer that has a different identity provider/roots. It has 2 sides - client verifying the server and server authorizing a client. DestinationRule allows you to specify which root CAs to trust and any SAN you want - URL or DNS are both fine. @costinm <https://github.com/costinm> I think this flexibility only works when you use secrets in the destination rule. Something like: spec: host: mydbserver.prod.svc.cluster.local trafficPolicy: tls: mode: MUTUAL clientCertificate: /etc/certs/myclientcert.pem privateKey: /etc/certs/client_private_key.pem caCertificates: /etc/certs/rootcacerts.pem When it comes to using SDS I cannot find a way to do that. I think using different contexts would solve that e.g. using builtin:https://external would tell sds to use context foo so the envoy can get client cert\key and root for that external out-of-cluster mTLS (this is where different spiffe id and any other setting can be used). I think that wouldn't mess with internal Istio spiffe id format.

I'm all for improving the UX for DestinationRule loading certificates. It's a pretty stable API/feature, we went through this with gateway so if anything is broken or hard to use - I would rather fix it then try to hack the URL format in the spiffe cert and any associated code. This will also help with many other use cases and improve an existing API. BTW - in your example you would still need a list of Spiffee URLs to validate as proof for the host (unless the host is using a DNS SAN). This thread is about custom Spiffee URLs - and as I mentioned in a previous comment the list of URLs may change and it is not so easy to discover if the workloads are in other clusters ( which may not be watched by Istiod ). Message ID: ***@***.***>

…

bleggett · 2023-09-29T15:55:14Z

The Istio spiffe format IS itio specific. There is no standard URL format.

There is nothing Istio specific about the format. There is a hardcoded default in Istio, which is not exposed in any API, but which several of our APIs have to implicitly assume and be aware of.

There are only 2 questions here:

Istio currently DOES hardcode the SPIFFE ID format - prefix AND postfix, which some code (and some APIs) opaquely assume.

Does that code actually NEED to make that assumption?
If it does, does it NEED to make that assumption about the prefix and the postfix, or just one or the other?

The answer to 2 is clearly "no, it does not".

The answer to 1 is likely "no, it does not, but it's a little hard to fix".

There's not much more to argue about here from a design perspective.

A SPIFFE ID is a pointer. It is not an identity. All Istio really, genuinely needs to do is hand that pointer to a workload CA and get a x509 cert back. To the degree that istio cares about the contents of the pointer, versus the thing it points to, is a degree to which Istio is making fragile assumptions. We should minimize, not maximize, the number of fragile assumptions.

This is exactly analogous to DNS - assumptions should be made about the thing the pointer points to, and not the format of the pointer itself. k8s needs to force a DNS name postfix format, and does, but it hardcodes/forces a minimal opinion - just the postfix is non-negotiable, it has no opinion about any other part of the DNS name.

There's no reason why we can't, or shouldn't, follow that for SPIFFE ID formats.

If you have 1000 workloads not using Istio Spiffee - maybe you should not use Istio, or modify whatever is generating the certs to use Istio format.

Again - Istio's value is not in being a workload CA. Nor is Istio's value in being a remarkably inflexible workload CA. It's 5% of the functionality and there are many implementations. It really shouldn't matter what workload CA Istio uses at all. Istio binds policy to workload identities. How those workload identities are generated, or mapped to workloads, is an implementation detail of the workload CA. Istio provides a basic workload CA implementation, much like it provides a basic waypoint implementation. Istio currently overindexes on its own basic workload CA implementation as a de-facto standard, when that is not strictly required.

Istio's invariants around the workload CA are

Every workload should have a cryptographic identity (how the workload is bound to the identity -> owned by the workload CA, not Istio's CP).
Istio should be able to figure out which workload CA to talk to.
Istio should be able to obtain a cryptographic identity from the workload CA by giving it some pointer.
The pointer should be "sufficiently" (where "sufficient" is determined by the workload CA) unambiguous, and return a single cryptographic identity (which encodes the pointer)
The cryptographic identity should be validate-able by Istio.

That's it. Nothing there requires the use of a specific workload CA. It requires a (very basic, probably x509) contract between Istio and the workload CA.

Why would we take the risks and do the work maintaining a separate format ( one ? 10 ? how many ways do you want to parse a URL - and make sure this is reflected in all code that deals with certs). Are we going to have regex to extract arbitrary URLs ? And the format is the least of your problems - how do you discover what identity to use for each of the 1000 workloads ?

Again - see DNS and the previous examples. How many "DNS formats" does Istio (or K8S) support? What parts of the DNS name MUST FOLLOW a fixed format for Istio to function? It's not a matter of "what Istio must support" - it's a matter of Istio making minimal, versus maximal, assumptions for the constraints it places around an external system (DNS, SPIFFE, x509, etc).

Secure naming generates istio style URLs.

No. Secure naming, from Istio's current perspective, generates URLs that contain at least a trust domain, a service account name, and a namespace. We have some code that confuses at least with at most, for no particularly defensible reason.

And again - the pointer is not "secure", it just needs to be "sufficiently" unambiguous, and tightly bound to the cryptographic identity the workload CA resolves it to. The latter is where the "secure" comes from.

Istio overspecifies an external system's spec, which creates needless inflexibility and API problems.
Istio does not actually need to engage in that overspecification, but does.

I'm all for improving the UX for DestinationRule loading certificates. It's
a pretty stable API/feature, we went through this with gateway so
if anything is broken or hard to use - I would rather fix it then try to
hack the URL format in the spiffe cert and any associated
code.

The current URL format is an over-fragile hack, with no API.

We're right back at what I said previously:

that creates a lot of problems - not just "I have a niche use case" problems, but also general fragility and scoping problems from the auth APIs on down.

Why on earth would we break a very well-scoped public API (arguably one of the few well-scoped APIs we have) in order to work around an unnecessarily hard-coded internal default for a workload's SPIFFE ID format?

DestinationRules are a symptom, they aren't the problem here.

If I have a system where the requirement is that "DNS names MUST HAVE AT LEAST a postfix that matches this pattern" and I write code that creates an implicit requirement that "DNS names CAN ONLY match this pattern" - I've made a mistake, not created a defensible internal standard.

kyessenov · 2023-09-29T22:12:26Z

What exactly assumes the format in Istio? My understanding is that it's fairly small:

Authz API translates source principal namespace constraints into regex templates.
Ambient translates trust domain constraints into regex / prefix template.
Telemetry code validates namespace from SPIFFE against peer header.

Is there anything consuming the principal as a non-opaque string? There's a lot of infrastructure that produces the principals (CA, constraints, etc), but I'm asking about the consumers strictly.

keithmattix · 2023-09-29T22:17:37Z

SAN matching for TLS context is the one that comes to my mind; you may have implied that in your first bullet though

howardjohn · 2023-09-29T22:21:21Z

The client side part is missing. We have code that is pod -> full URI. Need a way to replicate that

…

On Fri, Sep 29, 2023 at 3:17 PM Keith Mattix II ***@***.***> wrote: SAN matching for TLS context is the one that comes to my mind; you may have implied that in your first bullet though — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEYGXPBCI47QCUOIOI32U3X45CIZANCNFSM6AAAAAAUPIVGYI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

kyessenov · 2023-09-29T22:23:02Z

@keithmattix Where is it parsing SPIFFE ID? I only see exact matching (and some trust domain stuff, which is also precise match). Ambient does something different (point 2), but we can ignore that

keithmattix · 2023-09-29T22:33:28Z

Where is it parsing SPIFFE ID

Ah I see what you mean. In that case, I don't think we have a ton of parsing; SPIFFE as a whole just seems like a convenient format to store certain information.

kyessenov · 2023-09-29T22:51:42Z

Yeah, it's an encoding of certain attested workload attributes. John's point is valid - xDS control plane is attesting the server SVIDs on the client by using the service registry and filling out the templates, which requires that the client and the server agree on the template format.

So there are two major places:

A client must attest server SPIFFE ID from the server service account (fill the template on the client).
A server must be able to extract the namespace from the client SPIFFE ID (parse the template on the server). This is both to express a policy and to emit telemetry.

linsun · 2023-12-15T14:17:37Z

@EItanya any update on the design doc for this?

bleggett · 2024-07-26T15:45:56Z

Update on this - one of the things we have settled on I think is that

Allowing completely arbitrary SPIFFEIDs is problematic and requires quite a bit of rework, breaking existing policies, and several knock-on effects to the trust model potentially - we probably will not do this in the short term.
Allowing appending arbitrary segments to the suffix/prefix of the Istio SPIFFEID should be pretty doable (suffix might just be a slight Envoy tweak to support actually) - we probably will do this in the medium term.

costinm · 2024-07-28T04:06:34Z

I don't think appending more info is the wrong solution, now that OIDs have been defined for pod info. Also using an additional DNS SAN with the pod fqdn is cleaner and provides more interop for server authentication. Parsing Spidffe URLs and defining more istio-only formats is a dead end.

…

On Fri, Jul 26, 2024, 08:46 Ben Leggett ***@***.***> wrote: Update on this - one of the things we have settled on I think is that - Allowing *completely arbitrary* SPIFFEIDs is problematic and requires quite a bit of rework, breaking existing policies, and several knock-on effects to the trust model potentially - we probably will not do this in the short term. - Allowing *appending* arbitrary segments to the suffix/prefix of the Istio SPIFFEID should be pretty doable (suffix might just be a slight Envoy tweak to support actually) - we probably *will* do this in the medium term. — Reply to this email directly, view it on GitHub <#43105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAUR2TXCLOZLFLYG46IGP3ZOJVMXAVCNFSM6AAAAAAUPIVGYKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJTGAZTGMRWGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

istio-policy-bot added area/ambient Issues related to ambient mesh area/environments area/security area/user experience kind/docs kind/enhancement labels Feb 2, 2023

bleggett changed the title ~~Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be possible~~ Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be supported Feb 2, 2023

howardjohn removed the area/ambient Issues related to ambient mesh label Feb 13, 2023

JamesCallaghan mentioned this issue Apr 24, 2023

feat: use istio destination rules to allow for arbitrary SPIFFE IDs controlplaneio/threat-modelling-zero-trust-talk#4

Merged

bleggett mentioned this issue Jun 13, 2023

Added identity path prefix variable to meshconfig #45441

Closed

linsun assigned EItanya Dec 15, 2023

EItanya mentioned this issue Dec 15, 2023

SPIRE integration with Ambient #42339

Open

keithmattix mentioned this issue Mar 25, 2024

proposal: support customized spiffeId format #50064

Closed

istio-policy-bot added the lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while label Jun 13, 2024

howardjohn added lifecycle/staleproof Indicates a PR or issue has been deemed to be immune from becoming stale and/or automatically closed and removed lifecycle/stale Indicates a PR or issue hasn't been manipulated by an Istio team member for a while labels Jun 13, 2024

Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be supported #43105

Customizing SPIFFE ID format if using an external SPIFFE-compliant SDS should be supported #43105

Comments

bleggett commented Feb 2, 2023 • edited Loading

Summary

Detail

bleggett commented Feb 2, 2023

kyessenov commented Feb 2, 2023 • edited Loading

bleggett commented Feb 2, 2023 • edited Loading

kyessenov commented Feb 2, 2023

bleggett commented Feb 2, 2023 • edited Loading

hzxuzhonghu commented Feb 3, 2023

bleggett commented Feb 3, 2023 • edited Loading

dafang982 commented Apr 6, 2023

costinm commented Apr 28, 2023

bleggett commented Apr 28, 2023 • edited Loading

bleggett commented Apr 28, 2023 • edited Loading

costinm commented Apr 29, 2023 via email

costinm commented Apr 29, 2023 via email

bleggett commented May 1, 2023 • edited Loading

bleggett commented May 1, 2023 • edited Loading

costinm commented May 1, 2023 via email

costinm commented May 1, 2023 via email

costinm commented May 1, 2023 via email

elinesterov commented Sep 27, 2023

elinesterov commented Sep 27, 2023

elinesterov commented Sep 27, 2023

costinm commented Sep 27, 2023 via email

costinm commented Sep 27, 2023 via email

elinesterov commented Sep 27, 2023

elinesterov commented Sep 27, 2023

costinm commented Sep 28, 2023 via email

costinm commented Sep 28, 2023 via email

elinesterov commented Sep 28, 2023

bleggett commented Sep 28, 2023 • edited Loading

costinm commented Sep 29, 2023 via email

costinm commented Sep 29, 2023 via email

bleggett commented Sep 29, 2023 • edited Loading

kyessenov commented Sep 29, 2023

keithmattix commented Sep 29, 2023

howardjohn commented Sep 29, 2023 via email

kyessenov commented Sep 29, 2023

keithmattix commented Sep 29, 2023 • edited Loading

kyessenov commented Sep 29, 2023

linsun commented Dec 15, 2023

bleggett commented Jul 26, 2024 • edited Loading

costinm commented Jul 28, 2024 via email

bleggett commented Feb 2, 2023 •

edited

Loading

kyessenov commented Feb 2, 2023 •

edited

Loading

bleggett commented Feb 2, 2023 •

edited

Loading

bleggett commented Feb 2, 2023 •

edited

Loading

bleggett commented Feb 3, 2023 •

edited

Loading

bleggett commented Apr 28, 2023 •

edited

Loading

bleggett commented Apr 28, 2023 •

edited

Loading

bleggett commented May 1, 2023 •

edited

Loading

bleggett commented May 1, 2023 •

edited

Loading

bleggett commented Sep 28, 2023 •

edited

Loading

bleggett commented Sep 29, 2023 •

edited

Loading

keithmattix commented Sep 29, 2023 •

edited

Loading

bleggett commented Jul 26, 2024 •

edited

Loading