Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to pass providers to modules in for_each #24476

Open
mightyguava opened this issue Mar 26, 2020 · 162 comments
Open

Ability to pass providers to modules in for_each #24476

mightyguava opened this issue Mar 26, 2020 · 162 comments

Comments

@mightyguava
Copy link

Use-cases

I'd like to be able to provision the same set of resources in multiple regions a for_each on a module. However, looping over providers (which are tied to regions) is currently not supported.

We deploy most of our infra in 2 regions in an active-passive configuration. So being able to instantiate both regions using the same module block would be a huuuge win. It's also our primary use case for for_each on modules being implemented in #17519.

Proposal

Proposed syntax from @jakebiesinger-onduo

provider "google" {
  alias = "goog-us-east1"
  region = "us-east1"
}
provider "google" {
  alias = "goog-us-west1"
  region = "us-west1"
}
locals {
  regions = toset(['us-east1', 'us-west1'])
  providers = {
    us-east1 = google.goog-us-east1
    us-west1 = google.goog-us-west1
  }
}
module "vpc" {
 for_each = local.regions
  providers = {
    google = local.providers[each.key]
  }
  ...
}

Another option would be to de-couple the region from providers, and allow the region to be passed in to individual resources that are region aware. As far as I know, both AWS and GCP credentials at least are global.

References

@hashibot
Copy link
Contributor

Hello! 🤖

This issue seems to be covering the same problem or request as #9448, so we're going to close it just to consolidate the discussion over there. Thanks!

@jspiro
Copy link

jspiro commented Mar 27, 2020

Uh. These requests are in no way similar. Bad bot.

@mightyguava
Copy link
Author

Yeah, this is not the same as #9448 at all. @pselle @apparentlymart could you help with reopening this issue?

@pkolyvas pkolyvas changed the title Support for_each over providers Ability to provide specific providers, based on an each.key, to a module Mar 30, 2020
@pkolyvas
Copy link
Contributor

Hey there @mightyguava & @jspiro,

I'm going to re-open the issue as I agree that the concerns are not the same.

I did rename it for clarity; to distinguish this request from instantiating providers with for_each.

@pkolyvas pkolyvas reopened this Mar 30, 2020
@s1mark
Copy link

s1mark commented Apr 25, 2020

@mightyguava
I ran into the same abstraction issue with the azurerm provider. My goal was to automate multiple azure subscriptions and keep the code DRY as possible. Since I have to use Service Principals for auth with the azurerm provider, each subscription requires a separate provider declaration. I have ended up using terragrunt's generate function (https://terragrunt.gruntwork.io/docs/reference/config-blocks-and-attributes/#generate)

.
├── dev
│   └── terragrunt.hcl
├── modules
│   └── my_module
│       └── main.tf
├── prod
│   └── terragrunt.hcl
├── stage
│   └── terragrunt.hcl
└── variables.tf

./dev/terragrunt.hcl:

terraform {
  source = "${get_parent_terragrunt_dir()}/../"
}

# will generate content to ./providers.tf
generate "providers" {
  path      = "providers.tf"
  if_exists = "overwrite"
  contents = <<EOF
provider "azurerm" {
  # my main provider
  version         = "~> 2.6"
  subscription_id = "11111111-2222-3333-4444-555555555555"
  client_id       = var.client_id
  client_secret   = var.client_secret
  tenant_id       = var.tenant_id
  features {}
}

provider "azurerm" {
  alias           = "my_alias_provider"
  version         = "~> 2.6"
  subscription_id = "66666666-7777-8888-9999-000000000000"
  client_id       = var.client_id
  client_secret   = var.client_secret
  tenant_id       = var.tenant_id
  features {}
}
EOF
}

# will generate content to ./main.tf
generate "main" {
  path      = "main.tf"
  if_exists = "overwrite"
  contents = <<EOF
module "my_main_provider" {
  source = "./modules/my_module"
}

module "my_alias_provider" {
  source    = "./modules/my_module"
  providers = {
    azurerm = azurerm.my_alias_provider
  }
}
EOF
}

./variables.tf:

# passing these from cli or exporting to TF_VAR
# Note that both of my subscriptions use the same SP for auth and 
# both in the same tenant, so the difference is only the subscription_id
variable "client_id" {}
variable "client_secret" {}
variable "tenant_id" {}

The ./modules/my_module/main.tf contains the desired code without any provider block declaration (passing down provider declaration from root to child module)

Maybe this is not fully covering your scenario, but it provides some flexibility over different environment settings

@vivanov-dp
Copy link

vivanov-dp commented Aug 25, 2020

First of all, thanks for the great work adding iteration and depends_on for modules - both are going to be really useful and I wished for them so many times back during 0.11 days when we were building the majority of our config.

In addition to each.key, I'd expect to be able to freely use maps with for_each and have each.<property> be a provider. This would require the ability to assign a provider "instance" to a local or list/map members.
For example:

locals {
  modules_vars = {
    instance_1 = {
      var1 = ...
      var2 = ...
      provider = aws.euw1            // ERROR
    }
    instance_2 = {
      var1 = ...
      var2 = ...
      provider = aws.cnnw1           // ERROR
    }
  }
}

module "something" {
  source = "./module_something"
  for_each = local.modules_vars

  providers = { aws = each.provider }           // ERROR
  var1 = each.var1
  var2 = each.var2
}

This was the first thing I tried to do when I learned that 0.13 has for_each for modules, which brought me to #17519 and eventually - here. Since we maintain infrastructure in multiple AWS regions and availability zones around the world, most of the modules in our configuration require passing a provider along with at least a few other variables.

@mightyguava mightyguava changed the title Ability to provide specific providers, based on an each.key, to a module Ability to pass providers to modules in for_each Aug 25, 2020
@gbataille
Copy link
Contributor

While not strictly the same as #9448 I think they might be solved together.

First, like @vivanov-dp said, thanks for adding the for_each support for modules. I had been expecting it for a long time.

However I had not realized that provider configuration in modules was deprecated.

Here is my use case:

  • I use terraform to manage the list of AWS accounts I have in my organization
  • When I create a new account (from a variable list), I then want to provision it with a few common standard resources.

What I have now is that all those standard resources are in a module. I instanciate the module once per sub account and I pass the IAM role to the module. The module then opens a provider connection to the right account and the right role (different for each module instance).

This still works in 0.13. However, when I tried to migrate to "for_each" to instanciated all the modules for all the sub-account in a single module block, I hit the issue that providers inside modules are not supported anymore.

And since I can't construct a dynamic list of providers, I think I'm stuck.

Am I right to think there is currently no workaround for my use case?

Should I then split account creation and account "basic provisionning" in 2 different terraform projects?

thanks

@nikolay
Copy link

nikolay commented Sep 25, 2020

I personally think that inline provider declaration, which honors the module for_each or count is the cleanest solution:

module "some_module" {
  source = "./some-module"
  for_each = local.modules_elements

  provider "provider1" {
    ...
  }

  provider "provider2" {
    ...
  }

  var1 = each.var1
  var2 = each.var2
}

Ideally, this would support dynamic for providers as well.

Another option is to add for_each for provider as well along these lines:

provider "aws" {
  for_each = var.regions
  region = each.value
}

resource "some_type" "some_id" {
  provider = aws["us-east-1"]
}

@vivanov-dp
Copy link

@nikolay
That can do the job, but why creating new providers for each module invocation ?
Even if it is "for free" in terms of performance, which I don't really know, there are a bunch of properties to configure the provider and this approach would require to put them all into local.modules_elements and list them all in each provider declaration in each module invocation.
You can't really declare an AWS provider just by setting the region. It requires a profile, or access_key&secret_key too and it is very likely that the assume_role would also be set. It has 15+ properties and many of them become useful as the architecture grows.

@nikolay
Copy link

nikolay commented Sep 25, 2020

@vivanov-dp This was pseudocode just to illustrate my point, which was that the logic of how the provider should be initialized could be encapsulated in the module. I can't think of a situation where the instantiation of a provider would be an expensive operation. Also, providers with assume_role have session information, which may not make sense to be reused across different modules, but will happen due to natural laziness if we have to create too many aliases.

@vivanov-dp
Copy link

@nikolay
What I understand is that you propose to have this:

locals {
  modules_vars = {
    instance_1 = {
      var1 = ...
      var2 = ...
      region = ...
      profile = ...
      role_arn = ...
    }
    instance_2 = {
      var1 = ...
      var2 = ...
      region = ...
      profile = ...
      role_arn = ...
    }
  }
}

module "some_module" {
  source = "./some-module"
  for_each = local.modules_vars

  provider "aws" {
      region  = each.region
      profile = each.profile
      assume_role {
          role_arn = each.role_arn
      }
  }

  var1 = each.var1
  var2 = each.var2
}

instead of:

provider "aws" {
    alias   = "euw1"
    region  = "eu-west-1"
    profile = var.aws_west_profile
    assume_role {
        role_arn = "arn:${var.aws_partition}:iam::${var.aws_account_id}:role/TerraformRole"
    }
}
provider "aws" {
    alias   = "cnnw1"
    region  = "cn-northwest-1"
    profile = var.aws_cn_profile
    assume_role {
        role_arn = "arn:${var.aws_cn_partition}:iam::${var.aws_cn_account_id}:role/TerraformRole"
    }
}

locals {
  modules_vars = {
    instance_1 = {
      var1 = ...
      var2 = ...
      provider = aws.euw1
    }
    instance_2 = {
      var1 = ...
      var2 = ...
      provider = aws.cnnw1
    }
  }
}

module "something" {
  source = "./module_something"
  for_each = local.modules_vars

  providers = { aws = each.provider }
  var1 = each.var1
  var2 = each.var2
}

But then I have one set of providers for everything else and one set of the same properties just for the modules. So which one is the source of truth ? Unless I declare my providers by using the same sets of properties - so I have to create a new abstraction - the set of providers properties and use that in all places.

As I said - it can do the job, I think it looks nice and is not a bad idea, but IMO it involves more changes to the existing configuration than if we could just use the already defined providers - which in my case are in an external file, often 1 or 2 directories up the hierarchy and propagated down via a script that generates main.tf & variables.tf.

@jon-walton
Copy link

jon-walton commented Sep 25, 2020

A use case for me would be to configure a dynamic provider based on output from a module using for_each

such as creating multiple kubernetes clusters (foo) and optionally applying resources (bar)

module "foo" {
  source = "./foo"
  for_each = var.foo_things

  var1 = each.key
  var2 = each.values.something
}

module "bar" {
  source = "./bar"
  for_each = { for k, v in var.bar_things : k => v if v.add_bar_to_foo == true }

  provider "some_provider" {
    config1 = module.foo[each.values.foo_thing].output1
    config2 = module.foo[each.values.foo_thing].output2
    config3 = module.foo[each.values.foo_thing].output3
  }

  var1 = each.key
  var2 = each.values.something
}

@nikolay
Copy link

nikolay commented Sep 26, 2020

@vivanov-dp The ideal approach is to have identical code and only data, which varies between environment and clusters within the environment. Right now, almost everything has for_each/count except providers.

@nikolay
Copy link

nikolay commented Sep 26, 2020

@jon-walton Your example is identical to mine, but I illustrated if the module needs more than a single provider.

@jon-walton
Copy link

@jon-walton Your example is identical to mine, but I illustrated if the module needs more than a single provider.

My example illustrates the provider config being supplied to a module is set by the output of another module which also uses for_each

@vivanov-dp
Copy link

@nikolay Sure, having for_each for providers sounds logical and natural and I fully support it, I believe it deserves its own feature request

@nikolay
Copy link

nikolay commented Sep 28, 2020

@jon-walton Fair enough, we need dynamic providers - one way or another. Right now providers and outputs are the only two static resources in Terraform.

@apparentlymart
Copy link
Member

Hi all! Thanks for the interesting discussion here.

It feels to me that both this issue and #9448 are covering the same underlying use-case, which I would describe as: the ability to dynamically declare and use zero or more provider configurations based on data determined at runtime.

These various proposals all have in common a single underlying design constraint: unlike most other concepts in Terraform, provider configurations must be available for operations on resources that belong to them, which includes planning, updating, and eventually destroying. This means that a provider configuration must be available at the same time a new resource is added to the configuration, must have a stable name that can be tracked between runs in the Terraform state, and they must continue to be available until every resource instance belonging to them has been destroyed and/or removed from the state.

It is due to that design constraint that provider configurations remain separated from all other concepts in the restrictions placed on them in the configuration. Design work so far seems to suggest that there are some paths forward to making provider configuration associations (that is, the association of resources to provider configurations) more dynamic, but the requirement that each provider configuration be defined by a static provider block in the root module seems necessary to ensure that the provider block can remain in the configuration long enough to destroy existing resource instances associated with it, which happens after they are removed from the configuration.

One design we've considered (though this is not necessarily the final design we'd move forward with) is to make provider configurations a special kind of value in the language, which can be passed by reference through expressions in a similar sense that other values can. For example:

variable "networks" {
  type = map(
    object({
      cidr_block   = string
      aws_provider = providerconfig(aws)
    })
  )
}

resource "aws_vpc" "example" {
  for_each = var.networks
  provider = each.value.aws_provider

  cidr_block = each.value.cidr_block
}

The aws_provider attribute here is showing a hypothetical syntax for declaring that an attribute requires a configuration for the aws provider, with that reference then usable in provider arguments in resource and data blocks where static references would be required today. That syntax is intended to replace the current "proxy provider configuration" special-case syntax, by allowing provider configurations to pass through variables instead. However, this design does have the disadvantage of requiring explicit provider configuration passing, whereas today child modules can potentially inherit non-aliased provider configurations automatically in simple cases.

However, the calling module would still be required to declare the provider configurations statically with provider blocks, perhaps like this:

provider "aws" {
  alias = "usw2"

  region = "us-west-2"
}

provider "aws" {
  alias = "use2"

  region = "us-east-2"
}

module "example" {
  source = "./modules/example"

  networks = {
    usw2 = {
      cidr_block   = "10.1.0.0/16"
      aws_provider = provider.aws.usw2
    }
    use2 = {
      cidr_block   = "10.2.0.0/16"
      aws_provider = provider.aws.use2
    }
  }
}

Notice that the two provider configurations must still have separate static identities, which can be saved in the state to remember which resource belongs to which. But this new capability of sending dynamic references for provider configurations still allows writing a shared module that can be generic in the number of provider configurations it works with; only the root module is required to retain a static set of provider configurations.

There is also some possibility here of allowing count and for_each in provider blocks to permit provider addresses like provider.aws["use2"] (where use2 is the each.key), but this is more problematic because it creates another opportunity to "trap" yourself in an invalid situation: if you use the same value in for_each for both a resource configuration and its associated provider configuration, removing an item from that map would cause the resource instance and the provider configuration to be removed at the same time, which violates the constraint that the provider configuration must live long enough to destroy the instance recorded in the state. Given how common it is to get into that trap with provider blocks inside child modules today (which is why we've been recommending against that since Terraform 0.11), we're reluctant to introduce another feature that has a similar trap. For that reason, I predict that for_each and count for provider configurations (as proposed in #9448) won't make it through a more detailed design pass for this family of features.


I've shared the above mainly to just show some initial design work that happened for this family of features. However, I do have to be honest and share some unfortunate news: the focus of our work is now shifting towards stabilizing Terraform's current featureset (with minor modifications where necessary) in preparation for a Terraform 1.0, and a mechanism like the one I described above would be too disruptive to Terraform's internal design to arrive before that point.

The practical upshot of this is that further work on this feature couldn't begin until at least after Terraform 1.0 is released. Being realistic about what other work likely confronts us even after the 1.0 release, I'm going to hazard a guess that it will be at least a year before we'd be able to begin detailed design and implementation work for features in this family.

I understand that this is not happy news: I want this feature at least as much as you all do, but with finite resources and conflicting priorities we must unfortunately make some hard tradeoffs. I strongly believe that there is a technical design to address the use-cases discussed here, but I also want to be candid with you all about the timeline so that you can set your expectations accordingly.

@mmathys
Copy link

mmathys commented Dec 13, 2023

Would like to have that as well

@chrisk-tbot
Copy link

chrisk-tbot commented Jan 20, 2024

I just wanna chime in and note how challenging this has been in terms of missing abstraction. My use case was provisioning AWS GuardDuty resources across multiple regions and accounts. I use a module to eliminate the duplication with the various accounts, but the fact that it's impossible to use for_each in the context of different providers to run it across regions...

I appreciate all the hard work people have poured into this project, but this is the missing functionality that nags at me the most and results in painful duplication in my code. Would love to have this abstraction here

@claytonolley
Copy link

Have you considered CloudFormation StackSets? You can build these with Terraform and have them pushed out to all your accounts in the org.

https://docs.aws.amazon.com/organizations/latest/userguide/services-that-can-integrate-cloudformation.html

@chrisk-tbot
Copy link

@claytonolley I have not, but will take a look thanks. I did reduce the pain a bit by having multiple module calls. For anyone else who finds this helpful...

  • define provider aliases for all accounts and all regions
  • pass those providers through to the first module for each account
  • then in that module make calls to the secondary module for each region

This way the actual resource code is just defined once and the only duplication involves the nested module calls since provider can't be allocated dynamically.

@s1mark
Copy link

s1mark commented Jan 20, 2024

Maybe terraform stacks can help to solve this abstraction.
https://www.hashicorp.com/blog/terraform-stacks-explained

@favoretti
Copy link

I'll beat a dead horse here I guess. But after a couple of years of working on providers.. I get it that how it's built now - it won't be an easy change to implement. But considering that this has been a very wanted feature throughout all of the ecosystem community for a number of years - is this purely a "we won't do it cause we can" thing or..? @mitchellh @armon - we've all read rules for suggestions/additions to the core. Why anything but this gets prio? I can't imagine you don't see a dev-side benefit to this. Thank you for reading and hopefully providing an up-to-date viewpoint.

@pccowboy
Copy link

While it is not yet resolved, I'd say there is a much better chance of a solution here: opentofu/opentofu#1113.

fwiw, I've tried to get PRs accepted into terraform for other issues, and been met with an unacceptable amount of resistance for simple fixes. This issue is as dead a horse as they get.

@argium
Copy link

argium commented Feb 27, 2024

Hello, provisioning a resource and then requiring a provider to configure said resource is not an uncommon pattern. For example, creating a Databricks workspace and then using its ID to configure the Databricks provider. I was surprised to learn today that this is not supported. This is a glaring oversight. I'm shocked that this is still unresolved after 4 years.

@melissaberooks
Copy link

Hello, I wanted to add my Azure use case to further illustrate that this is not an AWS specific need, nor is it specific to iterating over regions. There are many use cases in Azure where you would want to iterate over multiple subscriptions, while maintaining a default "Management" subscription. Here is my latest use case...

System Topics and System Topic Subscriptions have to reside in their respective subscriptions (the scope they are monitoring) but the endpoint that the system topic subscription calls, does not.
While I can decouple the azurerm_eventgrid_system_topic deployment, The system topics event subscription (azurerm_eventgrid_system_topic_event_subscription) cannot be decoupled from its endpoint deployment.

All endpoints (webhooks in this example) need to be deployed to my management subscription, while all associated system topics need to be deployed to their respective subscriptions.

# Default provider
provider "azurerm" {
  alias = "management_subscription"
  subscription_id = "xxx-xxx-xxx"
  features {}
}

# Other providers
provider "azurerm" {
  alias = "subscription_1"
  subscription_id = "xxx-xxx-xxx"
  features {}
}

provider "azurerm" {
  alias = "subscription_2"
  subscription_id = "xxx-xxx-xxx"
  features {}
}

locals {
  scopes = {
    "subscription_1" = {
      display_name = "subscription_1"
      runbook      = "subscription_1_runbook"
      resource_id  = "/subscriptions/xxx-xxx-xxx"
    },
    "subscription_2" = {
      display_name = "subscription_2"
      runbook      = "subscription_2_runbook"
      resource_id  = "/subscriptions/xxx-xxx-xxx"
    }
  }
}

resource "azurerm_automation_runbook" "runbook_example" {
  for_each                = local.scopes
  name                    = each.value.runbook
  location                = var.location
  resource_group_name     = module.rgp.name
  automation_account_name = "automationaccount"
  log_verbose             = "true"
  log_progress            = "true"
  runbook_type            = "PowerShell72"
  content                 = file("${path.module}/foldername/${each.value.runbook}.ps1")

}

# The endpoint is an automation webhook, and all reside on the same automation account in my management subscription. 
resource "azurerm_automation_webhook" "webhook_example" {
  for_each                = local.scopes
  name                    = each.value.display_name
  resource_group_name     = module.rgp.name
  automation_account_name = "automationaccount"
  expiry_time             = "2025-12-31T00:00:00Z"
  enabled                 = true
  runbook_name            = each.value.runbook

  depends_on = [
    azurerm_automation_runbook.runbook_example
  ]
}

resource "azurerm_eventgrid_system_topic_event_subscription" "system_topic_subscription_example" {
  for_each             = azurerm_automation_webhook.webhook_example
  provider             = azurerm.${each.value.display_name}  ### Problem line
  name                 = "${each.value.display_name}-stsub"
  system_topic         = "${each.value.display_name}-st"
  resource_group_name  = "system-topic-rgp"
  included_event_types = ["Microsoft.Resources.ResourceWriteSuccess", "Microsoft.Resources.ResourceDeleteSuccess", "Microsoft.Resources.ResourceSctionSuccess"]

## I cannot decouple the azurerm_eventgrid_system_topic_event_subscription deployment from the webhook deployment, because I need to input the URI. 
  webhook_endpoint {
    url = each.value.uri
  }
}

Looking forward to any updates on this issue. Thank you!

@DavidGamba
Copy link

provisioning a resource and then requiring a provider to configure said resource is not an uncommon pattern

This can be accomplished by splitting into two separate stack components with the outputs of 1 feeding into the inputs of the other. Same pattern used when provisioning a k8s cluster in 1 component and configuring it in the next.
I have an open source take at stacks here[1] that can easily run this in a single command while keeping the 2 state files.
To destroy without orphaning resources, you just need to reverse the graph.

I also described using a provider declaration inside a module here [2] with its caveats.

There are many use cases in Azure where you would want to iterate over multiple subscriptions, while maintaining a default "Management" subscription.

I would probably leverage workspaces here, each workspace has a single subscription plus the management one.

I have been able to use "workarounds" for many of these issues but one that to this day affects me is the inability to upgrade the provider version in stages.
Even after splitting my code into multiple stack components and workspaces, all workspaces in a component use the same provider version and you can't pass a local or a variable to the version string, so if you wanted to validate a newer version of the provider in a given workspace you can't (unless of course, you do a local plan/apply but that is not sustainable).

If Hashicorp is planning to implement this feature, please consider a provision to allow to have different provider versions even if that means there is a single version for the entire workspace, in other words, allow variables in the version string.

[1] https://github.com/DavidGamba/dgtools/blob/master/bt/README.adoc
[2] #24476 (comment)

@npenin
Copy link

npenin commented Apr 2, 2024

Really an edge (no pun intended) case : If you use AWS Lambda@Edge, and want to configure the log groups in cloud watch, you need to do it for each region where the lambda might run (aws-lambda#265) . In such case, it would be great to loop over a list of regions and just populate all the log groups.

In addition comes the issue of not being able to create providers in sub module files. Thus, we have to populate a list of providers in the parent module and then pass them all to the submodule and still have to repeat ourselves for each provided provider variation !

@JHunsGlobal
Copy link

Hi @apparentlymart,

The practical upshot of this is that further work on this feature couldn't begin until at least after Terraform 1.0 is released. Being realistic about what other work likely confronts us even after the 1.0 release, I'm going to hazard a guess that it will be at least a year before we'd be able to begin detailed design and implementation work for features in this family.

Based on your at least a year message from 2020 and that v1.0 release is complete, is there any change this feature has worked it's way up the priority list yet?

@apparentlymart
Copy link
Member

This feature is blocked by there being no known design with which to implement it, not by prioritization.

We have attempted to solve it many times since before this issue was even open. There are significant design challenges to supporting both this and the related request of declaring provider configuration instances dynamically using something like for_each.

Although it's not the only problem, the most significant problem to be solved is what must happen when something exists in the remote system but no longer exists in the configuration. Terraform needs a valid and functioning provider configuration to plan and apply the "destroy" for those objects, and even in today's Terraform people often get themselves "trapped" by removing even a statically-declared-and-assigned provider configuration at the same time as removing the resources that belong to it.

Every time we've tried a prototype of making it more dynamic that problem arises almost immediately, because being able to dynamically remove provider instances and resource instances together is an unspoken assumption in this feature request.

The new language for Terraform Stacks will include a fresh attempt at solving this that benefits from the fact that the Stacks language is a bit of a fresh start not so constrained by existing Terraform behavior, but even there we don't yet have a completely satisfactory solution to the above problem. It is an area of ongoing research.

If the design for Stacks is ultimately successful then some form of it might make it into the traditional Terraform language, but with what we know right now it seems likely to require at least some small breaking changes to the language to make it work.

@bcsgh
Copy link

bcsgh commented Jun 18, 2024

I would suggest for consideration a slightly weakened set of requirements:

  1. It must be possible to define a configuration that allows adding and removing modules and providers using for_each (even if how a user can implement that is somewhat constrained).
  2. An "incorrect" implementation should refuse to apply if that would modify the state into a condition that requires "out of band" intervention to escape. ("That doesn't work, revert the .tf files and try something else" should always be an option.)
  3. Don't require that removing a provider always be accomplished in a single apply. (People might be mildly annoyed at having to decouple the set of modules and the providers they use so they can apply removing a module and then later apply removing the providers, but they are going to be more annoyed (and more likely to make errors) if they can't use for_each at all and basically have to do the same sequence of applies but by hand.)

As long as the user has reasonable confidence they aren't going to walk off a cliff (no. 2) I suspect dealing with a less-than-ideal implementation (i.e. no. 1 and no. 3) is something users would likely prefer over no implementation at all. (Heck, no. 1 is already a fact of life in some parts of TF in that the system forces the user to do thing TF's way rather than the way the user wants.)

Regarding no. 3, the set of things that TF can do expands significantly if you are willing to accept "apply in a loop until you reach a fixed point" as normal practice. (I wonder what fraction of people already do that anyway, just to be sure things worked correctly?) For example, my current works-around for this missing functionality is to use a local_file resource to generate part of the .tf config that is then evaluated on the next apply.

@stevehipwell
Copy link

Why can't the removal of a provider follow one of these two behaviours?

  1. Behave as if the resources had been referenced by a removed block
  2. Keep the resources in state but as orphaned where they show up and can be resumed if the provider is re-added (or they can be explicitly removed)

In addition to this it'd be great to be able to configure a provider (static or in a for_each) as disabled and to then either delete all resources or remove all resources (or orphan all resources if option 2 above was implemented). This would make the whole provider removal process significantly simpler and would allow the provider block to remain but not be loaded once there are no longer any resources in the state (single apply removal).

provider "random" {
  disabled {
    mode = "delete"
  }
}

@tmccombs
Copy link
Contributor

What if the alias mappings were stored in the state. So that if you remove a provider from the for_each , terraform can still figure out which provider was used to create the resources, as long as that provider still exists at the top level.

@bcsgh
Copy link

bcsgh commented Jun 21, 2024

... as long as that provider still exists at the top level.

I suspect that the cases that don't already work and would naturally fall under your "as long as" are rare. For one thing, I think some form of what you proposed would already be needed to support the case where someone deletes a module that uses aliases.

That said, I can think of a few ways to contort configs that would normally lose the provider before the resources are deleted so that they pin it for just long enough to satisfy that "as long as" requirement.

@tmccombs
Copy link
Contributor

tmccombs commented Jul 2, 2024

I suspect that the cases that don't already work and would naturally fall under your "as long as" are rare

Well, you could temporarily keep the provider definition, but remove its usage from the module.

@stevehipwell
Copy link

In addition to my previous comment some of this pain was introduced/increased by the decision to artificially separate the provider architecture from the reality of how identities can be used. For example an AWS provider can be authenticated to access multiple regions but the provider artificially constrains this to a single region and requires multiple provider instances to be created.

So in addition to being able to use for_each for completely separate provider instances a significant number of use cases (e.g. AWS) would be solved if providers had something along the lines of an override for aliases where there is still a single provider configuration but with fewer constraints (see below).

locals {
  aws_provider_regions = [
    "us-east-1"
    "eu-west-1"
  ]
}

provider "aws" {
  override = [ for region in local.aws_provider_regions : { alias = region, region = region}]
}

resource "aws_s3_bucket" "example" {
  for_each = toset(local.aws_provider_regions)
  provider = aws.default[each.value]

  bucket = "my-tf-test-bucket-{each.value}"
}

As this is still a single provider configuration, which should be tied to a single identity, it shouldn't require any architectural changes as the provider removal pattern doesn't change (the overrides would need to be stored in state to support removal). All it's intended to do is to support replacing a finite number of provider values which can override values used in each resource call.

This could also be implemented to work without the upcoming changes to support know values in the provider config by exposing the override aliases on the provider (e.g. for_each = aws.default.overide_aliases) but I think the pattern I showed above would be more ergonomic to use.

@apparentlymart
Copy link
Member

I remember that the AWS provider team was at one point investigating the possibility of changing to a posture similar to some other providers where the provider configuration's region is taken as a default region to use for newly-created objects but that each resource type would also track the region separately for each object, and thus there would be the option to override the provider-config-selected default on a per-region basis.

I don't know what the outcome of that was, but I expect it would be a huge effort to retrofit every resource type in that provider and so unfortunately while I agree that it would have been better for the provider to have been designed that way in the first place -- and for the general rule to be that provider configurations focus on caller identity rather than object location -- we probably need to find a way to make this work with the provider we have.

Of course, we also don't yet know how to retrofit this idea into Terraform Core. We are using the relatively-more-greenfield new language for the forthcoming Terraform Stacks as a test bed for making this work -- in Stacks each instance of a component can have a separate provider configuration, which is the same idea as this issue but at a coarser level of abstraction. If that design proves successful then some form of it can probably be brought into the traditional Terraform language too, but it seems likely to require some technically-breaking changes to the language, so we need to make sure the design seems right in a less-constrained setting before embarking on that.

@stevehipwell
Copy link

@apparentlymart my suggestion above wouldn't require any resource level changes and only minimal language changes for configuring and using the overridden providers; these should be non-breaking if not used and should require explicit opt in on a per-provider basis. Terraform would then be responsible for duplicating providers based of the default provider configuration and the overrides; but once the providers were configured everything would work as if the providers were explicitly defined.

@apparentlymart
Copy link
Member

Hi @stevehipwell,

The unanswered questions here are not really about syntax but rather about how Terraform should react to that syntax and how to introduce it without breaking existing working modules.

There are various assumptions in Terraform's current design and implementation that make this difficult, including but not limited to:

  • Terraform needs to have a provider configuration in order to plan to destroy and destroy an object, and it currently solves that by remembering in the state the address of the provider configuration assigned to each resource.

    This is not tracked on a per-instance basis, but on a whole-resource basis. It could potentially be changed to track on a per-instance basis instead, but that is a significant change to both the state snapshot format and the various code in Terraform that works in terms of that format.

  • Related to the previous point, most intentions for this issue also include the ability to declare provider configurations themselves dynamically. For example: given a set of AWS regions provided dynamically, declare a separate AWS provider configuration instance and a separate resource instance and connect them together.

    This means that if an element is removed from that set of AWS regions, the provider configuration instance and the resource instance are effectively removed at the same time, and so Terraform no longer has the provider configuration to use to destroy the removed resource instance.

  • Since long ago Terraform has used a different syntax for referring to provider configurations than for referring to values, which works only because the two namespaces are currently totally separate: there's never any case where an expression includes both value expressions and provider references at the same time.

    Most hypothetical designs for this use-case want to make provider be treated as a value expression and to then introduce a new type of value that represents a reference to a provider configuration, which requires a change of syntax. provider.aws.foo to refer to a provider "aws" block with alias = "foo" is a plausible syntax for that -- and is the syntax that the stacks language uses -- but in today's Terraform provider.aws.foo means the foo attribute of the singleton instance of the resource "provider" "aws" block, and so redefining that is technically a breaking change.

    (I say "technically" because it does seem unlikely that someone has made a provider that's literally called "provider", offering a resource type that's also literally called "provider", but nonetheless we cannot see into every organization's private providers to know for sure, and so our compatibility promises call for us to be conservative.)

The most recent idea in comments above seems to hit all three of these challenges, and so it is not a sufficient solution.

I'm pretty sure I've already made variations of these points in various comments upthread here and in the related issues, but I'm restating this here just so that it's all aggregated together at the (current) end of the discussion. While I do appreciate the enthusiasm in proposing potential solutions, I'd ask participants to recognize that if this were a straightforward problem it would likely have been solved already, and that it remains unsolved because various previous attempts to solve it have been incomplete or unsatisfactory.

As I mentioned in my previous comment, we are using the relatively clean slate of the new stacks language to cancel out some of the existing constraints so that we can apply some of the earlier design ideas. Of course that doesn't free us from having to still navigate the same constraints in the traditional Terraform language, but since it seems like the only path forward is making a breaking change we'd prefer to get experience with the new design in a less risky setting first and then, if successful, figure out what is the smallest breaking change we could make to get there and what strategy we ought to follow to navigate that breaking change without splitting the ecosystem.

It's highly unlikely that any single comment on this issue will be sufficient to allow this issue to proceed immediately.

@tmccombs
Copy link
Contributor

tmccombs commented Jul 4, 2024

There is an open PR to allow soecifying the region per resource. But it hasn't gotten a lot of activity recently.

@stevehipwell
Copy link

@apparentlymart thanks for replying, I know there is a lot of nuance I'm missing as I'm only looking at this from an external position. For context I have a requirement to configure a unknown number of different AWS accounts from a single Terraform workspace using a single IAM identity, the imperative (bash) version of this is trivial but the declarative (Terraform) version of this is impossible.

I think supporting a better provider removal pattern (I suggested one here) would be beneficial in all cases, but I'm not sure it's a prerequisite for a pattern that overrides a single provider. The point of configuring a single provider with overrides is that at the abstract architectural level you've added capabilities without requiring new components. I understand that there are language limitations, and thank you for clarifying some of them above, but conceptually this pattern would be a generic version of the AWS PR that @tmccombs linked above but without requiring each supporting resource to be modified.

Would this updated language pattern be a better fit? The resource provider_override could be added to another block such as lifecycle if that was more idiomatic.

locals {
  aws_provider_regions = [
    "us-east-1"
    "eu-west-1"
  ]
}

provider "aws" {
  overrides = { for region in local.aws_provider_regions : region => { region = region } }
}

resource "aws_s3_bucket" "example" {
  for_each = toset(local.aws_provider_regions)

  provider_override = each.value

  bucket = "my-tf-test-bucket-{each.value}"
}

I think the actual implementation here could follow one of two patterns:

  1. Pre-process into multiple providers and then use the current behaviour as is. The overrides would need to be stored in state to allow for removal.
  2. Require the providers to implement override logic and add the override data to the generic resource (and data source) interface. The overrides would probably not need storing in the state as they'd be part of the resource (e.g. AWS region).

Option 1 looks to be the least impactful, assuming it is possible to pre-process the Terraform config, but option 2 would probably be a better long term pattern. I think the thing here is that they're not mutually exclusive, if the language support was added then all providers could easily opt in to option 1 (specify overridable config fields) while some providers could take further steps to use option 2 if it was implemented by Terraform at a future point in time.

@sbwise01
Copy link

sbwise01 commented Jul 5, 2024

@apparentlymart

Per the assumption you provided:

Related to the previous point, most intentions for this issue also include the ability to declare provider configurations themselves dynamically.

I don't think that is accurate. Most solutions provided have recognized the need to statically declare providers in advance of using them to implement a resource or module. The underlying problem being discussed is how to dynamically reference an individual provider from a static set of them when implementing a resource, and the syntax currently does not provide for doing so.

@bcsgh
Copy link

bcsgh commented Jul 6, 2024

@sbwise01

Most solutions provided have recognized the need to statically declare providers in advance of using them to implement a resource or module. The underlying problem being discussed is how to dynamically reference an individual provider from a static set of them [...]

I beg to differ. In the case where I'd want to dynamically select a provider to pass to a module for_each, I'd basically always want to dynamically declare the providers using the same for_each source the modules come from.

In the case promoting my interest in this issue, that is exactly what I do, just via the back door of manually expanding the "for_each" I can't actually do and dumping it into the current director as a .tf file via a local_file resource:

locals {
  projects = {
    "p1": "us-west-2",
    "p2": "us-west-2",
    "p3": "us-east-1"
  }
  //////
  zones = toset([for _, z in local.projects: z])
  packed = {
    for t in local.zones:
    t => [for p, z in local.projects: p if t == z]
  }
}

resource "local_file" "provider_for_each" {
  filename = "./gen.tf"
  file_permission = "0600"
  content  = join("\n", flatten([
    [for z in local.zones: <<EOT
provider "aws" {
  alias = "${z}"
  region = "${z}"
}
EOT
],

    [for z in local.zones: <<EOT
module "zone-${z}" {
  providers = { aws = aws.${z} }
  source = "../zone/"
}
EOT
],

    [for z, v in local.packed: <<EOF
module "app-${z}" {
  for_each = toset(["${join("\", \"", v)}"])
  providers = { aws = aws.${z} }
  source = "../app"
  aws_zone = "${z}"
}
EOF
]
])
}
  • Yes, I know this is a nasty hack.
  • Yes, I know this don't converge in a single apply
  • Yes, I know this will demand some care when removing the last project in a given zone.
  • No, I don't care about those issues because it eventually converges to the correct state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests