Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Unity Catalog support #10115

Open
villepuntanen opened this issue Mar 15, 2023 · 13 comments
Open

Databricks Unity Catalog support #10115

villepuntanen opened this issue Mar 15, 2023 · 13 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request Needs: Upvote This issue requires more votes to be considered story: extensibility

Comments

@villepuntanen
Copy link

The recommended pattern for utilizing Azure Databricks is to have Unity Catalog setup for the management and governance of the Databricks setup and for utilizing several features.

Currently there is support for setting up Unity Catalog with Terraform (Link to docs), but the same would be needed in Bicep.

@leinoaa
Copy link

leinoaa commented Mar 15, 2023

This would be a great addition!

@alex-frankel
Copy link
Collaborator

It looks like these databricks resources are not ARM control plane resources. In terraform, they are using a dedicated databricks provider, so this would only be possible if someone builds a databricks provider for bicep via bicep extensibility.

@alex-frankel alex-frankel added Needs: Upvote This issue requires more votes to be considered story: extensibility and removed Needs: Triage 🔍 labels Mar 16, 2023
@aucampia
Copy link

Related: #9967

You are going to keep getting this question again and again until the inaccuracies in the documentation is fixed. Selling Bicep as "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep." and then clarifying later that actually by any Azure resources is meant only some is not really an acceptable thing to be doing.

It is misleading and results in people making decisions based on false information.

@aucampia
Copy link

so this would only be possible if someone builds a databricks provider for bicep via bicep extensibility

Do you have any documentation on how to do something like this?

@alex-frankel
Copy link
Collaborator

For this to happen in the short term, we would need the team that manages the Databricks RP to implement it since we are only supporting first-party maintained providers for now.

@villepuntanen
Copy link
Author

so this would only be possible if someone builds a databricks provider for bicep via bicep extensibility

Do you have any documentation on how to do something like this?

Hi, been looking a bit on this... Here's one example of a provider utilizing the extensibility:
https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/bicep-extensibility-kubernetes-provider

@jikuja
Copy link

jikuja commented Mar 21, 2023

You are going to keep getting this question again and again until the inaccuracies in the documentation is fixed. Selling Bicep as "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep." and then clarifying later that actually by any Azure resources is meant only some is not really an acceptable thing to be doing.

It is misleading and results in people making decisions based on false information.

https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/control-plane-and-data-plane should help understanding Azure control and data plane differences

@aucampia
Copy link

aucampia commented Apr 17, 2023

https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/control-plane-and-data-plane should help understanding Azure control and data plane differences

Can you clarify how this will help anyone do declarative management of Azure Databricks resources with Bicep? If this is not something that is in scope of Bicep it is best to just close this issue as wont-fix.

I don't think trying to reclassify the Azure Databricks control plane [ref] as a data plane is very productive or addresses any of the problems that anyone has. The core issue is that there are Azure Databricks resources that can be declaratively managed using other declarative resource management tools, that should be declaratively managed if modern engineering practices are followed, and that there is an expectation that the same is supported by Bicep, as Bicep says on the tin "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep".

Even if you somehow convinced people that the Azure Databricks control plane [ref] is not a control plane, they would still want to declaratively manage Azure Databricks resources, and they would still expect a tool that says "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep" to offer me declarative management of Azure Databricks resources.

@jikuja
Copy link

jikuja commented Apr 17, 2023

Can you clarify how this will help anyone do declarative management of Azure Databricks resources with Bicep?

That's for the reference documentation to understand what is control plane and data plane for ARM resources.

I don't think trying to reclassify the Azure Databricks control plane [ref] as a data plane is very productive or addresses any of the problems that anyone has.

Well, it is a data plane for ARM point of view. Datbricks documentation probably should mention that it is databricks-specific control plane that is not available via ARM.

they would still want to declaratively manage Azure Databricks resources

If you read @alex-frankel message you noticed that data planes will be handled at some point with Bicep providers

The core issue is that there are Azure Databricks resources that can be declaratively managed using other declarative resource management tools, that should be declaratively managed if modern engineering practices are followed,

I know and I really hope it will be part of the bicep at some point. Setting up databricks with scripts is not a good process.

an expectation that the same is supported by Bicep, as Bicep says on the tin "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep".

I'm not sure how common this expectation is.

@aucampia
Copy link

an expectation that the same is supported by Bicep, as Bicep says on the tin "Day 0 resource provider support. Any Azure resource — whether in private or public preview or GA — can be provisioned using Bicep".

I'm not sure how common this expectation is.

It would be less common if your documentation was updated to clarify that it is talking about Azure RM resources, and that other resources are out of scope.

@stephaniezyen stephaniezyen added the documentation Improvements or additions to documentation label Apr 17, 2023
@alex-frankel
Copy link
Collaborator

We are going to get the docs updated this week to help clarify this more, but @jikuja is right -- Databricks is a dataplane from the perspective of ARM.

@aucampia, we would like to keep the issue open because the framework now exists to enable support for these scenarios. It is just a question of if anyone has the capacity to implement it. If there are many others who want/need this, that will help us revisit the priority and get it done sooner. Right now, there is no commitment that the Databricks team will be able to do get this done, so there are no ETAs we could provide.

@aucampia
Copy link

@aucampia, we would like to keep the issue open because the framework now exists to enable support for these scenarios.

Is there any documentation for this? I looked briefly at how K8S extension is implemented, but reverse engineering that to understand how to build one for Databricks is going to take more time than I have available, but I really would like to get a better picture for what the capabilities are. For example, how is state managed, does extensions require that state management is deferred? Will this integrate with deployment stacks at all?

@alex-frankel
Copy link
Collaborator

There is no documentation because third parties cannot contribute their own provider at this point. The only team that can resolve this for databricks is either the Databricks team or Microsoft.

There is no state management required and we do plan to integrate this with deployment stacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request Needs: Upvote This issue requires more votes to be considered story: extensibility
Projects
Status: Todo
Development

No branches or pull requests

7 participants