⚠️ Beware: This is a community-maintained informal knowledge base.
- Refer to the Data Catalog documentation for the most up-to-date information.
- Is this repo useful? Please ⭑Star this repository and share the love.
- Curious about something? Open an issue, someone may be able to add it to the FAQ.
- Contribute if you learned something interesting about Data Catalog.
- Trouble using Data Catalog? Ask a question on Stack Overflow. and tips.
- Basics
- Open Source Connectors
- Cloud Data Loss Prevention (DLP)
- Filesets
- Samples
- Other Community Articles
Data Catalog Data Catalog is a fully managed and scalable metadata management service that empowers organizations to quickly discover, understand, and manage all their data. It offers a simple and easy-to-use search interface for data discovery, a flexible and powerful cataloging system for capturing both technical and business metadata, and a strong security and compliance foundation with Cloud Data Loss Prevention (DLP) and Cloud Identity and Access Management (IAM) integrations.
Official Docs - GA blog post Yes, Data Catalog became GA, check the blog post.
Community - Data Catalog Mental Model Yes, check this community blog post.
Official Docs - Search reference Check the official docs for search syntax.
Community - Data Catalog Search and lookup Check this community blog post for Python examples.
Official Docs - Quickstart Tagging Check the official docs for a quickstart on how to Tag Assets.
Community - Data Catalog Templates and Tags Check this community blog post for Python examples.
Google Github - Datacatalog Connectors Yes, check the github repository for a list of open source connectors for non-GCP assets.
Google Github - Datacatalog Connectors BI Yes, check the github repository for sample code on how to ingest Looker assets.
Community - Google Cloud Data Catalog and Looker integration Check this community blog post, for an overview of the Looker integration.
Google Github - Datacatalog Connectors BI Yes, check the github repository for sample code on how to ingest Tableau assets.
Community - Google Cloud Data Catalog and Tableau integration Check this community blog post, for an overview of the Tableau integration.
Google Github - Datacatalog Connectors RDBMS Yes, check the github repository for sample code on how to ingest RDBMS assets.
Community - Google Cloud Data Catalog — Integrate Your On-Prem RDBMS Metadata Check this community blog post, for an overview of the RDBMS integration.
Google Github - Datacatalog Connectors Hive Yes, check the github repository for sample code on how to ingest Hive assets.
Community - Google Cloud Data Catalog — Keep Up With Your On-Prem Hive Server Check this community blog post, for an overview of the Hive integration.
Official Docs - Sending Cloud DLP scan results to Data Catalog Yes, check the official docs for instructions.
Community - Create Data Catalog tags by inspecting BigQuery data with Cloud Data Loss Prevention Yes, check this community tutorial for instructions.
Official Docs - Using Cloud Storage filesets Check the official docs.
Community - Google Cloud Data Catalog Filesets: unlock its full potential Check this community blog post, if you want to enrich your filesets with Tags containing stats about your cloud storage files.
Community - Data Catalog Template examples Check this community github, if you are looking for guidance on what fields use on your Templates. There's an option to create the sample Templates in your Project using datacatalog-util
Python package.
Community - Boosting the Data Governance journey with Google Cloud Data Catalog Check this for a Data Governance and Data Catalog community blog post.
Community - Where is my data? The answer is Google Data Catalog Check this for a blog post exploring the main features of Data Catalog.
Your question not answered here? Open an issue and see if we can answer.