Build Data Catalog configurations with data sources, harvest schedules, and glossary terms.
Last verified: May 2026
Build Data Catalog configurations with data sources, harvest schedules, and glossary terms.
Required Fields
compartmentIddisplayNamedataSourcesOutput will appear here...Your team is building a data lake but has no central metadata view — analysts hunt for data via Slack DMs and tribal knowledge. The builder generates: a Data Catalog instance, automated harvesting from 5 Object Storage buckets and 3 Autonomous DBs, a business glossary with terms like 'customer', 'order', 'transaction' linking technical assets to business concepts, custom properties for sensitivity classification. After 3 months, the catalog has 10K+ assets cataloged and tagged. New analyst onboarding time drops from 2 weeks to 2 days — they search the catalog instead of asking around for 'where is the customer data'.
OCI Data Catalog is a metadata management service that helps you discover, organize, and govern data assets across OCI and external data sources. It automatically harvests metadata from Object Storage, Autonomous Database, MySQL HeatWave, and other data stores, creating a searchable catalog with business glossaries, tags, and data lineage. This builder helps you configure Data Catalog instances with data asset connections, harvesting schedules, custom metadata properties, and glossary structures.
The builder constructs OCI Data Catalog configurations: catalog instance resource (compartment), data assets (connections to Object Storage / Autonomous Database / MySQL / external sources), harvest jobs (with schedule for incremental updates), folders for organization, business glossaries (terms, categories, relationships), custom properties (for classification, tagging, ownership), and IAM policies for catalog access. Output is generated as oci data-catalog commands and Terraform oci_datacatalog_catalog + oci_datacatalog_data_asset resources.
Always start with automated metadata harvesting from existing data sources — Object Storage, Autonomous Database, MySQL HeatWave. Manual catalog entry doesn't scale. The harvester discovers schemas, file formats, and column statistics automatically; you add business context (glossary terms, classifications) on top of the auto-discovered metadata.
Custom metadata properties + tags are how you operationalize data classification. Tag every asset with sensitivity (public, internal, confidential, restricted) — this becomes the foundation for IAM policies that restrict access based on classification. Without classification, you can't enforce 'restricted data access requires elevated approval'.
Data lineage tracking is the killer feature for impact analysis. When a source schema changes, the catalog shows every downstream asset that depends on it. Without lineage, schema changes become 'hope nothing breaks' deployments. With it, you have a clear blast radius for any change.
Was this tool helpful?
Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.