Study Databricks DE-ASSOC Medallion Architecture: key concepts, common traps, and exam decision cues.
This lesson covers one of the cleanest conceptual objectives on DE-ASSOC: the three layers of the medallion architecture. The exam uses these layers to test whether you know where raw landing ends, where cleaned and conformed data begins, and where business-ready outputs belong.
Bronze: Raw landing layer that prioritizes fidelity and replayability.
Silver: Cleaned, validated, joined, and conformed data layer used for reliable downstream work.
Gold: Curated business-facing or consumption-ready output layer.
Medallion questions are really about data responsibility, not just folder names. The exam wants to know whether you can tell:
| Layer | Main purpose | What usually belongs there |
|---|---|---|
| Bronze | Keep raw or lightly normalized intake durable and replayable | landed source data, minimal cleaning, ingestion metadata |
| Silver | Clean and standardize data for trustworthy reuse | deduped records, joins, conforming logic, quality checks |
| Gold | Serve business-facing outputs and aggregated views | marts, KPIs, analytics-ready tables, curated facts and dimensions |
| If the task is mainly about… | Strong layer |
|---|---|
| preserving source fidelity and reprocessing options | bronze |
| deduplicating, validating, and joining reusable records | silver |
| exposing curated metrics or dimensional outputs to consumers | gold |
| storing late raw-arrival metadata or ingestion audit columns | bronze or early silver, depending on whether the logic is still source-oriented |
The medallion model is not only about storage location. It is about responsibility:
If a stem asks where heavy business aggregation belongs, gold is often the lane. If it asks where you standardize and deduplicate records, silver is usually the lane. If it asks where you keep source fidelity and replay options, bronze is usually the lane.
A team ingests raw order events, removes duplicates, standardizes customer identifiers, then builds daily revenue marts for finance dashboards. Which medallion path is strongest?
Correct answer: B. The workflow moves from raw fidelity, to reusable cleaned data, to business-facing consumption.