Study Databricks DA-ASSOC Aggregations and Federated Queries: key concepts, common traps, and exam decision cues.
This lesson is where many candidates win or lose the exam. Databricks is not testing fancy syntax for its own sake. It is testing whether you can produce the right analytical result with the right join, grouping, set operation, and data-access pattern.
| Need | Best first answer |
|---|---|
| one result row per grouped business grain | aggregate at that grain |
| preserve all rows from a fact table while attaching optional lookup data | LEFT JOIN |
| combine same-shaped result sets while keeping duplicates | UNION ALL |
| combine same-shaped result sets while removing duplicates | UNION |
| find whether related records exist without bringing columns | semi-join style logic |
| query Databricks data together with an external governed source | federated query pattern |
| Trap | Better rule |
|---|---|
filtering right-table columns in WHERE after a LEFT JOIN |
check whether you accidentally converted outer-join logic into inner-join behavior |
using UNION when duplicates are valid business rows |
UNION removes duplicates, so it can change meaning |
using DISTINCT after a bad join |
repair the join shape first |
| If the stem is really about… | Strong reading |
|---|---|
| on-demand analysis across Databricks and an external source | federated query can fit |
| repeated production ingestion into Databricks | import path may fit better than federation |
UNION versus UNION ALL based on duplicate meaning, not habitThis lesson usually tests whether you can choose the right SQL relationship or combination primitive. If every left-side row must be preserved, think LEFT JOIN. If duplicates from both result sets should remain, think UNION ALL. DA-ASSOC often hides wrong totals inside join-shape mistakes, so correctness comes before optimization.