Databricks DE-ASSOC 30-, 60-, and 90-day study plan for ingestion, Delta, catalogs, review loops, and final-week priorities.
This page answers the question most candidates actually have: “How do I structure my DE-ASSOC prep?” These three schedules are built around the current Databricks section map and the stronger lesson tree in this guide, so your labs, timed review, and miss logging stay tied to the live exam shape.
Use the plan that matches your available time, but anchor it to one small repeatable lab. Each week should include one notebook build, one review of a failed or weak run, one timed drill set, and one miss-log review.
Use the same loop every week:
flowchart LR
Classify["Classify platform / ingestion / transformation / production / governance"] --> Rep["Run one small Databricks rep"]
Rep --> Drill["Do short timed drill set"]
Drill --> Review["Log the real miss pattern"]
Review --> Classify
There is no single right number, but most candidates land in a range based on background:
| Your starting point | Typical total study time | Best-fit timeline |
|---|---|---|
| You build Databricks pipelines weekly | 25-40 hours | 30-60 days |
| You know Spark or SQL but are newer to Databricks operations | 40-70 hours | 60-90 days |
| You are new to Databricks and Unity Catalog | 70-100+ hours | 90 days |
Choose a plan based on hours per week:
| Time you can commit | Recommended plan | What it feels like |
|---|---|---|
| 8-10 hrs/week | 30-day intensive | Fast learning plus lots of practice |
| 5-7 hrs/week | 60-day balanced | Steady progress plus review time |
| 3-4 hrs/week | 90-day part-time | Slow and solid with repetition |
| If you are… | Use the plan like this |
|---|---|
| already using Databricks weekly | move faster through platform orientation and spend more time on tricky Delta and operational failure patterns |
| strong in SQL but weaker in Spark behavior | spend extra time on execution triggers, shuffles, and DataFrame reasoning |
| strong in Spark but newer to Databricks governance | spend extra time on Unity Catalog, managed vs external tables, and sharing/federation |
| short on time | complete one pass through all five sections before chasing edge-case features |
The live public Databricks exam guide groups scope into these five sections:
| Section | What to get good at first |
|---|---|
| Platform | Workspace logic, default optimization behavior, and compute fit |
| Ingestion | Notebooks, Databricks Connect, Auto Loader, and debugging |
| Processing | Medallion design, Lakeflow pipelines, DDL and DML, and PySpark transformations |
| Pipelines | Asset Bundles, workflows, repair and rerun, serverless jobs, and Spark UI |
| Governance | Unity Catalog permissions, lineage, sharing, and federation |
If you want one rule: spend most of your time on Sections 2, 3, and 4, because that is where notebook workflow, data movement, transformation logic, and operational judgment compound.
Try to keep one small runnable Databricks workflow alive while you study. For DE-ASSOC, that baseline should include:
| Minutes | What to do | Why |
|---|---|---|
| 0-10 | review one exam section task or lesson | keeps the session tied to current scope |
| 10-20 | restate the behavior that matters most | prevents syntax-only study |
| 20-35 | run one notebook rep or short drill set | turns the topic into observable behavior |
| 35-45 | write one miss rule and route the weakness to a section | makes the next session targeted |
Target pace: ~8-10 hours/week (1-1.5 hrs/day).
Goal: cover the full outline quickly, then use drills and mixed sets to harden judgment.
| Week | Focus | What to do | Links |
|---|---|---|---|
| 1 | Platform + Ingestion | Learn workspace behavior, compute fit, notebooks, Databricks Connect, Auto Loader, and basic debugging. Start a miss log immediately. | Resources • Glossary |
| 2 | Processing | Drill medallion logic, Lakeflow pipeline behavior, DDL and DML, DataFrame aggregations, and performance-sensitive transformations. | Cheat Sheet • Glossary |
| 3 | Pipelines | Focus on Asset Bundles, workflow deployment, repair and rerun, serverless jobs, and Spark UI evidence. End the week with one mixed timed set. | FAQ • Resources |
| 4 | Governance + final review | Drill Unity Catalog objects, permissions, lineage, sharing, and federation. Finish with 2-3 timed mixed runs and review every miss. |
Cheat Sheet • FAQ |
Target pace: ~5-7 hours/week. Goal: learn each section, then loop back with spaced repetition and mixed practice.
| Week | Focus | What to do | Links |
|---|---|---|---|
| 1 | 1.1 Platform Defaults + 1.2 Compute Fit | Learn what the platform is simplifying for you before you dive into pipeline syntax. | Resources |
| 2 | 2.1 Notebooks & Dev + 2.2 Auto Loader | Drill interactive development and ingestion judgment side by side. | Cheat Sheet |
| 3 | 2.3 Debugging & Triage + 3.1 Medallion | Practice failure classification and pipeline-shape thinking. | Glossary |
| 4 | 3.2 Lakeflow + 3.3 SQL & DataFrames | Focus on repeatable ETL and transformation correctness. | Resources |
| 5 | 3.4 Transform Performance + 4.1 Asset Bundles | Separate runtime tuning from deployment packaging. | Cheat Sheet |
| 6 | 4.2 Workflows & Jobs + 4.3 Spark UI & Tuning | Drill operational recovery and performance evidence. | FAQ |
| 7 | 5.1 Unity Catalog + 5.2 Permissions & Lineage | Build governance language that stays precise under time pressure. | Glossary |
| 8 | 5.3 Sharing & Federation + final review | Do two full mixed sets and re-drill whichever section keeps producing the same miss pattern. | FAQ • Resources |
Target pace: ~3-4 hours/week. Goal: slow repetition and stronger retention while keeping one lab alive.
| Week | Focus | What to do | Links |
|---|---|---|---|
| 1 | Setup + Platform Defaults | Set your cadence, start a miss log, and learn how Databricks wants you to think about the platform. | Exam Root • Glossary |
| 2 | Compute Fit | Drill interactive vs scheduled vs SQL-serving compute choices. | Resources |
| 3 | Notebooks & Dev | Practice local-vs-remote workflow judgment. | Cheat Sheet |
| 4 | Auto Loader | Focus on ingestion source fit, incremental file discovery, and checkpoint thinking. | Resources |
| 5 | Debugging & Triage | Practice failure classification from logs, traces, and recent pipeline changes. | FAQ |
| 6 | Medallion + Lakeflow | Drill layer purpose, dependency order, and declarative pipeline reasoning. | Glossary |
| 7 | SQL & DataFrames | Focus on SQL verbs, safe writes, and PySpark aggregation logic. | Cheat Sheet |
| 8 | Transform Performance | Learn to spot memory, shuffle, skew, and interactive-vs-job compute stems. | Resources |
| 9 | Asset Bundles | Practice packaging and target-aware deployment reasoning. | FAQ |
| 10 | Workflows & Jobs | Drill job operations and failure recovery. | Cheat Sheet |
| 11 | Spark UI & Tuning + Unity Catalog | Blend runtime evidence with governance object-model decisions. | Glossary |
| 12 | Permissions & Lineage + Sharing & Federation | Close with governance and sharing, then move into full mixed sets. | FAQ • Resources |
Use timed practice to test judgment, not to replace the explanation layer:
48-72 hours later until the same failure mode stops repeating.| Step | What to record |
|---|---|
| 1 | the weak section: platform, ingestion, transformations, production, or governance |
| 2 | the real failure mode: execution confusion, Delta write safety, workflow recovery, or object-boundary confusion |
| 3 | the one sentence rule you should have applied |
| 4 | the exact chapter or appendix page to revisit next |
| Day | Focus |
|---|---|
| 7 | ingestion and transformations only |
| 6 | productionizing pipelines only |
| 5 | governance and Unity Catalog only |
| 4 | one mixed timed set and a miss log |
| 3 | re-drill only repeated weak patterns |
| 2 | cheat-sheet and glossary refresh |
| 1 | official exam-guide check and light recall only |