Fabric DP-700 cheat sheet for key facts, traps, service mappings, and final review.
On this page
Use this cheat sheet for Microsoft Certified: Fabric Data Engineer Associate (DP-700) when you need fast recall across Fabric data engineering decisions. The exam lane is about moving governed data from source to useful analytics, then proving the solution is secure, reliable, monitored, and optimized.
Read every Fabric scenario in this order
Identify the artifact: lakehouse, warehouse, pipeline, dataflow, notebook, eventstream, KQL database, semantic model, or workspace setting.
Identify the operation: ingest, transform, secure, orchestrate, monitor, optimize, or troubleshoot.
Choose the engine or tool that fits latency, volume, skill set, and transformation complexity.
Preserve governance: workspace roles, item permissions, lineage, sensitivity, endorsement, and data boundary.
Prove the result with quality checks, refresh history, monitoring, and performance evidence.
DP-700 answer sequence
Use this when the stem mixes Fabric artifacts, governance, refresh, monitoring, and performance.
flowchart TD
S["Scenario"] --> A["Identify the Fabric artifact"]
A --> O["Identify the operation"]
O --> T["Choose the engine or tool that fits"]
T --> G["Preserve governance and permissions"]
G --> V["Verify with refresh history, lineage, or performance evidence"]
Fabric artifact chooser
Requirement
Strong starting point
store raw and curated files/tables in an analytics lake pattern
lakehouse and OneLake
SQL-first enterprise warehousing
warehouse
visual/low-code transformation
Dataflow Gen2 when transformation complexity fits
code-first Spark transformation
notebook or Spark job pattern
orchestration across steps
data pipeline with dependencies, parameters, and monitoring
real-time or log-style analysis
eventstream or KQL-oriented path when the source is streaming or operational telemetry
BI consumption
semantic model and downstream reporting path, while DP-700 still focuses on engineering inputs
Ingestion and transformation map
If the stem emphasizes…
Prefer thinking about…
repeatable scheduled load
pipeline orchestration, credentials, dependencies, failure path, and monitoring
complex transformations
notebook/Spark or SQL depending on data shape and team skills
source system abstraction
shortcuts, mirroring, or connector pattern where supported and appropriate
low-code data prep
Dataflow Gen2, refresh behavior, and ownership
streaming or near-real-time data
event ingestion, windowing/latency needs, and KQL or streaming pipeline fit
medallion architecture
raw/bronze, cleansed/silver, serving/gold separation with validation and lineage
OneLake, lakehouse, and warehouse rules
Topic
Exam instinct
OneLake
Treat as the shared storage foundation, but do not ignore workspace/item governance.
Shortcut
Useful for referencing data without copying it, but permissions and source reliability still matter.
Lakehouse
Best when files, Delta tables, Spark, and open analytics patterns are central.
Warehouse
Best when relational SQL modeling, constraints, and SQL analytics are central.
Delta table
Think ACID-style reliability, schema evolution controls, and table optimization.
Semantic model
Consumption layer needs clean grain, relationships, measures, refresh, and security.
SQL, PySpark, and KQL chooser
Tool/language
Use when…
Watch for…
SQL
relational transformations, joins, aggregations, warehouse logic, and analyst-friendly review
poor query shape, missing filters, wrong grain, or expensive scans
PySpark
large-scale distributed transformations, complex data prep, notebooks, and code reuse
cluster/session cost, shuffle pressure, schema drift, and maintainability
KQL
telemetry, event, log, or time-series analysis
wrong time filter, wrong aggregation, and treating KQL like normal transactional SQL
Dataflow Gen2
low-code cleaning and shaping
refresh dependencies, ownership, and transformation limits
Pipeline
orchestration rather than transformation logic itself
missing dependency, parameter, retry, alert, or failure path
Security and governance checklist
Control
What the exam rewards
workspace roles
assign broad workspace responsibilities intentionally; do not use them as a substitute for item-level thought
item permissions
grant access to specific artifacts when that is the real boundary
sensitivity labels
classify and protect data for downstream use and compliance
lineage
prove where data came from and what consumed it
endorsement
signal which data products are promoted or certified for trusted use
credentials
secure connection credentials and avoid personal one-off ownership for production paths
tenant/workspace settings
use platform-level governance when the requirement is organization-wide
Monitoring and troubleshooting
Symptom
Check first
pipeline failed
activity error, dependency, credential, parameter, source availability, and retry policy
refresh is slow
source latency, transformation step, query shape, data volume, partitioning, and capacity pressure
DP-700 answers should preserve the data engineering chain: ingest reliably, transform with the right engine, govern access, validate quality, monitor operations, and optimize from evidence.