Azure DP-100 sample questions with explanations, traps, topic labels, and IT Mastery route links.
These original sample questions are designed to help you check how the exam topics appear in decision-style prompts. They are not taken from the live exam.
Use these sample questions as a guided self-assessment for Microsoft Certified: Azure Data Scientist Associate (DP-100) topics such as data preparation, experiment tracking, AutoML and custom training choices, model evaluation, responsible deployment, endpoints, and monitoring. The prompts emphasize data-science workflow decisions on Azure.
The sample set below is part of the Microsoft DP-100 guide path:
Work through each prompt before opening the explanation. Strong DP-100 answers preserve reproducibility, compare models with appropriate metrics, and deploy only when the model is measurable after release.
Topic: Choosing an evaluation metric
A fraud model flags fewer than 2 percent of transactions as likely fraud. The team says overall accuracy is high, but investigators are missing too many fraudulent transactions. Which metric focus is most useful for model selection?
Best answer: B
Explanation: For an imbalanced classification problem, accuracy can look high while the model misses the minority class. Fraud detection requires examining recall, precision, F1, and threshold behavior for the class that matters.
Why the other choices are weaker:
What this tests: Selecting model metrics that match an imbalanced business problem.
Related topics: Classification; Imbalanced data; Recall; Thresholds
Topic: Experiment tracking
Several data scientists run experiments with different feature transformations and hyperparameters. The team needs to compare runs, reproduce the best model, and register the selected artifact. What should they capture?
Best answer: B
Explanation: Experiment tracking needs the context that produced each model: parameters, metrics, code, data, environment, and artifacts. That enables comparison and reproducibility.
Why the other choices are weaker:
What this tests: Using experiment tracking to support reproducible model development.
Related topics: Experiment tracking; Metrics; Artifacts; Reproducibility
Topic: Batch versus online inference
A marketing team needs nightly predictions for millions of customers and does not require an immediate response during a user session. The data arrives in files each evening. Which deployment pattern is the better fit?
Best answer: A
Explanation: The requirement is high-volume scheduled scoring, not low-latency interactive inference. Batch scoring fits nightly file-based processing and can be monitored as a pipeline.
Why the other choices are weaker:
What this tests: Choosing an inference pattern based on latency, volume, and data-arrival requirements.
Related topics: Batch inference; Online endpoint; Pipeline; Deployment choice
Tech Exam Lexicon and IT Mastery are independent study tools. They are not affiliated with, endorsed by, or sponsored by the exam vendor.