Browse Microsoft Certification Guides

Azure DP-100 Sample Questions with Explanations

Azure DP-100 sample questions with explanations, traps, topic labels, and IT Mastery route links.

These original sample questions are designed to help you check how the exam topics appear in decision-style prompts. They are not taken from the live exam.

Use these sample questions as a guided self-assessment for Microsoft Certified: Azure Data Scientist Associate (DP-100) topics such as data preparation, experiment tracking, AutoML and custom training choices, model evaluation, responsible deployment, endpoints, and monitoring. The prompts emphasize data-science workflow decisions on Azure.

Where these questions fit in the DP-100 guide

The sample set below is part of the Microsoft DP-100 guide path:

DP-100 data science sample questions

Work through each prompt before opening the explanation. Strong DP-100 answers preserve reproducibility, compare models with appropriate metrics, and deploy only when the model is measurable after release.


Question 1

Topic: Choosing an evaluation metric

A fraud model flags fewer than 2 percent of transactions as likely fraud. The team says overall accuracy is high, but investigators are missing too many fraudulent transactions. Which metric focus is most useful for model selection?

  • A. Accuracy only, because the majority class dominates the dataset.
  • B. Recall, precision, F1 score, and threshold analysis for the fraud class.
  • C. Training duration only.
  • D. The number of input columns, regardless of prediction quality.

Best answer: B

Explanation: For an imbalanced classification problem, accuracy can look high while the model misses the minority class. Fraud detection requires examining recall, precision, F1, and threshold behavior for the class that matters.

Why the other choices are weaker:

  • A hides minority-class failure.
  • C measures compute time, not model usefulness.
  • D says nothing about prediction behavior.

What this tests: Selecting model metrics that match an imbalanced business problem.

Related topics: Classification; Imbalanced data; Recall; Thresholds


Question 2

Topic: Experiment tracking

Several data scientists run experiments with different feature transformations and hyperparameters. The team needs to compare runs, reproduce the best model, and register the selected artifact. What should they capture?

  • A. Only the final model file name.
  • B. Parameters, metrics, code version, data inputs, environment details, and resulting model artifacts for each run.
  • C. Only screenshots of charts from each notebook.
  • D. Only the name of the data scientist who ran the experiment.

Best answer: B

Explanation: Experiment tracking needs the context that produced each model: parameters, metrics, code, data, environment, and artifacts. That enables comparison and reproducibility.

Why the other choices are weaker:

  • A identifies an artifact but not how it was produced.
  • C is not structured or reproducible.
  • D supports accountability but not model comparison.

What this tests: Using experiment tracking to support reproducible model development.

Related topics: Experiment tracking; Metrics; Artifacts; Reproducibility


Question 3

Topic: Batch versus online inference

A marketing team needs nightly predictions for millions of customers and does not require an immediate response during a user session. The data arrives in files each evening. Which deployment pattern is the better fit?

  • A. A batch scoring pipeline that processes the nightly files and writes predictions for downstream use.
  • B. A low-latency online endpoint for every prediction request.
  • C. A manual notebook run with no scheduling or output checks.
  • D. A dashboard that displays training metrics but does not generate predictions.

Best answer: A

Explanation: The requirement is high-volume scheduled scoring, not low-latency interactive inference. Batch scoring fits nightly file-based processing and can be monitored as a pipeline.

Why the other choices are weaker:

  • B adds online serving complexity when immediate response is not needed.
  • C is not reliable production automation.
  • D reports training information but does not satisfy the scoring requirement.

What this tests: Choosing an inference pattern based on latency, volume, and data-arrival requirements.

Related topics: Batch inference; Online endpoint; Pipeline; Deployment choice

Tech Exam Lexicon and IT Mastery are independent study tools. They are not affiliated with, endorsed by, or sponsored by the exam vendor.

Revised on Sunday, May 10, 2026