| ML runtime |
Databricks runtime environment optimized for machine learning tasks |
Important platform-fit concept |
| AutoML |
Databricks automation aid for model and feature exploration |
High-yield Databricks ML feature |
| MLflow run |
Logged experiment execution with parameters, metrics, and artifacts |
Core experiment-tracking concept |
| Artifact |
File output from a run such as a plot, model file, or dataset snapshot |
Commonly confused with the model itself |
| Experiment |
Collection of related MLflow runs |
Comparison and organization layer |
| Registered model |
Named model object tracked across versions and lifecycle stages |
Core model-lifecycle concept |
| Alias |
Stable name that points to a particular registered-model version |
Important champion or challenger promotion concept |
| Feature engineering |
Transforming raw data into model-usable inputs |
High-yield feature-work concept |
| Feature table |
Managed reusable feature storage object |
Important Databricks ML platform concept |
| Online feature table |
Feature storage designed for low-latency serving use |
Commonly contrasted with offline feature tables |
| Offline feature table |
Feature storage designed for training, analysis, or batch workflows |
Commonly contrasted with online feature tables |
| Leakage |
Information bleeding into training or evaluation from an invalid future or target-dependent source |
One of the most tested evaluation failure modes |
| Baseline model |
Simple comparison model used to judge whether a better approach adds value |
Helps prevent “good-looking metric” mistakes |
| Precision |
Share of predicted positives that are actually positive |
Common classification metric |
| Recall |
Share of actual positives captured by the model |
Common classification metric |
| Cross-validation |
Repeated training and validation across different data splits |
Key validation concept |
| Hyperparameter |
Tunable training setting that is chosen before or during model search |
Common model-tuning term |
| Hyperopt |
Hyperparameter optimization tool referenced in the exam outline |
Important tuning tool concept |
| Estimator |
ML component that learns from data and produces a model |
Commonly contrasted with transformers |
| Transformer |
Component that changes data shape or values without being the predictive model itself |
Common pipeline concept |
| Inference |
Using a trained model to produce predictions |
Distinct from training and tracking |
| Train/validation/test split |
Separation of data for fitting, tuning, and final evaluation |
Core trustworthy-evaluation term |
| Artifact store |
Backing storage for MLflow artifacts |
Helps separate tracking metadata from stored outputs |
| Model version |
Specific registered-model instance tracked through lifecycle changes |
Common registry term |
| Reproducibility |
Ability to rerun and explain the same experiment result reliably |
Central operational ML concept |