Google Cloud PMLE Cheat Sheet: Training, Serving, and MLOps
April 24, 2026
Google Cloud PMLE cheat sheet for training, serving, MLOps, traps, and final review.
On this page
Use this cheat sheet for Google Cloud Professional Machine Learning Engineer (PMLE) after you know the ML lifecycle and need faster scenario decisions. PMLE questions reward lifecycle discipline: frame the problem, protect data validity, choose the right model path, deploy safely, monitor behavior, and retrain only when evidence supports it.
Read every PMLE question in this order
Identify the task: classification, regression, recommendation, forecasting, computer vision, NLP, GenAI, or anomaly detection.
Choose the metric that matches the business cost of wrong predictions.
Check data validity: label quality, leakage, skew, imbalance, bias, privacy, and train/serve consistency.
Pick managed, pretrained, AutoML, custom training, or GenAI only after the requirement is clear.
Add deployment, monitoring, explainability, rollback, and retraining triggers.
PMLE answer sequence
Use this when the stem mixes metric choice, data validity, model path, deployment, and monitoring.
flowchart TD
S["Scenario"] --> P["Identify the prediction task"]
P --> M["Pick the metric that matches the business cost"]
M --> D["Check data validity and train/serve consistency"]
D --> T["Choose managed, custom, or GenAI path"]
T --> O["Add serving, monitoring, and retraining controls"]
Problem and metric chooser
Scenario
Better metric instinct
binary classification with costly false negatives
recall, sensitivity, or cost-weighted metric
binary classification with costly false positives
precision or precision-recall trade-off
imbalanced classes
precision/recall, F1, PR AUC, and class-specific analysis
regression
MAE, RMSE, residual analysis, and business error tolerance
ranking or recommendation
relevance, ranking metric, click/conversion impact, and bias monitoring
forecasting
time-aware validation and error by horizon or segment
Data preparation traps
Trap
Better instinct
random split on time series
use time-based split to avoid leakage
test data used in tuning
keep test set for final unbiased evaluation
production features differ
align training and serving transformations
class imbalance ignored
use sampling, weighting, threshold tuning, and segment metrics
sensitive features hidden but proxies remain
evaluate fairness and indirect leakage
poor labels
fix label process before over-optimizing model architecture
Vertex AI and model path chooser
Requirement
Start with
fastest baseline from tabular/image/text data
AutoML-style managed path
custom architecture or training loop
custom training
repeatable ML workflow
pipelines, metadata, artifacts, and versioned components
managed model registry and deployment
model registry, endpoints, versions, and traffic splitting
experiment comparison
tracked parameters, metrics, dataset version, and reproducibility
GenAI app
model selection, prompt design, grounding, safety, evaluation, and monitoring
Deployment and serving
Need
Better fit
low-latency user prediction
online endpoint
large offline scoring
batch prediction
safe release
canary, traffic split, shadow test, or rollback path
high availability
regional design, autoscaling, health, monitoring, and fallback
cost control
model size, accelerator use, batching, autoscaling, and traffic pattern
explainability need
feature attribution or explanation approach where supported and meaningful
Monitoring and MLOps
Signal
What it tells you
training-serving skew
production features differ from training features
data drift
input distribution changed
concept drift
relationship between features and label changed
model performance
predictions no longer meet target metric
latency and errors
serving system health
cost
model and infrastructure efficiency
explainability shift
decision drivers changed across time or segment
GenAI and responsible AI
Scenario
Strong answer pattern
model hallucinates
grounding, retrieval quality, evaluation set, and human review
sensitive prompt data
classification, access, retention, redaction, and approved logging
unsafe generated output
safety filters, policy, review, and monitoring
prompt changes break quality
regression evaluation and versioned prompt/model config
GenAI versus predictive ML
choose GenAI for language/content tasks, not every prediction problem
high-impact decision
human oversight, explainability, fairness, and audit trail
Common traps
Trap
Better instinct
model-first thinking
start with business problem, metric, and data
accuracy as universal metric
choose metric based on error cost and class balance
deployment as finish line
production ML needs monitoring and retraining criteria
retraining on schedule only
retrain based on drift, performance, data changes, or business need
custom model by default
prefer managed or pretrained options when they meet requirements
GenAI without controls
add grounding, safety, privacy, and evaluation
Final 15-minute review
If the stem says…
Start here
poor metric
error cost, imbalance, business objective, and threshold
data leakage
split, feature timing, transformation, and test-set isolation
PMLE answers should protect the ML lifecycle: right metric, clean data boundaries, appropriate model path, safe deployment, continuous monitoring, and responsible controls.