Browse Linux Foundation and CNCF Guides

Linux Foundation PCA Sample Questions with Explanations

Linux Foundation PCA sample questions with explanations, traps, topic labels, and IT Mastery route links.

These original sample questions are designed to help you check how the exam topics appear in decision-style prompts. They are not taken from the live exam.

Use these sample questions as a guided self-assessment for Prometheus Certified Associate (PCA) topics such as the metrics model, labels, scraping, service discovery, PromQL, alert rules, recording rules, histograms, dashboards, and operational scaling. The prompts focus on choosing the right query or control for an observability problem.

Where these questions fit in the PCA guide

The sample set below is part of the Linux Foundation / CNCF PCA guide path:

PCA observability sample questions

Work through each prompt before opening the explanation. PCA questions reward metric-shape discipline: labels, counter math, range vectors, alert quality, cardinality, and scrape troubleshooting.


Question 1

Topic: Missing scrape targets

A Kubernetes application exposes /metrics, but no time series appear for the new pods. Prometheus uses Kubernetes service discovery and relabeling. The pods are healthy, and the endpoint works when accessed directly from inside the cluster. What should the team check first?

  • A. Whether the Grafana dashboard has a dark-theme setting enabled.
  • B. Whether the application writes metrics to a local disk file before Prometheus starts.
  • C. Whether the ServiceMonitor or scrape configuration selects the correct namespace, labels, port name, and path after relabeling.
  • D. Whether the pod CPU request is higher than the namespace quota.

Best answer: C

Explanation: If the endpoint works but Prometheus has no series, the strongest first check is target discovery and scrape selection. In Kubernetes setups, label selectors, namespace selectors, port names, paths, and relabeling determine whether Prometheus actually scrapes the target.

Why the other choices are weaker:

  • A affects display preferences, not scrape target discovery.
  • B is not how Prometheus normally collects metrics from application endpoints.
  • D may affect scheduling, but the pods are already healthy and reachable.

What this tests: Scrape configuration, Kubernetes discovery, selectors, relabeling, and target health troubleshooting.

Related topics: Scraping; Service discovery; ServiceMonitor; Relabeling; Targets


Question 2

Topic: Alert quality

A team pages the on-call engineer whenever average CPU usage is above 70 percent for five minutes. The alert fires often during safe batch windows and misses user-visible API failures. What is the best improvement?

  • A. Build alerts around user-impacting symptoms and service objectives, then use resource alerts as supporting diagnostic signals.
  • B. Remove all alerts and rely only on dashboards during incidents.
  • C. Alert on every pod restart regardless of workload, environment, or impact.
  • D. Lower the CPU threshold to 50 percent so the team gets earlier warning.

Best answer: A

Explanation: Good alerting focuses on actionable symptoms and service health, not every noisy internal signal. CPU can be useful context, but paging should usually be tied to user impact, error rates, latency, availability, saturation, or an SLO-style condition with clear ownership.

Why the other choices are weaker:

  • B removes proactive notification and slows response.
  • C creates noise because restarts are not equally important in every workload.
  • D increases noise without addressing the mismatch between CPU and user impact.

What this tests: Alert design, actionable paging, symptom-based monitoring, and the difference between diagnostic signals and page-worthy conditions.

Related topics: Alerting; SLOs; Runbooks; Resource metrics; Noise reduction


Question 3

Topic: Repeated expensive PromQL

A dashboard and several alerts repeatedly calculate the same complex request-rate aggregation across hundreds of services. The query is correct, but it is slow and expensive. What is the strongest Prometheus-native optimization?

  • A. Duplicate the query into every dashboard panel so each panel can tune it independently.
  • B. Replace Prometheus with application logs because logs are always cheaper to aggregate.
  • C. Increase label cardinality so the query has more dimensions to filter.
  • D. Create a recording rule that precomputes the aggregation into a new time series used by dashboards and alerts.

Best answer: D

Explanation: Recording rules precompute frequently used or expensive PromQL expressions and store the result as time series. This reduces repeated query cost and makes dashboards and alerts more consistent.

Why the other choices are weaker:

  • A repeats the same expensive work and increases maintenance risk.
  • B changes signal type and does not solve the Prometheus query-design problem.
  • C usually makes performance and storage pressure worse by increasing cardinality.

What this tests: Recording rules, query performance, aggregation reuse, and operational Prometheus design.

Related topics: Recording rules; PromQL; Aggregation; Dashboards; Alert rules


Question 4

Topic: Latency percentiles

An API exposes request-duration histograms with buckets. The SRE team wants to graph an approximate p95 latency over a five-minute window by route. Which PromQL pattern is most appropriate?

  • A. Use avg(http_request_duration_seconds_sum) grouped by route.
  • B. Use histogram_quantile() over a rate of the histogram bucket series, grouped by le and route.
  • C. Use the raw counter value from http_request_duration_seconds_count without a range function.
  • D. Use up == 1 because target availability and request latency are the same signal.

Best answer: B

Explanation: Prometheus histograms expose bucket counters. To estimate a percentile such as p95 over a time window, calculate the rate of bucket observations, keep the le bucket label during aggregation, and pass the result to histogram_quantile().

Why the other choices are weaker:

  • A does not compute a percentile and uses the sum series incorrectly by itself.
  • C uses a monotonically increasing count rather than a windowed latency distribution.
  • D checks target scrape availability, not request latency.

What this tests: Histograms, bucket labels, range functions, rates, and percentile estimation in PromQL.

Related topics: Histograms; PromQL; histogram_quantile; Range vectors; Latency

Independent study note

Tech Exam Lexicon and IT Mastery are independent study tools. They are not affiliated with, endorsed by, or sponsored by the Linux Foundation, CNCF, or any certification body.

Revised on Sunday, May 10, 2026