Browse Python Institute Certification Guides

Python Institute PCEI Sample Questions with Explanations

Python Institute PCEI sample questions with explanations, traps, topic labels, and IT Mastery route links.

These original sample questions are designed to help you check how the exam topics appear in decision-style prompts. They are not taken from the live exam.

Use these sample questions as a guided self-assessment for Certified Entry-Level Python Programmer for AI (PCEI) topics such as Python foundations for AI scripts, features and labels, train/test evaluation, prompts, embeddings, retrieval, model limitations, privacy, and responsible AI checks. The prompts focus on safe implementation choices, not hype terms.

Where these questions fit in the PCEI guide

The sample set below is part of the Python Institute PCEI guide path:

PCEI Python and AI sample questions

Work through each prompt before opening the explanation. PCEI questions combine basic code reasoning with AI workflow judgment: data preparation, task type, validation, grounding, privacy, and human oversight.


Question 1

Topic: Features and labels

A beginner is preparing a supervised learning dataset to predict whether support tickets will miss the service-level target. Each row describes a ticket before the deadline passes. Which column should be treated as the label?

  • A. The ticket category selected when the ticket was opened.
  • B. The number of words in the first customer message.
  • C. Whether the ticket eventually missed the service-level target.
  • D. The agent team assigned at ticket creation.

Best answer: C

Explanation: In supervised learning, the label is the outcome the model is trained to predict. Here, the target outcome is whether the ticket missed the service-level target. The other columns may be useful features if they are known before prediction time.

Why the other choices are weaker:

  • A, B, and D describe input information, not the outcome being predicted.

What this tests: Distinguishing features from labels in a supervised learning problem.

Related topics: Supervised learning; Features; Labels; Prediction target; Data preparation


Question 2

Topic: Retrieval before generation

A Python chatbot should answer questions from a company’s internal policy documents. The team wants answers to stay grounded in the documents and avoid inventing policy details. Which design is strongest?

  • A. Put every policy document into one very long system prompt and hope the model remembers the relevant paragraph.
  • B. Use retrieval to select relevant policy passages, pass those passages into the prompt, and ask the model to answer from that context.
  • C. Ask the model to generate a new policy from general internet knowledge.
  • D. Remove policy documents from the workflow so the model is not constrained.

Best answer: B

Explanation: Retrieval helps the application find relevant source text before generation. Passing selected passages into the prompt gives the model grounding material and makes it easier to validate whether the answer follows the available policy context.

Why the other choices are weaker:

  • A is brittle, hard to maintain, and can exceed context limits.
  • C does not use the company’s actual policy source.
  • D increases the chance of unsupported answers.

What this tests: Retrieval-augmented generation, grounding, source selection, and prompt context design.

Related topics: Retrieval; Grounding; Prompts; Embeddings; AI applications


Question 3

Topic: Privacy in AI data preparation

A learner wants to test a summarization script using real customer support tickets that contain names, addresses, and account numbers. What is the safest first step?

  • A. Upload the tickets directly because summarization is not model training.
  • B. Keep only the longest tickets because short tickets contain less useful data.
  • C. Replace the model with a larger model so it can handle sensitive data safely.
  • D. Remove or mask personal data, confirm the permitted use of the dataset, and test with the minimum data needed.

Best answer: D

Explanation: Responsible AI work starts before the model call. The learner should reduce exposure by masking or removing personal data, checking whether the data is permitted for the use case, and using only the data needed for the test.

Why the other choices are weaker:

  • A ignores privacy and data-use constraints.
  • B changes which examples are tested but does not address sensitive data.
  • C does not make sensitive data safe by itself.

What this tests: Privacy, data minimization, responsible AI workflow, and safe test-data handling.

Related topics: Responsible AI; Privacy; Data minimization; PII; Testing


Question 4

Topic: Evaluating generated output

A Python script sends prompts to a generative model and receives fluent answers. The learner wants to know whether the answers are correct for a known set of examples. What should they add?

  • A. A repeatable evaluation set with expected answers or scoring criteria, then compare model responses against those criteria.
  • B. A random sleep between requests so answers vary more.
  • C. A print statement that displays only the first ten characters of each response.
  • D. A larger font in the terminal output so mistakes are easier to see.

Best answer: A

Explanation: Fluent output is not the same as correct output. A repeatable evaluation set gives the learner a controlled way to compare responses against expected behavior, identify regressions, and improve prompts or workflow design.

Why the other choices are weaker:

  • B changes timing, not correctness measurement.
  • C hides most of the output and does not define correctness.
  • D may improve readability but does not create an evaluation method.

What this tests: Evaluation discipline for AI outputs, test sets, scoring criteria, and repeatable validation.

Related topics: Evaluation; Generative AI; Test sets; Prompt quality; Validation

Independent study note

Tech Exam Lexicon and IT Mastery are independent study tools. They are not affiliated with, endorsed by, or sponsored by Python Institute or any certification body.

Revised on Sunday, May 10, 2026