Python Institute PCED glossary of data cleaning, transformation, visualization, and analysis terms.
Use this glossary when Certified Entry-Level Python for Data Science (PCED) terms start to blur together. The goal is practical recognition, not encyclopedia coverage.
| Term | Exam meaning |
|---|---|
| DataFrame | Tabular data structure with rows and columns. |
| Feature | Input variable used by a model. |
| Label | Target value a supervised model learns to predict. |
| Overfitting | Model fits training data too closely and generalizes poorly. |
| Train/test split | Separating data for learning from data for evaluation. |
| Imputation | Replacing missing values with chosen estimates or defaults. |
| Pair | How to separate them |
|---|---|
| Python foundations vs Data handling | Ask which layer the scenario is testing, then match the answer to that layer only. |
| Control vs evidence | A control changes behavior; evidence proves behavior or supports investigation. |
| Managed service vs custom build | Managed services win for lower operational effort unless the requirement needs unsupported customization. |
| Prevention vs detection | Prevention blocks or reduces a bad event; detection finds or reports that it happened. |
Do not memorize terms in isolation. For each term, write one scenario where it is the best answer, one scenario where it is a distractor, and one signal that proves it worked.