Study Azure AI-900 Vision Tasks: key concepts, common traps, and exam decision cues.
AI-900 computer-vision questions are easier when you identify the expected output. Is the system assigning a label to the whole image, locating objects, reading text, or working with faces? That output usually tells you the task.
| Task | What the output looks like | Common clue |
|---|---|---|
| image classification | one or more labels for the whole image | “categorize this image” |
| object detection | labels plus object locations | “find the cars in the image” |
| OCR | extracted text from a visual source | “read the sign” |
| facial detection | location of a face | “find whether a face is present” |
| facial analysis | attributes or information derived from a detected face | “analyze the detected face” |
OCR is about extracting text from an image or document. Document processing often goes one step further and cares about document structure such as fields, tables, or form layout. AI-900 can test this boundary even if the formal computer-vision service bullets stay centered on visual capabilities.
| Scenario clue | Strongest first answer |
|---|---|
| “Which category best describes this image?” | image classification |
| “Detect each package in the warehouse photo” | object detection |
| “Read the street sign text” | OCR |
| “Locate faces in uploaded photos” | facial detection |
| “Analyze information from a detected face” | facial analysis |