HELP

Microsoft Azure AI Fundamentals (AI-900) for Non-Technical Pros

AI Certification Exam Prep — Beginner

Microsoft Azure AI Fundamentals (AI-900) for Non-Technical Pros

Microsoft Azure AI Fundamentals (AI-900) for Non-Technical Pros

Learn AI-900 essentials fast with practice, scenarios, and a full mock exam.

Beginner ai-900 · microsoft · azure · ai-fundamentals

Course overview

This beginner-friendly course prepares you to pass the Microsoft Azure AI Fundamentals exam (AI-900), even if you’re a non-technical professional. You’ll learn how to recognize common AI workloads, understand core machine learning concepts, and confidently choose the right Azure AI approach in real-world scenarios—exactly the kind of decision-making the AI-900 exam rewards.

The Microsoft AI-900 exam is designed to validate foundational knowledge across key areas of Azure AI. Instead of coding, you’ll be tested on concepts, terminology, and scenario-based choices: What type of AI workload is this? Which capability best fits the requirement? What outcomes and limitations should you expect? This course is structured as a 6-chapter “book” so you always know what to study next and how it maps to official objectives.

Aligned to the official AI-900 domains

Each chapter directly targets the exam domains:

  • Describe AI workloads
  • Fundamental principles of ML on Azure
  • Computer vision workloads on Azure
  • NLP workloads on Azure
  • Generative AI workloads on Azure

Chapters 2–5 provide clear explanations at the right depth for AI-900, followed by exam-style practice so you can learn the concepts and immediately apply them the way Microsoft asks.

How the 6 chapters work

Chapter 1 gets you exam-ready operationally: exam registration and scheduling, what the score means, typical question patterns, and a simple study strategy you can follow whether you have 2 weeks or a month.

Chapters 2–5 each cover one (or two) official domains. You’ll focus on identifying AI workload types, understanding ML fundamentals (training vs inference, evaluation, and responsible AI), and choosing between vision, language, and generative solutions depending on the scenario. Throughout, you’ll practice with questions written in the style you’ll see on the AI-900 exam.

Chapter 6 is your full mock exam and final review. You’ll complete two timed mock sections, analyze weak spots by domain, and finish with an exam-day checklist that helps you avoid common mistakes and manage time confidently.

What makes this course effective for non-technical learners

  • Plain-language explanations that assume no prior certification experience
  • Scenario-first learning: focus on how to choose the right AI approach
  • Practice questions in exam style, plus review guidance to improve quickly
  • Responsible AI concepts integrated across all domains (not treated as an afterthought)

Get started

If you’re ready to begin, create your learning account and start the first chapter today: Register free. Prefer to compare options first? You can also browse all courses on the Edu AI platform.

What You Will Learn

  • Describe AI workloads and core AI concepts tested in the AI-900 exam
  • Explain fundamental principles of machine learning on Azure (training, evaluation, and responsible AI)
  • Identify Azure computer vision workloads and select the right vision capability for a scenario
  • Identify Azure natural language processing (NLP) workloads and select the right language capability for a scenario
  • Describe generative AI workloads on Azure, including model concepts, use cases, and responsible deployment

Requirements

  • Basic IT literacy (web apps, cloud concepts, and common business software)
  • No prior Microsoft certification experience required
  • Willingness to practice with scenario-based, multiple-choice questions

Chapter 1: AI-900 Exam Orientation and Study Plan

  • Understand the AI-900 exam format, domains, and question styles
  • Register, schedule, and choose exam delivery (online vs test center)
  • Build a 2-week and 4-week study strategy for beginners
  • Set up your Azure learning environment and free resources

Chapter 2: Describe AI Workloads (Domain) and Core Concepts

  • Recognize AI workloads and match them to real business scenarios
  • Differentiate ML, computer vision, NLP, and generative AI at a high level
  • Apply responsible AI concepts to common workplace use cases
  • Practice: AI workload identification (exam-style set)

Chapter 3: Fundamental Principles of Machine Learning on Azure (Domain)

  • Explain supervised, unsupervised, and reinforcement learning at exam depth
  • Understand training, validation, testing, and evaluation metrics
  • Choose the right ML approach for a scenario (regression vs classification vs clustering)
  • Practice: ML fundamentals on Azure (exam-style set)

Chapter 4: Computer Vision Workloads on Azure (Domain)

  • Identify key computer vision tasks and how they appear on the exam
  • Select Azure vision capabilities for OCR, detection, and image understanding scenarios
  • Understand responsible vision considerations and limitations
  • Practice: computer vision scenario questions (exam-style set)

Chapter 5: NLP and Generative AI Workloads on Azure (Domains)

  • Identify NLP tasks and choose the right capability for language scenarios
  • Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth
  • Apply responsible AI and safety concepts to language and generative use cases
  • Practice: NLP + generative AI mixed scenarios (exam-style set)

Chapter 6: Full Mock Exam and Final Review

  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist

Jordan Whitaker

Microsoft Certified Trainer (MCT) — Azure AI Fundamentals

Jordan Whitaker is a Microsoft Certified Trainer who specializes in helping beginners pass Microsoft fundamentals exams on the first attempt. He has coached professionals across business, operations, and support teams through AI-900 by translating Azure AI concepts into clear decision-making frameworks and exam-ready practice.

Chapter 1: AI-900 Exam Orientation and Study Plan

AI-900 (Microsoft Azure AI Fundamentals) is designed to validate that you understand what AI is, where it fits in business scenarios, and how Microsoft Azure delivers AI capabilities—without requiring you to code. For non-technical professionals, this exam is less about building models and more about choosing the right AI workload, interpreting high-level machine learning concepts, and explaining responsible AI considerations in plain language.

This chapter orients you to the exam’s format, what it measures, and how to build an efficient study routine in either 2 weeks or 4 weeks. You’ll also set up a lightweight learning environment so you can recognize Azure AI services by name, purpose, and best-fit scenarios—exactly what the exam tests.

Exam Tip: AI-900 answers are often “most appropriate,” not merely “possible.” Your job is to pick the option that best matches the scenario constraints (data type, desired output, latency, cost, and governance), even if multiple choices sound plausible.

  • Understand exam domains and how they map to skills
  • Know how to register and choose delivery mode
  • Plan your study time (2-week or 4-week track)
  • Prepare for question styles and manage time
  • Set up free learning resources and an Azure environment

Use this chapter as your “operating manual” for the rest of the course. The remaining chapters will dig into AI workloads, Azure services, and responsible AI—but your outcomes depend on how well you navigate the exam and practice intelligently.

Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and choose exam delivery (online vs test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 2-week and 4-week study strategy for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Set up your Azure learning environment and free resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and choose exam delivery (online vs test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 2-week and 4-week study strategy for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Set up your Azure learning environment and free resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 1.1: What AI-900 measures and how domains map to skills

Section 1.1: What AI-900 measures and how domains map to skills

AI-900 measures your ability to describe AI workloads and identify Azure services that solve common business problems. Microsoft organizes the exam into domains that broadly align to: core AI concepts (including machine learning principles), computer vision, natural language processing (NLP), and generative AI—plus responsible AI concepts that cut across all areas.

In practical terms, the exam is testing whether you can hear a scenario and correctly classify the workload: “Is this prediction or classification?” “Is this extracting text from images?” “Is this summarizing language or understanding intent?” “Is this generating content using a foundation model?” Your answers should demonstrate correct matching of problem type → Azure capability.

  • Core AI/ML concepts: training vs inference, features/labels, evaluation metrics, overfitting, and model lifecycle.
  • Vision: image classification, object detection, OCR, and related Azure services/capabilities.
  • NLP: key phrase extraction, sentiment, entity recognition, translation, and conversational bots.
  • Generative AI: prompts, grounding, embeddings, and responsible deployment patterns on Azure.
  • Responsible AI: fairness, reliability, safety, privacy/security, inclusiveness, transparency, accountability.

Exam Tip: When two answers look similar, choose the one that matches the output the scenario asks for. “Detect objects and return bounding boxes” points to object detection, while “categorize the image” points to classification. The exam rewards precise interpretation of what the business wants produced.

Common trap: treating “AI service names” as the goal. The goal is workload selection. Learn the names as labels for capabilities, but practice translating scenarios into the right workload category first, then select the Azure service that implements it.

Section 1.2: Registration steps, exam policies, and accommodations

Section 1.2: Registration steps, exam policies, and accommodations

Registering correctly prevents avoidable stress and last-minute issues. AI-900 exams are scheduled through Microsoft’s certification portal and delivered by an exam provider (commonly Pearson VUE). You’ll pick an exam delivery method: online proctored (remote) or test center. Both test the same objectives; your choice should be based on your environment and comfort with proctoring rules.

High-level registration flow: sign in with your Microsoft account → select AI-900 → choose language and delivery → select date/time and location (or online) → complete identity verification steps required by the provider. Plan this at least a week in advance so you have time to resolve account name mismatches or ID concerns.

  • Online proctored: requires a quiet room, stable internet, and strict desk/room rules. Expect check-in steps, camera verification, and restrictions on breaks.
  • Test center: reduces home-tech risk; centers handle the environment. Travel time is the main drawback.

Exam Tip: If your home environment is unpredictable (noise, shared space, unstable Wi‑Fi), choose a test center. Online cancellations due to technical or environmental issues are one of the most common “preventable failures” for otherwise-prepared candidates.

Review exam policies early: ID requirements, rescheduling windows, and prohibited items. If you need accommodations (for example, extra time due to a documented need), start the request process as early as possible since approvals can take time. A common trap is assuming accommodations can be added after scheduling—often you must secure them first, then schedule under the approved conditions.

Section 1.3: Scoring model, passing expectations, and retake strategy

Section 1.3: Scoring model, passing expectations, and retake strategy

Microsoft exams typically use a scaled scoring model, and candidates often aim for a “passing score” that is commonly presented as 700 on a 1–1000 scale. The exact number and weighting can vary, and not every question necessarily contributes equally. Your strategy should focus on consistent coverage across domains rather than “gaming” the score.

What passing really requires: not perfection, but competence across the blueprint. AI-900 is fundamentals-level, yet it includes subtle distinctions (for example, when to use a vision OCR capability vs document analysis, or when a generative AI solution needs grounding to reduce hallucinations). Expect a mix of straightforward definitions and scenario interpretation.

Exam Tip: Build “minimum viable competence” in every domain before you try to optimize any single area. Candidates who over-study one favorite topic (often generative AI) and ignore another (often classic ML evaluation concepts) risk missing easy points.

Plan a retake strategy even if you don’t intend to use it. Knowing your fallback reduces anxiety and improves performance. If you fail, use the score report by skill area to identify where to focus. Avoid the common trap of immediately rebooking without changing your study process—your next attempt should include more targeted practice, not just more reading.

  • After a failed attempt, allocate 60–70% of your time to the weakest domains.
  • Redo scenario-based practice to retrain decision-making, not memorization.
  • Confirm any policy-based waiting periods and fees before scheduling a retake.

For non-technical professionals, confidence comes from repetition: “If I see X business need, I map it to Y workload and Z Azure capability.” That mapping skill is what the scoring model ultimately rewards.

Section 1.4: Question formats (MCQ, scenario, drag-drop) and time management

Section 1.4: Question formats (MCQ, scenario, drag-drop) and time management

AI-900 questions commonly appear as multiple choice (single answer) and multi-select (“choose two/three”), plus scenario-based items that require you to interpret business context. Some exam forms include drag-and-drop matching (for example, matching a workload to a service or matching steps in an ML process). Your preparation should mirror these formats so you practice the skill the exam measures: quick, accurate mapping under time pressure.

Time management is less about rushing and more about avoiding traps. Scenario questions often contain extra details meant to distract you. Train yourself to underline (mentally) three elements: input data type (text, image, audio, tabular), desired output (label, score, bounding box, summary), and constraints (real-time vs batch, explainability, privacy).

  • MCQ trap: picking a “real” Azure product name that does not match the task.
  • Multi-select trap: choosing only one option when the prompt clearly says “choose two.”
  • Drag-drop trap: mixing up training vs inference steps, or evaluation vs deployment steps.

Exam Tip: For multi-select items, treat each option as a true/false statement against the scenario. Don’t search for “the best pair” first—verify each candidate option independently and then select all that satisfy the requirement.

Build a pacing rule: if you can’t confidently decide within a short window, mark it for review (if your exam interface allows) and move on. Many candidates lose points by spending too long on one confusing scenario and then rushing simpler questions at the end.

Section 1.5: Study plan templates, spaced repetition, and note strategy

Section 1.5: Study plan templates, spaced repetition, and note strategy

Your study plan should match your schedule and your starting point. For beginners, two practical tracks work well: a focused 2-week plan (for those who can study daily) and a steadier 4-week plan (for those balancing work and family). In both tracks, the priority is to rotate through domains and revisit them repeatedly—this is spaced repetition, and it is especially effective for service-to-scenario mapping.

2-week template (high intensity): Days 1–3 core AI/ML concepts; Days 4–5 vision; Days 6–7 NLP; Days 8–9 generative AI and responsible AI; Days 10–12 mixed review + targeted weak spots; Days 13–14 full practice + final revision. This works if you can do 60–90 minutes per day.

4-week template (steady): Week 1 core concepts + responsible AI; Week 2 vision; Week 3 NLP; Week 4 generative AI + full-domain review and practice. This works if you can do 30–60 minutes most days.

Exam Tip: Use “two-layer notes.” Layer 1 is a one-page map: workload → output → Azure capability. Layer 2 is short clarifiers: key definitions (feature/label), what a metric indicates, and common scenario keywords (e.g., “extract printed text” → OCR).

  • Write notes as decision rules (“If the goal is X, prefer Y”) rather than paragraphs.
  • Review notes on a schedule: same day, next day, 3 days later, 7 days later.
  • Keep a running list of confusing terms and resolve them within 24 hours.

Common trap: passive reading of documentation. Fundamentals exams still require recall under pressure. Convert reading into prompts you can answer: “What does this service do?” “What output does it return?” “When would it be the wrong choice?”

Section 1.6: Practice approach: baselines, weak-spot tracking, and review loops

Section 1.6: Practice approach: baselines, weak-spot tracking, and review loops

Practice is where non-technical candidates separate “I’ve read it” from “I can answer it.” Start by taking a baseline assessment early (after you’ve skimmed the domains) to identify weak spots. Your baseline is not a judgment—it’s a diagnostic that tells you where to invest time. Then run a weekly review loop: practice → analyze errors → update notes → re-practice.

Weak-spot tracking should be specific. Don’t write “NLP” as a weakness; write “confusing key phrase extraction vs sentiment,” or “forgetting when to use embeddings/grounding in generative AI.” Specificity creates targeted fixes and faster improvement.

  • Baseline: 20–30 questions across all domains; tag each miss with a reason (concept gap, misread prompt, guessed between two).
  • Fix: revisit the concept, then create a one-sentence decision rule in your notes.
  • Reinforce: redo similar items 48–72 hours later to confirm retention.

Exam Tip: Track “near misses” (questions you got right but weren’t sure about). These are high-risk on exam day because stress can flip them to wrong. Treat them like incorrect answers and strengthen the rule behind the choice.

Set up your Azure learning environment with free resources so concepts feel concrete. Use Microsoft Learn modules, product documentation overview pages, and—if available to you—a free Azure account or sandbox. You don’t need to build production solutions, but clicking through Azure AI service pages, seeing what inputs/outputs look like, and reading a few sample scenarios will reduce ambiguity on test day. The goal of practice is not volume; it’s closing loops until your decision-making becomes automatic.

Chapter milestones
  • Understand the AI-900 exam format, domains, and question styles
  • Register, schedule, and choose exam delivery (online vs test center)
  • Build a 2-week and 4-week study strategy for beginners
  • Set up your Azure learning environment and free resources
Chapter quiz

1. You are advising a non-technical colleague who is starting the AI-900 exam prep. They ask what the exam is primarily designed to validate. Which statement best describes the AI-900 focus?

Show answer
Correct answer: Ability to identify AI workloads and Azure AI capabilities, and explain concepts like responsible AI without writing code
AI-900 (Azure AI Fundamentals) validates foundational understanding of AI concepts and how Azure provides AI services, commonly at a scenario/selection level rather than hands-on coding. Option B is more aligned with role-based data science/engineering exams where building/training models and coding is expected. Option C aligns with Azure infrastructure/security exams, not the AI fundamentals domains.

2. A candidate notices that several answer choices on practice questions seem plausible. What approach best matches how AI-900 questions are commonly scored and written?

Show answer
Correct answer: Select the option that is most appropriate given the scenario constraints, even if other options could also work
AI-900 questions often ask for the 'best' or 'most appropriate' choice based on constraints such as data type, desired output, cost, latency, and governance/responsible AI considerations. Option B is incorrect because question type (single vs multi-select) is explicitly indicated; you cannot assume multi-select. Option C is incorrect because exams frequently prefer the simplest service that meets requirements, not the most complex.

3. A busy professional can study only 30–45 minutes on weekdays and 1–2 hours on weekends. They want a realistic plan that reduces burnout while still building familiarity with Azure AI services and question styles. Which study strategy is most appropriate?

Show answer
Correct answer: Follow a 4-week plan with steady daily practice, time-boxed review, and periodic practice questions
A 4-week plan better fits limited daily availability and supports consistent exposure to exam domains, terminology, and scenario-style decision making. Option B is less appropriate because compressed timelines generally require more daily hours and increases burnout risk. Option C is weak because practice tests without structured learning can lead to memorizing answers without understanding exam objectives and service-fit reasoning.

4. A company’s HR team wants to schedule the AI-900 exam for several employees and asks about delivery options. Which statement correctly reflects typical Microsoft exam delivery choices that candidates must decide during scheduling?

Show answer
Correct answer: They can choose between online proctored delivery and taking the exam at a test center
Microsoft certification exams commonly offer both online proctored and test center delivery options during registration/scheduling. Option B is incorrect because online proctoring is generally available for many fundamentals exams. Option C is incorrect because these exams are not offered as open-book; exam policies and proctoring rules apply regardless of delivery mode.

5. You are helping a beginner set up a learning environment for AI-900 preparation. They want to spend as little as possible while becoming familiar with Azure AI services by name and typical use cases. What is the most appropriate first step?

Show answer
Correct answer: Use free Microsoft learning resources and set up an Azure account (free tier/trial) to explore the portal and service categories
AI-900 preparation typically benefits from Microsoft Learn and a lightweight Azure environment (often via free tier/trial) to recognize services and scenarios without heavy cost or setup. Option B is unnecessary because AI-900 does not require building/training models or advanced compute setup. Option C is inappropriate because it adds cost and complexity beyond what the exam domains require for fundamentals-level service selection and concepts.

Chapter 2: Describe AI Workloads (Domain) and Core Concepts

This chapter maps directly to a major AI-900 skill area: recognizing common AI workload patterns and explaining the core vocabulary that shows up in nearly every question. For non-technical professionals, the exam is not trying to turn you into a data scientist—it’s checking whether you can identify the right type of AI for a business scenario, describe what a model does, and show awareness of responsible AI principles.

You’ll see scenario-based items that sound like workplace requests (“we want to predict churn,” “we need to read receipts,” “summarize support tickets,” “detect defects in images,” “draft marketing copy”). Your job is to classify the request into the correct workload family (machine learning, computer vision, NLP, or generative AI), then choose an appropriate capability (often an Azure AI service vs building custom ML).

As you read, keep this exam habit: translate business language into AI workload language. Words like “forecast,” “estimate,” “score,” and “risk” often indicate prediction; “categorize,” “approve/deny,” and “route” hint classification; “find objects” or “locate issues in images” suggests detection; “shorten,” “key points,” and “digest” indicates summarization. This chapter also introduces responsible AI concepts because AI-900 expects you to identify risks and mitigation themes at a high level.

Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Differentiate ML, computer vision, NLP, and generative AI at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Apply responsible AI concepts to common workplace use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: AI workload identification (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Differentiate ML, computer vision, NLP, and generative AI at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Apply responsible AI concepts to common workplace use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: AI workload identification (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 2.1: Describe AI workloads vs traditional software approaches

AI-900 frequently tests whether you can tell when AI is appropriate versus when traditional rules-based software is enough. Traditional software is deterministic: you define explicit rules (if-then logic), and the program follows them exactly. AI workloads are probabilistic: they learn patterns from data and return the most likely output (often with a confidence score).

Business cue: if the problem is stable, well-defined, and easy to encode (“apply a fixed discount rule,” “validate a 10-digit ID checksum”), it’s usually traditional software. If the problem involves perception, language, or messy real-world variation (“recognize products in photos,” “detect fraud,” “summarize emails”), AI is typically a better fit because writing complete rules is impractical.

Exam Tip: Watch for phrasing like “cannot describe all rules” or “patterns change over time.” That is the exam’s hint that machine learning (or another AI workload) is needed. If the scenario says “must be 100% correct” with no tolerance for error, the best answer may be non-AI controls or a human-in-the-loop approach.

Common trap: assuming anything with the word “automation” requires AI. Many automation problems are workflow and integration tasks (Power Automate, logic apps, scripts). The exam expects you to pick AI only when there is learning, perception, or language understanding involved. Another trap is equating “chatbot” with generative AI. Many bots are retrieval/FAQ systems (NLP intent detection + knowledge base) without generating novel text.

How to identify correct answers: first decide “rules vs learning.” Then identify the modality: numbers/tables (ML), images/video (vision), text/speech (NLP), content creation (generative AI). This quick triage is an exam-winning habit.

Section 2.2: Common AI workload types: prediction, classification, detection, summarization

AI-900 uses a small set of workload archetypes repeatedly. Your score improves when you can map scenario verbs to these archetypes.

  • Prediction (regression/forecasting): output is a number. Examples: forecast monthly sales, predict delivery time, estimate home price, compute risk score. Look for “how much” or “how many” or “what will the value be.”
  • Classification: output is a category/label. Examples: approve/deny a loan, classify email as spam/ham, route tickets to “billing/technical,” label sentiment as positive/negative/neutral. Look for “which bucket does this belong to.”
  • Detection: output is “something is present” and often “where.” In computer vision, detection commonly means locating objects (bounding boxes) or identifying anomalies/defects. In security or ops, detection can also mean “find unusual behavior.” Look for “find,” “locate,” “identify objects,” “detect defects.”
  • Summarization: output is shorter text capturing key points. This is an NLP capability and also a common generative AI use case. Look for “condense,” “key points,” “executive summary.”

Exam Tip: Don’t confuse detection with classification. If the question says “is there a defect?” it could be classification. If it says “draw a box around the defect” or “count and locate items,” that’s detection. The exam often uses these subtle cues.

Another trap: summarization versus extraction. Summarization produces a paraphrased condensed version; extraction pulls existing fields (names, dates, invoice totals). On AI-900, extraction often maps to NLP information extraction (entity recognition) or document processing capabilities, while summarization maps to NLP/generative capabilities.

To pick the best answer, focus on the expected output type: number (prediction), label (classification), presence/location (detection), shorter text (summarization). Then choose the corresponding AI domain (ML, vision, NLP, generative AI) based on data type and scenario.

Section 2.3: AI terms for the exam: model, training, inference, features, labels

The AI-900 exam uses core machine learning vocabulary even in non-technical scenarios. You don’t need math, but you must know what the terms mean and how they relate to an ML lifecycle.

  • Model: the learned “logic” created from data. It takes inputs and produces outputs (predictions, classes, detections, etc.).
  • Training: the process of feeding historical data to an algorithm so it can learn patterns. Training produces the model.
  • Inference: using a trained model to make predictions on new data (sometimes called “scoring”).
  • Features: the input variables used to make a prediction (e.g., age, tenure, purchase frequency, image pixels, or text tokens depending on the domain).
  • Labels: the correct answers in supervised learning used during training (e.g., “churned” yes/no; defect type; sentiment category).

Exam Tip: If the scenario mentions “known outcomes in the past,” it’s hinting at labeled data and supervised learning. If it says “no labels available,” the exam may be pushing you toward unsupervised approaches (like clustering) or toward using prebuilt AI services rather than custom training.

Common trap: mixing up training and inference. Training happens before deployment and is compute-heavy; inference happens in production and must be fast and reliable. Another trap: thinking features are only numeric columns. For AI-900, features are simply the model inputs—text, images, and audio can all be features depending on the model.

What the exam tests: your ability to read a business description and identify where the model comes from (training), what you do with it in real time (inference), and what kind of data is required (features and labels). When answer choices include these terms, select the one that matches the correct stage of the ML process.

Section 2.4: Responsible AI fundamentals: fairness, reliability, privacy, transparency

AI-900 includes responsible AI because organizations must manage risk, compliance, and trust. The exam expects conceptual understanding of key principles and how they apply to workplace scenarios.

  • Fairness: similar individuals should receive similar outcomes; avoid unjust bias across groups. Exam scenarios often involve hiring, lending, insurance, or customer prioritization—areas where biased data can create discriminatory outcomes.
  • Reliability & safety: systems should perform consistently under expected conditions and fail safely. Think of edge cases: poor lighting for vision systems, slang/typos for NLP, or data drift over time.
  • Privacy & security: protect personal data and sensitive content. Consider data minimization, access control, encryption, and whether prompts or documents contain confidential information.
  • Transparency: stakeholders should understand when AI is used and what it is intended to do; provide explanations appropriate to the audience. For some decisions, users must know why a recommendation was made.

Exam Tip: When a question mentions protected characteristics (age, gender, ethnicity) or high-impact decisions, fairness is usually central. When it mentions “audit,” “explain,” or “regulators,” transparency is the likely theme. When it mentions “PII,” “HIPAA,” “GDPR,” or “customer data,” privacy is the theme.

Common trap: treating responsible AI as only a policy statement. The exam often wants practical mitigations: diverse training data, monitoring model performance, human review for sensitive decisions, and clear user disclosures. Another trap is assuming “more data” always helps; collecting unnecessary personal data can increase privacy risk.

Generative AI adds extra responsible AI concerns (hallucinations, unsafe content, data leakage), but the principles above still apply. On AI-900, you mainly need to connect scenario risk to the correct principle and identify high-level controls like content filtering and human oversight.

Section 2.5: Azure AI ecosystem overview: when to use services vs custom ML

AI-900 does not require hands-on building, but it does test whether you understand the difference between using prebuilt AI services and building custom machine learning solutions. The key decision is: do you need a general capability that Microsoft already provides, or do you need a model tuned to your specific data and labels?

Use Azure AI services (prebuilt) when the task is common and broadly applicable—OCR, image tagging, speech-to-text, translation, key phrase extraction, sentiment analysis, or content moderation. These services are optimized, scalable, and reduce the need for ML expertise and training pipelines.

Use custom ML (e.g., Azure Machine Learning) when your organization needs to predict something unique (churn risk for your specific customer base), when accuracy depends on your proprietary data, or when you must control features, training, evaluation, and deployment. Custom ML also makes sense when you need specialized labels or domain-specific outcomes.

Exam Tip: If the scenario mentions “we have historical data with outcomes” and the outcome is specific to the business, that’s a strong signal for custom ML. If it mentions “extract text from receipts,” “detect faces,” or “translate,” that’s typically a prebuilt service scenario.

Common trap: assuming generative AI replaces all NLP. Many language tasks are still best served by classic NLP (entity extraction, language detection) because they are cheaper, more deterministic, and easier to evaluate. Another trap is thinking you must always train a model—prebuilt services can be the correct answer when speed-to-value and standard tasks are emphasized.

What the exam tests here is your ability to select the right approach, not to memorize product names. Focus on the decision criteria: data availability (labels), specificity of the task, required customization, and risk/controls (especially for generative AI use).

Section 2.6: Practice questions: scenario mapping and terminology checks

This section prepares you for the exam’s most common item style: short business scenarios with multiple plausible AI options. Your advantage comes from using a consistent mapping method rather than “gut feel.”

Step 1: Identify the input data type. Tables of numbers and attributes suggest ML. Images/video suggest computer vision. Large volumes of text suggest NLP. Requests to draft, rewrite, or brainstorm content suggest generative AI.

Step 2: Identify the output type using the workload archetypes: number (prediction), category (classification), presence/location (detection), shorter text (summarization). This often eliminates half the choices immediately.

Step 3: Check for training vs inference clues. If the scenario says “build a model using past outcomes,” it’s training. If it says “use the model to score new applications,” it’s inference. If answer options misuse these terms, that’s a classic AI-900 distractor.

Step 4: Apply a responsible AI lens. If the scenario is high impact or involves personal data, consider fairness, privacy, reliability, and transparency. The exam may include a “best next step” choice like adding human review, monitoring performance, or explaining outcomes.

Exam Tip: When two answers both seem feasible, choose the one that most directly matches the scenario’s required output and least assumes unnecessary complexity. AI-900 typically rewards “fit-for-purpose” solutions (prebuilt service for common tasks; custom ML for proprietary predictions).

Common traps to avoid: confusing summarization with data extraction; confusing detection with classification; selecting generative AI when the task is simple sentiment or key phrase extraction; and ignoring that responsible AI may be the primary objective of the question even when AI technology is mentioned.

Chapter milestones
  • Recognize AI workloads and match them to real business scenarios
  • Differentiate ML, computer vision, NLP, and generative AI at a high level
  • Apply responsible AI concepts to common workplace use cases
  • Practice: AI workload identification (exam-style set)
Chapter quiz

1. A retail company wants to predict which customers are most likely to stop buying in the next 30 days so the sales team can target retention offers. Which AI workload is this scenario describing?

Show answer
Correct answer: Machine learning (prediction/classification)
Predicting future churn is a machine learning workload because it uses historical data to produce a prediction score or class (likely to churn vs not). Computer vision is used for analyzing images/video, which is not part of the scenario. NLP focuses on understanding or processing text (e.g., sentiment, extraction, summarization), which is not the primary requirement here.

2. A manufacturing plant wants to automatically detect scratches and dents in product photos taken on an assembly line. Which AI workload should you use?

Show answer
Correct answer: Computer vision (image detection/classification)
Detecting defects in photos is a computer vision workload because it involves analyzing images to find and classify visual issues. Generative AI is for creating new content (text/images), not inspecting existing images. Time-series forecasting is a machine learning pattern for predicting values over time (e.g., demand), which does not match the image-based inspection requirement.

3. A support manager wants to automatically summarize long customer support tickets into a few bullet points for faster triage. Which type of AI capability best fits this request?

Show answer
Correct answer: Natural language processing (summarization)
Summarizing tickets is an NLP task because it involves understanding and condensing text. OCR (computer vision) is used to extract text from images/scanned documents; the scenario already has ticket text. Clustering groups similar items without predefined labels, which may help categorize tickets but does not directly produce a summary.

4. A marketing team wants an AI tool that can draft multiple variations of product descriptions in a specific tone based on a short prompt. Which AI workload is being requested?

Show answer
Correct answer: Generative AI (text generation)
Drafting new product description variations from a prompt is generative AI because it creates original text content. Entity recognition is NLP for extracting items like names, dates, or organizations from existing text, not generating new copy. Image tagging is computer vision for labeling images and is unrelated to drafting text.

5. A bank uses an AI model to help decide whether loan applications should be approved or denied. During review, the team discovers the model rejects a higher percentage of applicants from a particular demographic group. Which responsible AI principle is most directly impacted?

Show answer
Correct answer: Fairness
A systematic difference in outcomes for a demographic group is primarily a fairness concern (bias and equitable treatment). Reliability and safety focuses on consistent performance and avoiding harmful failures (e.g., unstable predictions), not demographic disparity. Privacy and security relates to protecting sensitive data and preventing data leakage, which is not what the scenario describes.

Chapter 3: Fundamental Principles of Machine Learning on Azure (Domain)

This chapter maps to the AI-900 domain that asks you to explain core machine learning (ML) principles and how Azure supports them. The exam is not testing coding. It is testing whether you can interpret a scenario, identify the ML workload (regression, classification, clustering), understand the ML lifecycle (training/evaluation/deployment), and recognize responsible AI considerations. If you can reliably answer “What is the label?”, “What is the prediction type?”, and “How do we evaluate success?”, you can eliminate most wrong options quickly.

On AI-900, you’ll also see Azure Machine Learning (Azure ML) vocabulary at a high level: workspace, compute, datasets, pipelines, and AutoML. The most common trap is confusing where something happens (training vs inference) or choosing an evaluation metric that doesn’t match the business goal (for example, optimizing accuracy when false negatives are the real risk). Another trap: mixing up validation and test sets, or assuming unsupervised learning uses labels.

Use this chapter to build a practical mental checklist: (1) define the problem type; (2) confirm data/labels; (3) select training and evaluation approach; (4) understand deployment/inference; (5) apply responsible ML guardrails. Those steps align with how exam questions are written: short scenarios with multiple plausible answers, where one fits the lifecycle and objective best.

Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand training, validation, testing, and evaluation metrics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Choose the right ML approach for a scenario (regression vs classification vs clustering): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: ML fundamentals on Azure (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand training, validation, testing, and evaluation metrics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Choose the right ML approach for a scenario (regression vs classification vs clustering): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: ML fundamentals on Azure (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 3.1: ML lifecycle: data collection, preparation, training, deployment

AI-900 expects you to understand the end-to-end ML lifecycle, not just “train a model.” A typical lifecycle is: collect data, prepare data, train, evaluate, deploy, and monitor. In exam scenarios, read for clues about where the team is in this lifecycle—because many answer options describe the right action, but at the wrong stage.

Data collection focuses on gathering representative data that reflects real-world conditions. If the scenario mentions “new region,” “new customer group,” or “changing trends,” think about whether the collected data still represents the problem. Data preparation includes cleaning missing values, removing duplicates, feature engineering (creating useful input columns), and splitting data into training/validation/test sets.

Training is when the algorithm learns patterns from the training set. Deployment is when you publish the trained model so it can produce predictions (inference) on new data—often as a web service endpoint. The exam frequently tests the distinction between training and inference: training uses historical labeled data; inference uses new unlabeled inputs to produce outputs.

Exam Tip: If the question mentions “endpoint,” “real-time predictions,” or “integrate into an app,” you are in deployment/inference territory, not training. If it mentions “tune hyperparameters,” “select features,” or “improve performance,” you are in training/validation territory.

  • Training set: used to fit the model.
  • Validation set: used to tune and compare models during development.
  • Test set: used at the end for an unbiased final estimate.

Common trap: calling the validation set the “test set.” On the exam, the test set is reserved for final evaluation, not iterative tuning.

Section 3.2: Learning types and common algorithms (conceptual, not coding)

The exam wants you to distinguish supervised, unsupervised, and reinforcement learning at a conceptual level and match them to scenarios. The quickest way is to ask: “Do we have labels?” If yes, it’s usually supervised learning. If no and we’re discovering structure, it’s unsupervised. If an agent learns by trial-and-error rewards, it’s reinforcement learning.

Supervised learning uses labeled data (inputs with known outputs). Two key supervised problem types appear constantly on AI-900:

  • Regression: predicts a numeric value (house price, energy usage, time to deliver).
  • Classification: predicts a category/label (fraud vs not fraud, churn vs not churn, disease present vs absent). It can be binary or multi-class.

Unsupervised learning uses unlabeled data. The most tested concept is clustering, which groups similar items (customer segmentation, grouping documents by topic, anomaly detection as “far from clusters”). If the scenario says “we don’t know the groups in advance” or “segment customers,” clustering is usually the best fit.

Reinforcement learning is less common on AI-900, but you should recognize the pattern: an agent takes actions, receives rewards/penalties, and learns a policy (robot navigation, game playing, dynamic pricing with feedback). Reinforcement learning is not simply “retraining with new data”; it’s learning through interaction.

Exam Tip: The word “predict” alone doesn’t guarantee supervised learning. If the scenario is about “discovering segments” or “finding patterns” without labeled outcomes, the correct choice is often unsupervised clustering—even though the business may call it “predictive insights.”

Section 3.3: Model evaluation basics: accuracy, precision/recall, overfitting

Evaluation questions on AI-900 focus on choosing and interpreting basic metrics and recognizing overfitting. Start by identifying what “good” means for the business: is it worse to miss a positive case (false negative) or to raise false alarms (false positive)? Then select metrics accordingly.

Accuracy is the proportion of correct predictions overall. It can be misleading on imbalanced datasets (for example, only 1% fraud). A model that always predicts “not fraud” can be 99% accurate but useless.

Precision answers: “When the model predicts positive, how often is it correct?” It matters when false positives are costly (flagging legitimate transactions, sending unnecessary alerts). Recall answers: “Of all true positives, how many did we find?” It matters when missing positives is costly (failing to detect fraud, missing a medical condition).

Overfitting is when a model performs very well on training data but poorly on new data. Typical causes include overly complex models, leakage (using features that won’t exist at inference), or insufficient data. Underfitting is the opposite: the model is too simple to capture patterns, leading to poor performance on both training and test data.

Exam Tip: If you see “high training accuracy, low test accuracy,” think overfitting. If both are low, think underfitting or poor features/data quality. If performance drops after deployment, think data drift and monitoring (covered in responsible ML).

Also know the role of validation vs test: you tune using validation metrics (for example, compare models), and you report final performance on the test set. A common exam trap is selecting “test set” as the place to iterate; that introduces bias.

Section 3.4: Azure ML concepts: workspace, compute, pipelines, AutoML (high level)

AI-900 includes high-level Azure Machine Learning concepts to ensure you can describe how ML work is organized in Azure. You are not expected to configure them in detail, but you should know what each component is for and how they relate to the lifecycle.

An Azure ML workspace is the top-level container that organizes assets such as datasets, models, experiments, and endpoints. If a question asks where models and runs are tracked and managed, the workspace is usually the answer.

Compute refers to the resources used for training or inference (for example, compute instances for development, compute clusters for scalable training). If the scenario needs “scale out training jobs” or “run experiments in parallel,” compute clusters fit. Deployment can also use compute (managed endpoints), but exam wording often distinguishes “training compute” from “serving predictions.”

Pipelines represent repeatable workflows (data preparation → training → evaluation → deployment). On the exam, pipelines are associated with automation, repeatability, and operationalizing ML processes.

AutoML automates model selection and hyperparameter tuning for a given task (classification/regression/time series forecasting). It’s most appropriate when you want a strong baseline quickly, you have labeled data, and you want Azure to try multiple algorithms/parameter combinations.

Exam Tip: If the question says “no code” or “quickly compare algorithms,” AutoML is a strong candidate. If it says “repeatable, scheduled training and deployment,” pipelines are the better match.

Section 3.5: Responsible ML on Azure: data quality, bias, and monitoring basics

Responsible AI is explicitly tested on AI-900, and in ML questions it often appears as “what should you do to reduce risk?” The exam expects baseline understanding: improve data quality, reduce bias, explain outcomes when appropriate, and monitor models after deployment.

Data quality issues (missing values, noisy labels, outdated data) directly impact model reliability. In scenarios where model performance is inconsistent, the best next step may be to review label accuracy and data representativeness rather than changing algorithms.

Bias occurs when a model systematically disadvantages a group. Bias can come from imbalanced or unrepresentative training data, historical inequities, or proxy features (variables that indirectly encode sensitive attributes). The exam often tests recognition: if the scenario mentions different error rates across groups, fairness and bias mitigation steps are relevant (collect better data, evaluate fairness metrics, adjust features, or use fairness tools).

Monitoring is essential because real-world data changes (data drift) and model performance can degrade over time. Post-deployment monitoring looks for changes in input data distribution, prediction distribution, and outcome-based performance (when labels become available later).

Exam Tip: If the scenario says “the model worked during testing but is worse in production,” don’t jump straight to “retrain with more data” as the only fix. The more complete responsible ML answer includes monitoring for drift, validating data pipelines, and then retraining as needed.

At AI-900 depth, you should also connect responsible ML to human oversight: high-impact decisions (finance, healthcare, hiring) often require interpretability and review processes, not just high accuracy.

Section 3.6: Practice questions: select the ML approach and interpret outcomes

This section prepares you for exam-style prompts without listing full quiz items. AI-900 questions typically present a short business scenario and ask you to choose the ML approach (regression/classification/clustering) or interpret a training outcome. Use a consistent decision flow to avoid common traps.

Step 1: Identify the target output. If the output is a number (cost, time, temperature), choose regression. If it’s a category (approve/deny, yes/no, type A/B/C), choose classification. If there is no target output and you are grouping, choose clustering.

Step 2: Check for labels. If the dataset includes a known “answer column” (like “Churned: Yes/No”), it’s supervised. If the scenario says “we don’t know categories yet,” it’s likely unsupervised.

Step 3: Match the metric to the risk. If false negatives are dangerous (missing fraud, missing a defect), prioritize recall. If false positives are costly (too many customers incorrectly flagged), prioritize precision. If classes are balanced and costs are similar, accuracy may be acceptable.

Step 4: Interpret train vs test results. Big gap (train good, test poor) implies overfitting; similar poor results imply underfitting or weak features/data. If performance drops after deployment, suspect drift and the need for monitoring/retraining.

Exam Tip: Many wrong answers are “technically related” but not the best fit. For example, choosing clustering when you actually have labeled outcomes, or choosing accuracy when the scenario emphasizes rare events. Anchor on the output type and business risk; then select the most aligned approach and metric.

Chapter milestones
  • Explain supervised, unsupervised, and reinforcement learning at exam depth
  • Understand training, validation, testing, and evaluation metrics
  • Choose the right ML approach for a scenario (regression vs classification vs clustering)
  • Practice: ML fundamentals on Azure (exam-style set)
Chapter quiz

1. A retail company wants to predict the total sales amount for each store next month based on historical sales, promotions, and local events. Which machine learning approach best fits this requirement?

Show answer
Correct answer: Regression (supervised learning)
Predicting a numeric value (sales amount) is a regression problem and typically uses supervised learning with labeled historical outcomes. Classification is for predicting categories (for example, high/medium/low), not a continuous amount. Clustering groups similar items without labels and does not directly predict a future numeric target.

2. A healthcare organization builds a model to flag patients who are likely to miss an appointment. Missing a high-risk patient (false negative) is much more costly than incorrectly flagging a patient (false positive). Which evaluation metric is most appropriate to prioritize during model selection?

Show answer
Correct answer: Recall
When false negatives are the biggest risk, you prioritize recall (true positive rate) to catch as many actual high-risk cases as possible. Accuracy can look good even if the model misses many high-risk patients, especially with imbalanced data. MAE is a regression metric and is not appropriate for a yes/no (classification) outcome.

3. You create a model and split your labeled dataset into training, validation, and test sets. Which statement best describes the purpose of the test set?

Show answer
Correct answer: To provide an unbiased final evaluation of the selected model on unseen data
The test set is held back to provide an unbiased estimate of how well the finalized model generalizes to new data. Training data is used to fit model parameters (so option B describes training). Validation data is used for model selection and hyperparameter tuning (so option C describes validation).

4. A telecommunications provider wants to segment customers into groups with similar usage patterns to create targeted plans. They do not have predefined group labels. Which approach should you use?

Show answer
Correct answer: Clustering (unsupervised learning)
Creating groups from unlabeled data is a clustering task, which is an unsupervised learning approach. Classification requires labeled categories (for example, churn vs not churn), which are not provided. Reinforcement learning is used for learning actions via rewards over time (for example, optimizing decisions), not for static customer segmentation.

5. You deploy a trained model to a REST endpoint in Azure Machine Learning. New data is sent to the endpoint to generate predictions in real time. Which stage of the ML lifecycle is occurring when the endpoint returns a prediction?

Show answer
Correct answer: Inference
Generating predictions from a deployed model on new data is inference (scoring). Training is when the model learns from labeled (or structured) data to create the model. Validation is used during development to compare models or tune hyperparameters; it is not what happens when a production endpoint is called.

Chapter 4: Computer Vision Workloads on Azure (Domain)

In AI-900, “computer vision” means extracting meaning from images and video: reading text, detecting objects, describing scenes, and (conceptually) dealing with people-related imagery. The exam does not expect you to build neural networks from scratch; it expects you to recognize common vision tasks, map them to Azure capabilities, and describe what outputs you get (labels, bounding boxes, extracted text, confidence scores, etc.).

This chapter organizes the vision domain the way AI-900 questions are typically written: a short scenario (e.g., “scan receipts” or “detect safety gear”), followed by multiple services that sound similar. Your job is to spot the workload type (classification vs detection vs OCR), then pick the Azure capability that matches the required output and constraints (speed, customization, document complexity, responsible AI).

Exam Tip: When a scenario says “where is the item in the image?” you need object detection (bounding boxes). When it says “what is in the image?” you can often use image analysis (tags/captions). When it says “read the text,” think OCR. That single keyword-to-output mapping eliminates many distractor answers.

Finally, AI-900 also tests responsible AI basics. In vision, this often appears as privacy/consent, demographic performance differences, and the risk of using face-related features incorrectly. Expect conceptual questions about safe/appropriate use rather than deep implementation details.

Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Select Azure vision capabilities for OCR, detection, and image understanding scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand responsible vision considerations and limitations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: computer vision scenario questions (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Select Azure vision capabilities for OCR, detection, and image understanding scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand responsible vision considerations and limitations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: computer vision scenario questions (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 4.1: Vision workload types: image classification, object detection, OCR

AI-900 vision questions frequently start by testing whether you can identify the task type. The three most common tasks are image classification, object detection, and OCR (optical character recognition). Although they can be combined in real solutions, the exam often forces you to choose the primary workload based on the desired output.

Image classification answers: “Which category best describes this image?” The output is typically a label (or multiple labels) with confidence scores. This is used for scenarios like classifying product photos (shoe vs shirt), or sorting images into known categories. A common trap is to confuse classification with tagging; classification implies you have defined classes you care about.

Object detection answers: “Which objects are present and where are they?” The key output is a set of bounding boxes (coordinates) plus labels and confidence. If the scenario mentions counting items, locating them, drawing rectangles, or triggering an alert when an object appears in a specific region, it’s detection. Exam Tip: If the question mentions “identify a logo in an image” or “detect damaged parts,” you’re still in detection territory because location matters, even if the object is small.

OCR answers: “What text appears in the image?” The output is extracted text plus positional information (lines/words) and confidence. OCR is used for reading signs, labels, receipts, and scanned documents. A frequent exam distractor is proposing “NLP” for text extraction; remember OCR is still a vision task because the input is an image. NLP comes after OCR if you then need to interpret meaning.

  • Classification: label(s) for the whole image (no boxes).
  • Detection: label(s) + bounding boxes (where).
  • OCR: text extraction + layout/position (read).

When you see multiple-choice answers that include “Custom Vision,” “Vision (image analysis),” “Read/OCR,” or “Document Intelligence,” pause and ask: is the scenario about category, location, or text? That one decision usually determines the correct path.

Section 4.2: Image analysis concepts: tags, captions, and content understanding

Many AI-900 questions use “image understanding” language such as “describe the photo” or “identify what’s happening.” In Azure, this maps to prebuilt image analysis capabilities that return structured descriptions without you training a custom model. The exam expects you to recognize three common outputs: tags, captions, and dense descriptions/content metadata.

Tags are keywords (often multiple) that summarize what the service sees: “person,” “outdoor,” “car,” “sky.” Tags are useful for search, indexing, and basic filtering. They are not “classes you trained,” so a trap is choosing “custom classification” when the scenario simply wants auto-generated keywords for many unknown images.

Captions are short natural-language sentences describing the image (for example, “a person riding a bicycle on a street”). Captions are commonly used for accessibility (alt text) and quick summaries. If a question mentions “generate a description for screen readers,” captions are a strong clue.

Content understanding can include recognizing common objects and scene context beyond a single label. Exam questions sometimes phrase this as “extract insights from an image” or “understand the scene.” The key is that you are not asked to locate every object with bounding boxes; you are asked to derive a meaningful summary. Exam Tip: If the requirement is “identify every instance and location of a specific object,” you’ve crossed into detection; if it’s “describe or tag,” image analysis is the better match.

  • Choose image analysis when you need general-purpose tags/captions without training.
  • Expect outputs like labels/tags, confidence scores, and natural-language descriptions.
  • Don’t over-select custom models for simple “describe and tag” tasks.

A common exam trap is mixing up “tags” with “classification.” Classification implies a constrained set of categories you define (often for a business process). Tags are broad and descriptive, driven by the prebuilt model’s vocabulary. Read the scenario carefully: if the company wants to standardize into their categories (e.g., “acceptable” vs “defective”), it leans custom; if they want generic metadata, it leans prebuilt image analysis.

Section 4.3: Document and text extraction concepts: OCR and form-like scenarios

Text in images appears on the AI-900 exam in two main flavors: simple OCR (read text) and document/form-like extraction (understand structured fields). Both start with vision, but the best Azure choice depends on whether the document has a predictable layout and whether you need key-value pairs.

OCR (Read) is ideal when the question says “extract the text” from photos or scans: street signs, product labels, screenshots, or basic scanned pages. Outputs commonly include lines/words, their coordinates, and confidence. In exam scenarios, OCR is often sufficient when there’s no mention of fields like “total,” “invoice number,” or “date.”

When the prompt implies forms or semi-structured documents—for example invoices, receipts, IDs, or purchase orders—the exam expects you to think beyond raw text and toward field extraction. In Azure, that is typically described as document analysis/document intelligence capabilities that can return structured results (e.g., vendor name, total amount) rather than just a text blob.

Exam Tip: Spot the word “extract fields” or “key-value pairs.” If the scenario needs “Total,” “Tax,” “Customer name,” or “Line items,” choose a document/form capability rather than plain OCR. OCR alone would force the developer to parse the text manually, which the exam will treat as the less appropriate choice.

  • Use OCR for general text extraction from images where layout isn’t the focus.
  • Use form/document extraction for invoices/receipts where you need named fields and structure.
  • Common trap: selecting NLP because the output is text. Remember: the step of turning pixels into characters is vision/OCR.

Also watch for constraints: mobile camera photos can have skew, glare, and low resolution. The exam may hint that you need a service designed for real-world document capture. Your job is not to design preprocessing pipelines, but to pick the capability that best matches “document + fields” vs “image + text.”

Section 4.4: Face-related capabilities and consent/privacy considerations (conceptual)

AI-900 treats face-related topics primarily as a responsible AI and appropriate-use concept area. You may see scenarios about detecting whether a face is present, or comparing a face to a known user for access. The exam is less about implementation details and more about what is appropriate, what requires consent, and what limitations to acknowledge.

Conceptually, face-related capabilities can include detecting faces in an image and extracting face-region attributes (such as position). Exam items may also describe “identify who the person is,” “verify the same person,” or “group photos of the same person.” Treat these as higher-risk use cases that require careful governance, transparency, and compliance.

Privacy/consent is a recurring theme. If a scenario involves customers, employees, or public footage, assume you must consider notice and consent, data minimization (store only what you need), and secure handling. The exam often rewards answers that include explicit consent, clear purpose limitation, and an option for users to opt out where appropriate.

Exam Tip: If two answers both “work,” the exam often prefers the one that demonstrates responsible AI: consent, least-privilege access, auditing, and a human review step for consequential decisions. This is especially true for face-related scenarios.

  • Common trap: assuming “because it’s technically possible, it’s automatically appropriate.”
  • Common trap: ignoring fairness/performance differences across demographics and lighting conditions.
  • Good signals: anonymization, limiting retention, and documenting intended use.

Also remember limitations: image quality, occlusion (masks, glasses), and poor lighting can reduce accuracy. The exam may ask what to communicate to stakeholders; the safe, correct approach is to describe confidence thresholds, manual review for borderline cases, and continuous monitoring for drift or performance gaps.

Section 4.5: How to choose a vision approach: prebuilt service vs custom model

A high-value AI-900 skill is choosing between a prebuilt vision capability and a custom model. Prebuilt services are designed for common tasks (tags, captions, OCR, generic object recognition) and are best when requirements are general and time-to-value matters. Custom models are best when the business has domain-specific categories or objects that prebuilt models won’t reliably recognize (specific defects, proprietary parts, brand-specific labeling rules).

Prebuilt approach: choose this when you can describe the requirement in everyday terms (“read text,” “describe the image,” “detect common objects”) and you don’t have labeled training data. The exam expects you to recognize that prebuilt services reduce effort and are typically the default recommendation for common workloads.

Custom model approach: choose this when the scenario says “train,” “labeled images,” “our products,” “our defect types,” or when accuracy must be optimized for a narrow domain. This often maps to custom vision-style training for classification or detection. Exam Tip: The keyword “custom” in the scenario is not enough; look for the real driver: a unique label set or objects not covered by general models.

  • Pick prebuilt for general tags/captions/OCR and quick deployment.
  • Pick custom for specialized categories, bespoke objects, or controlled business labels.
  • Common trap: choosing custom when you have no data, no labeling budget, and only generic needs.

Decision cues the exam loves: (1) Do you need bounding boxes? That pushes you toward detection (custom or prebuilt depending on specificity). (2) Do you need fields from receipts/invoices? That pushes you toward document analysis rather than generic OCR. (3) Do you need your own categories? That pushes you toward custom classification. Write these cues on your mental checklist and use them to eliminate distractors quickly.

Section 4.6: Practice questions: match scenario to vision capability and outputs

This lesson is about building the “scenario-to-service” reflex the AI-900 exam demands. You are not being graded on memorizing product names alone; you’re being graded on mapping requirements to the correct capability and knowing the typical outputs so you can validate your choice.

When you read a scenario, underline the requested output: (a) category label, (b) object location, (c) extracted text, (d) structured fields, (e) descriptive summary. Then choose the capability that naturally produces that output. If the answer option provides an output that doesn’t match (for example, tags when the scenario needs bounding boxes), eliminate it.

Exam Tip: Force yourself to say out loud (mentally): “The input is an image; the required output is X; therefore I need Y.” This prevents a common trap where test-takers pick a familiar-sounding service name without checking outputs.

  • If the scenario needs alt text or “describe the scene,” expect captions as output.
  • If it needs inventory counting or “highlight items,” expect bounding boxes (detection).
  • If it needs read a label or “capture a serial number,” expect OCR text + coordinates.
  • If it needs invoice totals and “vendor name,” expect structured fields from document analysis.

Also practice identifying “red herring” requirements. For example, a prompt might mention “store the extracted text for analytics” (tempting you toward NLP), but the core ask is still OCR. Or it may mention “detect whether a person is present” (image analysis may work) but then add “and draw a box around each person” (now it’s detection). AI-900 questions often hinge on that one extra phrase that changes the workload type.

Finally, keep responsible AI in view even in technical matching scenarios. If a scenario involves people, identity, or surveillance-like contexts, the best answer may include consent, data minimization, and human oversight—even if the technical capability is correct. The exam rewards solutions that are both functional and responsible.

Chapter milestones
  • Identify key computer vision tasks and how they appear on the exam
  • Select Azure vision capabilities for OCR, detection, and image understanding scenarios
  • Understand responsible vision considerations and limitations
  • Practice: computer vision scenario questions (exam-style set)
Chapter quiz

1. A retail company wants to automatically extract the merchant name, date, and total amount from photos of customer receipts taken on mobile devices. Which Azure capability should they use?

Show answer
Correct answer: OCR (Optical Character Recognition) in Azure AI Vision
This is a 'read the text' requirement, which maps to OCR outputs (recognized text and confidence). Object detection is for locating items with bounding boxes (where something is), not reading printed text. Image captioning produces a natural-language description/tags (what is in the image) and does not reliably extract structured text like totals and dates.

2. A manufacturing plant needs to verify whether workers are wearing hard hats in live camera images. The system must identify where the hard hat appears in the image to support auditing. Which computer vision task is required?

Show answer
Correct answer: Object detection
The requirement 'identify where the hard hat appears' indicates bounding boxes, which is object detection. Image classification/tagging can tell you what is present but not the location of the item. OCR is used for extracting text from images and is unrelated to locating safety gear.

3. A real estate company wants to automatically generate a short description for each property photo, such as "a living room with a sofa and large window." Which Azure vision capability best fits this requirement?

Show answer
Correct answer: Image analysis to generate captions/tags
Generating a sentence describing the scene aligns with image analysis that returns captions/tags (what is in the image). Object detection would provide locations of specific items (where items are) but would not generate a coherent scene description by itself. OCR is for reading text present in the image, which is not the goal.

4. A company plans to analyze images of customers in a store to infer emotions and target ads in real time. Which responsible AI concern is most relevant to raise for this scenario in the context of AI-900 computer vision?

Show answer
Correct answer: Potential privacy/consent issues and inappropriate use of people-related analysis
AI-900 emphasizes responsible vision considerations for people-related imagery: privacy, consent, and the risk of harmful or inappropriate inferences about individuals, along with potential demographic performance differences. Object detection vs OCR are technical capability choices and do not address the core responsible AI concern in this scenario.

5. You are reviewing requirements for an Azure vision solution. The solution must return (1) the text found in an image and (2) the confidence score for the extracted text. Which workload type should you select?

Show answer
Correct answer: OCR
Extracting text from an image with confidence scores is OCR. Image classification returns labels/tags for the overall image (what is in the image) but not extracted text. Object detection returns bounding boxes and labels (where items are) and is not designed to read printed or handwritten text.

Chapter 5: NLP and Generative AI Workloads on Azure (Domains)

This chapter maps directly to the AI-900 exam domain that asks you to identify natural language processing (NLP) workloads and describe generative AI workloads on Azure. You are not expected to code, but you are expected to recognize common language scenarios, choose an appropriate Azure capability, and describe how responsible AI applies—especially for generative AI. The exam often tests your ability to separate “classic NLP” (extracting signals from text) from “generative AI” (creating new text) and to spot when a scenario needs both.

As you read, keep the exam mindset: focus on the verbs in the question (analyze, extract, classify, translate, generate, summarize, answer) and the constraints (latency, privacy, safety, grounding). Many wrong options look plausible because they are “AI-ish” but mismatched to the task. You will practice that selection logic in Section 5.6.

Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Apply responsible AI and safety concepts to language and generative use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: NLP + generative AI mixed scenarios (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Apply responsible AI and safety concepts to language and generative use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Practice: NLP + generative AI mixed scenarios (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 5.1: NLP workloads: sentiment, key phrases, entity recognition, translation

On AI-900, “NLP workloads” usually mean analyzing text rather than generating it. The exam commonly frames these as four bread-and-butter tasks: sentiment analysis, key phrase extraction, entity recognition, and translation. Your job is to match the scenario’s goal to the correct capability.

Sentiment analysis is about classifying opinion or emotional tone (positive/negative/neutral, sometimes with confidence). Typical exam scenario: “Analyze customer reviews to measure satisfaction trends.” The trap: choosing a chatbot or generative model because the input is text. If the output is a score or label rather than new prose, you’re in classic NLP territory.

Key phrase extraction pulls out the main terms from a document (e.g., “shipping delay,” “refund policy”). This is often tested in scenarios like “summarize topics from thousands of support tickets.” The common trap is confusing it with summarization. Key phrases are not sentences; they’re short terms you can use for tagging, search, or dashboards.

Entity recognition (often including “named entity recognition”) identifies structured items in text such as people, organizations, locations, dates, or product names. Exam questions frequently combine this with compliance: “Detect names and addresses in emails.” Be careful: detecting personal data may overlap with privacy requirements, but the task itself is still entity extraction.

Translation converts text from one language to another. The exam will often include a multi-language customer support scenario. Exam Tip: If the scenario explicitly says “convert between languages,” don’t overthink it—translation is the intended capability even if other tasks (like sentiment) could be added later.

  • Choose NLP analysis when the output is labels, extracted terms, entities, or translated text.
  • Watch for “extract”/“identify” verbs (NLP) vs “write”/“create” verbs (generative).

How to identify the correct answer: find the noun that represents the output. “Score,” “entities,” “key phrases,” and “translated text” strongly indicate these NLP workloads.

Section 5.2: Conversational AI basics: chatbots, intents, and orchestration concepts

Conversational AI on AI-900 is about understanding what a chatbot does, how user requests are interpreted, and how the bot connects to other services. You’ll see terms like “bot,” “intents,” “utterances,” and “orchestration.” You do not need to implement conversation flows, but you must recognize these components and pick them in scenarios.

A chatbot provides a conversational interface (web chat, Teams, SMS) to answer questions, guide users, or trigger actions. The exam typically describes goals like “handle common HR questions” or “help users reset passwords.” The key idea is that the bot is the front door; it may call other services behind the scenes.

Intents represent what the user wants to do (e.g., “CheckOrderStatus,” “BookAppointment”). Utterances are example phrases users might say that map to an intent. A common trap is confusing intent classification with sentiment analysis. Sentiment is opinion; intent is purpose.

Orchestration is the “traffic controller” concept: routing an incoming message to the right handler—perhaps a FAQ knowledge base, an action workflow, or a human agent. On AI-900, orchestration is tested as a conceptual design decision: when you have multiple sources (structured data, knowledge articles, generative model responses), you need a plan for selecting and combining them safely.

Exam Tip: If the scenario mentions “handoff to human,” “connect to ticketing,” or “trigger a workflow,” that’s a conversational solution with orchestration, not just a language model generating text.

  • Chatbot = channel + dialogue experience.
  • Intent = user goal; utterance = user phrasing.
  • Orchestration = routing to tools/knowledge/actions.

How to spot correct answers: look for “conversation,” “multi-turn,” “support agent,” “guided interaction,” or “integrate with business systems.” Those point to conversational AI rather than standalone NLP analysis.

Section 5.3: Generative AI workloads: content generation, summarization, Q&A

Generative AI workloads differ from classic NLP because the system produces new content. At AI-900 depth, focus on three use cases: content generation, summarization, and question answering (Q&A). The exam expects you to describe what these are, when to use them, and what risks they introduce.

Content generation includes drafting emails, marketing copy, job descriptions, or product FAQs. The prompt asks for an output that did not exist before. The common trap is assuming generation is always appropriate; exam scenarios often include compliance constraints where free-form generation must be controlled.

Summarization compresses long text (meeting notes, incident reports) into shorter text while retaining key points. This looks like “key phrases,” but the output is coherent sentences and bullet summaries. Exam Tip: If the scenario asks for “a short paragraph summary” or “executive summary,” that’s summarization; if it asks for “tags/topics,” that’s key phrase extraction.

Q&A can mean answering user questions based on a defined set of content (policies, manuals) or answering more broadly. For AI-900, you should recognize that higher-quality enterprise Q&A often requires grounding (covered in Section 5.4) so the model answers from provided sources rather than guessing. The trap is choosing a generative model without grounding for high-stakes factual domains (HR policy, medical guidance).

Azure generative AI solutions are commonly described at a high level as using large language models (LLMs) via managed services, with options to connect data sources. The exam doesn’t require SKU memorization, but it does require knowing what a generative model is good at (natural language creation) and what it is not inherently guaranteed to do (always be factual).

  • Generation: create new drafts and variations.
  • Summarization: shorten while preserving meaning.
  • Q&A: answer questions, ideally grounded in trusted content.

How to choose correctly: identify whether the user wants a new narrative response versus an extracted field/label. New narrative response → generative AI workload.

Section 5.4: Prompting fundamentals: instructions, context, examples, constraints

Prompting is testable on AI-900 because it is the main “control surface” for LLM behavior. You should be able to explain, at a practical level, how prompts influence outputs and what elements improve reliability. Think of a prompt as a mini spec: the clearer the spec, the more predictable the result.

Instructions are the direct task request (e.g., “Summarize this report for an executive audience”). Instructions should specify output format, tone, and length when needed. A common trap is vague prompts that cause the model to invent details. The exam may describe inconsistent results and hint that better instructions are required.

Context is the information the model should use: the document to summarize, the policy excerpt to answer from, or the customer’s ticket history. Context is also where grounding begins: you supply authoritative text so the answer is anchored to it. Exam Tip: When you need factual answers about your organization, adding trusted context is often the difference between “chat” and “enterprise Q&A.”

Examples (sometimes called “few-shot” examples) show the model what good outputs look like. On the exam, examples matter when the question mentions consistent formatting or classification-like responses (e.g., always return JSON fields or always follow a template).

Constraints limit behavior: “Use only the provided sources,” “If the answer is not in the context, say you don’t know,” “Do not include personal data,” “Return exactly three bullet points.” Constraints reduce risk and improve evaluation. The trap: assuming the model will naturally follow policies without explicit constraints.

  • Instructions define the task.
  • Context supplies the facts.
  • Examples standardize outputs.
  • Constraints reduce variability and risk.

How to identify correct answers: when a scenario describes “unreliable responses,” “wrong format,” or “answers not aligned to our documents,” the best fix is often prompt improvement plus grounding, not switching to a different AI workload.

Section 5.5: Safety and responsible generative AI: hallucinations, data protection, evaluation

Responsible AI is not a separate “ethics lecture” on AI-900; it’s embedded in scenario choices. For language and generative use cases, three risks appear repeatedly: hallucinations, data protection, and evaluation/monitoring.

Hallucinations are plausible-sounding but incorrect outputs. The exam tests whether you understand that LLMs can generate fluent answers that are not guaranteed to be true. Mitigations at AI-900 depth include grounding the model on trusted sources, constraining prompts to “use only provided content,” and designing the system to say “I don’t know” when context is missing. Exam Tip: If the scenario is high-stakes (legal, medical, finance), expect the correct answer to include grounding and human review.

Data protection focuses on preventing leakage of sensitive or personal data. Common scenario: “Summarize customer chats that include addresses and payment details.” You should recognize the need to limit data exposure (send only necessary text), apply access controls, and consider data minimization. The trap is assuming you can freely paste internal documents into a model without governance.

Evaluation means measuring output quality and safety over time. Unlike classic ML where you track accuracy, generative evaluation can involve checking groundedness, factuality, toxicity, and adherence to format. The exam may describe a pilot that works “sometimes” and ask what you should do before production: evaluate outputs with representative test cases, define acceptance criteria, and monitor.

  • Reduce hallucinations: grounding + constraints + fallback behavior.
  • Protect data: least privilege, minimize sensitive inputs, governance.
  • Evaluate: test sets, human review loops, continuous monitoring.

How to choose correct answers: look for risk keywords—“compliance,” “PII,” “incorrect answers,” “harmful content.” Those questions want responsible design choices, not just a capability name.

Section 5.6: Practice questions: scenario selection for NLP vs generative AI solutions

This section prepares you for the most common AI-900 question style: short business scenarios where multiple AI options seem plausible. Your goal is to pick the best fit by identifying (1) the desired output, (2) whether the task is extract/label vs generate, and (3) what safety constraints are implied.

Start by underlining the output artifact. If the scenario wants a score (customer happiness), think sentiment analysis. If it wants tags (main topics), think key phrases. If it wants structured fields (names, locations, dates), think entity recognition. If it wants the same message in another language, translation. These are “NLP analysis” selections.

If the scenario wants a draft, summary paragraph, email rewrite, or natural language answer, that’s a generative AI workload. Then ask: does it need to be factual based on company data? If yes, the scenario is signaling grounding and orchestration—retrieve trusted content and constrain the model to it. Exam Tip: “Answer using our policy documents” is code for grounded Q&A, not open-ended chat.

Mixed scenarios are common: a support center might use entity recognition to extract order numbers, sentiment to prioritize angry customers, and generative AI to draft a response. The trap is picking only one tool when the scenario clearly needs a pipeline. On AI-900, you won’t design the full architecture, but you should recognize when multiple capabilities are complementary.

  • Extract/label/translate → NLP workloads.
  • Draft/summarize/answer in prose → generative AI workloads.
  • High-stakes facts + internal docs → grounding + constraints + evaluation.

When stuck between two options, choose the one that produces the required output with the least “extra.” The exam rewards selecting the most direct capability rather than the fanciest.

Chapter milestones
  • Identify NLP tasks and choose the right capability for language scenarios
  • Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth
  • Apply responsible AI and safety concepts to language and generative use cases
  • Practice: NLP + generative AI mixed scenarios (exam-style set)
Chapter quiz

1. A support team wants to automatically route incoming customer emails into categories such as Billing, Technical Issue, and Account Management. The solution should identify the correct category for each email. Which Azure AI capability best fits this requirement?

Show answer
Correct answer: Text classification with Azure AI Language
This is a classic NLP workload: assigning a label to text (classification). Azure AI Language supports text classification scenarios. Speech to text is for converting audio to text, not categorizing written emails. OCR extracts text from images or scanned documents and does not perform email intent/category classification.

2. A company wants a chat experience that answers employee questions about internal HR policies. To reduce hallucinations, answers must be based only on the company’s approved policy documents. At AI-900 depth, which approach best meets the requirement?

Show answer
Correct answer: Use a large language model with grounding on the HR documents (for example, retrieval-augmented generation)
The requirement is generative Q&A with reduced hallucinations by constraining outputs to approved content, which maps to grounding (often implemented via retrieval-augmented generation). Sentiment analysis is an NLP task for tone/emotion, not factual answering. OCR is only relevant if you need to extract text from images; converting PDFs to images does not address grounding or answer accuracy.

3. A marketing team wants to generate multiple variations of a product description from a short list of bullet points, while keeping the tone professional. Which workload is this an example of?

Show answer
Correct answer: Generative AI text generation using a large language model and prompt instructions
Creating new text variations from input content is a generative AI workload (text generation) and is commonly controlled via prompts (tone/style constraints). Named entity recognition extracts entities from text rather than producing new prose. Translation changes language but does not inherently generate multiple new marketing variants.

4. A healthcare organization plans to summarize patient messages using a generative AI model. They want to reduce the risk of generating disallowed or inappropriate content and ensure oversight for high-impact use. Which responsible AI measure best aligns to this goal?

Show answer
Correct answer: Implement content safety filtering and human review/approval for sensitive outputs
For AI-900 responsible AI, a key concept is applying safety controls such as content filtering and having humans in the loop, especially in sensitive domains like healthcare. Increasing temperature increases randomness/creativity and can raise risk, not reduce it. Language detection can be useful operationally but does not address harmful content generation or oversight requirements.

5. A global retailer wants to analyze social media posts to determine whether customers are happy or unhappy with a recent product launch. They do not need the system to generate responses—only to score the tone of each post. Which NLP task should you choose?

Show answer
Correct answer: Sentiment analysis
Determining happy vs unhappy tone is sentiment analysis (classic NLP signal extraction). Text generation creates new text and is not required here. Grounded question answering is used to answer questions from a known knowledge source; it does not directly measure sentiment of arbitrary social posts.

Chapter 6: Full Mock Exam and Final Review

This chapter is your “dress rehearsal” for AI-900. The goal is not to prove you’re smart—it’s to build exam instincts: pacing, eliminating distractors, recognizing service names, and spotting the test’s favorite traps. AI-900 is designed for non-technical professionals, but it is still precise: the exam rewards clear mapping from scenario language (what the business needs) to Azure AI workload categories (what capability fits) and to responsible AI considerations (what constraints apply).

You will complete two mock exam passes (Part 1 and Part 2), then perform a weak-spot analysis using an answer-review method that mirrors how exam writers think. Finally, you’ll use a domain-by-domain checklist aligned to the official objectives and finish with a practical exam-day plan. Treat this chapter as a workflow: run the mock, review with rigor, refresh by domain, then execute your exam strategy.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 6.1: Mock exam instructions, pacing, and how to mark questions

Section 6.1: Mock exam instructions, pacing, and how to mark questions

Before you start any mock exam, decide how you will manage time and uncertainty. AI-900 questions are typically short, but the traps are in the wording: “best,” “most cost-effective,” “minimize development,” “requires training data,” or “must be explainable.” Your job is to translate those constraints into the correct Azure AI workload and discard options that violate the constraint.

Use a two-pass approach. Pass 1 is speed and confidence: answer what you can quickly and accurately. Pass 2 is for flagged questions. Mark questions (mentally or on your scratch) in three levels: (1) “Unsure between two,” (2) “Concept gap,” (3) “Wording trap.” This classification matters because each type needs a different fix during review.

Exam Tip: If a question includes a clear requirement like “no coding” or “prebuilt model,” immediately suspect Azure AI Services (prebuilt) rather than Azure Machine Learning (custom training). If it says “custom model,” “train,” “features,” or “evaluate,” your center of gravity shifts toward Azure Machine Learning concepts.

Pacing: aim for a steady rhythm rather than racing early. If you find yourself rereading, stop and extract the nouns and verbs: what is the input (text, image, audio, tabular data) and what is the desired output (classify, extract, summarize, generate, detect anomalies)? This “input/output” framing prevents drifting into irrelevant options.

  • Pass 1 target: answer quickly; flag anything that takes longer than ~60–90 seconds.
  • Pass 2 target: resolve flagged items by matching requirements to the correct workload/service.
  • Do not change answers without a reason tied to a requirement (avoid “second-guess” changes).

Mock exams are most valuable when you treat them like the real test: quiet environment, no notes, and strict timing. The point is to reveal your decision patterns under pressure.

Section 6.2: Mock Exam Part 1 (mixed domains, exam-style sets)

Section 6.2: Mock Exam Part 1 (mixed domains, exam-style sets)

Mock Exam Part 1 should be a mixed-domain set that forces you to switch contexts the way the real AI-900 does. Expect interleaving across: AI workloads and core concepts, machine learning lifecycle, computer vision, NLP, and generative AI on Azure. Your mindset is “classify the problem first, then pick the capability.”

As you work through Part 1, anchor each scenario to an objective. If the scenario describes predicting a numeric value (sales forecast, demand), think regression. If it describes choosing a category (spam/not spam, churn/retain), think classification. If it describes grouping without labels, think clustering. If it describes outliers (fraud spikes), think anomaly detection. These are the exam’s foundational ML patterns, and the test often checks whether you can identify them from business language.

For Azure selection, watch for the recurring distinction: prebuilt AI Services vs custom ML. In vision, typical exam-tested capabilities include OCR (read text), image analysis (tags, captions, object detection), face detection (not identification unless explicitly allowed and supported), and document intelligence for structured extraction. In NLP, identify whether the task is sentiment analysis, key phrase extraction, language detection, entity recognition, or conversational bot scenarios.

Exam Tip: If a scenario asks for “extract text from scanned receipts or forms,” the strongest instinct is Document Intelligence (form/receipt processing) rather than basic OCR alone, because the requirement implies structure (fields) not just raw text.

Generative AI questions may test concepts rather than deep engineering: what a prompt is, what grounding/augmentation is (bringing in enterprise data), and responsible deployment (content filters, human-in-the-loop, privacy). When you feel uncertain, return to constraints: “needs citations,” “must use company data,” “avoid hallucinations,” “restrict unsafe content.” Those keywords point toward retrieval-augmented generation patterns and safety tooling rather than raw model selection.

After Part 1, do not immediately look at answers. First, write down which domains felt slow or guessy. This note becomes the input to your weak-spot analysis later.

Section 6.3: Mock Exam Part 2 (mixed domains, exam-style sets)

Section 6.3: Mock Exam Part 2 (mixed domains, exam-style sets)

Mock Exam Part 2 should be taken after a short break, because fatigue changes your accuracy. This set should include more “best option” and “choose the correct service” style prompts, which are common in AI-900. Here, the exam is testing whether you can reject plausible-but-wrong options that are adjacent in the Azure ecosystem.

A common trap is confusing what requires training. Prebuilt services (Azure AI Vision, Azure AI Language, Azure AI Speech) typically do not require you to bring labeled training data; you configure and call an API. By contrast, Azure Machine Learning is about building, training, and managing custom models (including data prep, training runs, and evaluation metrics). If the scenario says “no training data available” yet offers “train a model,” that option is likely a distractor.

Another trap is mixing up evaluation metrics. Classification tends to emphasize accuracy, precision, recall, and F1. Regression is often about MAE/MSE/RMSE. Clustering is more about grouping quality and interpretability than “accuracy.” The exam won’t usually require heavy math, but it will test that you pick metrics that match the task. If the task is “catch fraud,” recall might matter more than accuracy because missing fraud is costly.

Exam Tip: When you see “imbalanced data” (rare positives like fraud), expect distractors that recommend accuracy. Accuracy can look high while failing the business goal. Look for precision/recall framing or thresholds.

Responsible AI is also frequently embedded: fairness, reliability/safety, privacy/security, inclusiveness, transparency, and accountability. If an option suggests using sensitive attributes (race, health) without justification, or deploying without monitoring, it is often incorrect. Likewise, generative AI scenarios may test whether you apply content moderation, data protection, and human oversight.

Finish Part 2 by tallying your confidence levels. The point is to see whether you’re consistently missing one domain (for example, confusing Vision vs Document Intelligence, or mixing Language features) or whether errors are mostly from rushing and not reading constraints.

Section 6.4: Answer review method: why each option is right/wrong

Section 6.4: Answer review method: why each option is right/wrong

Your score improves fastest not by doing more questions, but by reviewing in a way that removes repeat mistakes. Use a “three-column review” for every missed or uncertain item: (1) the requirement keywords, (2) the correct capability and why it fits, (3) why each wrong option fails a requirement.

Start by rewriting the question in your own words as an input/output statement. Example format: “Input = customer emails (text). Output = detect sentiment + extract key topics. Constraints = no custom training.” Then map: NLP prebuilt features are a match; custom ML is likely unnecessary. This approach forces clarity and reduces the chance you’ll be seduced by brand-name distractors.

For wrong options, don’t write “not right.” Write the specific mismatch: “requires labeled training data,” “outputs tags not structured fields,” “detects language but not entities,” “does image captioning but not OCR.” Over time you build a personal “distractor dictionary,” which is exactly what AI-900 is testing: your ability to distinguish neighboring capabilities.

Exam Tip: If you can’t explain why three options are wrong, you don’t fully own the concept yet—even if you guessed the right letter. The exam punishes shallow recognition because distractors are designed to look familiar.

Also review any question you got right but felt uncertain about. Many candidates plateau because they only study incorrect answers and ignore weak knowledge that happened to result in a correct guess. Tag these as “lucky correct” and include them in your refresh checklist.

Finally, identify the failure mode: (a) concept gap, (b) service confusion, (c) didn’t read constraints, (d) changed answer without evidence. Your remediation differs: concept gaps need study, service confusion needs comparison tables, constraint errors need a reading protocol, and answer-changing needs discipline.

Section 6.5: Final domain-by-domain refresh checklist aligned to objectives

Section 6.5: Final domain-by-domain refresh checklist aligned to objectives

Use this final checklist to align your knowledge with AI-900 objectives. This is not a content dump—treat it as a verification routine. If you cannot explain an item in one or two sentences, flag it for a quick revisit.

  • AI workloads & core concepts: Identify when a scenario is prediction vs recognition vs generation; distinguish supervised/unsupervised; know classification vs regression vs clustering vs anomaly detection; understand what “features” and “labels” mean.
  • Machine learning on Azure: Training vs inference; train/test split and why it matters; evaluation metrics matched to task (precision/recall vs RMSE); basic model management concepts; when to use Azure Machine Learning vs prebuilt AI Services.
  • Responsible AI: The six principles (fairness, reliability/safety, privacy/security, inclusiveness, transparency, accountability); recognize mitigation actions (data governance, monitoring, human oversight, documentation).
  • Computer vision workloads: OCR vs image analysis vs object detection; when Document Intelligence is the better fit (structured forms/receipts/invoices); know the “shape” of outputs (text, bounding boxes, fields).
  • NLP workloads: Sentiment, key phrases, entities, language detection, summarization; conversational scenarios and intent/utterances at a conceptual level; distinguish extracting information from generating content.
  • Generative AI on Azure: What prompts are; why grounding with enterprise data reduces hallucinations; responsible deployment (content filtering, privacy, data boundaries); common use cases (drafting, summarizing, Q&A, copilots) and the risks (fabrication, sensitive data exposure).

Exam Tip: The fastest points come from clean service matching. Make sure you can confidently answer: “Is this best solved with a prebuilt service, custom ML, or generative AI?” before you worry about anything else.

After running this checklist, pick only the top 2–3 weak areas to refresh. Over-studying everything at once increases confusion and reduces recall on exam day.

Section 6.6: Exam day plan: environment, time strategy, and last-24-hours review

Section 6.6: Exam day plan: environment, time strategy, and last-24-hours review

Exam day performance is a product of preparation plus execution. Your plan should reduce avoidable errors: stress, rushing, and misreading constraints. Start with environment: stable internet, quiet room, cleared desk, and a simple way to track flagged questions. If you are testing online, complete the system check early and eliminate interruptions.

Time strategy: use the same two-pass method you practiced. Pass 1: answer confidently, flag anything uncertain. Pass 2: resolve flags by mapping requirements to capabilities. If you still cannot decide, eliminate options that violate constraints (needs training, wrong modality, wrong output type) and choose the remaining best fit. The exam often rewards elimination more than perfect recall.

Exam Tip: Read the last line of the question twice—this is where “best,” “most cost-effective,” “minimize effort,” or “must be responsible” often appears, and it changes the correct answer.

Last 24 hours: do not attempt to learn brand-new material. Re-run your domain checklist, review your “distractor dictionary,” and revisit only the questions you flagged as “concept gap.” Sleep matters because AI-900 is heavy on recognition and careful reading; fatigue increases careless mistakes.

Right before you start, set a simple rule: only change an answer if you can point to a specific requirement that your new choice satisfies better. This protects you from the common trap of switching from a correct answer to a tempting distractor.

When you finish, use any remaining time to review flagged questions first, then any “lucky correct” areas you recall. End with a final scan for un-answered items. Your goal is a calm, systematic finish—exactly the mindset this chapter trained.

Chapter milestones
  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
Chapter quiz

1. A retail company wants to analyze 50,000 customer reviews to identify overall sentiment and extract key phrases. The company does not want to build or train a machine learning model. Which Azure AI service should you recommend?

Show answer
Correct answer: Azure AI Language
Azure AI Language (Text Analytics capabilities) supports sentiment analysis and key phrase extraction as prebuilt NLP features, aligning to the AI-900 Natural Language Processing workload. Azure AI Vision is for image/video analysis, not text. Azure Machine Learning is used to build and train custom models; it’s unnecessary when prebuilt language features meet the requirement.

2. A healthcare organization plans to deploy an AI model that helps prioritize incoming patient messages. The organization is concerned that the model might treat certain age groups unfairly. Which Responsible AI principle is most directly related to this concern?

Show answer
Correct answer: Fairness
Fairness addresses whether an AI system produces equitable outcomes across groups (for example, different age ranges). Reliability and safety focus on consistent performance and avoiding harm due to failures, not bias across groups. Transparency is about understanding how the system works and explaining outcomes, which can help diagnose bias but is not the principle most directly tied to unfair treatment.

3. You are reviewing a practice exam and see the requirement: "Convert recorded customer service calls into text and create a searchable transcript." Which Azure AI capability best matches this requirement?

Show answer
Correct answer: Speech to text in Azure AI Speech
Speech to text in Azure AI Speech converts spoken audio into written text, matching the scenario. Azure AI Translator translates text between languages but does not transcribe audio. OCR in Azure AI Vision extracts text from images and scanned documents, not from audio recordings.

4. During a mock exam, you encounter a scenario describing a manufacturing company that wants to predict equipment failures using historical sensor data. The company has data scientists who can train models and wants full control over the training process. Which Azure service is the best fit?

Show answer
Correct answer: Azure Machine Learning
Azure Machine Learning is designed for building, training, and deploying custom machine learning models, which fits predictive maintenance with sensor data and a need for training control. Azure AI Language is focused on NLP workloads rather than time-series sensor predictions. Azure AI Services (prebuilt) provides ready-made capabilities (vision, speech, language) but does not provide the same end-to-end custom training workflow as Azure Machine Learning for this scenario.

5. You are creating an exam-day checklist for AI-900. Which action best aligns with the exam strategy of mapping scenario language to the correct Azure AI workload category?

Show answer
Correct answer: Identify keywords that indicate vision, speech, language, or decision workloads before selecting a service
AI-900 questions commonly reward quickly mapping scenario cues (for example, images/video, audio, text, recommendations/anomaly detection) to workload categories, then selecting the appropriate Azure AI service. Picking the most advanced-sounding service is a trap and can lead to incorrect choices (for example, choosing Azure Machine Learning when a prebuilt service suffices). Predictive requirements do not always mean deep learning or custom training; many scenarios use prebuilt or simpler ML approaches, and AI-900 emphasizes selecting fit-for-purpose services.
More Courses
Edu AI Last
AI Course Assistant
Hi! I'm your AI tutor for this course. Ask me anything — from concept explanations to hands-on examples.