AI Certification Exam Prep — Beginner
Everything you need to pass AI-900—clear concepts, Azure services, real exam practice.
This course is a complete exam-prep blueprint for the Microsoft AI-900: Azure AI Fundamentals certification. It’s designed for beginners with basic IT literacy who want a clear, confidence-building path through the official objectives. You’ll learn how to recognize AI workloads, understand core machine learning concepts on Azure, and select the right Azure AI services for computer vision, natural language processing (NLP), and generative AI scenarios.
AI-900 questions often test practical judgment: “Which workload is this?” and “Which Azure service fits best?” This course focuses on those decision points, using objective-aligned explanations plus exam-style practice to help you avoid common traps and recognize the wording patterns Microsoft uses.
Chapter 1 gets you exam-ready before you start: registration steps, what to expect on test day, how scoring works at a high level, and a study strategy that fits a 2-week or 4-week timeline. Chapters 2 through 5 each map directly to one or two official domains, building from simple definitions to scenario-based thinking. Each of those chapters ends with exam-style practice so you can immediately validate your understanding and identify weak areas early.
Chapter 6 is your capstone: a full mock exam split into two parts, followed by a structured review process so you can turn mistakes into points. You’ll finish with a final service-selection refresh and an exam-day checklist to reduce surprises and improve performance under time pressure.
This course is for anyone preparing for Microsoft AI-900—students, career changers, IT professionals, and business stakeholders who want to understand Azure AI capabilities. No prior Azure or certification experience is required; the course assumes only basic familiarity with using a computer and navigating web tools.
If you’re new to the platform, begin here: Register free. Want to compare options first? You can also browse all courses. Then follow the chapters in order, complete the practice sets, and use the mock exam to confirm you’re ready for AI-900.
Microsoft Certified Trainer (MCT) | Azure AI Fundamentals Specialist
Nadia Richardson is a Microsoft Certified Trainer who helps learners pass Microsoft Azure certification exams through practical, objective-aligned instruction. She specializes in AI-900 and Azure AI services, translating exam domains into clear study paths with targeted practice.
The AI-900: Microsoft Azure AI Fundamentals exam is designed to validate that you understand what AI can do, how common AI workloads are described, and which Azure services are typically used for those workloads. This chapter helps you orient yourself to the exam’s goals, how the test is structured, how to register and sit the exam, and—most importantly—how to study with discipline using a two-week or four-week plan. The AI-900 is not a “coding exam,” but it is not a vocabulary quiz either. You will be asked to recognize scenarios, match them to the right AI approach (classification vs. regression vs. clustering, etc.), and select the best Azure AI service (for vision, language, and generative AI).
As you work through this course, keep a running objective map: for every concept you learn, connect it to (1) a workload type, (2) a business scenario, and (3) the Azure service family that best fits. That is how AI-900 tests you—by requiring you to translate between plain-language needs and correct solution choices, while avoiding misleading distractors.
Exam Tip: When a question sounds like it’s asking “Which service do I use?” first identify the workload (vision, NLP, ML, or generative AI). Only then choose the service. Many wrong answers are “real Azure products” that simply don’t match the workload.
Practice note for Understand AI-900 exam goals and who it’s for: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register, schedule, and take the exam (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Know the question formats and scoring basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study strategy with checkpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand AI-900 exam goals and who it’s for: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register, schedule, and take the exam (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Know the question formats and scoring basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study strategy with checkpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand AI-900 exam goals and who it’s for: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 is an entry-level fundamentals exam aimed at learners who want to understand AI concepts and Azure AI services without needing deep data science experience. It is appropriate for technical and non-technical roles: students, business analysts, product owners, early-career developers, and IT professionals who need to speak confidently about AI solution scenarios. The certification signals you can describe common AI workloads and map them to Microsoft’s Azure AI offerings—an increasingly important skill when organizations evaluate computer vision, language, and generative AI projects.
What the exam is for: establishing baseline fluency in AI vocabulary, core machine learning ideas (training vs. inference, evaluation metrics, overfitting), and Azure service selection. What it is not for: implementing custom neural networks from scratch, writing extensive code, or performing advanced statistics. Expect scenario-based questions that test whether you understand what the technology does and how Azure positions services to deliver it.
In this course, the outcomes align tightly with the AI-900 intent: describe AI workloads, explain ML fundamentals on Azure, identify computer vision and NLP workloads and the right services, and describe generative AI workloads plus responsible AI considerations. You should leave Chapter 1 knowing (1) what you’re preparing for, (2) how to take the exam, and (3) how to study efficiently under a fixed timeline.
Exam Tip: AI-900 rewards clarity over complexity. If you find yourself choosing an answer because it “sounds advanced,” pause—fundamentals questions usually prefer the simplest service/approach that satisfies the scenario.
To study efficiently, you must map every lesson to an exam objective. AI-900 generally covers: (1) AI workloads and considerations, (2) machine learning principles on Azure, (3) computer vision workloads, (4) natural language processing workloads, and (5) generative AI workloads and responsible AI. Your job is not to memorize product names in isolation; it’s to recognize patterns in requirements and select the correct approach and service family.
Use an “objective map” sheet with five columns: Domain, Workload keywords, Typical tasks, Azure services, and Common traps. For example: “Computer vision” → keywords like image classification, object detection, OCR → tasks like labeling, extracting text → services like Azure AI Vision (and OCR features) → traps like confusing OCR (text extraction) with translation (language). For “NLP” map sentiment analysis, key phrase extraction, entity recognition, summarization, and conversational bots to the appropriate language services and scenarios.
Generative AI appears as both capability and governance: understand what large language models (LLMs) are used for (drafting, summarizing, reasoning assistance), but also what responsible AI asks for (fairness, reliability/safety, privacy/security, inclusiveness, transparency, accountability). Expect questions that test whether you can choose a responsible mitigation (human-in-the-loop, content filters, grounding, data minimization) rather than simply selecting a model.
Exam Tip: When two answers both “could work,” the exam often prefers the one most directly aligned to the domain objective. If the question is about extracting text from images, don’t pick a general ML platform just because it’s flexible—choose the vision/OCR capability that matches the task.
Plan the logistics early so exam day is calm. You typically register through Microsoft’s certification portal and select either online proctoring or a test center. Online is convenient but requires a compliant environment: a quiet room, clear desk, stable internet, and the ability to complete a system check. Test centers reduce the risk of connectivity issues but require travel and fixed schedules. Choose the format that minimizes uncertainty for you, not the one that seems “easier.”
Accommodations should be requested well in advance if needed (for example, extra time). Read the exam policies carefully: identification requirements, permitted items, breaks, and rules on leaving the webcam view for online sessions. A common failure mode is not content knowledge—it’s a disrupted session due to policy violations (phone visible, notes on desk, background noise, or leaving the room).
Also understand rescheduling and cancellation rules. Build your study plan around a target date with a buffer window for unexpected life events. For a two-week sprint, schedule near the end of week two; for a four-week plan, schedule near the end of week four with a “decision checkpoint” at the end of week two to confirm you are on track.
Exam Tip: If you test online, treat your workspace like a clean-room: remove extra monitors, cover whiteboards, clear papers, and silence notifications. Proctor interruptions can break focus and cost you time.
AI-900 questions typically include multiple choice and scenario-based items. You may see case-study style prompts or sets of questions tied to a short scenario. Your advantage comes from a consistent approach: identify the workload, identify the task (classification vs. prediction vs. extraction vs. generation), then map to the most appropriate Azure service or concept.
Timing strategy: pace yourself so you do not over-invest in one question. When stuck between two options, eliminate wrong choices using the “scope test”: does the answer address the full requirement with minimal assumptions? Many distractors are partially correct (e.g., a general analytics service when an AI service is explicitly required). Use a flag/review approach if available, but don’t depend on having large review time at the end—aim for accuracy on the first pass.
Scoring basics: Microsoft exams are scored with a scaled model; your job is to maximize correct answers and avoid careless mistakes. Retake strategy matters: if you don’t pass, treat the score report as a diagnostic. Retake only after you can explain why each missed objective is correct in plain language and can consistently select services based on scenarios.
Exam Tip: Watch for “tell words” that indicate the task: “predict a number” (regression), “choose a category” (classification), “group similar items” (clustering), “extract printed text” (OCR), “detect objects” (object detection), “summarize or draft” (generative AI). These keywords often unlock the correct answer quickly.
Common Trap: Confusing training vs. inference. Training is when the model learns from labeled/known data; inference is when you use the trained model to make predictions. Questions may mix these terms to see if you notice.
Use three resource layers in the right order. First, Microsoft Learn paths for AI-900 give you structured coverage aligned to exam objectives. Second, hands-on labs and sandbox exercises help you turn definitions into recognition—when you see a service in action, you remember what it does and when to use it. Third, official documentation fills gaps and resolves confusion (for example, differences between vision analysis features, language analysis tasks, and generative AI workflows).
Your note-taking method should be optimized for exam decisions, not for writing a textbook. Create a one-page “service decision table” with rows for common workloads and columns for “What it does,” “Input/Output,” “When to choose,” and “What it is not.” Update it after each study session. For machine learning, maintain a compact sheet of key terms: features, labels, training/validation/test sets, overfitting, evaluation metrics, and model deployment. For responsible AI, keep a checklist of principles and mitigations.
Two-week plan (high intensity): Days 1–3: AI workloads + ML fundamentals; Days 4–6: vision + NLP service mapping; Days 7–9: generative AI + responsible AI; Days 10–12: mixed review and targeted reading; Days 13–14: full practice review and final consolidation. Four-week plan (lower intensity): spend one week per major domain, then a final week for review and practice, with checkpoints at the end of weeks 1, 2, and 3 to verify you can map scenarios to services without notes.
Exam Tip: Don’t just read—rewrite. After a lesson, write two or three “If you see X, choose Y” rules (e.g., “If the requirement is OCR from images, choose Azure AI Vision OCR capability”). These rules are exactly how you will think under time pressure.
Practice is only valuable if you review mistakes correctly. Treat every missed question as a signal about a specific gap: concept misunderstanding (e.g., confusing precision vs. recall), service confusion (choosing a general platform instead of a task-specific AI service), or reading error (missing a key constraint like “real-time” or “extract text”). Your review workflow should force you to articulate the correct reasoning, not just memorize the right option.
Use a “mistake log” with four fields: (1) objective/domain, (2) why your choice seemed plausible, (3) why it is wrong, (4) the rule you will apply next time. Then schedule “spaced reviews” of the log every 2–3 days. This is especially effective for AI-900 because many questions are pattern recognition: once you repeatedly correct the same type of confusion, your accuracy climbs quickly.
Track weak areas with a simple dashboard: percentage correct by domain (AI workloads, ML, vision, NLP, generative/responsible AI). Your checkpoint targets: by the midpoint of a two-week plan, you should be able to identify the workload and service family correctly for most scenarios; by the final days, you should be refining edge cases and eliminating traps.
Exam Tip: When reviewing, practice explaining the answer out loud in one sentence. If you can’t explain it simply (e.g., “This is OCR, so we use a vision service that extracts text from images”), you likely don’t own the concept yet.
Common Trap: Over-generalizing “Azure Machine Learning” as the answer to everything. The exam often expects you to choose specialized prebuilt AI services (vision/language/generative offerings) when the scenario is a standard cognitive task rather than custom model development.
1. You are advising a project manager who is new to AI-900. They believe the exam is primarily a hands-on coding test. Which statement best describes what AI-900 is designed to validate?
2. A question on the AI-900 exam asks: "Which Azure service should you use to extract key phrases and detect sentiment from customer feedback?" What is the best first step to avoid misleading distractors?
3. A company is planning to take AI-900 and wants to reduce the risk of missing topics. Which approach best aligns with the chapter’s recommended study method?
4. You are 10 days away from your planned AI-900 exam date and can study about 1 hour per day. Which plan best fits the chapter’s guidance on study strategies and checkpoints?
5. You are registering for AI-900 and deciding how to take the exam. Which statement reflects the chapter’s guidance on scheduling and taking the exam?
Domain 1 of AI-900 is about recognizing “what kind of problem is this?” and mapping it to the right AI workload and (at a high level) the right Azure AI service. The exam is not testing that you can build models; it tests whether you can describe AI workloads, choose sensible solution patterns, and spot when AI is unnecessary.
This chapter connects four exam behaviors you must master: (1) recognize when AI adds value versus when traditional software is sufficient, (2) match business scenarios to workload types (vision, language, prediction, generative), (3) explain responsible AI basics and the risk areas that show up on the test, and (4) practice making the “best next step” decision without overengineering.
Exam Tip: When you read a scenario, first underline the input type (text, image, audio, tabular data) and the desired output (label, score, description, generated content). That simple mapping eliminates many distractor answers.
Practice note for Recognize when to use AI vs traditional software: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match business scenarios to AI workload types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI basics and risk areas: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Answer domain-focused practice questions and review rationales: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize when to use AI vs traditional software: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match business scenarios to AI workload types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI basics and risk areas: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Answer domain-focused practice questions and review rationales: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize when to use AI vs traditional software: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match business scenarios to AI workload types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI basics and risk areas: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
For AI-900, “AI” is best understood as systems that learn patterns from data (or use pretrained models) to perform tasks that are hard to solve with explicit rules. Traditional software excels when logic is stable and you can precisely specify steps (for example, calculating tax, validating a checksum, or routing based on fixed thresholds). AI is a good fit when the rules are ambiguous, change over time, or would take too many rules to maintain—think of recognizing objects in photos, extracting meaning from free-form text, or predicting demand from messy historical data.
Common AI workload patterns tested in Domain 1 include: (1) prediction/scoring (produce a numeric value or probability), (2) classification (assign one of a set of categories), (3) detection/extraction (locate entities/objects or pull structured info from unstructured input), and (4) generation (create new text, images, or code from prompts). These patterns show up across machine learning, computer vision, natural language processing (NLP), and generative AI.
Exam Tip: If the scenario says “determine whether,” “categorize,” or “route,” you’re usually in classification. If it says “forecast,” “estimate,” “probability,” or “risk score,” you’re usually in prediction. If it says “find,” “identify where,” “extract fields,” or “detect,” you’re in detection/extraction.
Recognizing when not to use AI is equally important. If a business rule is explicit (“if temperature > 100 then alert”), if the organization lacks data or cannot label it, or if the cost of wrong predictions is unacceptable without oversight, a deterministic approach may be preferable. The exam often includes distractors that push AI when a simple rule-based solution suffices.
AI-900 expects you to differentiate machine learning (ML), deep learning (DL), and non-AI “classical” approaches, mainly so you can choose appropriate solution families. ML is a broad set of methods that learn from examples (data) to make predictions or decisions. Deep learning is a subset of ML that uses neural networks with multiple layers and is especially effective for images, speech, and complex language tasks—often requiring more data and compute, but reducing manual feature engineering.
Classical approaches include rule-based logic, deterministic algorithms, and traditional statistics. These are not “wrong”; they are often faster to implement, easier to explain, and more reliable when conditions are stable. On the exam, a common trap is assuming “AI” is always required for anything involving data. If the problem statement provides clear thresholds, fixed mappings, or a small set of rules, classical software is likely the best answer.
From an Azure perspective (high level), you may implement ML by training a model on historical labeled data (supervised learning) or by finding patterns without labels (unsupervised learning). The exam frequently frames supervised learning as “we have past examples with the correct outcome,” such as customer churn labels or loan default outcomes. Reinforcement learning is less emphasized at the fundamentals level but can appear conceptually as “learning through rewards” in dynamic environments.
Exam Tip: When the scenario mentions “labeled training data” (for example, emails marked as spam/not spam), choose a supervised ML framing. When it mentions “group similar items” without labels (for example, segment customers by behavior), that aligns with clustering (unsupervised).
Deep learning is often implied when the input is unstructured and high-dimensional (images, audio, natural language at scale). However, many Azure AI services provide pretrained models, so the decision is not “build a neural net from scratch” but “use an existing cognitive capability” versus “train a custom model.”
Domain 1 questions are usually solved by mapping scenario language to one of four workload categories. Prediction outputs a number or probability (for example, forecast sales next month, estimate delivery time, compute risk). Classification outputs a discrete label (approve/deny, spam/ham, topic A/B/C). Detection/extraction finds something within data (detect a face in an image, extract invoice fields, identify entities like names and dates in text). Generation creates new content (summaries, answers, marketing copy, code, images).
Computer vision scenarios often involve detection (locating objects), classification (labeling an image), or extraction (OCR and document understanding). NLP scenarios frequently involve classification (sentiment, intent), extraction (key phrases, entities), and generation (summarization, Q&A, chat). ML on tabular data is commonly prediction or classification. Generative AI is usually generation, but it can be combined with extraction (retrieve facts) and classification (moderation, routing) in real solutions.
Exam Tip: Watch for “where” versus “what.” “What is in the image?” suggests classification. “Where is the object/face?” suggests detection (bounding boxes). “Read the text from the image/document” suggests OCR and extraction.
A frequent exam trap is confusing “classification” with “detection.” If the scenario needs bounding boxes, coordinates, or identifying multiple instances (three products on a shelf), it is detection. If it just needs a single label for the whole input (“contains a dog”), it is classification.
Another trap is treating generation as always a chatbot. Generation includes summarizing long documents, drafting emails, generating product descriptions, and creating code snippets. In these cases, the output is new content, not just a label or numeric score.
AI-900 expects service-level recognition, not implementation detail. At a high level, choose Azure Machine Learning when you need to train, evaluate, and deploy custom ML models—especially for tabular prediction/classification, end-to-end model lifecycle, and MLOps. Choose Azure AI services (prebuilt) when you want common capabilities without building your own model, such as vision, speech, language, and document processing.
For computer vision, Azure AI Vision typically covers image analysis, OCR, and object/scene understanding. For document-centric extraction (invoices, receipts, forms), Azure AI Document Intelligence is designed for structured field extraction from documents. For NLP, Azure AI Language supports tasks like sentiment analysis, key phrase extraction, entity recognition, and custom text classification. For speech, Azure AI Speech handles speech-to-text, text-to-speech, translation, and speaker-related scenarios.
For generative AI, Azure OpenAI Service is the core option for large language models and generative use cases such as summarization, chat, and content generation, typically combined with your data and safety controls. The exam may describe scenarios like “generate answers from company policies” or “draft responses,” which generally map to Azure OpenAI at a high level.
Exam Tip: If the scenario says “custom model from our historical data” and mentions training, features, evaluation, or pipelines, lean toward Azure Machine Learning. If it says “detect text, analyze images, extract entities” and implies a prebuilt capability, lean toward Azure AI services.
A common trap is selecting Azure Machine Learning for basic OCR or sentiment analysis. Those are classic “use the service” cases. Another trap is assuming a generative model is needed for any text problem; if the task is simply to identify sentiment or extract names/dates, a language analytics capability is more appropriate than generation.
Responsible AI is tested as concepts and risk recognition. You should be able to describe core principles and identify what could go wrong. Fairness means the system’s outcomes should not systematically disadvantage groups (for example, biased hiring recommendations). Reliability and safety means the system performs consistently under expected conditions and fails gracefully (for example, an inspection model should be monitored for drift and validated on new camera lighting). Privacy and security involve protecting sensitive data, minimizing collection, and controlling access. Transparency includes being clear that users are interacting with AI, what data is used, and providing understandable explanations where possible.
For Domain 1, focus on recognizing risk areas: biased training data, unrepresentative samples, label errors, and feedback loops (where model outputs influence future data). Generative AI introduces additional risks like hallucinations (plausible but incorrect outputs), prompt injection, data leakage, and harmful content generation. The exam often frames these as “what should you consider?” rather than “how do you code it?”
Exam Tip: If a scenario impacts people’s opportunities (loans, hiring, healthcare), fairness and transparency are key. If it involves personal data (audio recordings, customer chats, IDs), privacy and security considerations are central. If it’s safety-critical (manufacturing defects, medical triage), reliability and human oversight matter most.
Another common trap is treating responsible AI as a one-time checklist. Expect wording that implies ongoing monitoring: model drift, periodic evaluation, and incident response processes. For generative solutions, emphasize guardrails (content filtering, grounding with trusted sources, and clear user disclosures) rather than “the model will always be correct.”
This domain rewards disciplined scenario parsing. In practice, you should be able to read a short prompt and quickly decide: (1) Is AI needed? (2) If yes, which workload category fits—prediction, classification, detection/extraction, or generation? (3) Which Azure family is most appropriate at a high level—Azure Machine Learning, Azure AI services (Vision/Language/Speech/Document), or Azure OpenAI?
When reviewing your answers, look for keywords that signal the correct mapping. “Forecast,” “estimate,” and “probability” point to prediction. “Choose one of these categories” points to classification. “Locate,” “extract fields,” “read text from documents,” and “find objects” point to detection/extraction. “Draft,” “summarize,” “answer questions,” and “create” point to generation. Your job is to ignore irrelevant story details (industry, location) unless they affect responsible AI (for example, regulated data, high-stakes decisions).
Exam Tip: If two answer choices both sound plausible, pick the one that requires the least custom training and the most direct fit to the input/output. Fundamentals exams favor the simplest correct architecture.
Finally, bake responsible AI into your reasoning. If the scenario touches customer data, ensure privacy controls and minimal data exposure. If it impacts eligibility or decisioning, think fairness and transparency. If it requires trustworthy outputs (especially for generative), assume validation, grounding, and human review are necessary. Many Domain 1 items include a “best practice” angle, and the safest, most responsible option is often the correct one even when multiple technical options exist.
1. A manufacturing company wants to automatically route customer support emails into one of five issue categories (billing, shipping, returns, product defect, other). They have historical emails labeled with the correct category. Which AI workload type best fits this requirement?
2. A retailer needs to apply a fixed discount rule: If a customer has spent more than $1,000 in the past 12 months, apply a 10% discount at checkout; otherwise, apply no discount. The business asks whether they should "use AI" to implement this. What is the best response?
3. A logistics company wants to forecast the number of delivery requests it will receive each day next week using historical daily order counts, promotions, and weather data. Which workload type best matches the goal?
4. A bank is evaluating an AI model that helps approve or deny loan applications. Which responsible AI risk area is most directly implicated if the model produces systematically lower approval rates for a protected group, even when applicants have similar financial profiles?
5. A travel company wants a chatbot that can draft personalized itinerary suggestions and rewrite them in different tones (formal, friendly) based on a user prompt. Which workload type best fits this solution?
Domain 2 of AI-900 tests whether you can recognize core machine learning (ML) ideas and connect them to Azure ML concepts without getting lost in data science math. Expect questions that use everyday language (“predict,” “classify,” “cluster,” “recommend,” “learn from feedback”) and ask you to identify the ML approach, the lifecycle step, or the Azure capability that fits.
This chapter aligns to the exam outcomes: you’ll distinguish learning types (supervised/unsupervised/reinforcement), walk through the ML lifecycle and data concepts, interpret evaluation metrics, and connect the ideas to Azure Machine Learning (Azure ML) and AutoML fundamentals. The exam typically rewards clear definitions and correct mapping of scenario → learning type → metric → Azure tooling.
As you read, keep a “test lens”: AI-900 isn’t trying to make you build a perfect model; it’s testing whether you can choose the right ML framing, understand why metrics can be misleading, and identify common pitfalls like data leakage, imbalanced classes, and overfitting.
Practice note for Explain supervised, unsupervised, and reinforcement learning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Walk through the ML lifecycle and data concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Interpret evaluation metrics and avoid common pitfalls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice AI-900 ML questions (Azure ML + AutoML concepts): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain supervised, unsupervised, and reinforcement learning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Walk through the ML lifecycle and data concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Interpret evaluation metrics and avoid common pitfalls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice AI-900 ML questions (Azure ML + AutoML concepts): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain supervised, unsupervised, and reinforcement learning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Walk through the ML lifecycle and data concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Interpret evaluation metrics and avoid common pitfalls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
At AI-900 level, the exam expects you to speak the language of ML: features are the input variables (columns) used to make a prediction; a label is the known correct output you want the model to learn (the target column). A model is the learned function that maps features to predictions. Training is the process of fitting that model using historical data; inference is using the trained model to make predictions on new data.
A common scenario question gives you a dataset description and asks what is a feature vs. label. Example: predicting house price using size, number of bedrooms, and zip code. Those inputs are features; price is the label. If the prompt says “predict whether a transaction is fraudulent,” then “fraudulent (yes/no)” is the label and the transaction attributes are features.
Exam Tip: Watch for “known outcomes” language. If historical data includes the correct answer (e.g., “was this email spam?”), you have labels and you’re likely in supervised learning. If the question says “group similar customers without predefined categories,” there are no labels.
Also know the difference between training and inference environments. Training is usually compute-heavy and may run in batch; inference is often optimized for low latency (real-time) or cost-efficient batch scoring. AI-900 can test this with phrasing like “deploy model to score new requests” (inference) vs. “fit the model using past data” (training).
Common trap: confusing “algorithm” with “model.” The algorithm is the training method (e.g., logistic regression); the model is the trained result. The exam may use either term loosely, so focus on the intent: “trained” implies model; “method used to train” implies algorithm.
AI-900 frequently asks you to pick the learning type based on a scenario. Use the fastest discriminator: are labels present, and what is the goal?
Supervised learning uses labeled data. It includes classification (predict a category like yes/no, A/B/C) and regression (predict a numeric value). Typical workloads: fraud detection (classification), demand forecasting (regression), churn prediction (classification). The exam expects you to recognize the label type: categories → classification; numbers → regression.
Unsupervised learning uses unlabeled data to find structure. Typical tasks: clustering (grouping similar items), anomaly detection (sometimes framed as unsupervised), and dimensionality reduction (less common on AI-900). Scenario cues: “segment customers into groups” or “discover patterns without predefined classes.”
Reinforcement learning is learning by trial-and-error using rewards/penalties. It is not as common on AI-900 as supervised/unsupervised, but it appears as a conceptual question. Scenario cues: “agent,” “environment,” “actions,” “reward,” “maximize long-term return.” Examples: robotics navigation, game playing, dynamic pricing in a simulated environment.
Exam Tip: If the prompt includes “feedback loop” but also provides historical labeled examples, do not automatically choose reinforcement learning. Many real systems have feedback, but the learning type depends on how training data is structured (labeled examples vs. reward signals).
Common trap: mixing up “clustering” and “classification.” If you already know the categories and have labeled examples, it’s classification. If you’re discovering the groups, it’s clustering. Another trap: treating “recommendations” as always unsupervised; recommendations can be supervised or unsupervised depending on the approach, but AI-900 typically frames them as pattern discovery/collaboration, leaning unsupervised.
The ML lifecycle on the exam is usually simplified: collect data → prepare data → train → evaluate → deploy → monitor. The evaluation step is where data splits matter. A standard split is training (fit the model), validation (tune settings), and test (final unbiased check). You may also see “train/test split” in simpler questions.
Overfitting means the model learns noise or memorizes training examples and performs poorly on new data. Symptoms: very high training performance but significantly worse validation/test performance. Underfitting means the model is too simple or not trained enough to capture the pattern; it performs poorly on both training and test.
AI-900 doesn’t require equations, but you should have the intuition: overfitting relates to high variance; underfitting relates to high bias. If a scenario says “the model performs well in development but fails in production,” suspect overfitting or data drift. If it says “the model is inaccurate even on training data,” suspect underfitting or poor features.
Exam Tip: Be alert for data leakage, a frequent hidden pitfall. Leakage happens when training includes information that would not be available at inference time (e.g., using a “refund issued” field to predict fraud). Leakage can create unrealistically high validation scores and is a classic exam trap.
Another tested concept is class imbalance. If 99% of transactions are legitimate, a model that always predicts “legitimate” can achieve 99% accuracy while being useless. This sets up metric questions in the next section. Finally, remember that the test set should represent future data; if the prompt hints that the split wasn’t random or that the data is time-based, a random split may be inappropriate and could inflate results.
AI-900 expects you to choose metrics that match the problem type and understand when a metric is misleading. For classification, the basic metric is accuracy (percent correct), but accuracy fails with imbalanced data. When false positives and false negatives have different costs, focus on precision and recall.
AUC (Area Under the ROC Curve) measures how well the model separates classes across all thresholds. It’s helpful when you will choose a threshold later or want threshold-independent comparison.
For regression (numeric prediction), common metrics include RMSE (root mean squared error) and R². RMSE is in the same units as the label (e.g., dollars), so it’s often easier to interpret; it also penalizes large errors more strongly. R² is a relative measure of variance explained (higher is better), but it can be misleading if used without context (e.g., narrow target range).
Exam Tip: If the question describes “predicting a number,” eliminate accuracy/precision/recall/F1/AUC and look for RMSE or R². If it describes “yes/no” or categories, eliminate RMSE/R² and choose classification metrics.
Common traps: (1) picking accuracy for fraud/rare-event detection without considering imbalance; (2) confusing precision vs. recall—use the “cost framing” to decide; (3) assuming AUC is only for multi-class (AI-900 typically frames AUC with binary classification). Also, remember that improving one metric can worsen another depending on threshold; that’s why AUC and F1 are often used for balanced comparisons.
Domain 2 also checks whether you can recognize core Azure Machine Learning components and what they’re for. Think in terms of “what does this object manage?” and “where does work run?”
An Azure ML workspace is the top-level resource that organizes assets: experiments, models, endpoints, data connections, and compute. If the question asks where you manage models, runs, and deployments, the workspace is the container.
Compute is where training or inference runs. You might see compute instances (often used for interactive development) and compute clusters (scale-out training). AI-900 won’t test deep sizing details, but it may test the idea that training requires compute and can scale.
Datasets / data assets represent registered data references used for training and evaluation. The exam goal is to ensure you can identify that Azure ML helps track data and model lineage, which supports reproducibility.
Pipelines are repeatable workflows that orchestrate steps such as data prep, training, and evaluation. If a scenario says “automate and rerun the training process consistently,” pipeline is the best match.
AutoML (Automated Machine Learning) is a capability that helps select algorithms and hyperparameters automatically for a given task (classification/regression/time series forecasting). AI-900 typically tests what AutoML is good for: speeding up baseline model creation and reducing manual experimentation, not replacing the need for good data or clear objectives.
Exam Tip: If the scenario emphasizes “quickly find the best model” or “automatically try multiple algorithms,” choose AutoML. If it emphasizes “repeatable steps” and “operationalizing the workflow,” choose pipelines. If it emphasizes “central place to manage artifacts,” choose workspace.
Common trap: confusing Azure ML with Azure AI services (prebuilt). Azure ML is for building/training your own models; Azure AI services are for using pretrained APIs. AI-900 includes both across the full exam, so always read whether the prompt implies custom training or prebuilt inference.
This section consolidates how AI-900 frames Domain 2 questions—without drills or rote memorization. When you face an exam item, apply a consistent decision tree: (1) Identify the prediction goal (category vs. number vs. grouping vs. reward-driven). (2) Determine whether labels exist. (3) Choose the metric aligned to the goal and data characteristics. (4) Map to Azure ML components if asked.
For learning type identification, highlight verbs: “predict/classify/forecast” usually indicates supervised; “group/segment/discover patterns” indicates unsupervised; “agent learns by taking actions with rewards” indicates reinforcement. If the prompt mentions “historical outcomes,” that is a label signal.
For lifecycle questions, look for where you are in the flow: “clean/transform” implies preparation; “fit/train” implies training; “compare on held-out data” implies evaluation; “publish endpoint/score requests” implies deployment/inference; “track performance drift” implies monitoring. Be especially careful with leakage: if a feature is only known after the event, it does not belong in training for real inference.
For metrics, treat them as tools: accuracy is fine for balanced classification; precision/recall/F1 for imbalanced or asymmetric cost; AUC for threshold-independent comparison; RMSE/R² for regression. If the scenario is safety- or compliance-critical, expect recall to matter (catch all positives) unless the prompt explicitly says false alarms are the bigger problem.
Finally, for Azure ML/AutoML, anchor on intent. Workspace organizes, compute runs jobs, datasets/data assets track data inputs, pipelines automate repeatable workflows, and AutoML explores algorithms/hyperparameters. Exam Tip: When two answers both sound plausible, pick the one that directly addresses the action in the scenario (organize vs. run vs. automate vs. optimize). Misreading the scenario verb is the most common Domain 2 mistake.
1. A retail company wants to predict the number of units it will sell next week for each product based on historical sales, promotions, and seasonality. Which type of machine learning should you use?
2. You have customer data (age, region, spend, browsing behavior) but no existing categories. You want to group customers into segments for targeted marketing. What approach best fits the requirement?
3. A team reports 99% accuracy for a model that detects fraudulent transactions. Only 1% of transactions are actually fraud. Which evaluation metric is most appropriate to review to avoid being misled by class imbalance?
4. You are preparing data for a churn prediction model. You accidentally include a column named "ChurnDate" that is only populated after the customer has already churned. The model performs extremely well in testing but poorly in production. What is the most likely cause?
5. A data scientist wants Azure to automatically try multiple algorithms and feature engineering approaches to find the best model for a labeled dataset, while tracking runs and comparing metrics. Which Azure capability best fits this requirement?
Domain 3 of AI-900 expects you to recognize common computer vision tasks, understand what their outputs look like, and select the Azure service that best fits a scenario. The exam is less about coding details and more about mapping a business need (for example, “extract text from receipts” or “detect defects on a production line”) to the right Azure AI capability. In this chapter you’ll practice differentiating key vision tasks and outputs, choosing the right Azure vision service, and understanding the fundamentals of OCR, image analysis, and custom vision. You’ll also learn the patterns the exam uses to hide the correct answer behind near-miss distractors.
A strong strategy for this domain is to read the scenario, underline the “output” the user wants (labels, bounding boxes, polygons, text, identity match), then match that output to the task (classification, detection, segmentation, OCR). Only after you identify the task should you pick the service. Exam Tip: Many wrong answers are “good AI services” that solve a different output type; if you anchor on the output first, you avoid those traps.
Finally, remember AI-900 tests fundamentals: service purpose, typical use cases, and high-level workflow (input → processing → output). It rarely tests specific SDK calls, but it frequently tests the difference between prebuilt capabilities (quick to use) and custom models (when you must train).
Practice note for Differentiate key vision tasks and outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right Azure vision service for a scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand OCR, image analysis, and custom vision fundamentals: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice computer vision questions with service selection drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate key vision tasks and outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right Azure vision service for a scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand OCR, image analysis, and custom vision fundamentals: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice computer vision questions with service selection drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate key vision tasks and outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Computer vision scenarios can look similar on the surface (“analyze an image”), so AI-900 emphasizes clear task boundaries. Start by mapping the required output to the workload type. Classification answers “What is in this image?” with one or more labels (for example, “cat,” “car,” “damaged”). The output is typically a label plus confidence, and it does not tell you where the object is located. Object detection answers “Where are the objects and what are they?” with bounding boxes and labels; the output includes coordinates (x/y/width/height) plus confidence.
Segmentation goes beyond detection by identifying pixels belonging to an object. In practice, segmentation outputs a mask or polygon/outline rather than a rectangle. On exams, segmentation is the right choice when the scenario mentions “precise shape,” “area,” “count pixels,” or “separate foreground from background.” OCR (optical character recognition) is different: it extracts text from images, typically returning recognized strings, layout information, and sometimes bounding boxes for words/lines.
Exam Tip: Watch for the “where” keyword. If the scenario wants location (boxes or masks), classification alone is wrong. Another common trap is confusing OCR with “image description.” If the scenario wants to read printed or handwritten text, it’s OCR; if it wants a caption like “a person riding a bicycle,” that’s image analysis/captioning, not OCR.
In service-selection questions, Azure offers both prebuilt and custom options. If the scenario says “new product types,” “your own classes,” or “domain-specific defects,” expect a custom vision approach. If it says “generic objects,” “common tags,” or “extract text,” expect a prebuilt vision capability.
Azure AI Vision (often referenced on the exam as the service for image analysis) provides prebuilt capabilities to extract meaning from images without training your own model. AI-900 expects you to recognize common outputs: tags (keywords), captions (natural-language description), object listings, and sometimes scene-level insights (for example, “outdoor,” “person,” “vehicle”). The exam typically frames this as “analyze an image and return a description” or “generate tags to support search.”
Think of Azure AI Vision image analysis as the right answer when you need quick, general-purpose understanding of images. Common use cases include: auto-tagging a photo library, generating alt-text/captions for accessibility, content indexing for search, and basic object identification in retail or media workflows.
Exam Tip: Prebuilt image analysis is best when the scenario does not require the organization to define custom classes. If the prompt mentions “our internal product SKUs,” “custom categories,” or “defect types unique to our factory,” then a custom model is more likely. Another trap is mixing “captioning” with “text extraction.” Captions describe what’s happening; OCR reads characters.
When you see “detect objects” in a scenario, confirm whether the scenario needs bounding boxes or just labels/tags. Many prebuilt analyzers can return objects, but if the scenario emphasizes precise location for downstream automation (for example, robotic picking), that requirement pushes you toward detection/segmentation-capable solutions rather than simple tags. For AI-900, you are usually expected to pick the service family correctly rather than a niche feature flag.
Also, be aware of responsible AI cues: if the scenario includes sensitive domains (people, identity, surveillance), the test may steer you toward high-level principles and safe scenario mapping rather than detailed biometric claims.
OCR questions are common because the “output type” is unambiguous: the customer wants text. On AI-900, OCR scenarios include receipts, invoices, IDs, screenshots, scanned PDFs, and photos of signage. The fundamental pipeline is: input image/document → detect text regions → recognize characters → return text plus layout metadata (lines/words and often bounding boxes). If the prompt mentions “handwritten notes,” OCR is still the core workload, though accuracy may vary by implementation and quality.
Document-style scenarios may go beyond raw text into structure (key-value pairs, tables, form fields). On the exam, watch how the requirement is phrased. If it says “extract the invoice number, total, and vendor,” that implies structured document extraction rather than simply reading all text. If it says “convert this scanned page to editable text,” that’s classic OCR output.
Exam Tip: Don’t over-select custom vision for document extraction just because the organization has “their own forms.” The exam often expects that you choose a document/OCR capability first when the primary need is text and layout. Custom vision is more appropriate when the task is visual classification/detection of objects, not when the core requirement is reading characters.
Common traps include selecting image captioning (because it “describes the image”) or selecting translation services (because the business wants the text in another language). Translation is a separate NLP workload; the correct chain is OCR first (extract text), then translation second (convert language). AI-900 questions may test whether you recognize that sequencing even when you are only asked to pick the vision service.
When reviewing a scenario, look for nouns like “receipt,” “contract,” “form,” “statement,” and verbs like “extract,” “read,” “digitize,” “index,” and “search within documents.” These are strong OCR/document cues.
AI-900 treats face-related and spatial concepts at a high level, focusing on what the workload does and how to map it to safe, compliant scenarios. Face detection typically means finding human faces in an image and returning their location (bounding boxes) and possibly basic attributes depending on policy and capability. Face verification/identification concepts can appear as “match this person to a known profile” or “confirm the same person appears in two images.”
For spatial concepts, the exam may describe understanding relationships in a scene: where objects are relative to each other, counting people, or tracking movement across frames. At the fundamentals level, you should interpret these as extensions of detection/tracking rather than as custom ML from scratch. If the scenario emphasizes location and counting (for example, occupancy estimation), your mental model should be detection outputs (boxes) aggregated into counts.
Exam Tip: When a scenario includes people, identity, or surveillance-like requirements, the test may evaluate whether you choose an appropriate capability and avoid overstating what the service does. Read carefully: “detect faces to blur them for privacy” is different from “identify individuals for security.” The former is generally detection (location for redaction). The latter implies identity matching and has stronger governance expectations.
Another trap is confusing “face detection” with “object detection.” Face detection is a specialized detection task focused on faces, while object detection covers general classes (cars, bottles, etc.). If the scenario explicitly mentions faces (selfies, ID photos, group photos), choose a face-capable option rather than generic object detection.
In exam-safe scenario mapping, align the user’s intent with the minimal required capability: if they only need to anonymize images, detection and blurring is sufficient; if they need to authenticate a user, verification is implied; if they need to group similar faces in a photo set, similarity/verification concepts apply. The exam typically rewards conservative mapping to requirements, not feature shopping.
Custom vision is the right mental bucket when the organization needs to recognize things that prebuilt models don’t cover well: proprietary products, manufacturing defects, unique logos, or domain-specific categories. AI-900 focuses on the basic lifecycle: collect images, label them, train a model, evaluate it, then deploy for predictions. The exam may describe this as “build a model to detect scratches on parts” or “classify images into our internal categories.”
Labeling is central. For classification, labels usually apply to the whole image (for example, “acceptable” vs. “defective”). For object detection, labels are attached to regions (bounding boxes) so the model learns location as well as class. Testing/evaluation typically means validating the model on images it has not seen before and using metrics (precision/recall concepts may appear at a high level). Deployment means making the trained model available as an endpoint so applications can send new images and get predictions.
Exam Tip: If the scenario says “we have only a small dataset” and implies frequent updates, the exam may be probing whether you understand the ongoing iteration loop: label → train → test → improve. A common trap is choosing prebuilt image analysis when the business needs custom categories, or choosing custom vision when the business just needs generic tagging.
Another selection trap: “detect objects in our warehouse images” could be solved by prebuilt detection if objects are common (boxes, forklifts), but if it says “our custom package types” or “our branded items,” that is a strong signal for custom. Also note the output requirement: if the app needs to draw rectangles around defects, that’s custom object detection, not classification.
On AI-900, you are not expected to memorize portal clicks, but you should understand the conceptual workflow and why labeling quality matters (inconsistent labels produce inconsistent predictions).
This section is a service-selection drill without explicit questions: you practice the reasoning pattern the exam expects. Step 1: identify the desired output (text, labels, boxes, masks, identity match). Step 2: map to the workload type (OCR, classification, detection, segmentation, face). Step 3: choose prebuilt versus custom based on whether the categories are generic or organization-specific.
When the scenario mentions “extract text from receipts,” the output is text and layout, so OCR/document reading is the core requirement. When it mentions “auto-generate tags and captions for a photo website,” the output is generic tags/captions, so prebuilt image analysis fits. When it mentions “detect and draw boxes around defects on a manufactured part,” the output is bounding boxes around domain-specific items, so custom vision object detection is the match. When it mentions “separate the background from the foreground for precise editing,” the output implies pixel-level masks, which is segmentation.
Exam Tip: The fastest elimination technique is to reject any option that produces the wrong output shape. If you need bounding boxes, eliminate pure classification. If you need text, eliminate captioning/tagging. If you need custom classes, eliminate generic-only solutions. This is especially useful when two answers sound plausible because they are both “vision services.”
Also practice spotting “two-step” problems. If the business wants to translate text from a sign, the vision step is OCR and the language step is translation. If they want to index videos by what is said and what appears on screen, that combines speech/NLP with vision—AI-900 often tests whether you can identify the vision component correctly even when the overall solution is multi-modal.
Finally, watch for distractors that name powerful tools but don’t match the scenario. A general ML service may be offered as an option, but AI-900 typically prefers a dedicated AI service when one exists. Choose the simplest service that meets requirements, and don’t assume you must train a model unless the scenario explicitly demands custom categories or specialized recognition.
1. A retailer wants to automatically extract the merchant name, date, and total amount from photos of receipts uploaded by customers. Which Azure AI capability should you use?
2. You are designing a solution for a factory. The business needs to locate and draw bounding boxes around damaged parts on a conveyor belt in real time. Which computer vision task and Azure service best fit the requirement?
3. A media site wants to automatically generate a short caption like "a person riding a bike on a street" and a list of tags for user-uploaded images. The site does not want to train a custom model. Which service should you choose?
4. A company wants to identify whether an uploaded photo contains a hard hat. They only need a yes/no result per image (no location coordinates). The environment and helmet styles vary by site, and the model must be tuned over time. Which approach should you recommend?
5. You are reviewing requirements for an AI-900 study project. The stakeholder says, "We need the service to return the recognized text plus where it appears in the image so we can highlight it to users." Which output format best matches this requirement, and what capability provides it?
This chapter maps directly to the AI-900 objectives for Natural Language Processing (Domain 4) and Generative AI (Domain 5). The exam is less about implementing code and more about recognizing which workload you have and selecting the correct Azure AI service (or capability) to solve it. You’ll repeatedly be tested on scenario phrasing: “extract,” “classify,” “summarize,” “converse,” “transcribe,” “generate,” “ground,” and “filter.” Each word points to a specific service family and feature set.
As you read, practice turning business needs into technical tasks. For example, “monitor customer satisfaction in reviews” becomes sentiment analysis; “find product names and locations” becomes entity recognition; “create a support assistant that answers from internal documents” becomes retrieval-augmented generation (grounded GenAI). When you can do that translation quickly, most AI-900 questions become straightforward.
Exam Tip: When two answers look plausible, look for the one that is a managed Azure AI service matching the workload (Language vs Speech vs OpenAI). The exam rarely expects you to assemble a complex architecture unless it’s clearly described (e.g., “ground the model on company data”).
Practice note for Identify NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain conversational AI and speech scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Describe generative AI concepts, use cases, and safety basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice combined NLP + GenAI scenario questions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain conversational AI and speech scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Describe generative AI concepts, use cases, and safety basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice combined NLP + GenAI scenario questions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain conversational AI and speech scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
NLP workloads turn text into structured information or decisions. AI-900 expects you to recognize core task types and the outputs they produce. Entity recognition identifies and labels spans of text (people, organizations, locations, dates, products). If the scenario says “extract names, addresses, invoice numbers,” the intent is entity extraction, not translation or summarization.
Sentiment analysis classifies opinion polarity (positive/negative/neutral) and often includes confidence scores; some solutions also identify opinions about aspects (e.g., “battery life is great, screen is bad”). Key phrase extraction pulls the most important terms to help with tagging or indexing. These are lightweight enrichment tasks commonly used before search or reporting.
Summarization compresses longer text into a shorter form (extractive vs abstractive). On the exam, summarization is distinct from key phrases: key phrases are tokens/phrases; summarization is coherent sentences. Text classification assigns labels (spam vs not spam, topic categories, priority routing). The critical exam nuance: classification can be prebuilt (e.g., language detection) or custom (your labels). If the question mentions “your own categories,” expect custom classification.
Common trap: Confusing “summarize meeting transcript” (summarization) with “generate meeting notes” (GenAI). If the scenario implies creative rewriting, action items, or drafting new text beyond compression, it’s drifting into generative AI.
Exam Tip: Watch for “identify the language” and “translate.” Language detection is an NLP analysis task; translation is typically a dedicated translation feature/service. Don’t force everything into “NLP” just because it’s text.
For AI-900, the centerpiece for NLP analysis is Azure AI Language. Your job is to map requirements to the right capability. If the scenario asks to extract entities, key phrases, sentiment, or summarize documents, Azure AI Language is the default choice. If the scenario calls out “custom labels,” “train with your own examples,” or “domain-specific extraction,” then you’re looking at custom text classification or custom named entity recognition under Azure AI Language.
Another frequent scenario is information retrieval over documents. If the goal is “search and retrieve relevant passages,” that leans toward Azure AI Search (indexing and retrieval). If the goal is “answer questions conversationally from documents,” that’s usually Azure OpenAI plus grounding (often using Azure AI Search as the retrieval layer). The exam won’t require deep implementation detail, but it does test whether you can differentiate retrieval vs generation.
Common trap: Selecting Azure OpenAI for every text problem. If the question is strictly “detect sentiment” or “extract key phrases,” choose Azure AI Language. Using a large language model is overkill and not the “fundamentals” best answer the exam expects.
Exam Tip: Look for verbs: “extract,” “detect,” “classify,” “summarize” → Language. “Find documents” → Search. “Draft, generate, rewrite, answer in natural language” → OpenAI (often grounded).
Conversational AI questions focus on how systems interpret user input and manage dialog. A bot is the application that interacts with users across channels (web, Teams, etc.). The exam commonly uses the terms utterance (what the user says) and intent (what the user wants). For example, “I need to reset my password” is an utterance; the intent might be ResetPassword. Entities can also appear here (e.g., username, device type) to fill slots needed to complete a task.
AI-900 also expects basic awareness of orchestration: deciding what component handles the user request (FAQ lookup, backend transaction, or a generative answer). In real solutions, you might route between scripted dialog, knowledge base responses, and an LLM. On the exam, orchestration is often implied by phrases like “handoff to an agent,” “trigger a workflow,” or “use multiple skills.”
Common trap: Treating a bot as “just chat.” If the scenario emphasizes completing actions (booking, resetting, checking status), you should think intent/entity extraction and dialog flow, not freeform generation.
Exam Tip: If the question says “users ask questions in natural language and the system replies from company policies,” that may be a Q&A/knowledge scenario (often retrieval + response). If it says “users request actions,” that is intent-based conversational design.
Speech workloads are distinct from text NLP because the input/output is audio. The AI-900 exam expects you to match scenarios to the correct speech capability. Speech-to-text (transcription) converts spoken audio into text—think call center transcripts, meeting captions, or voice notes. Text-to-speech synthesizes spoken audio from text—think voice assistants, IVR systems, or reading accessibility features.
Speech translation converts spoken language in one language to text (or speech) in another. The exam often uses phrasing like “real-time translation during a call” (translation) versus “detect language of a document” (NLP). Don’t mix these up: language detection is text analysis; translation for spoken audio is a speech workload.
Common trap: Choosing a text translation capability when the scenario clearly involves microphone input, captions, or audio streams. The key is the modality: if it’s audio, start with Speech.
Exam Tip: Many scenarios are multi-step: speech-to-text first, then NLP (sentiment, summarization), then possibly GenAI (drafting a follow-up email). If the question asks for “the first step” or “which service transcribes,” pick the speech option even if later steps use Language/OpenAI.
Generative AI workloads create new content: text, summaries with rephrasing, code, or structured outputs (JSON) generated from a prompt. On Azure, the key service family for these workloads is Azure OpenAI (models such as GPT-style LLMs). AI-900 focuses on concepts: what an LLM is, how prompts guide outputs, and why grounding is critical for enterprise scenarios.
Prompting includes providing instructions, context, and examples. The exam may reference “system instructions” vs “user prompt,” or ask how to improve reliability—clear constraints, desired format, and relevant context generally win. Grounding (often via retrieval-augmented generation) reduces hallucinations by supplying authoritative data (documents, product catalog, policy pages) at answer time. This is where Azure AI Search commonly complements Azure OpenAI.
Copilots are assistant-style experiences embedded in apps (support, HR, developer tools). The exam tests whether you can identify when a “copilot” is appropriate: repetitive knowledge work, drafting, summarizing, Q&A over internal data, and natural-language interfaces to systems.
Responsible AI is explicitly in scope. You should know the basic risk categories and mitigations: content safety filtering, grounding, access control to data, logging/monitoring, and human review for high-impact decisions. Azure provides safety tooling (e.g., content filtering) and guidance, but the exam mainly tests that you understand why guardrails matter (toxicity, bias, privacy leakage, hallucinations).
Common trap: Assuming an LLM “knows” your company’s private data. Unless data is provided in the prompt or retrieved via grounding, the model won’t have it. If the scenario says “must answer using our internal manuals,” the best design includes grounding, not just a generic chat prompt.
Exam Tip: If a question emphasizes “reduce hallucinations” or “ensure answers come from approved sources,” grounding is the keyword. If it emphasizes “prevent harmful outputs,” think content safety filtering and responsible AI controls.
On AI-900, combined scenarios are common: audio becomes text, text is analyzed, then a generative system produces a customer-ready response. Your exam strategy is to identify the primary workload the question is asking about (transcription vs analysis vs generation) and select the service aligned to that step.
When you see customer reviews, tickets, emails, or chat logs, first decide: is the output a label/score (NLP analysis) or newly written content (GenAI)? Routing and dashboards (sentiment trends, key topics, extracted entities) are classic Azure AI Language outcomes. Drafting replies, rewriting tone, producing a summary with action items, or creating a knowledge assistant are classic Azure OpenAI outcomes—often improved by grounding on enterprise content.
Common trap: Over-selecting the most advanced tool. AI-900 rewards correct fundamentals: use Language for deterministic enrichment, Speech for audio conversion, and OpenAI for generation. Another trap is missing the word “custom”—custom classification/entity recognition implies training with labeled examples under Azure AI Language rather than only prebuilt analysis.
Exam Tip: In multi-service answer choices, pick the option that cleanly matches the workflow order (Speech → Language → OpenAI) and explicitly mentions grounding or safety controls when the scenario stresses compliance, policy alignment, or reduced hallucinations.
1. A retail company wants to analyze thousands of customer product reviews to determine whether each review expresses a positive, negative, or neutral opinion. Which Azure AI capability should you use?
2. A healthcare provider receives unstructured clinical notes and wants to automatically extract medication names, diagnoses, and patient locations from the text. Which Azure AI service is the best fit?
3. A call center wants to transcribe live customer phone calls into text in near real time to support agents during conversations. Which Azure AI service should you use?
4. A company wants a support assistant that answers employee questions using only information from internal policy documents, and it must cite the source content. Which approach best matches the requirement?
5. A marketing team uses an Azure OpenAI model to draft customer-facing emails. The company wants to reduce the likelihood of generating hateful, sexual, or violent content. What should you implement?
This chapter is your last structured pass before the AI-900 exam. You are not learning “new” content here—you are converting knowledge into points. The AI-900 is designed to test whether you can recognize AI workload types, match them to the right Azure AI services, and describe core ML and responsible AI concepts at a fundamental level. The most common reason candidates miss questions is not lack of knowledge, but misreading the scenario or choosing a service that is “close” but not the best fit.
You will complete two mixed-domain mock exam parts, then run a weak-spot analysis using objective mapping (the same way exam creators blueprint the test). Finally, you’ll run a fast refresh of the key services and a decision tree you can replay mentally during the exam.
Exam Tip: In AI-900, the “best answer” is usually the one that most directly satisfies the requirement with the least extra infrastructure. When two options can technically work, the exam favors the managed, purpose-built Azure AI service over a build-it-yourself approach.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Before you start Mock Exam Part 1 and Part 2, set up realistic test conditions: one sitting, no notes, and a strict time box. AI-900 questions reward careful reading more than speed, but you still need a pacing plan to avoid spending too long on one tricky item. A practical strategy is to do an initial pass to capture “easy points,” mark uncertain items, then return for a second pass where you slow down and confirm.
Timing strategy: aim to finish your first pass with at least 25–30% of the time remaining for review. If a question requires you to re-interpret a scenario multiple times, it’s a candidate to mark and move on. On review, focus on questions that are (a) high confidence wrong based on a detail you missed, or (b) low confidence but likely decidable by one key concept (for example, classification vs regression, OCR vs object detection, sentiment analysis vs key phrase extraction).
Exam Tip: If the scenario says “no ML expertise required” or “prebuilt,” expect Azure AI services (Vision, Language, Document Intelligence). If it says “custom model,” “training,” or “evaluate,” expect Azure Machine Learning or custom models within Azure AI services (like Custom Vision where applicable).
Common trap: confusing service families. “Azure AI services” are prebuilt or customizable APIs; “Azure Machine Learning” is the end-to-end platform for building, training, and deploying ML models. Another trap is mixing up “Azure OpenAI” (generative) with “Language” (classical NLP features like NER, sentiment, key phrases). The exam expects you to pick the simplest service that meets the requirement while staying within the scenario constraints.
Mock Exam Part 1 should feel like the first half of a real blueprint: broad coverage across AI workloads, ML fundamentals, and service selection. Your scoring plan matters as much as the questions themselves because it turns results into an improvement roadmap. Use a two-level score: (1) overall percent correct and (2) objective-based score by domain (AI workloads, ML on Azure, Vision, NLP, Generative AI/Responsible AI).
As you answer, force yourself to name the workload type in your head before you look at the options. Example mental labels: “classification,” “object detection,” “OCR,” “entity extraction,” “chat completion,” “responsible AI risk.” This prevents you from being lured by a familiar service name that is not the best match.
Exam Tip: When two services sound plausible, decide by the “input/output” of the scenario. If the input is documents and the output is structured fields, think Document Intelligence; if the input is free-form text and the output is entities/sentiment, think Azure AI Language; if the output is generated text or code, think Azure OpenAI.
Common traps that Part 1 tends to expose: (1) confusing feature types in vision—image classification vs object detection vs OCR; (2) confusing supervised learning tasks—classification vs regression; (3) mixing training concepts—training/validation/test splits and what evaluation metrics indicate; and (4) treating generative AI as a drop-in for classical NLP. After Part 1, do not immediately “move on.” Capture patterns: did you miss items because of one word like “predict a numeric value” (regression) or “extract printed text from images” (OCR)? Those are high-yield fixes.
Mock Exam Part 2 should be treated as your endurance and consistency test. Many candidates improve from Part 1 simply by reading more carefully; Part 2 checks whether that improvement holds when you encounter similar-but-not-identical scenarios. Use the same scoring plan, but add a “confidence score” for each answer (high/medium/low). The goal is not only correctness, but calibrated confidence—on exam day you must know which questions to double-check.
In mixed-domain sets, the exam commonly blends a scenario requirement (business need) with a technical constraint (latency, cost, data type, or responsible AI). Your job is to identify the constraint that rules out distractors. For example, if the scenario needs prebuilt extraction with minimal code, that constraint rules out building a custom ML model from scratch. If the scenario highlights sensitive data and governance, that pushes you toward responsible AI practices and proper deployment controls rather than “just call an API.”
Exam Tip: If a scenario describes conversational answers, summarization, rewriting, or creative generation, treat it as generative AI. If it describes extracting insights (sentiment, entities, classification) from existing text, treat it as NLP analytics. The distractors often swap these two.
Common traps in Part 2: over-selecting Azure Machine Learning when a prebuilt Azure AI service is sufficient; assuming “Cognitive Search” is the right answer whenever documents are mentioned (search is for indexing/querying, not extraction itself); and misunderstanding responsible AI vocabulary (e.g., fairness vs reliability/safety vs privacy/security). Use Part 2 to prove you can decide quickly and justify your decision with one decisive capability statement.
Your review process is where score improvements are created. Don’t just read the correct answer—write (mentally or on a sheet) a one-sentence rationale that references the requirement and the service capability. Then write a one-sentence reason each distractor is wrong. This is how you immunize yourself against “close enough” options on the real exam.
Map every missed question to an exam objective area. AI-900 is fundamentals-focused, so your mapping should be simple and repeatable: (1) AI workloads and considerations, (2) ML principles on Azure, (3) computer vision workloads, (4) NLP workloads, (5) generative AI workloads and responsible AI. If you missed a vision item because you confused OCR and object detection, that’s not “random”—it’s a specific sub-skill you can drill.
Exam Tip: If you can’t explain why each wrong option is wrong, you don’t fully own the concept yet. On AI-900, distractors are frequently “valid Azure services” used in the wrong scenario.
Common traps uncovered in review: confusing evaluation metrics (accuracy vs precision/recall) and when they matter; misunderstanding the purpose of a validation set; and mixing responsible AI principles. For example, fairness relates to disparate impact across groups, reliability/safety relates to consistent performance and avoiding harmful outputs, privacy/security relates to data protection and access control, and transparency relates to explainability and disclosure. The exam expects you to recognize these concepts at a scenario level, not to calculate metrics.
This final refresh is your pre-exam “service map.” The exam is largely a matching exercise: given a scenario, pick the right workload type and then the right Azure capability. Start with the data type: image/video, text, documents, tabular signals, or “prompt-to-generate.” Then decide whether you need prebuilt inference or custom training.
Use a simple decision tree during the exam: (1) What is the input? (2) What is the output? (3) Is it prediction/extraction or generation? (4) Do we need custom training or prebuilt? (5) Which managed service most directly meets that requirement? This mental flow prevents you from picking “famous” services that don’t match the output.
Exam Tip: If the scenario is “extract,” “detect,” or “classify” from existing content, think analytics services (Vision/Language/Document Intelligence). If the scenario is “create,” “compose,” “summarize,” or “converse,” think Azure OpenAI. If it emphasizes building and evaluating a model lifecycle, think Azure Machine Learning.
Responsible AI refresh: be ready to identify actions that reduce harm—human review, content filtering, grounding with trusted data, access controls, monitoring, and clear user disclosures. A common exam trap is choosing a purely technical improvement (like bigger models) when the scenario is asking for a governance or safety mitigation.
Exam day performance is process-driven. Prepare the night before: confirm your exam appointment time, test delivery method (online proctored vs test center), and required identification. For online proctoring, your environment matters: clear desk, stable internet, and a quiet room. Remove extra monitors or ensure they are disconnected per provider rules, and keep your phone out of reach unless explicitly allowed.
Exam Tip: If you feel stuck, re-read the last sentence of the scenario—exam writers often place the true requirement there (for example, “without building a custom model” or “extract text from scanned forms”).
Plan for recovery: if you do not pass, schedule a retake window immediately while the experience is fresh. Your retake plan should be objective-based: review your weakest domain first, redo your rationale/distractor analysis, and rerun at least one mock part under timed conditions. Candidates often gain the most points by fixing a small number of recurring confusions (Vision task types, Language vs OpenAI, and ML evaluation basics). Walk into the exam with a process you’ve practiced—not just knowledge you’ve read.
1. A retail company wants to add a feature to its mobile app that detects the presence of a company logo in photos taken by users. The team wants the least amount of custom model training and infrastructure management. Which Azure AI service should you recommend?
2. A healthcare provider wants to transcribe doctor-patient conversations into text in near real time. The solution must handle audio input and return a text transcript. Which Azure service best meets this requirement?
3. A company wants to build a customer support chatbot that answers questions grounded in internal policy documents. They want to minimize hallucinations by ensuring responses are based on their content. Which approach best aligns with this requirement?
4. You build a binary classification model to predict whether a transaction is fraudulent. Fraud cases are rare, and missing a fraudulent transaction is very costly. Which evaluation metric is most important to focus on in this scenario?
5. A company is deploying an AI system used to screen job applicants. They need to check whether the model’s predictions differ significantly across demographic groups and document steps taken to mitigate issues. Which responsible AI principle is most directly being addressed?