AI Certification Exam Prep — Beginner
Learn AI-900 essentials fast with practice, scenarios, and a full mock exam.
This beginner-friendly course prepares you to pass the Microsoft Azure AI Fundamentals exam (AI-900), even if you’re a non-technical professional. You’ll learn how to recognize common AI workloads, understand core machine learning concepts, and confidently choose the right Azure AI approach in real-world scenarios—exactly the kind of decision-making the AI-900 exam rewards.
The Microsoft AI-900 exam is designed to validate foundational knowledge across key areas of Azure AI. Instead of coding, you’ll be tested on concepts, terminology, and scenario-based choices: What type of AI workload is this? Which capability best fits the requirement? What outcomes and limitations should you expect? This course is structured as a 6-chapter “book” so you always know what to study next and how it maps to official objectives.
Each chapter directly targets the exam domains:
Chapters 2–5 provide clear explanations at the right depth for AI-900, followed by exam-style practice so you can learn the concepts and immediately apply them the way Microsoft asks.
Chapter 1 gets you exam-ready operationally: exam registration and scheduling, what the score means, typical question patterns, and a simple study strategy you can follow whether you have 2 weeks or a month.
Chapters 2–5 each cover one (or two) official domains. You’ll focus on identifying AI workload types, understanding ML fundamentals (training vs inference, evaluation, and responsible AI), and choosing between vision, language, and generative solutions depending on the scenario. Throughout, you’ll practice with questions written in the style you’ll see on the AI-900 exam.
Chapter 6 is your full mock exam and final review. You’ll complete two timed mock sections, analyze weak spots by domain, and finish with an exam-day checklist that helps you avoid common mistakes and manage time confidently.
If you’re ready to begin, create your learning account and start the first chapter today: Register free. Prefer to compare options first? You can also browse all courses on the Edu AI platform.
Microsoft Certified Trainer (MCT) — Azure AI Fundamentals
Jordan Whitaker is a Microsoft Certified Trainer who specializes in helping beginners pass Microsoft fundamentals exams on the first attempt. He has coached professionals across business, operations, and support teams through AI-900 by translating Azure AI concepts into clear decision-making frameworks and exam-ready practice.
AI-900 (Microsoft Azure AI Fundamentals) is designed to validate that you understand what AI is, where it fits in business scenarios, and how Microsoft Azure delivers AI capabilities—without requiring you to code. For non-technical professionals, this exam is less about building models and more about choosing the right AI workload, interpreting high-level machine learning concepts, and explaining responsible AI considerations in plain language.
This chapter orients you to the exam’s format, what it measures, and how to build an efficient study routine in either 2 weeks or 4 weeks. You’ll also set up a lightweight learning environment so you can recognize Azure AI services by name, purpose, and best-fit scenarios—exactly what the exam tests.
Exam Tip: AI-900 answers are often “most appropriate,” not merely “possible.” Your job is to pick the option that best matches the scenario constraints (data type, desired output, latency, cost, and governance), even if multiple choices sound plausible.
Use this chapter as your “operating manual” for the rest of the course. The remaining chapters will dig into AI workloads, Azure services, and responsible AI—but your outcomes depend on how well you navigate the exam and practice intelligently.
Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register, schedule, and choose exam delivery (online vs test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study strategy for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up your Azure learning environment and free resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register, schedule, and choose exam delivery (online vs test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study strategy for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up your Azure learning environment and free resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format, domains, and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 measures your ability to describe AI workloads and identify Azure services that solve common business problems. Microsoft organizes the exam into domains that broadly align to: core AI concepts (including machine learning principles), computer vision, natural language processing (NLP), and generative AI—plus responsible AI concepts that cut across all areas.
In practical terms, the exam is testing whether you can hear a scenario and correctly classify the workload: “Is this prediction or classification?” “Is this extracting text from images?” “Is this summarizing language or understanding intent?” “Is this generating content using a foundation model?” Your answers should demonstrate correct matching of problem type → Azure capability.
Exam Tip: When two answers look similar, choose the one that matches the output the scenario asks for. “Detect objects and return bounding boxes” points to object detection, while “categorize the image” points to classification. The exam rewards precise interpretation of what the business wants produced.
Common trap: treating “AI service names” as the goal. The goal is workload selection. Learn the names as labels for capabilities, but practice translating scenarios into the right workload category first, then select the Azure service that implements it.
Registering correctly prevents avoidable stress and last-minute issues. AI-900 exams are scheduled through Microsoft’s certification portal and delivered by an exam provider (commonly Pearson VUE). You’ll pick an exam delivery method: online proctored (remote) or test center. Both test the same objectives; your choice should be based on your environment and comfort with proctoring rules.
High-level registration flow: sign in with your Microsoft account → select AI-900 → choose language and delivery → select date/time and location (or online) → complete identity verification steps required by the provider. Plan this at least a week in advance so you have time to resolve account name mismatches or ID concerns.
Exam Tip: If your home environment is unpredictable (noise, shared space, unstable Wi‑Fi), choose a test center. Online cancellations due to technical or environmental issues are one of the most common “preventable failures” for otherwise-prepared candidates.
Review exam policies early: ID requirements, rescheduling windows, and prohibited items. If you need accommodations (for example, extra time due to a documented need), start the request process as early as possible since approvals can take time. A common trap is assuming accommodations can be added after scheduling—often you must secure them first, then schedule under the approved conditions.
Microsoft exams typically use a scaled scoring model, and candidates often aim for a “passing score” that is commonly presented as 700 on a 1–1000 scale. The exact number and weighting can vary, and not every question necessarily contributes equally. Your strategy should focus on consistent coverage across domains rather than “gaming” the score.
What passing really requires: not perfection, but competence across the blueprint. AI-900 is fundamentals-level, yet it includes subtle distinctions (for example, when to use a vision OCR capability vs document analysis, or when a generative AI solution needs grounding to reduce hallucinations). Expect a mix of straightforward definitions and scenario interpretation.
Exam Tip: Build “minimum viable competence” in every domain before you try to optimize any single area. Candidates who over-study one favorite topic (often generative AI) and ignore another (often classic ML evaluation concepts) risk missing easy points.
Plan a retake strategy even if you don’t intend to use it. Knowing your fallback reduces anxiety and improves performance. If you fail, use the score report by skill area to identify where to focus. Avoid the common trap of immediately rebooking without changing your study process—your next attempt should include more targeted practice, not just more reading.
For non-technical professionals, confidence comes from repetition: “If I see X business need, I map it to Y workload and Z Azure capability.” That mapping skill is what the scoring model ultimately rewards.
AI-900 questions commonly appear as multiple choice (single answer) and multi-select (“choose two/three”), plus scenario-based items that require you to interpret business context. Some exam forms include drag-and-drop matching (for example, matching a workload to a service or matching steps in an ML process). Your preparation should mirror these formats so you practice the skill the exam measures: quick, accurate mapping under time pressure.
Time management is less about rushing and more about avoiding traps. Scenario questions often contain extra details meant to distract you. Train yourself to underline (mentally) three elements: input data type (text, image, audio, tabular), desired output (label, score, bounding box, summary), and constraints (real-time vs batch, explainability, privacy).
Exam Tip: For multi-select items, treat each option as a true/false statement against the scenario. Don’t search for “the best pair” first—verify each candidate option independently and then select all that satisfy the requirement.
Build a pacing rule: if you can’t confidently decide within a short window, mark it for review (if your exam interface allows) and move on. Many candidates lose points by spending too long on one confusing scenario and then rushing simpler questions at the end.
Your study plan should match your schedule and your starting point. For beginners, two practical tracks work well: a focused 2-week plan (for those who can study daily) and a steadier 4-week plan (for those balancing work and family). In both tracks, the priority is to rotate through domains and revisit them repeatedly—this is spaced repetition, and it is especially effective for service-to-scenario mapping.
2-week template (high intensity): Days 1–3 core AI/ML concepts; Days 4–5 vision; Days 6–7 NLP; Days 8–9 generative AI and responsible AI; Days 10–12 mixed review + targeted weak spots; Days 13–14 full practice + final revision. This works if you can do 60–90 minutes per day.
4-week template (steady): Week 1 core concepts + responsible AI; Week 2 vision; Week 3 NLP; Week 4 generative AI + full-domain review and practice. This works if you can do 30–60 minutes most days.
Exam Tip: Use “two-layer notes.” Layer 1 is a one-page map: workload → output → Azure capability. Layer 2 is short clarifiers: key definitions (feature/label), what a metric indicates, and common scenario keywords (e.g., “extract printed text” → OCR).
Common trap: passive reading of documentation. Fundamentals exams still require recall under pressure. Convert reading into prompts you can answer: “What does this service do?” “What output does it return?” “When would it be the wrong choice?”
Practice is where non-technical candidates separate “I’ve read it” from “I can answer it.” Start by taking a baseline assessment early (after you’ve skimmed the domains) to identify weak spots. Your baseline is not a judgment—it’s a diagnostic that tells you where to invest time. Then run a weekly review loop: practice → analyze errors → update notes → re-practice.
Weak-spot tracking should be specific. Don’t write “NLP” as a weakness; write “confusing key phrase extraction vs sentiment,” or “forgetting when to use embeddings/grounding in generative AI.” Specificity creates targeted fixes and faster improvement.
Exam Tip: Track “near misses” (questions you got right but weren’t sure about). These are high-risk on exam day because stress can flip them to wrong. Treat them like incorrect answers and strengthen the rule behind the choice.
Set up your Azure learning environment with free resources so concepts feel concrete. Use Microsoft Learn modules, product documentation overview pages, and—if available to you—a free Azure account or sandbox. You don’t need to build production solutions, but clicking through Azure AI service pages, seeing what inputs/outputs look like, and reading a few sample scenarios will reduce ambiguity on test day. The goal of practice is not volume; it’s closing loops until your decision-making becomes automatic.
1. You are advising a non-technical colleague who is starting the AI-900 exam prep. They ask what the exam is primarily designed to validate. Which statement best describes the AI-900 focus?
2. A candidate notices that several answer choices on practice questions seem plausible. What approach best matches how AI-900 questions are commonly scored and written?
3. A busy professional can study only 30–45 minutes on weekdays and 1–2 hours on weekends. They want a realistic plan that reduces burnout while still building familiarity with Azure AI services and question styles. Which study strategy is most appropriate?
4. A company’s HR team wants to schedule the AI-900 exam for several employees and asks about delivery options. Which statement correctly reflects typical Microsoft exam delivery choices that candidates must decide during scheduling?
5. You are helping a beginner set up a learning environment for AI-900 preparation. They want to spend as little as possible while becoming familiar with Azure AI services by name and typical use cases. What is the most appropriate first step?
This chapter maps directly to a major AI-900 skill area: recognizing common AI workload patterns and explaining the core vocabulary that shows up in nearly every question. For non-technical professionals, the exam is not trying to turn you into a data scientist—it’s checking whether you can identify the right type of AI for a business scenario, describe what a model does, and show awareness of responsible AI principles.
You’ll see scenario-based items that sound like workplace requests (“we want to predict churn,” “we need to read receipts,” “summarize support tickets,” “detect defects in images,” “draft marketing copy”). Your job is to classify the request into the correct workload family (machine learning, computer vision, NLP, or generative AI), then choose an appropriate capability (often an Azure AI service vs building custom ML).
As you read, keep this exam habit: translate business language into AI workload language. Words like “forecast,” “estimate,” “score,” and “risk” often indicate prediction; “categorize,” “approve/deny,” and “route” hint classification; “find objects” or “locate issues in images” suggests detection; “shorten,” “key points,” and “digest” indicates summarization. This chapter also introduces responsible AI concepts because AI-900 expects you to identify risks and mitigation themes at a high level.
Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate ML, computer vision, NLP, and generative AI at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI concepts to common workplace use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: AI workload identification (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate ML, computer vision, NLP, and generative AI at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI concepts to common workplace use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: AI workload identification (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize AI workloads and match them to real business scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 frequently tests whether you can tell when AI is appropriate versus when traditional rules-based software is enough. Traditional software is deterministic: you define explicit rules (if-then logic), and the program follows them exactly. AI workloads are probabilistic: they learn patterns from data and return the most likely output (often with a confidence score).
Business cue: if the problem is stable, well-defined, and easy to encode (“apply a fixed discount rule,” “validate a 10-digit ID checksum”), it’s usually traditional software. If the problem involves perception, language, or messy real-world variation (“recognize products in photos,” “detect fraud,” “summarize emails”), AI is typically a better fit because writing complete rules is impractical.
Exam Tip: Watch for phrasing like “cannot describe all rules” or “patterns change over time.” That is the exam’s hint that machine learning (or another AI workload) is needed. If the scenario says “must be 100% correct” with no tolerance for error, the best answer may be non-AI controls or a human-in-the-loop approach.
Common trap: assuming anything with the word “automation” requires AI. Many automation problems are workflow and integration tasks (Power Automate, logic apps, scripts). The exam expects you to pick AI only when there is learning, perception, or language understanding involved. Another trap is equating “chatbot” with generative AI. Many bots are retrieval/FAQ systems (NLP intent detection + knowledge base) without generating novel text.
How to identify correct answers: first decide “rules vs learning.” Then identify the modality: numbers/tables (ML), images/video (vision), text/speech (NLP), content creation (generative AI). This quick triage is an exam-winning habit.
AI-900 uses a small set of workload archetypes repeatedly. Your score improves when you can map scenario verbs to these archetypes.
Exam Tip: Don’t confuse detection with classification. If the question says “is there a defect?” it could be classification. If it says “draw a box around the defect” or “count and locate items,” that’s detection. The exam often uses these subtle cues.
Another trap: summarization versus extraction. Summarization produces a paraphrased condensed version; extraction pulls existing fields (names, dates, invoice totals). On AI-900, extraction often maps to NLP information extraction (entity recognition) or document processing capabilities, while summarization maps to NLP/generative capabilities.
To pick the best answer, focus on the expected output type: number (prediction), label (classification), presence/location (detection), shorter text (summarization). Then choose the corresponding AI domain (ML, vision, NLP, generative AI) based on data type and scenario.
The AI-900 exam uses core machine learning vocabulary even in non-technical scenarios. You don’t need math, but you must know what the terms mean and how they relate to an ML lifecycle.
Exam Tip: If the scenario mentions “known outcomes in the past,” it’s hinting at labeled data and supervised learning. If it says “no labels available,” the exam may be pushing you toward unsupervised approaches (like clustering) or toward using prebuilt AI services rather than custom training.
Common trap: mixing up training and inference. Training happens before deployment and is compute-heavy; inference happens in production and must be fast and reliable. Another trap: thinking features are only numeric columns. For AI-900, features are simply the model inputs—text, images, and audio can all be features depending on the model.
What the exam tests: your ability to read a business description and identify where the model comes from (training), what you do with it in real time (inference), and what kind of data is required (features and labels). When answer choices include these terms, select the one that matches the correct stage of the ML process.
AI-900 includes responsible AI because organizations must manage risk, compliance, and trust. The exam expects conceptual understanding of key principles and how they apply to workplace scenarios.
Exam Tip: When a question mentions protected characteristics (age, gender, ethnicity) or high-impact decisions, fairness is usually central. When it mentions “audit,” “explain,” or “regulators,” transparency is the likely theme. When it mentions “PII,” “HIPAA,” “GDPR,” or “customer data,” privacy is the theme.
Common trap: treating responsible AI as only a policy statement. The exam often wants practical mitigations: diverse training data, monitoring model performance, human review for sensitive decisions, and clear user disclosures. Another trap is assuming “more data” always helps; collecting unnecessary personal data can increase privacy risk.
Generative AI adds extra responsible AI concerns (hallucinations, unsafe content, data leakage), but the principles above still apply. On AI-900, you mainly need to connect scenario risk to the correct principle and identify high-level controls like content filtering and human oversight.
AI-900 does not require hands-on building, but it does test whether you understand the difference between using prebuilt AI services and building custom machine learning solutions. The key decision is: do you need a general capability that Microsoft already provides, or do you need a model tuned to your specific data and labels?
Use Azure AI services (prebuilt) when the task is common and broadly applicable—OCR, image tagging, speech-to-text, translation, key phrase extraction, sentiment analysis, or content moderation. These services are optimized, scalable, and reduce the need for ML expertise and training pipelines.
Use custom ML (e.g., Azure Machine Learning) when your organization needs to predict something unique (churn risk for your specific customer base), when accuracy depends on your proprietary data, or when you must control features, training, evaluation, and deployment. Custom ML also makes sense when you need specialized labels or domain-specific outcomes.
Exam Tip: If the scenario mentions “we have historical data with outcomes” and the outcome is specific to the business, that’s a strong signal for custom ML. If it mentions “extract text from receipts,” “detect faces,” or “translate,” that’s typically a prebuilt service scenario.
Common trap: assuming generative AI replaces all NLP. Many language tasks are still best served by classic NLP (entity extraction, language detection) because they are cheaper, more deterministic, and easier to evaluate. Another trap is thinking you must always train a model—prebuilt services can be the correct answer when speed-to-value and standard tasks are emphasized.
What the exam tests here is your ability to select the right approach, not to memorize product names. Focus on the decision criteria: data availability (labels), specificity of the task, required customization, and risk/controls (especially for generative AI use).
This section prepares you for the exam’s most common item style: short business scenarios with multiple plausible AI options. Your advantage comes from using a consistent mapping method rather than “gut feel.”
Step 1: Identify the input data type. Tables of numbers and attributes suggest ML. Images/video suggest computer vision. Large volumes of text suggest NLP. Requests to draft, rewrite, or brainstorm content suggest generative AI.
Step 2: Identify the output type using the workload archetypes: number (prediction), category (classification), presence/location (detection), shorter text (summarization). This often eliminates half the choices immediately.
Step 3: Check for training vs inference clues. If the scenario says “build a model using past outcomes,” it’s training. If it says “use the model to score new applications,” it’s inference. If answer options misuse these terms, that’s a classic AI-900 distractor.
Step 4: Apply a responsible AI lens. If the scenario is high impact or involves personal data, consider fairness, privacy, reliability, and transparency. The exam may include a “best next step” choice like adding human review, monitoring performance, or explaining outcomes.
Exam Tip: When two answers both seem feasible, choose the one that most directly matches the scenario’s required output and least assumes unnecessary complexity. AI-900 typically rewards “fit-for-purpose” solutions (prebuilt service for common tasks; custom ML for proprietary predictions).
Common traps to avoid: confusing summarization with data extraction; confusing detection with classification; selecting generative AI when the task is simple sentiment or key phrase extraction; and ignoring that responsible AI may be the primary objective of the question even when AI technology is mentioned.
1. A retail company wants to predict which customers are most likely to stop buying in the next 30 days so the sales team can target retention offers. Which AI workload is this scenario describing?
2. A manufacturing plant wants to automatically detect scratches and dents in product photos taken on an assembly line. Which AI workload should you use?
3. A support manager wants to automatically summarize long customer support tickets into a few bullet points for faster triage. Which type of AI capability best fits this request?
4. A marketing team wants an AI tool that can draft multiple variations of product descriptions in a specific tone based on a short prompt. Which AI workload is being requested?
5. A bank uses an AI model to help decide whether loan applications should be approved or denied. During review, the team discovers the model rejects a higher percentage of applicants from a particular demographic group. Which responsible AI principle is most directly impacted?
This chapter maps to the AI-900 domain that asks you to explain core machine learning (ML) principles and how Azure supports them. The exam is not testing coding. It is testing whether you can interpret a scenario, identify the ML workload (regression, classification, clustering), understand the ML lifecycle (training/evaluation/deployment), and recognize responsible AI considerations. If you can reliably answer “What is the label?”, “What is the prediction type?”, and “How do we evaluate success?”, you can eliminate most wrong options quickly.
On AI-900, you’ll also see Azure Machine Learning (Azure ML) vocabulary at a high level: workspace, compute, datasets, pipelines, and AutoML. The most common trap is confusing where something happens (training vs inference) or choosing an evaluation metric that doesn’t match the business goal (for example, optimizing accuracy when false negatives are the real risk). Another trap: mixing up validation and test sets, or assuming unsupervised learning uses labels.
Use this chapter to build a practical mental checklist: (1) define the problem type; (2) confirm data/labels; (3) select training and evaluation approach; (4) understand deployment/inference; (5) apply responsible ML guardrails. Those steps align with how exam questions are written: short scenarios with multiple plausible answers, where one fits the lifecycle and objective best.
Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand training, validation, testing, and evaluation metrics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right ML approach for a scenario (regression vs classification vs clustering): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: ML fundamentals on Azure (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand training, validation, testing, and evaluation metrics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right ML approach for a scenario (regression vs classification vs clustering): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: ML fundamentals on Azure (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain supervised, unsupervised, and reinforcement learning at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 expects you to understand the end-to-end ML lifecycle, not just “train a model.” A typical lifecycle is: collect data, prepare data, train, evaluate, deploy, and monitor. In exam scenarios, read for clues about where the team is in this lifecycle—because many answer options describe the right action, but at the wrong stage.
Data collection focuses on gathering representative data that reflects real-world conditions. If the scenario mentions “new region,” “new customer group,” or “changing trends,” think about whether the collected data still represents the problem. Data preparation includes cleaning missing values, removing duplicates, feature engineering (creating useful input columns), and splitting data into training/validation/test sets.
Training is when the algorithm learns patterns from the training set. Deployment is when you publish the trained model so it can produce predictions (inference) on new data—often as a web service endpoint. The exam frequently tests the distinction between training and inference: training uses historical labeled data; inference uses new unlabeled inputs to produce outputs.
Exam Tip: If the question mentions “endpoint,” “real-time predictions,” or “integrate into an app,” you are in deployment/inference territory, not training. If it mentions “tune hyperparameters,” “select features,” or “improve performance,” you are in training/validation territory.
Common trap: calling the validation set the “test set.” On the exam, the test set is reserved for final evaluation, not iterative tuning.
The exam wants you to distinguish supervised, unsupervised, and reinforcement learning at a conceptual level and match them to scenarios. The quickest way is to ask: “Do we have labels?” If yes, it’s usually supervised learning. If no and we’re discovering structure, it’s unsupervised. If an agent learns by trial-and-error rewards, it’s reinforcement learning.
Supervised learning uses labeled data (inputs with known outputs). Two key supervised problem types appear constantly on AI-900:
Unsupervised learning uses unlabeled data. The most tested concept is clustering, which groups similar items (customer segmentation, grouping documents by topic, anomaly detection as “far from clusters”). If the scenario says “we don’t know the groups in advance” or “segment customers,” clustering is usually the best fit.
Reinforcement learning is less common on AI-900, but you should recognize the pattern: an agent takes actions, receives rewards/penalties, and learns a policy (robot navigation, game playing, dynamic pricing with feedback). Reinforcement learning is not simply “retraining with new data”; it’s learning through interaction.
Exam Tip: The word “predict” alone doesn’t guarantee supervised learning. If the scenario is about “discovering segments” or “finding patterns” without labeled outcomes, the correct choice is often unsupervised clustering—even though the business may call it “predictive insights.”
Evaluation questions on AI-900 focus on choosing and interpreting basic metrics and recognizing overfitting. Start by identifying what “good” means for the business: is it worse to miss a positive case (false negative) or to raise false alarms (false positive)? Then select metrics accordingly.
Accuracy is the proportion of correct predictions overall. It can be misleading on imbalanced datasets (for example, only 1% fraud). A model that always predicts “not fraud” can be 99% accurate but useless.
Precision answers: “When the model predicts positive, how often is it correct?” It matters when false positives are costly (flagging legitimate transactions, sending unnecessary alerts). Recall answers: “Of all true positives, how many did we find?” It matters when missing positives is costly (failing to detect fraud, missing a medical condition).
Overfitting is when a model performs very well on training data but poorly on new data. Typical causes include overly complex models, leakage (using features that won’t exist at inference), or insufficient data. Underfitting is the opposite: the model is too simple to capture patterns, leading to poor performance on both training and test data.
Exam Tip: If you see “high training accuracy, low test accuracy,” think overfitting. If both are low, think underfitting or poor features/data quality. If performance drops after deployment, think data drift and monitoring (covered in responsible ML).
Also know the role of validation vs test: you tune using validation metrics (for example, compare models), and you report final performance on the test set. A common exam trap is selecting “test set” as the place to iterate; that introduces bias.
AI-900 includes high-level Azure Machine Learning concepts to ensure you can describe how ML work is organized in Azure. You are not expected to configure them in detail, but you should know what each component is for and how they relate to the lifecycle.
An Azure ML workspace is the top-level container that organizes assets such as datasets, models, experiments, and endpoints. If a question asks where models and runs are tracked and managed, the workspace is usually the answer.
Compute refers to the resources used for training or inference (for example, compute instances for development, compute clusters for scalable training). If the scenario needs “scale out training jobs” or “run experiments in parallel,” compute clusters fit. Deployment can also use compute (managed endpoints), but exam wording often distinguishes “training compute” from “serving predictions.”
Pipelines represent repeatable workflows (data preparation → training → evaluation → deployment). On the exam, pipelines are associated with automation, repeatability, and operationalizing ML processes.
AutoML automates model selection and hyperparameter tuning for a given task (classification/regression/time series forecasting). It’s most appropriate when you want a strong baseline quickly, you have labeled data, and you want Azure to try multiple algorithms/parameter combinations.
Exam Tip: If the question says “no code” or “quickly compare algorithms,” AutoML is a strong candidate. If it says “repeatable, scheduled training and deployment,” pipelines are the better match.
Responsible AI is explicitly tested on AI-900, and in ML questions it often appears as “what should you do to reduce risk?” The exam expects baseline understanding: improve data quality, reduce bias, explain outcomes when appropriate, and monitor models after deployment.
Data quality issues (missing values, noisy labels, outdated data) directly impact model reliability. In scenarios where model performance is inconsistent, the best next step may be to review label accuracy and data representativeness rather than changing algorithms.
Bias occurs when a model systematically disadvantages a group. Bias can come from imbalanced or unrepresentative training data, historical inequities, or proxy features (variables that indirectly encode sensitive attributes). The exam often tests recognition: if the scenario mentions different error rates across groups, fairness and bias mitigation steps are relevant (collect better data, evaluate fairness metrics, adjust features, or use fairness tools).
Monitoring is essential because real-world data changes (data drift) and model performance can degrade over time. Post-deployment monitoring looks for changes in input data distribution, prediction distribution, and outcome-based performance (when labels become available later).
Exam Tip: If the scenario says “the model worked during testing but is worse in production,” don’t jump straight to “retrain with more data” as the only fix. The more complete responsible ML answer includes monitoring for drift, validating data pipelines, and then retraining as needed.
At AI-900 depth, you should also connect responsible ML to human oversight: high-impact decisions (finance, healthcare, hiring) often require interpretability and review processes, not just high accuracy.
This section prepares you for exam-style prompts without listing full quiz items. AI-900 questions typically present a short business scenario and ask you to choose the ML approach (regression/classification/clustering) or interpret a training outcome. Use a consistent decision flow to avoid common traps.
Step 1: Identify the target output. If the output is a number (cost, time, temperature), choose regression. If it’s a category (approve/deny, yes/no, type A/B/C), choose classification. If there is no target output and you are grouping, choose clustering.
Step 2: Check for labels. If the dataset includes a known “answer column” (like “Churned: Yes/No”), it’s supervised. If the scenario says “we don’t know categories yet,” it’s likely unsupervised.
Step 3: Match the metric to the risk. If false negatives are dangerous (missing fraud, missing a defect), prioritize recall. If false positives are costly (too many customers incorrectly flagged), prioritize precision. If classes are balanced and costs are similar, accuracy may be acceptable.
Step 4: Interpret train vs test results. Big gap (train good, test poor) implies overfitting; similar poor results imply underfitting or weak features/data. If performance drops after deployment, suspect drift and the need for monitoring/retraining.
Exam Tip: Many wrong answers are “technically related” but not the best fit. For example, choosing clustering when you actually have labeled outcomes, or choosing accuracy when the scenario emphasizes rare events. Anchor on the output type and business risk; then select the most aligned approach and metric.
1. A retail company wants to predict the total sales amount for each store next month based on historical sales, promotions, and local events. Which machine learning approach best fits this requirement?
2. A healthcare organization builds a model to flag patients who are likely to miss an appointment. Missing a high-risk patient (false negative) is much more costly than incorrectly flagging a patient (false positive). Which evaluation metric is most appropriate to prioritize during model selection?
3. You create a model and split your labeled dataset into training, validation, and test sets. Which statement best describes the purpose of the test set?
4. A telecommunications provider wants to segment customers into groups with similar usage patterns to create targeted plans. They do not have predefined group labels. Which approach should you use?
5. You deploy a trained model to a REST endpoint in Azure Machine Learning. New data is sent to the endpoint to generate predictions in real time. Which stage of the ML lifecycle is occurring when the endpoint returns a prediction?
In AI-900, “computer vision” means extracting meaning from images and video: reading text, detecting objects, describing scenes, and (conceptually) dealing with people-related imagery. The exam does not expect you to build neural networks from scratch; it expects you to recognize common vision tasks, map them to Azure capabilities, and describe what outputs you get (labels, bounding boxes, extracted text, confidence scores, etc.).
This chapter organizes the vision domain the way AI-900 questions are typically written: a short scenario (e.g., “scan receipts” or “detect safety gear”), followed by multiple services that sound similar. Your job is to spot the workload type (classification vs detection vs OCR), then pick the Azure capability that matches the required output and constraints (speed, customization, document complexity, responsible AI).
Exam Tip: When a scenario says “where is the item in the image?” you need object detection (bounding boxes). When it says “what is in the image?” you can often use image analysis (tags/captions). When it says “read the text,” think OCR. That single keyword-to-output mapping eliminates many distractor answers.
Finally, AI-900 also tests responsible AI basics. In vision, this often appears as privacy/consent, demographic performance differences, and the risk of using face-related features incorrectly. Expect conceptual questions about safe/appropriate use rather than deep implementation details.
Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Select Azure vision capabilities for OCR, detection, and image understanding scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand responsible vision considerations and limitations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: computer vision scenario questions (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Select Azure vision capabilities for OCR, detection, and image understanding scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand responsible vision considerations and limitations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: computer vision scenario questions (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify key computer vision tasks and how they appear on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 vision questions frequently start by testing whether you can identify the task type. The three most common tasks are image classification, object detection, and OCR (optical character recognition). Although they can be combined in real solutions, the exam often forces you to choose the primary workload based on the desired output.
Image classification answers: “Which category best describes this image?” The output is typically a label (or multiple labels) with confidence scores. This is used for scenarios like classifying product photos (shoe vs shirt), or sorting images into known categories. A common trap is to confuse classification with tagging; classification implies you have defined classes you care about.
Object detection answers: “Which objects are present and where are they?” The key output is a set of bounding boxes (coordinates) plus labels and confidence. If the scenario mentions counting items, locating them, drawing rectangles, or triggering an alert when an object appears in a specific region, it’s detection. Exam Tip: If the question mentions “identify a logo in an image” or “detect damaged parts,” you’re still in detection territory because location matters, even if the object is small.
OCR answers: “What text appears in the image?” The output is extracted text plus positional information (lines/words) and confidence. OCR is used for reading signs, labels, receipts, and scanned documents. A frequent exam distractor is proposing “NLP” for text extraction; remember OCR is still a vision task because the input is an image. NLP comes after OCR if you then need to interpret meaning.
When you see multiple-choice answers that include “Custom Vision,” “Vision (image analysis),” “Read/OCR,” or “Document Intelligence,” pause and ask: is the scenario about category, location, or text? That one decision usually determines the correct path.
Many AI-900 questions use “image understanding” language such as “describe the photo” or “identify what’s happening.” In Azure, this maps to prebuilt image analysis capabilities that return structured descriptions without you training a custom model. The exam expects you to recognize three common outputs: tags, captions, and dense descriptions/content metadata.
Tags are keywords (often multiple) that summarize what the service sees: “person,” “outdoor,” “car,” “sky.” Tags are useful for search, indexing, and basic filtering. They are not “classes you trained,” so a trap is choosing “custom classification” when the scenario simply wants auto-generated keywords for many unknown images.
Captions are short natural-language sentences describing the image (for example, “a person riding a bicycle on a street”). Captions are commonly used for accessibility (alt text) and quick summaries. If a question mentions “generate a description for screen readers,” captions are a strong clue.
Content understanding can include recognizing common objects and scene context beyond a single label. Exam questions sometimes phrase this as “extract insights from an image” or “understand the scene.” The key is that you are not asked to locate every object with bounding boxes; you are asked to derive a meaningful summary. Exam Tip: If the requirement is “identify every instance and location of a specific object,” you’ve crossed into detection; if it’s “describe or tag,” image analysis is the better match.
A common exam trap is mixing up “tags” with “classification.” Classification implies a constrained set of categories you define (often for a business process). Tags are broad and descriptive, driven by the prebuilt model’s vocabulary. Read the scenario carefully: if the company wants to standardize into their categories (e.g., “acceptable” vs “defective”), it leans custom; if they want generic metadata, it leans prebuilt image analysis.
Text in images appears on the AI-900 exam in two main flavors: simple OCR (read text) and document/form-like extraction (understand structured fields). Both start with vision, but the best Azure choice depends on whether the document has a predictable layout and whether you need key-value pairs.
OCR (Read) is ideal when the question says “extract the text” from photos or scans: street signs, product labels, screenshots, or basic scanned pages. Outputs commonly include lines/words, their coordinates, and confidence. In exam scenarios, OCR is often sufficient when there’s no mention of fields like “total,” “invoice number,” or “date.”
When the prompt implies forms or semi-structured documents—for example invoices, receipts, IDs, or purchase orders—the exam expects you to think beyond raw text and toward field extraction. In Azure, that is typically described as document analysis/document intelligence capabilities that can return structured results (e.g., vendor name, total amount) rather than just a text blob.
Exam Tip: Spot the word “extract fields” or “key-value pairs.” If the scenario needs “Total,” “Tax,” “Customer name,” or “Line items,” choose a document/form capability rather than plain OCR. OCR alone would force the developer to parse the text manually, which the exam will treat as the less appropriate choice.
Also watch for constraints: mobile camera photos can have skew, glare, and low resolution. The exam may hint that you need a service designed for real-world document capture. Your job is not to design preprocessing pipelines, but to pick the capability that best matches “document + fields” vs “image + text.”
AI-900 treats face-related topics primarily as a responsible AI and appropriate-use concept area. You may see scenarios about detecting whether a face is present, or comparing a face to a known user for access. The exam is less about implementation details and more about what is appropriate, what requires consent, and what limitations to acknowledge.
Conceptually, face-related capabilities can include detecting faces in an image and extracting face-region attributes (such as position). Exam items may also describe “identify who the person is,” “verify the same person,” or “group photos of the same person.” Treat these as higher-risk use cases that require careful governance, transparency, and compliance.
Privacy/consent is a recurring theme. If a scenario involves customers, employees, or public footage, assume you must consider notice and consent, data minimization (store only what you need), and secure handling. The exam often rewards answers that include explicit consent, clear purpose limitation, and an option for users to opt out where appropriate.
Exam Tip: If two answers both “work,” the exam often prefers the one that demonstrates responsible AI: consent, least-privilege access, auditing, and a human review step for consequential decisions. This is especially true for face-related scenarios.
Also remember limitations: image quality, occlusion (masks, glasses), and poor lighting can reduce accuracy. The exam may ask what to communicate to stakeholders; the safe, correct approach is to describe confidence thresholds, manual review for borderline cases, and continuous monitoring for drift or performance gaps.
A high-value AI-900 skill is choosing between a prebuilt vision capability and a custom model. Prebuilt services are designed for common tasks (tags, captions, OCR, generic object recognition) and are best when requirements are general and time-to-value matters. Custom models are best when the business has domain-specific categories or objects that prebuilt models won’t reliably recognize (specific defects, proprietary parts, brand-specific labeling rules).
Prebuilt approach: choose this when you can describe the requirement in everyday terms (“read text,” “describe the image,” “detect common objects”) and you don’t have labeled training data. The exam expects you to recognize that prebuilt services reduce effort and are typically the default recommendation for common workloads.
Custom model approach: choose this when the scenario says “train,” “labeled images,” “our products,” “our defect types,” or when accuracy must be optimized for a narrow domain. This often maps to custom vision-style training for classification or detection. Exam Tip: The keyword “custom” in the scenario is not enough; look for the real driver: a unique label set or objects not covered by general models.
Decision cues the exam loves: (1) Do you need bounding boxes? That pushes you toward detection (custom or prebuilt depending on specificity). (2) Do you need fields from receipts/invoices? That pushes you toward document analysis rather than generic OCR. (3) Do you need your own categories? That pushes you toward custom classification. Write these cues on your mental checklist and use them to eliminate distractors quickly.
This lesson is about building the “scenario-to-service” reflex the AI-900 exam demands. You are not being graded on memorizing product names alone; you’re being graded on mapping requirements to the correct capability and knowing the typical outputs so you can validate your choice.
When you read a scenario, underline the requested output: (a) category label, (b) object location, (c) extracted text, (d) structured fields, (e) descriptive summary. Then choose the capability that naturally produces that output. If the answer option provides an output that doesn’t match (for example, tags when the scenario needs bounding boxes), eliminate it.
Exam Tip: Force yourself to say out loud (mentally): “The input is an image; the required output is X; therefore I need Y.” This prevents a common trap where test-takers pick a familiar-sounding service name without checking outputs.
Also practice identifying “red herring” requirements. For example, a prompt might mention “store the extracted text for analytics” (tempting you toward NLP), but the core ask is still OCR. Or it may mention “detect whether a person is present” (image analysis may work) but then add “and draw a box around each person” (now it’s detection). AI-900 questions often hinge on that one extra phrase that changes the workload type.
Finally, keep responsible AI in view even in technical matching scenarios. If a scenario involves people, identity, or surveillance-like contexts, the best answer may include consent, data minimization, and human oversight—even if the technical capability is correct. The exam rewards solutions that are both functional and responsible.
1. A retail company wants to automatically extract the merchant name, date, and total amount from photos of customer receipts taken on mobile devices. Which Azure capability should they use?
2. A manufacturing plant needs to verify whether workers are wearing hard hats in live camera images. The system must identify where the hard hat appears in the image to support auditing. Which computer vision task is required?
3. A real estate company wants to automatically generate a short description for each property photo, such as "a living room with a sofa and large window." Which Azure vision capability best fits this requirement?
4. A company plans to analyze images of customers in a store to infer emotions and target ads in real time. Which responsible AI concern is most relevant to raise for this scenario in the context of AI-900 computer vision?
5. You are reviewing requirements for an Azure vision solution. The solution must return (1) the text found in an image and (2) the confidence score for the extracted text. Which workload type should you select?
This chapter maps directly to the AI-900 exam domain that asks you to identify natural language processing (NLP) workloads and describe generative AI workloads on Azure. You are not expected to code, but you are expected to recognize common language scenarios, choose an appropriate Azure capability, and describe how responsible AI applies—especially for generative AI. The exam often tests your ability to separate “classic NLP” (extracting signals from text) from “generative AI” (creating new text) and to spot when a scenario needs both.
As you read, keep the exam mindset: focus on the verbs in the question (analyze, extract, classify, translate, generate, summarize, answer) and the constraints (latency, privacy, safety, grounding). Many wrong options look plausible because they are “AI-ish” but mismatched to the task. You will practice that selection logic in Section 5.6.
Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI and safety concepts to language and generative use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: NLP + generative AI mixed scenarios (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI and safety concepts to language and generative use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: NLP + generative AI mixed scenarios (exam-style set): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify NLP tasks and choose the right capability for language scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI concepts (LLMs, prompts, grounding) at AI-900 depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
On AI-900, “NLP workloads” usually mean analyzing text rather than generating it. The exam commonly frames these as four bread-and-butter tasks: sentiment analysis, key phrase extraction, entity recognition, and translation. Your job is to match the scenario’s goal to the correct capability.
Sentiment analysis is about classifying opinion or emotional tone (positive/negative/neutral, sometimes with confidence). Typical exam scenario: “Analyze customer reviews to measure satisfaction trends.” The trap: choosing a chatbot or generative model because the input is text. If the output is a score or label rather than new prose, you’re in classic NLP territory.
Key phrase extraction pulls out the main terms from a document (e.g., “shipping delay,” “refund policy”). This is often tested in scenarios like “summarize topics from thousands of support tickets.” The common trap is confusing it with summarization. Key phrases are not sentences; they’re short terms you can use for tagging, search, or dashboards.
Entity recognition (often including “named entity recognition”) identifies structured items in text such as people, organizations, locations, dates, or product names. Exam questions frequently combine this with compliance: “Detect names and addresses in emails.” Be careful: detecting personal data may overlap with privacy requirements, but the task itself is still entity extraction.
Translation converts text from one language to another. The exam will often include a multi-language customer support scenario. Exam Tip: If the scenario explicitly says “convert between languages,” don’t overthink it—translation is the intended capability even if other tasks (like sentiment) could be added later.
How to identify the correct answer: find the noun that represents the output. “Score,” “entities,” “key phrases,” and “translated text” strongly indicate these NLP workloads.
Conversational AI on AI-900 is about understanding what a chatbot does, how user requests are interpreted, and how the bot connects to other services. You’ll see terms like “bot,” “intents,” “utterances,” and “orchestration.” You do not need to implement conversation flows, but you must recognize these components and pick them in scenarios.
A chatbot provides a conversational interface (web chat, Teams, SMS) to answer questions, guide users, or trigger actions. The exam typically describes goals like “handle common HR questions” or “help users reset passwords.” The key idea is that the bot is the front door; it may call other services behind the scenes.
Intents represent what the user wants to do (e.g., “CheckOrderStatus,” “BookAppointment”). Utterances are example phrases users might say that map to an intent. A common trap is confusing intent classification with sentiment analysis. Sentiment is opinion; intent is purpose.
Orchestration is the “traffic controller” concept: routing an incoming message to the right handler—perhaps a FAQ knowledge base, an action workflow, or a human agent. On AI-900, orchestration is tested as a conceptual design decision: when you have multiple sources (structured data, knowledge articles, generative model responses), you need a plan for selecting and combining them safely.
Exam Tip: If the scenario mentions “handoff to human,” “connect to ticketing,” or “trigger a workflow,” that’s a conversational solution with orchestration, not just a language model generating text.
How to spot correct answers: look for “conversation,” “multi-turn,” “support agent,” “guided interaction,” or “integrate with business systems.” Those point to conversational AI rather than standalone NLP analysis.
Generative AI workloads differ from classic NLP because the system produces new content. At AI-900 depth, focus on three use cases: content generation, summarization, and question answering (Q&A). The exam expects you to describe what these are, when to use them, and what risks they introduce.
Content generation includes drafting emails, marketing copy, job descriptions, or product FAQs. The prompt asks for an output that did not exist before. The common trap is assuming generation is always appropriate; exam scenarios often include compliance constraints where free-form generation must be controlled.
Summarization compresses long text (meeting notes, incident reports) into shorter text while retaining key points. This looks like “key phrases,” but the output is coherent sentences and bullet summaries. Exam Tip: If the scenario asks for “a short paragraph summary” or “executive summary,” that’s summarization; if it asks for “tags/topics,” that’s key phrase extraction.
Q&A can mean answering user questions based on a defined set of content (policies, manuals) or answering more broadly. For AI-900, you should recognize that higher-quality enterprise Q&A often requires grounding (covered in Section 5.4) so the model answers from provided sources rather than guessing. The trap is choosing a generative model without grounding for high-stakes factual domains (HR policy, medical guidance).
Azure generative AI solutions are commonly described at a high level as using large language models (LLMs) via managed services, with options to connect data sources. The exam doesn’t require SKU memorization, but it does require knowing what a generative model is good at (natural language creation) and what it is not inherently guaranteed to do (always be factual).
How to choose correctly: identify whether the user wants a new narrative response versus an extracted field/label. New narrative response → generative AI workload.
Prompting is testable on AI-900 because it is the main “control surface” for LLM behavior. You should be able to explain, at a practical level, how prompts influence outputs and what elements improve reliability. Think of a prompt as a mini spec: the clearer the spec, the more predictable the result.
Instructions are the direct task request (e.g., “Summarize this report for an executive audience”). Instructions should specify output format, tone, and length when needed. A common trap is vague prompts that cause the model to invent details. The exam may describe inconsistent results and hint that better instructions are required.
Context is the information the model should use: the document to summarize, the policy excerpt to answer from, or the customer’s ticket history. Context is also where grounding begins: you supply authoritative text so the answer is anchored to it. Exam Tip: When you need factual answers about your organization, adding trusted context is often the difference between “chat” and “enterprise Q&A.”
Examples (sometimes called “few-shot” examples) show the model what good outputs look like. On the exam, examples matter when the question mentions consistent formatting or classification-like responses (e.g., always return JSON fields or always follow a template).
Constraints limit behavior: “Use only the provided sources,” “If the answer is not in the context, say you don’t know,” “Do not include personal data,” “Return exactly three bullet points.” Constraints reduce risk and improve evaluation. The trap: assuming the model will naturally follow policies without explicit constraints.
How to identify correct answers: when a scenario describes “unreliable responses,” “wrong format,” or “answers not aligned to our documents,” the best fix is often prompt improvement plus grounding, not switching to a different AI workload.
Responsible AI is not a separate “ethics lecture” on AI-900; it’s embedded in scenario choices. For language and generative use cases, three risks appear repeatedly: hallucinations, data protection, and evaluation/monitoring.
Hallucinations are plausible-sounding but incorrect outputs. The exam tests whether you understand that LLMs can generate fluent answers that are not guaranteed to be true. Mitigations at AI-900 depth include grounding the model on trusted sources, constraining prompts to “use only provided content,” and designing the system to say “I don’t know” when context is missing. Exam Tip: If the scenario is high-stakes (legal, medical, finance), expect the correct answer to include grounding and human review.
Data protection focuses on preventing leakage of sensitive or personal data. Common scenario: “Summarize customer chats that include addresses and payment details.” You should recognize the need to limit data exposure (send only necessary text), apply access controls, and consider data minimization. The trap is assuming you can freely paste internal documents into a model without governance.
Evaluation means measuring output quality and safety over time. Unlike classic ML where you track accuracy, generative evaluation can involve checking groundedness, factuality, toxicity, and adherence to format. The exam may describe a pilot that works “sometimes” and ask what you should do before production: evaluate outputs with representative test cases, define acceptance criteria, and monitor.
How to choose correct answers: look for risk keywords—“compliance,” “PII,” “incorrect answers,” “harmful content.” Those questions want responsible design choices, not just a capability name.
This section prepares you for the most common AI-900 question style: short business scenarios where multiple AI options seem plausible. Your goal is to pick the best fit by identifying (1) the desired output, (2) whether the task is extract/label vs generate, and (3) what safety constraints are implied.
Start by underlining the output artifact. If the scenario wants a score (customer happiness), think sentiment analysis. If it wants tags (main topics), think key phrases. If it wants structured fields (names, locations, dates), think entity recognition. If it wants the same message in another language, translation. These are “NLP analysis” selections.
If the scenario wants a draft, summary paragraph, email rewrite, or natural language answer, that’s a generative AI workload. Then ask: does it need to be factual based on company data? If yes, the scenario is signaling grounding and orchestration—retrieve trusted content and constrain the model to it. Exam Tip: “Answer using our policy documents” is code for grounded Q&A, not open-ended chat.
Mixed scenarios are common: a support center might use entity recognition to extract order numbers, sentiment to prioritize angry customers, and generative AI to draft a response. The trap is picking only one tool when the scenario clearly needs a pipeline. On AI-900, you won’t design the full architecture, but you should recognize when multiple capabilities are complementary.
When stuck between two options, choose the one that produces the required output with the least “extra.” The exam rewards selecting the most direct capability rather than the fanciest.
1. A support team wants to automatically route incoming customer emails into categories such as Billing, Technical Issue, and Account Management. The solution should identify the correct category for each email. Which Azure AI capability best fits this requirement?
2. A company wants a chat experience that answers employee questions about internal HR policies. To reduce hallucinations, answers must be based only on the company’s approved policy documents. At AI-900 depth, which approach best meets the requirement?
3. A marketing team wants to generate multiple variations of a product description from a short list of bullet points, while keeping the tone professional. Which workload is this an example of?
4. A healthcare organization plans to summarize patient messages using a generative AI model. They want to reduce the risk of generating disallowed or inappropriate content and ensure oversight for high-impact use. Which responsible AI measure best aligns to this goal?
5. A global retailer wants to analyze social media posts to determine whether customers are happy or unhappy with a recent product launch. They do not need the system to generate responses—only to score the tone of each post. Which NLP task should you choose?
This chapter is your “dress rehearsal” for AI-900. The goal is not to prove you’re smart—it’s to build exam instincts: pacing, eliminating distractors, recognizing service names, and spotting the test’s favorite traps. AI-900 is designed for non-technical professionals, but it is still precise: the exam rewards clear mapping from scenario language (what the business needs) to Azure AI workload categories (what capability fits) and to responsible AI considerations (what constraints apply).
You will complete two mock exam passes (Part 1 and Part 2), then perform a weak-spot analysis using an answer-review method that mirrors how exam writers think. Finally, you’ll use a domain-by-domain checklist aligned to the official objectives and finish with a practical exam-day plan. Treat this chapter as a workflow: run the mock, review with rigor, refresh by domain, then execute your exam strategy.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Before you start any mock exam, decide how you will manage time and uncertainty. AI-900 questions are typically short, but the traps are in the wording: “best,” “most cost-effective,” “minimize development,” “requires training data,” or “must be explainable.” Your job is to translate those constraints into the correct Azure AI workload and discard options that violate the constraint.
Use a two-pass approach. Pass 1 is speed and confidence: answer what you can quickly and accurately. Pass 2 is for flagged questions. Mark questions (mentally or on your scratch) in three levels: (1) “Unsure between two,” (2) “Concept gap,” (3) “Wording trap.” This classification matters because each type needs a different fix during review.
Exam Tip: If a question includes a clear requirement like “no coding” or “prebuilt model,” immediately suspect Azure AI Services (prebuilt) rather than Azure Machine Learning (custom training). If it says “custom model,” “train,” “features,” or “evaluate,” your center of gravity shifts toward Azure Machine Learning concepts.
Pacing: aim for a steady rhythm rather than racing early. If you find yourself rereading, stop and extract the nouns and verbs: what is the input (text, image, audio, tabular data) and what is the desired output (classify, extract, summarize, generate, detect anomalies)? This “input/output” framing prevents drifting into irrelevant options.
Mock exams are most valuable when you treat them like the real test: quiet environment, no notes, and strict timing. The point is to reveal your decision patterns under pressure.
Mock Exam Part 1 should be a mixed-domain set that forces you to switch contexts the way the real AI-900 does. Expect interleaving across: AI workloads and core concepts, machine learning lifecycle, computer vision, NLP, and generative AI on Azure. Your mindset is “classify the problem first, then pick the capability.”
As you work through Part 1, anchor each scenario to an objective. If the scenario describes predicting a numeric value (sales forecast, demand), think regression. If it describes choosing a category (spam/not spam, churn/retain), think classification. If it describes grouping without labels, think clustering. If it describes outliers (fraud spikes), think anomaly detection. These are the exam’s foundational ML patterns, and the test often checks whether you can identify them from business language.
For Azure selection, watch for the recurring distinction: prebuilt AI Services vs custom ML. In vision, typical exam-tested capabilities include OCR (read text), image analysis (tags, captions, object detection), face detection (not identification unless explicitly allowed and supported), and document intelligence for structured extraction. In NLP, identify whether the task is sentiment analysis, key phrase extraction, language detection, entity recognition, or conversational bot scenarios.
Exam Tip: If a scenario asks for “extract text from scanned receipts or forms,” the strongest instinct is Document Intelligence (form/receipt processing) rather than basic OCR alone, because the requirement implies structure (fields) not just raw text.
Generative AI questions may test concepts rather than deep engineering: what a prompt is, what grounding/augmentation is (bringing in enterprise data), and responsible deployment (content filters, human-in-the-loop, privacy). When you feel uncertain, return to constraints: “needs citations,” “must use company data,” “avoid hallucinations,” “restrict unsafe content.” Those keywords point toward retrieval-augmented generation patterns and safety tooling rather than raw model selection.
After Part 1, do not immediately look at answers. First, write down which domains felt slow or guessy. This note becomes the input to your weak-spot analysis later.
Mock Exam Part 2 should be taken after a short break, because fatigue changes your accuracy. This set should include more “best option” and “choose the correct service” style prompts, which are common in AI-900. Here, the exam is testing whether you can reject plausible-but-wrong options that are adjacent in the Azure ecosystem.
A common trap is confusing what requires training. Prebuilt services (Azure AI Vision, Azure AI Language, Azure AI Speech) typically do not require you to bring labeled training data; you configure and call an API. By contrast, Azure Machine Learning is about building, training, and managing custom models (including data prep, training runs, and evaluation metrics). If the scenario says “no training data available” yet offers “train a model,” that option is likely a distractor.
Another trap is mixing up evaluation metrics. Classification tends to emphasize accuracy, precision, recall, and F1. Regression is often about MAE/MSE/RMSE. Clustering is more about grouping quality and interpretability than “accuracy.” The exam won’t usually require heavy math, but it will test that you pick metrics that match the task. If the task is “catch fraud,” recall might matter more than accuracy because missing fraud is costly.
Exam Tip: When you see “imbalanced data” (rare positives like fraud), expect distractors that recommend accuracy. Accuracy can look high while failing the business goal. Look for precision/recall framing or thresholds.
Responsible AI is also frequently embedded: fairness, reliability/safety, privacy/security, inclusiveness, transparency, and accountability. If an option suggests using sensitive attributes (race, health) without justification, or deploying without monitoring, it is often incorrect. Likewise, generative AI scenarios may test whether you apply content moderation, data protection, and human oversight.
Finish Part 2 by tallying your confidence levels. The point is to see whether you’re consistently missing one domain (for example, confusing Vision vs Document Intelligence, or mixing Language features) or whether errors are mostly from rushing and not reading constraints.
Your score improves fastest not by doing more questions, but by reviewing in a way that removes repeat mistakes. Use a “three-column review” for every missed or uncertain item: (1) the requirement keywords, (2) the correct capability and why it fits, (3) why each wrong option fails a requirement.
Start by rewriting the question in your own words as an input/output statement. Example format: “Input = customer emails (text). Output = detect sentiment + extract key topics. Constraints = no custom training.” Then map: NLP prebuilt features are a match; custom ML is likely unnecessary. This approach forces clarity and reduces the chance you’ll be seduced by brand-name distractors.
For wrong options, don’t write “not right.” Write the specific mismatch: “requires labeled training data,” “outputs tags not structured fields,” “detects language but not entities,” “does image captioning but not OCR.” Over time you build a personal “distractor dictionary,” which is exactly what AI-900 is testing: your ability to distinguish neighboring capabilities.
Exam Tip: If you can’t explain why three options are wrong, you don’t fully own the concept yet—even if you guessed the right letter. The exam punishes shallow recognition because distractors are designed to look familiar.
Also review any question you got right but felt uncertain about. Many candidates plateau because they only study incorrect answers and ignore weak knowledge that happened to result in a correct guess. Tag these as “lucky correct” and include them in your refresh checklist.
Finally, identify the failure mode: (a) concept gap, (b) service confusion, (c) didn’t read constraints, (d) changed answer without evidence. Your remediation differs: concept gaps need study, service confusion needs comparison tables, constraint errors need a reading protocol, and answer-changing needs discipline.
Use this final checklist to align your knowledge with AI-900 objectives. This is not a content dump—treat it as a verification routine. If you cannot explain an item in one or two sentences, flag it for a quick revisit.
Exam Tip: The fastest points come from clean service matching. Make sure you can confidently answer: “Is this best solved with a prebuilt service, custom ML, or generative AI?” before you worry about anything else.
After running this checklist, pick only the top 2–3 weak areas to refresh. Over-studying everything at once increases confusion and reduces recall on exam day.
Exam day performance is a product of preparation plus execution. Your plan should reduce avoidable errors: stress, rushing, and misreading constraints. Start with environment: stable internet, quiet room, cleared desk, and a simple way to track flagged questions. If you are testing online, complete the system check early and eliminate interruptions.
Time strategy: use the same two-pass method you practiced. Pass 1: answer confidently, flag anything uncertain. Pass 2: resolve flags by mapping requirements to capabilities. If you still cannot decide, eliminate options that violate constraints (needs training, wrong modality, wrong output type) and choose the remaining best fit. The exam often rewards elimination more than perfect recall.
Exam Tip: Read the last line of the question twice—this is where “best,” “most cost-effective,” “minimize effort,” or “must be responsible” often appears, and it changes the correct answer.
Last 24 hours: do not attempt to learn brand-new material. Re-run your domain checklist, review your “distractor dictionary,” and revisit only the questions you flagged as “concept gap.” Sleep matters because AI-900 is heavy on recognition and careful reading; fatigue increases careless mistakes.
Right before you start, set a simple rule: only change an answer if you can point to a specific requirement that your new choice satisfies better. This protects you from the common trap of switching from a correct answer to a tempting distractor.
When you finish, use any remaining time to review flagged questions first, then any “lucky correct” areas you recall. End with a final scan for un-answered items. Your goal is a calm, systematic finish—exactly the mindset this chapter trained.
1. A retail company wants to analyze 50,000 customer reviews to identify overall sentiment and extract key phrases. The company does not want to build or train a machine learning model. Which Azure AI service should you recommend?
2. A healthcare organization plans to deploy an AI model that helps prioritize incoming patient messages. The organization is concerned that the model might treat certain age groups unfairly. Which Responsible AI principle is most directly related to this concern?
3. You are reviewing a practice exam and see the requirement: "Convert recorded customer service calls into text and create a searchable transcript." Which Azure AI capability best matches this requirement?
4. During a mock exam, you encounter a scenario describing a manufacturing company that wants to predict equipment failures using historical sensor data. The company has data scientists who can train models and wants full control over the training process. Which Azure service is the best fit?
5. You are creating an exam-day checklist for AI-900. Which action best aligns with the exam strategy of mapping scenario language to the correct Azure AI workload category?