AI Certification Exam Prep — Beginner
A clear, non-technical path to pass Microsoft AI-900 on the first try.
Microsoft Azure AI Fundamentals (AI-900) is designed for learners who want to understand what AI can do, how common AI solutions are described, and how Microsoft Azure delivers AI capabilities—without requiring a technical background. This course blueprint is built specifically for non-technical professionals (business, operations, sales, project management, HR, finance) who need a clear exam-aligned learning path and enough practice to walk into the test confident.
This course follows the official AI-900 exam objectives and organizes them into a 6-chapter “book” that progressively builds understanding and exam readiness across:
Chapter 1 starts with exam orientation: how to register, how scoring works, what to expect from question formats, and how to study efficiently as a beginner. You’ll set a practical study plan and learn how to avoid common exam traps like overthinking scenarios or choosing tools that don’t match the workload.
Chapters 2–5 map directly to the exam domains. Each chapter includes clear explanations at the depth the exam expects—focused on understanding, vocabulary, and scenario recognition rather than coding. You’ll repeatedly practice the key skill the AI-900 rewards: selecting the right AI approach or Azure capability for a business problem.
Chapter 6 finishes with a full mock exam split into two parts plus a structured review. You’ll learn how to analyze mistakes, spot patterns in distractor answers, and tighten your weakest domain areas quickly before exam day.
AI-900 is not a build-and-deploy exam; it’s a fundamentals and concepts exam. Success comes from understanding the language of AI workloads, the difference between ML vs AI services, and how Azure groups capabilities into practical offerings. This course emphasizes:
If you’re new to certification prep, start by setting up your learning routine and taking the first diagnostic-style practice in Chapter 1. Then move through the domain chapters and finish with the mock exam to validate readiness. You can begin right away by creating an account: Register free. Prefer to compare options first? You can also browse all courses.
By the end, you’ll be able to confidently describe AI workloads, explain ML fundamentals on Azure, and identify when to use computer vision, NLP, and generative AI capabilities—exactly in the way Microsoft AI-900 questions expect.
Microsoft Certified Trainer (MCT)
Jordan Reyes is a Microsoft Certified Trainer specializing in Azure fundamentals and AI certification pathways. Jordan has coached beginners through Microsoft exams with a focus on practical exam strategy, domain mapping, and confidence-building practice.
AI-900 is designed to validate foundational AI literacy in the Microsoft ecosystem—especially how common AI workloads map to Azure services and responsible AI considerations. This chapter orients you to the exam’s purpose, logistics, and a practical 14-day plan tailored for non-technical professionals who want confidence with AI concepts without becoming data scientists.
The exam rewards clear workload identification (vision vs. NLP vs. generative AI vs. classic ML), recognizing what Azure service fits a use case, and applying responsible AI principles. It also tests your ability to interpret scenario-style prompts where multiple answers sound plausible. Your job is to learn the patterns: keywords, constraints, and “most appropriate” logic.
Exam Tip: When two answers both “could work,” the correct choice is usually the one that best matches the workload type and the least complexity required. AI-900 frequently favors managed Azure AI services over custom model-building unless the scenario explicitly requires custom training.
Practice note for Understand the AI-900 exam format and domain weighting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and set up your testing environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 14-day study plan for non-technical learners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Learn the question styles and how to avoid common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format and domain weighting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and set up your testing environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 14-day study plan for non-technical learners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Learn the question styles and how to avoid common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format and domain weighting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and set up your testing environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 (Microsoft Azure AI Fundamentals) measures whether you can describe AI workloads and select appropriate Azure capabilities—without requiring coding or advanced math. It is ideal for business stakeholders, project managers, sales, analysts, and anyone who needs to communicate accurately about AI solutions. In exam terms, you’re expected to recognize “what kind of AI problem is this?” and “which Azure service family fits?”
The exam aligns to major domains you’ll cover in this course: (1) describing AI workloads and considerations, (2) fundamental principles of machine learning on Azure, (3) computer vision workloads, (4) natural language processing workloads, and (5) generative AI workloads on Azure (including Azure OpenAI and responsible AI). Domain weighting can change over time, but the practical takeaway is consistent: you must be comfortable distinguishing classical ML (predicting labels/values), vision (images/video), NLP (text/speech/language understanding), and generative AI (creating new text/images/code).
Non-technical learners often over-focus on definitions and under-focus on service selection. AI-900 tests applied understanding: given a scenario, identify the workload and the best-fit service category. You should be able to explain the difference between “training a custom model” vs. “using a prebuilt model/API,” and when each is appropriate.
Exam Tip: If the prompt emphasizes “quickly add AI,” “no ML expertise,” or “minimal effort,” expect Azure AI services (prebuilt). If it emphasizes “proprietary data,” “custom features,” or “control over training,” expect Azure Machine Learning or custom model workflows.
Register through Microsoft Learn, which routes scheduling to Pearson VUE. Your first practical step is ensuring your legal name matches your government-issued ID exactly—middle names and hyphens matter. Mismatches are one of the easiest ways to lose exam day time (or be turned away). From the AI-900 exam page on Microsoft Learn, select the exam, choose your language, and schedule either online proctored or at a test center.
For online proctoring, plan your testing environment early. You will need a stable internet connection, a quiet room, and a cleared desk. Many candidates fail check-in due to prohibited items in view (extra monitors, papers, phones, smartwatches) or unstable Wi-Fi. For test centers, arrive early and know parking and building entry rules. In both formats, you may be asked to complete a system test; do it at least 24–48 hours before.
Exam Tip: Schedule your exam for a time when you are mentally sharp and your environment is predictable. For online exams, avoid shared networks and times when household traffic is high. A “perfect study plan” won’t save a disrupted exam session.
Finally, lock your study plan to your scheduled date. AI-900 rewards steady exposure and repetition more than cramming, so scheduling is not just logistics—it’s the anchor for your practice cycle.
Microsoft certification exams typically use a scaled score. You’ll see a score report that indicates pass/fail and performance by skill area rather than a raw percentage. The important practical implication is that you should aim for balanced readiness across domains, not perfection in one area and weakness in another. Because domain weighting can vary, a “favorite topic” strategy is risky.
Understand retake policies and waiting periods before you sit the exam, so you can plan calmly. Most candidates don’t need a retake if they prepare with targeted practice and review, but anxiety often comes from uncertainty about what happens if you don’t pass. Read the current Microsoft policy pages before exam day (policies can change). Also be aware of exam security rules: no capturing questions, no discussing items, and no external materials during the exam.
Exam-day policies matter in subtle ways: breaks are limited and may not be allowed without impacting the session; leaving camera view (online) or accessing personal items (test center) can invalidate the exam. Plan hydration, snacks, and comfort beforehand.
Exam Tip: Treat policies as part of preparation. A calm, compliant check-in prevents avoidable cognitive load. If you spend the first 10 minutes stressed about rules, you lose attention you need for scenario questions.
In this course, we’ll map every module back to the AI-900 objectives so your study time directly improves your scoring potential.
AI-900 uses multiple question styles that test the same core ability: identify the workload and choose the best match. You will see standard multiple-choice questions, multi-select items, scenario-based prompts, and occasionally ordering or matching formats. The format matters because each requires a different reading strategy.
For MCQ and multi-select, the trap is assuming there is only one “true” statement. Read carefully for qualifiers like “best,” “most cost-effective,” “least administrative effort,” “requires custom training,” or “must run on edge devices.” Those constraints often eliminate otherwise-valid options. For scenario questions, the key is extracting the workload type and constraints: data type (text/image/audio), desired output (label, summary, generation), and operational needs (real-time, offline, compliance).
Ordering questions commonly test process understanding rather than service names—for example, the logical steps in a machine learning workflow (collect data, train, evaluate, deploy, monitor) or a responsible AI approach (identify risk, mitigate, test, monitor). Don’t overthink: choose the simplest end-to-end flow that matches the prompt’s goal.
Exam Tip: Use a two-pass approach. Pass 1: identify workload and constraint keywords. Pass 2: compare remaining options and select the “most appropriate” based on managed service fit. If an answer adds unnecessary complexity (custom model, extra infrastructure) and the prompt didn’t ask for it, it’s usually wrong.
Your goal is not memorization of every product name—it’s pattern recognition: input type, desired output, and service family alignment.
A strong AI-900 plan for non-technical learners is short, consistent, and feedback-driven. Use a 14-day schedule that mixes learning, recall, and practice. Start by mapping the five domains to your calendar so each area gets multiple exposures. Then add spaced repetition: revisit key ideas after 1 day, 3 days, and 7 days. This combats the “I understood it yesterday” illusion that disappears under exam pressure.
Here is a practical 14-day structure you can follow (adjust pacing to your availability): Days 1–2: AI workloads + responsible AI foundations. Days 3–5: ML principles on Azure (training vs. inference, classification/regression/clustering, model evaluation). Days 6–7: Computer vision workloads (image classification, object detection, OCR). Days 8–9: NLP workloads (sentiment, key phrase extraction, translation, speech basics). Days 10–11: Generative AI + Azure OpenAI concepts (prompting, grounding, safety, use cases). Days 12–13: Mixed-domain practice with review of misses. Day 14: Light review, focus on traps, exam logistics check.
Exam Tip: Practice questions are only useful if you review why wrong options are wrong. Create a “trap log” with three columns: keyword you missed, correct workload/service family, and the distractor pattern (e.g., “custom training bait”).
Non-technical learners often benefit from “explain it out loud” rehearsal. If you can explain why a scenario is NLP rather than generative AI in 20 seconds, you’re building the exact skill the exam measures.
Readiness for AI-900 is less about total hours and more about coverage and consistency. Build a resource checklist that aligns to the domains and then set a readiness benchmark you must meet before exam day. Your checklist should include official learning paths, at least one reliable practice source, and a lightweight way to track mistakes (notes app or spreadsheet).
Use Microsoft Learn modules for the authoritative baseline vocabulary and service positioning. Supplement with hands-on familiarity where possible: even viewing Azure service pages and seeing how capabilities are described helps you recognize exam wording. For generative AI, ensure you understand what Azure OpenAI is (a managed offering for models) and how responsible AI is applied (content filters, safety system messages, human-in-the-loop, and governance).
Exam Tip: Don’t treat “responsible AI” as a theory-only topic. Expect scenario phrasing that asks how to reduce bias, improve transparency, or protect privacy. The best answer usually combines policy/process (governance) with technical mitigations (monitoring, filtering, access control) appropriate to the service.
A practical readiness benchmark: you should be able to (1) classify a scenario into ML vs. vision vs. NLP vs. generative in under 30 seconds, (2) name the most appropriate Azure service family for that workload, and (3) state one responsible AI consideration relevant to the scenario. If you can do that reliably, you are preparing in the way AI-900 is designed to assess.
1. You are planning your AI-900 preparation and want to allocate study time efficiently. Which approach best aligns with how Microsoft certification exams are typically designed and weighted?
2. A non-technical professional is scheduling the AI-900 exam and wants to reduce the risk of test-day issues. Which action is MOST appropriate when preparing for an online proctored exam?
3. You are mentoring a learner who struggles with technical depth and has 14 days to prepare for AI-900. Which study plan best matches the exam’s intent for non-technical learners?
4. A company wants to add a feature that extracts key phrases and detects sentiment from customer emails. The team prefers the least complex solution and does NOT want to train a custom model. Which Azure approach is MOST appropriate for AI-900-style guidance?
5. You encounter an AI-900 question where two options both seem feasible. The prompt asks for the 'most appropriate' solution and mentions minimal setup and operational overhead. What is the BEST test-taking strategy based on common AI-900 traps?
This chapter targets AI-900 Domain 1: recognizing common AI workload types, matching them to business outcomes, and selecting the right Azure approach. The exam does not expect you to code models; it expects you to identify what kind AI problem you have (prediction vs. clustering vs. vision vs. language), what success looks like, and what Azure service family best fits the constraints (time, data, interpretability, compliance, and cost).
As you study, focus on the “shape” of the question: does it describe labeled outcomes, numeric targets, groups with no labels, text understanding, images, or conversations? AI-900 often tests whether you can map a scenario to an AI workload category and then to an Azure capability. You’ll also see responsible AI concepts embedded in scenarios—especially around privacy, fairness, and transparency—so you can’t treat ethics as an afterthought.
By the end of this chapter, you should be able to recognize the most common workloads, explain why responsible AI matters, choose between Azure AI services and custom machine learning, and apply option-elimination strategies on scenario questions.
Practice note for Recognize common AI workload types and when to use them: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI principles and why they matter: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right Azure AI approach for a business scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: domain quiz set + mini case questions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize common AI workload types and when to use them: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI principles and why they matter: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose the right Azure AI approach for a business scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: domain quiz set + mini case questions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Recognize common AI workload types and when to use them: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain responsible AI principles and why they matter: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Most AI-900 workload questions boil down to whether you have labeled data and what type of outcome you want. If the scenario includes past examples with known outcomes (for example, “customers who churned” or “emails labeled spam/not spam”), you’re likely in supervised learning. If the scenario asks you to discover structure without labels (for example, “group customers into segments”), you’re likely in unsupervised learning.
Classification predicts a category. Typical exam phrases include “yes/no,” “A/B/C,” “fraud or not,” “which product category,” or “sentiment: positive/neutral/negative.” Regression predicts a number (continuous value), such as “price,” “demand,” “temperature,” or “time to failure.” The umbrella term prediction is broader and can include both classification and regression—watch for questions that use “predict” but require you to specify which kind.
Clustering groups items by similarity with no predefined labels. You’ll see wording like “segment,” “group,” “discover patterns,” or “identify similar customers.” Clustering is not the same as classification: classification uses labeled training data; clustering does not.
Exam Tip: If the answer choices include both “classification” and “clustering,” look for the presence of labels. A phrase like “historical data includes whether the customer churned” strongly implies classification, not clustering.
Common trap: Forecasting is often mistaken for generic regression. Forecasting is still numeric prediction, but it is specifically time series (values indexed by time). In AI-900, when time and seasonality are central to the scenario, expect forecasting terminology to appear (covered in Section 2.2).
Three workload types show up repeatedly on AI-900 because they are common business needs and have recognizable “signal words”: anomaly detection, recommendation, and forecasting. Your goal is to map the scenario to the right workload quickly and avoid confusing similar categories.
Anomaly detection identifies unusual behavior compared to a baseline. Scenarios: “detect fraudulent transactions,” “alert when sensor readings are abnormal,” or “flag unusual login activity.” These questions often emphasize rare events and deviations. Anomaly detection can be supervised (labeled anomalies) but is frequently semi-supervised/unsupervised due to limited labeled anomalies—AI-900 mainly tests the concept, not the training strategy.
Recommendation suggests items a user may like based on similarity, history, and patterns (“customers who bought X also bought Y”). It’s distinct from classification: you are not assigning one of a few categories; you are ranking or selecting likely items. The scenario often includes click history, purchases, ratings, or “personalized suggestions.”
Forecasting predicts future values over time (sales next month, staffing needs next week). The key is the time component and frequently seasonality or trends. While forecasting can be implemented with regression-like methods, the exam expects you to identify it explicitly when time series is central.
Exam Tip: If the question mentions “real-time alerts” on streaming sensor data, anomaly detection is a strong fit. If it mentions “personalized results for each user,” recommendation is usually the best match.
Common trap: Confusing anomaly detection with classification because both can output “flag/not flag.” The differentiator is the baseline behavior and rarity: anomalies are deviations from normal patterns, not just membership in a common class.
Conversational AI questions test whether you understand the end-to-end pieces required to build an agent that interacts with users in natural language. On AI-900, you are typically identifying the workload and the Azure capability category rather than designing a full architecture.
Chatbots handle common questions, triage requests, and guide users through flows (“reset password,” “check order status”). Virtual agents is a broader business term for automated assistants across channels (web, mobile, voice). Copilots usually implies an assistant embedded into a tool to help users create, summarize, or act—often powered by large language models (LLMs) and grounded on enterprise data.
Key concepts the exam may hint at include: intent (what the user wants), entities (important details like dates or order numbers), and dialog management (how the bot decides the next step). Even when generative AI is involved, many scenarios still require reliable task completion, handoff to humans, and safety controls.
Exam Tip: When the scenario emphasizes “answer questions from company policies and documents,” think beyond generic chat—look for clues that the solution must be grounded in enterprise knowledge (often paired with search/retrieval). If the scenario emphasizes “book an appointment” or “update a record,” it’s a task-oriented conversational workload.
Common trap: Assuming conversational AI always means “speech.” Many bots are text-only. If the question explicitly mentions voice calls or transcribing spoken input, then speech capabilities are part of the workload; otherwise, treat it as an NLP/conversational scenario.
AI-900 includes Responsible AI because real-world solutions can create harm if deployed without safeguards. Expect scenario language about protected attributes, regulatory requirements, explanations for decisions, or user consent. You should be able to describe core principles and recognize what a scenario is asking you to address.
Fairness means AI systems should treat people equitably. A common exam framing: “a loan approval model denies a higher percentage of applicants from a demographic group.” The question may ask what principle is impacted (fairness) or what to consider (bias detection/mitigation, representative data, evaluation across groups).
Reliability and safety means the system performs consistently under expected conditions and fails safely. Watch for scenarios involving edge cases (poor lighting in vision, unusual accents in speech, noisy sensor input) or high-stakes use (healthcare, finance). Reliability is not just uptime; it’s prediction stability and robustness.
Privacy and security involves protecting personal data, limiting access, and using data appropriately. Scenario cues include “PII,” “customer addresses,” “medical records,” “consent,” “data retention,” or “data residency.”
Transparency means stakeholders understand how and why the system behaves as it does—especially for automated decisions. This includes explainability, documentation, and being clear to users that they are interacting with an AI system.
Exam Tip: If an answer option mentions “provide explanations for model output,” that maps to transparency. If it mentions “protect personal data,” map it to privacy. Don’t overthink the tooling; AI-900 typically tests principle-to-scenario matching.
Common trap: Treating transparency as “open source the model.” Transparency in Responsible AI is about understandable behavior and communication, not publishing proprietary code.
AI-900 expects you to choose an Azure approach that fits the scenario: use prebuilt Azure AI services when they meet the need, and use custom machine learning when you need domain-specific predictions or control over features, training, and evaluation.
Use Azure AI services when the task is common and well-supported (OCR, object detection, speech-to-text, translation, sentiment, key phrase extraction). These services are optimized, require less data science expertise, and are faster to deploy. They also often include built-in considerations for scaling and security.
Use custom ML (Azure Machine Learning) when you must predict a business-specific label or number that Azure AI services can’t provide out-of-the-box (churn risk for your product, equipment failure probability for your sensors, custom pricing). Custom ML is also preferred when you need to train on proprietary features, control the training process, or meet specific evaluation/interpretability requirements.
Exam Tip: If the scenario describes a task that sounds like a general human sense (see, read, hear, translate, recognize common entities), it often maps to Azure AI services. If it describes a unique business outcome derived from your organization’s historical records, it often maps to custom ML.
Common trap: Picking custom ML “because it’s more powerful.” The exam rewards choosing the simplest approach that meets requirements. If a prebuilt service solves it with minimal training data and faster time-to-value, that is frequently the intended answer.
This chapter’s practice focus is not memorizing definitions in isolation; it’s learning a repeatable method to decode scenarios. AI-900 scenarios are short but dense: they include just enough detail to indicate labels vs. no labels, numeric vs. categorical output, and whether the input is text, images, speech, or time series.
Step 1: Identify the input modality. If the scenario is about images/video, think computer vision workloads. If it’s about text documents, emails, chats, think NLP. If it’s audio, think speech. If it’s structured tables (transactions, customer records), think ML prediction/clustering/anomaly detection.
Step 2: Identify the output type. Category → classification. Number → regression. Groups without labels → clustering. Unusual events → anomaly detection. Time-indexed future values → forecasting. Personalized ranked items → recommendation.
Step 3: Choose the Azure approach. If it’s a common perception/language task, lean toward Azure AI services. If it’s a business-specific score trained on your historical outcomes, lean toward custom ML on Azure Machine Learning. Then cross-check for Responsible AI requirements: privacy constraints, fairness impacts, need for explanations, and reliability expectations.
Exam Tip: Use “option elimination” aggressively. If the scenario says “predict next month’s sales,” eliminate clustering and classification immediately. If it says “group similar customers,” eliminate regression and forecasting. Your speed and accuracy improve when you remove mismatched workload types first.
Common trap: Overfitting to a single keyword. For example, “predict” appears everywhere, but “predict whether” implies classification, “predict how many” implies regression/forecasting, and “predict which products to show” implies recommendation. Always anchor on the output format and the business action that follows.
Finally, keep Responsible AI in your elimination toolkit: if a scenario includes sensitive decisions (employment, lending, healthcare), prioritize answers that reflect fairness, transparency, and privacy considerations as part of the workload selection—not as optional add-ons after deployment.
1. A retail company wants to forecast next month’s sales for each store based on historical sales, promotions, and local events. The company has labeled historical data (sales totals). Which AI workload type best fits this scenario?
2. A bank wants to automatically group new customers into segments based on similarities in spending patterns, without any predefined segment labels. Which AI workload type should the bank use?
3. A call center wants a chatbot that can answer questions about store hours and return policies and escalate complex issues to a human agent. Which Azure AI capability is the best fit?
4. A healthcare provider is deploying an AI model that helps prioritize patient follow-ups. During review, stakeholders ask for clear reasons why the model recommends one patient over another. Which responsible AI principle is being emphasized?
5. A manufacturing company wants to detect defects in product images on an assembly line. They need a quick solution and do not have a large in-house data science team. Which approach is most appropriate?
This chapter maps directly to AI-900 Domain 2: you are expected to recognize core machine learning (ML) terminology, understand the difference between training and inference, and connect those ideas to how Azure delivers ML through Azure Machine Learning. The exam does not ask you to code models, but it does test whether you can read a scenario, identify the type of ML problem, choose the right approach at a conceptual level, and interpret basic evaluation outcomes.
As a non-technical professional, your advantage is thinking in outcomes: “What decision is the model supporting?” and “What evidence shows it works?” The exam rewards clear problem framing and correct vocabulary more than deep math. You will see many questions where multiple answers seem plausible; your job is to spot the one that matches the workload, the data you have (labeled or unlabeled), and the metric that fits the business risk.
Exam Tip: When a scenario describes a model being used in production (scoring, predicting, classifying new inputs), that is inference. When it describes learning from historical data, tuning, or evaluating, that is training. Many AI-900 distractors swap these terms.
Throughout this chapter, we will build: (1) core ML terminology used on the exam, (2) training vs. inference and model lifecycle basics, (3) how those map to Azure Machine Learning capabilities, and (4) practice-style reasoning for model choice and metric interpretation.
Practice note for Master core ML terminology used on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand training vs inference and model lifecycle basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Connect ML concepts to Azure Machine Learning capabilities: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: ML fundamentals question set + scenario items: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Master core ML terminology used on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand training vs inference and model lifecycle basics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Connect ML concepts to Azure Machine Learning capabilities: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: ML fundamentals question set + scenario items: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Master core ML terminology used on the exam: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 expects you to speak the “language of ML.” Start with the dataset: a collection of examples (rows) with attributes (columns). In supervised learning, each example includes features (inputs) and a label (the correct output). If you’re predicting house price, features might include square footage and location; the label is the price. If you’re classifying email as spam/not spam, features come from the email content; the label is the spam flag.
Next is the split into training, validation, and test datasets. Training data teaches the model. Validation data is used during model development to tune choices (hyperparameters, model selection). Test data is held back until the end to estimate how the model will perform on new, unseen data. The exam often checks whether you understand that the test set should not influence training decisions.
Exam Tip: If a question says “evaluate the final model on unseen data,” it’s pointing to the test set. If it says “tune the model during training,” it’s pointing to validation.
A common trap is mixing up “model training” with “data preparation.” The exam may describe cleaning data, handling missing values, or encoding categories; these are data preprocessing steps that happen before or as part of training, but they are not inference. Another trap: assuming more data always solves problems. More data helps only if it is relevant, representative, and labeled correctly for the target task.
Finally, remember the model lifecycle at exam level: define the problem and success metric, prepare data, train, validate/tune, test, deploy for inference, and monitor. Monitoring matters because data can drift over time, reducing performance in production even if test results looked strong.
The AI-900 exam expects you to decide whether a scenario is best solved with supervised or unsupervised learning. The simplest discriminator is labels: if you have known outcomes to learn from, it’s supervised; if you only have raw data and want to discover structure, it’s unsupervised.
Supervised learning includes:
Unsupervised learning includes:
Exam Tip: Watch for phrasing like “we don’t know the categories in advance” or “find natural groupings”—that signals clustering/unsupervised. Phrases like “predict whether” or “predict the value” usually imply supervised learning.
Common exam trap: confusing classification with clustering. Classification requires predefined labeled classes (even if more than two). Clustering outputs groups that may not map to known business categories until a human interprets them.
Another trap is assuming deep learning is always required. AI-900 scenarios often include distractors implying “neural networks” are the default. The better answer is usually the approach that matches the output type (category vs number vs grouping) rather than the most advanced-sounding model.
Evaluation metrics appear on AI-900 at a practical, decision-making level: you must choose a metric aligned to the task and interpret what “good” means. Begin by matching the metric to the model type. For classification, you commonly see accuracy, precision, recall, and F1-score. For regression, you commonly see MAE (mean absolute error), MSE (mean squared error), and RMSE (root mean squared error).
Accuracy is the fraction of correct predictions overall. It can be misleading when classes are imbalanced (e.g., fraud is rare). That is where precision and recall matter:
Exam Tip: If false positives are expensive (e.g., blocking legitimate payments), prioritize precision. If false negatives are dangerous (e.g., missing cancer detection), prioritize recall. The exam often embeds cost/risk language to guide you.
For regression, RMSE summarizes prediction error magnitude; lower is better. Because errors are squared before averaging, RMSE penalizes large errors more heavily than MAE. In business terms, RMSE is useful when large mistakes are disproportionately harmful (e.g., under-forecasting inventory by a lot).
A common trap is picking “accuracy” for every classification scenario. If the data is imbalanced, a model can have high accuracy by always predicting the majority class. AI-900 may hint at imbalance by stating “only 1% of transactions are fraudulent” or similar—this is your signal to consider precision/recall.
Another trap is mixing metrics across problem types: RMSE is not a classification metric, and accuracy is not a regression metric. On the exam, eliminate answers that don’t match the workload first, then select the metric that reflects the stated business risk.
AI-900 tests overfitting and underfitting as conceptual failure modes you can recognize from training vs. validation/test behavior. Overfitting occurs when a model learns the training data too well, including noise, and performs poorly on new data. Typical symptom: very strong training performance but noticeably weaker validation/test performance.
Underfitting occurs when the model is too simple to capture the underlying pattern. Symptom: poor performance on both training and validation/test sets. The exam frequently uses “the model performs poorly even on training data” to point you toward underfitting.
These concepts relate to bias and variance in the ML sense:
Exam Tip: When you see “generalize” or “performs well on new data,” the question is about overfitting/underfitting. Compare training vs. test results in the stem; that comparison is usually the entire point.
At AI-900 level, know common mitigations without going deep: to reduce overfitting, you can use more representative data, simplify the model, add regularization, or use early stopping; to reduce underfitting, you may use a more expressive model, add informative features, or train longer (when appropriate).
Common trap: confusing statistical “bias” (systematic error in an estimator) with “bias” in responsible AI/fairness. In this domain (Fundamental principles of ML), bias/variance refers to model behavior and generalization, not demographic fairness (though both topics exist elsewhere in AI-900). If the question mentions groups, fairness, or protected attributes, you’re likely in a responsible AI context, not bias/variance tradeoff.
Azure Machine Learning (Azure ML) is the core Azure service for building, training, and deploying ML models. AI-900 expects you to recognize its major components and how they map to the model lifecycle: organize work, run training, track results, and deploy for inference.
A workspace is the top-level container for Azure ML resources—think of it as the “project home” that stores connections, compute, datasets, models, endpoints, and logs. An experiment is a logical grouping of training runs; each run captures parameters, metrics, and artifacts so you can compare approaches and reproduce results. If a scenario describes “tracking runs” or “comparing models,” it is hinting at experiments and run history.
Pipelines orchestrate repeatable ML workflows: data prep, training, evaluation, and deployment steps chained together. Pipelines support automation and MLOps-style consistency (repeatability is a frequent exam keyword). Use pipelines when the question emphasizes repeatable processes, scheduled retraining, or standardized steps across teams.
AutoML (Automated ML) helps select algorithms and tune hyperparameters automatically for tasks like classification, regression, and time-series forecasting. The exam angle is when you want to build a good baseline quickly without deep ML expertise—AutoML is a strong match. However, avoid the trap of selecting AutoML for everything: if the stem emphasizes “full control,” “custom architecture,” or specialized deep learning, AutoML may not be the best conceptual fit.
Exam Tip: Map the scenario verbs to Azure ML components: “organize and govern” → workspace; “compare runs/metrics” → experiment; “repeatable workflow” → pipeline; “automatic model selection/tuning” → AutoML; “real-time scoring” → endpoint/inference.
Finally, connect training vs inference to Azure deliverables: training jobs produce a model artifact; deployment publishes that model behind an endpoint for applications to call. If a question asks what is required to “use the model in an app,” look for deployment/endpoint language rather than training resources.
This section prepares you for the two most common AI-900 reasoning tasks in Domain 2: selecting an appropriate model type (classification vs regression vs clustering) and interpreting which metric best matches the business goal. The exam rarely wants an algorithm name; it wants the correct category of approach and the correct evaluation lens.
For “choose-the-model” scenarios, use a quick elimination checklist:
Exam Tip: Beware of distractors that mention “prediction” as a synonym for everything. In ML terms, classification and regression both “predict,” but the exam expects you to anchor on output type (categorical vs numeric) and label availability.
For metric interpretation, pair the metric to the risk described in the scenario. If the scenario emphasizes avoiding false alarms (e.g., too many customers incorrectly flagged), precision is usually more important. If it emphasizes not missing true cases (e.g., missing fraud), recall rises in priority. If it emphasizes overall correctness with balanced classes, accuracy can be reasonable. For regression, if the scenario emphasizes “large errors are especially bad,” RMSE is often favored over MAE.
Common trap: interpreting a “good” metric without context. The exam may present a high accuracy model in an imbalanced scenario—your job is to recognize that accuracy alone can hide poor detection of the minority class. Another trap is assuming one metric must be maximized in isolation; in real systems there is a tradeoff, but the exam will usually provide a risk cue that clearly points to the intended metric.
As you review practice items, discipline yourself to underline three clues in every stem: (1) what the output looks like, (2) whether labels exist, and (3) what kind of error is most costly. Those three clues typically determine the correct answer faster than reading the options repeatedly.
1. A retail company has a trained model that predicts whether a customer will churn. The company deploys the model and uses it to score new customer records each night. Which phase of the ML lifecycle is the nightly scoring activity?
2. A bank wants to categorize loan applications as Approved or Denied based on historical applications that already include the final decision. What type of machine learning is this?
3. You are reviewing an ML project in Azure Machine Learning. The team says, "We split our labeled data into training and test sets to measure how well the model generalizes." What is the primary purpose of the test set?
4. A manufacturer wants to detect defective parts by identifying items that are unusually different from typical production output. The company has many measurements but no labels indicating which parts were defective. Which approach is most appropriate?
5. A team uses Azure Machine Learning to manage experiments. They compare two models and select the one that achieves higher accuracy on the validation data. Which Azure Machine Learning capability best aligns with tracking these runs and their metrics over time?
AI-900 expects you to recognize common computer vision tasks, the typical outputs they produce, and which Azure services are designed for those tasks. As a non-technical professional, you won’t be asked to build models or write code—but you will be tested on identifying the right workload and the right service from a short scenario, plus understanding what the service returns (tags, bounding boxes, extracted text, confidence scores) and what limitations apply.
This chapter maps directly to AI-900 Domain 3 (computer vision workloads on Azure). You’ll practice three core task types—image classification, object detection, and OCR—then connect them to Azure AI Vision and related services. You’ll also cover document-focused extraction (where “vision” overlaps with documents), and you’ll learn exam-grade cues for selecting the correct Azure offering quickly. Exam Tip: On AI-900, the fastest way to the right answer is to name the task output: “label” (classification), “box + label” (detection), “text” (OCR), or “key-value tables/fields” (document intelligence).
Practice note for Identify key computer vision tasks and typical outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match scenarios to Azure vision services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand OCR, detection, and classification at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: vision services question set + mini case: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify key computer vision tasks and typical outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match scenarios to Azure vision services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand OCR, detection, and classification at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: vision services question set + mini case: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify key computer vision tasks and typical outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match scenarios to Azure vision services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand OCR, detection, and classification at exam depth: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Most AI-900 vision questions are really “workload identification” questions. Start by classifying the request into one of three core task types that appear repeatedly on the exam: image classification, object detection, and OCR (optical character recognition). Each has a distinct output, which the exam uses as a clue.
Image classification assigns one or more labels to an entire image (or a cropped region if the scenario mentions it). The output is typically a set of categories/tags with confidence scores (for example, “dog: 0.92”). This is ideal when the question is “What is this image?” rather than “Where is the object?” Common scenarios: categorizing product photos, identifying whether an image contains a type of content, or adding searchable tags to media.
Object detection identifies and locates objects within an image. The output includes bounding boxes (coordinates) plus labels and confidence. If the scenario mentions “counting,” “locating,” “drawing boxes,” “tracking items,” or “finding where,” think detection. Exam Tip: “Multiple objects in the same image” often signals detection; classification can be multi-label, but it doesn’t tell you where the objects are.
OCR extracts printed or handwritten text from images, returning recognized text and often positional information. Scenario cues include scanned documents, street signs, serial numbers, containers, invoices, screenshots, and “extract text from images.” A frequent trap is confusing OCR with full document extraction: OCR gives you text; document intelligence gives structured fields (like vendor, total, date) derived from both text and layout.
Common trap: “Describe what’s in the picture” can mean tags/captions (image analysis) rather than custom model training. In AI-900, default to prebuilt analysis unless the scenario explicitly demands a custom taxonomy unique to the business.
Azure AI Vision (often referred to as Vision) provides prebuilt image analysis capabilities that map cleanly to AI-900 objectives. When the scenario needs “understand an image” without training a model, think Azure AI Vision first. At exam depth, focus on what it can return: tags, captions/descriptions, detected objects, image metadata, and OCR-style text extraction for images.
Typical use cases include auto-tagging a photo library, generating captions for accessibility, detecting common objects in retail or manufacturing images, and extracting text from signs or labels. The exam frequently uses “quickly add AI to an app” wording—this points to prebuilt Vision rather than a custom ML build. Exam Tip: “Prebuilt,” “no training,” “easy integration,” and “common objects” are strong hints to choose Azure AI Vision.
Understand how to map phrasing to features: if the requirement is “generate a sentence describing the image,” think captioning/description; if it’s “identify objects and where they are,” think object detection; if it’s “return keywords,” think tagging. In service-selection questions, the wrong choices often include unrelated workloads (NLP services for text sentiment, forecasting services, or generic “Azure Machine Learning” when no custom training is needed).
Common trap: Don’t assume “computer vision” means you must build a model. AI-900 leans toward recognizing when a managed, prebuilt API is the intended answer, especially for standard tasks like describing, tagging, and reading text.
Face-related scenarios are designed to test two things: (1) you can distinguish face detection/analysis concepts from general image analysis, and (2) you understand responsible AI and safety boundaries. Conceptually, “face detection” means finding faces in an image (often returning a rectangle/bounding box). “Face verification” typically means confirming whether two images are the same person (1:1). “Face identification” means matching a face against a known set of people (1:many). On AI-900, watch for these ratios in wording—verification vs identification is a classic exam discriminator.
Just as important, Microsoft places restrictions on some face-related capabilities, especially those that infer sensitive attributes or enable risky identification uses. The exam may include policy-aware phrasing like “appropriate use,” “consent,” “privacy,” and “bias.” Your best answer will acknowledge that face solutions should be used responsibly, with clear user consent, data minimization, and transparency.
Exam Tip: If a scenario sounds like surveillance, covert tracking, or identifying people without consent, expect the question to steer you away from it (or test your awareness that such uses are restricted or require strict governance). In AI-900, “responsible AI” is not just theory—it’s a service-selection clue.
Common trap: Confusing face detection with object detection. A face is an object, but the exam expects you to recognize face-specific scenarios and the associated privacy and compliance considerations.
When the input is a document (forms, receipts, invoices, IDs, contracts), the exam often wants you to choose document-focused extraction rather than generic OCR. Azure AI Document Intelligence is designed to extract structured information—fields, key-value pairs, tables, line items—not just raw text. This is where “vision” crosses into business automation: turning messy document images into data you can store in a database.
At exam depth, know the difference between these levels of output:
Scenario cues for Document Intelligence include “extract total from receipts,” “populate a form automatically,” “capture invoice line items,” “digitize paper forms,” and “reduce manual data entry.” If the scenario says “structured extraction,” “key-value pairs,” or “tables,” treat that as an unmistakable signal.
Exam Tip: If the business outcome is “send extracted fields to an ERP/CRM,” Document Intelligence is usually the best match. OCR is a component; Document Intelligence is the end-to-end document-to-data tool.
Common trap: Selecting Vision OCR for receipts/invoices because “it reads text.” The exam expects you to choose the service that matches the business requirement (structured fields), not just the technical sub-step (text recognition).
AI-900 doesn’t test coding, but it does test whether you understand how vision solutions behave in production: what goes in, what comes out, and what factors affect reliability. Inputs are commonly images (JPEG/PNG), scanned PDFs, camera frames, or stored blobs. Outputs are structured JSON-like results such as tags, captions, bounding boxes, extracted text, and extracted fields—with confidence scores attached.
Confidence scores matter because the “right” design often includes a threshold or a human review step. Low-confidence OCR results or ambiguous classifications can route to manual verification. Exam Tip: When you see “human in the loop,” “review,” or “approve before posting,” think confidence thresholds and fallback workflows—this aligns with responsible, reliable AI design.
Latency and throughput show up as practical constraints in scenarios: real-time camera use implies low latency; back-office document processing can be batch. If a scenario says “near real-time,” “live video,” or “on a production line,” prefer services and patterns that support fast inference and consider edge processing concepts (even if the exam stays high level). If it says “process thousands of receipts nightly,” think batch processing and cost-efficient scaling.
Common trap: Treating AI outputs as “facts.” The exam expects you to view results as probabilistic and to design for uncertainty using confidence scores and validation steps.
AI-900 vision questions are commonly scenario-based: you are given a short business goal and must choose the correct computer vision task and Azure service. Your method should be consistent: (1) underline the desired output (text, labels, boxes, fields), (2) decide the workload type (classification, detection, OCR, document extraction), and (3) map to the most direct Azure service (Vision for image analysis/OCR, Document Intelligence for structured document extraction, face-related capabilities only when explicitly needed and appropriate).
In mini-case style items, expect extra details that are distractors (device type, industry, or storage location). Focus on what the system must return. For example, “detect defects” often implies object detection (boxes) or classification (pass/fail) depending on whether location is required. “Auto-fill expense reports from receipts” strongly implies Document Intelligence rather than plain OCR.
Exam Tip: When two answers both seem plausible, pick the one that is more “purpose-built.” The exam generally rewards choosing a specialized service (Document Intelligence for forms, Vision for image analysis) over a general platform option (Azure Machine Learning) unless custom training is explicitly stated.
Common trap: Over-indexing on keywords like “AI” or “machine learning” and choosing Azure Machine Learning by default. AI-900 expects you to recognize that many vision solutions are delivered as managed cognitive services with prebuilt models, not custom ML projects.
1. A retail company wants to count how many people enter a store from a security camera image. The solution must return the location of each person in the image. Which computer vision task best fits this requirement?
2. A company wants to extract the text from photos of street signs taken by employees using mobile phones. Which Azure service capability is the best match?
3. You are reviewing outputs from an Azure computer vision solution. The result includes labels such as "car" and "person" with confidence scores, and each label includes a rectangular region within the image. What type of workload is being performed?
4. A manufacturing company wants to categorize product photos into predefined categories (for example, "shoe", "shirt", "hat") for an online catalog. The solution does not need to identify where items appear in the image. Which output is most associated with the correct task?
5. A company wants to process scanned invoices and extract structured data such as vendor name, invoice number, dates, and totals. Which Azure service is the best fit for this requirement at AI-900 exam depth?
This chapter targets the AI-900 skills measured for NLP workloads on Azure and Generative AI workloads on Azure. As a non-technical professional, your exam success comes from mapping a business need ("What are we trying to do with text?") to the correct Azure service and the correct type of AI approach (classic NLP vs. generative AI). Expect questions that describe scenarios in plain language (customer feedback, multilingual support, internal knowledge search, or content generation) and ask you to choose the most appropriate capability or service.
The exam also checks that you can explain, at a high level, how generative AI works (prompts, tokens, embeddings, retrieval) and how to apply responsible AI concepts (safety, privacy, human review). A common trap is picking a “bigger” tool (Azure OpenAI) when a simpler NLP feature is asked for (sentiment analysis, entity extraction, translation). Another trap is confusing service families: Azure AI Language vs. Azure AI Translator vs. bot services vs. Azure OpenAI.
In the sections that follow, you’ll build a mental decision tree: (1) Identify the NLP task, (2) choose the right Azure service, (3) recognize when generative AI is appropriate, and (4) layer responsible AI controls.
Practice note for Understand NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI and foundation model basics for non-technical roles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI and safety concepts to GenAI scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: NLP + GenAI mixed question set: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI and foundation model basics for non-technical roles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply responsible AI and safety concepts to GenAI scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: NLP + GenAI mixed question set: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand NLP tasks and map them to Azure services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 commonly frames NLP as “extract meaning from text” or “understand and transform language.” The exam expects you to recognize core NLP tasks by their business outputs. Sentiment analysis classifies text as positive/negative/neutral (often with confidence scores). Scenario cues: “measure customer satisfaction from reviews,” “track brand perception,” or “monitor social posts.” Key phrase extraction pulls representative phrases (themes) from text, useful for “what are people talking about?” Entity recognition (NER) identifies “things” such as people, organizations, locations, dates, or custom entities; cues include “extract product names from tickets” or “detect medical terms.”
Summarization reduces long text into shorter form; AI-900 may mention “summarize call center transcripts” or “create an executive summary of documents.” Summarization is available as a classic NLP feature in Azure AI Language and can also be done with generative AI; your job is to pick the service that best matches the asked capability (see traps below). Translation converts text between languages; cues include “support multilingual users” and “translate UI strings or messages.”
Exam Tip: When the scenario asks for “identify” or “extract” specific information from text (entities, phrases), choose an NLP extraction feature—not a chatbot or generative model. Generative AI is more likely when the scenario requires creating new text, rewriting, or synthesizing across multiple sources.
Common trap: “Summarize documents” can tempt you to pick Azure OpenAI automatically. If the question focuses on an out-of-the-box NLP analysis feature (summarization as a language analytics task), Azure AI Language is usually the intended answer. If it emphasizes conversational drafting, style, or flexible natural language generation, Azure OpenAI may be appropriate.
On AI-900, the “right service” questions are often about picking between Azure AI Language and Azure AI Translator. The simplest way to differentiate: Azure AI Language is for analyzing and understanding text (sentiment, entities, key phrases, summarization, classification, and conversational language understanding). Azure AI Translator is for converting text from one language to another (and related linguistic transforms like detection of language).
Look for action verbs. If the scenario says: “detect sentiment,” “extract entities,” “classify emails,” “summarize text,” or “identify key topics,” that maps to Azure AI Language. If it says: “translate support tickets into English,” “localize content,” “provide real-time chat translation,” or “translate product descriptions for international markets,” that maps to Azure AI Translator.
Exam Tip: Many items hide the service choice behind business wording. Translate = “make it available in multiple languages.” Sentiment = “measure tone.” Entities = “pull out names/places.” If the output is still in the same language but enriched with insights (labels, scores, extracted items), think Azure AI Language.
Common trap: Confusing “language detection” with “language understanding.” Translator can detect the language for translation workflows, but intent/entity extraction for chat or commands is a Language capability. Another trap is overthinking implementation (SDKs, REST, pricing tiers). AI-900 stays at the capability/use-case level: choose the service family that matches the task.
As a non-technical pro, you should also be ready to explain the value proposition: Azure AI Language helps organizations analyze customer feedback, automate document processing, and categorize communications. Azure AI Translator enables global reach by removing language barriers in apps, websites, and support operations.
Conversational NLP questions test whether you can distinguish between (1) a chat experience (the “bot”) and (2) the language intelligence behind it (intent recognition, entity extraction, or Q&A retrieval). Conceptually, a bot is the interface that manages conversation flow (greetings, follow-ups, handoff). Under the hood, conversational NLP can be implemented in two common patterns: intent recognition and QnA/knowledge base.
Intent recognition is used when users express goals (“reset my password,” “check order status”). The system predicts the intent and extracts entities (“order number 12345”). In Azure, this maps to conversational language understanding capabilities within Azure AI Language. QnA patterns are used when users ask factual questions (“What are your store hours?”) and you want the best matching answer from a set of approved content. In Azure exam language, this aligns with question answering features (a curated knowledge base that returns the most relevant answer).
Exam Tip: Identify whether the user is “asking for a fact” (QnA) or “trying to do an action” (intent). If the question mentions “knowledge base,” “FAQ,” or “return the best answer,” it’s QnA. If it mentions “detect intent,” “extract parameters,” or “trigger a workflow,” it’s intent recognition.
Common trap: Assuming “chat” always means generative AI. AI-900 includes classic bots and QnA solutions that do not generate new text beyond approved answers. If the scenario stresses “approved responses,” “consistent answers,” or “compliance,” the intended approach is often QnA retrieval rather than free-form generation.
For exam readiness, keep the mental model: the bot is the channel experience; Azure AI Language provides the understanding layer. Your job is to map the scenario to the correct conversational pattern.
AI-900 generative AI questions focus on foundational concepts rather than deep model training. Generative AI creates new content (text, images, code) based on patterns learned from large datasets. You’ll see references to foundation models (large pre-trained models) that are adapted through prompting and (sometimes) fine-tuning.
Know the vocabulary: a prompt is the input instruction and context you give the model. Models process text in tokens (chunks of text), and token limits affect how much input/output you can include. Many exam scenarios imply this constraint indirectly: “long documents,” “multi-turn chat,” or “include policy text.” Token limits can drive design decisions such as summarizing, chunking, or retrieval rather than pasting everything into the prompt.
Embeddings are numeric vector representations of text that capture semantic meaning. They enable “similarity search” (find related passages even if wording differs). This is the foundation for retrieval-based solutions and is heavily associated with RAG (Retrieval-Augmented Generation): retrieve relevant content from your data store, then provide it to the model to generate an answer grounded in that content.
Exam Tip: If the scenario says “use company documents to answer questions” or “reduce hallucinations by citing internal sources,” the pattern is RAG: retrieve + generate. If it says “find similar tickets,” “semantic search,” or “deduplicate,” embeddings are the likely concept.
Common trap: Confusing RAG with fine-tuning. RAG does not change the model’s weights; it changes what context you provide at runtime. Fine-tuning adapts model behavior by training on examples. On AI-900, when the goal is “use latest internal data” or “keep answers aligned to current policies,” RAG is usually the better match because it can update data without retraining.
Azure OpenAI Service provides access to OpenAI models hosted on Azure with enterprise controls. For AI-900, you should recognize the core capability categories: text generation and chat (drafting, summarizing, Q&A style responses), embeddings (semantic similarity), and image generation (where applicable). Most exam scenarios focus on text/chat and embeddings because they map directly to business workflows: customer support assistance, internal copilots, document drafting, and knowledge discovery.
Common solution patterns the exam likes: (1) Chat assistant for employees or customers, (2) Document summarization and email drafting, (3) RAG-based knowledge assistant that answers using internal documents, and (4) Semantic search using embeddings to find related content.
Exam Tip: When the scenario mentions “generate,” “draft,” “rewrite,” “create variations,” or “natural conversation,” Azure OpenAI is a strong candidate. When it mentions “compare meaning,” “find similar,” or “retrieve the most relevant passages,” look for embeddings as part of an Azure OpenAI pattern.
Common trap: Picking Azure OpenAI for straightforward translation or sentiment. Translation is typically Azure AI Translator; sentiment/entities/key phrases are typically Azure AI Language. Azure OpenAI can do these tasks, but the exam usually rewards selecting the purpose-built service unless the scenario explicitly asks for a generative model or flexible natural language generation.
Also be ready for “capabilities vs. responsibilities” phrasing: Azure OpenAI provides the model endpoint, but you still design prompts, define grounding data sources, and apply safety controls. AI-900 expects you to describe the solution at this architectural level, not code.
Responsible AI is not an optional extra on AI-900—expect at least one scenario that tests risk awareness and mitigations for generative AI. Start with the key problem: generative models can produce plausible but incorrect output (“hallucinations”), may include harmful content, and can expose sensitive data if prompts or outputs are mishandled.
Grounding is the practice of basing model outputs on trusted sources (often via RAG), so answers align with approved content. This reduces hallucinations and improves traceability. Pair grounding with citations or “show your sources” behaviors when possible. Safety involves content filtering and policy controls to reduce hateful, violent, sexual, or self-harm content, and to manage jailbreak attempts. Privacy includes minimizing sensitive data in prompts, applying access controls to retrieved documents, and following data governance rules (who can see what content).
Human-in-the-loop is a frequent exam theme: for high-impact decisions (finance, HR, medical, legal), the model should assist rather than decide. You route low-confidence or high-risk outputs to human reviewers, and you log and monitor outputs for quality and compliance.
Exam Tip: If a question asks how to reduce hallucinations, the best answer is usually “ground the model using your data (RAG)” rather than “increase temperature” or “ask it to be accurate.” If the question asks about harmful content, think “safety filters, monitoring, and policy enforcement.” If it asks about sensitive information, think “data minimization, access control, and human review.”
Common trap: Treating responsible AI as only a legal/compliance step. The exam frames it as a design requirement: you choose architectures (RAG), processes (human review), and controls (filters, permissions) to mitigate risk. In mixed NLP + GenAI scenarios, responsible AI can be the deciding factor between two plausible answers—choose the option that includes grounding and oversight.
1. A company receives thousands of customer survey comments each day and wants to automatically identify whether each comment is positive, negative, or neutral. Which Azure service is the best fit for this requirement?
2. A global support team wants to translate live chat messages between English and Japanese. The requirement is translation only (no summarization or content creation). Which Azure service should you choose?
3. A company wants a chatbot that answers employee questions using content from internal policy documents. The bot must ground responses in those documents rather than inventing answers. Which approach best meets the requirement?
4. A marketing team uses a generative AI model to draft product descriptions. They want to reduce the risk of generating unsafe or inappropriate content and ensure outputs are reviewed before publishing. Which set of actions best aligns with Responsible AI practices for generative AI?
5. You are explaining how a generative AI solution finds relevant internal content for answering questions. Which concept describes converting text into numeric vectors so the system can find semantically similar information?
This chapter is your capstone: you will run a full, timed mock exam, diagnose weak spots, and execute a final review plan aligned to the AI-900 objectives. AI-900 is designed for non-technical professionals, but it still tests precision: you must pick the best option based on Azure service fit, workload type, and responsible AI considerations—not just recognize buzzwords.
We’ll integrate the chapter lessons in a practical flow: first, you’ll set up the mock exam environment and timing strategy (Mock Exam Part 1 and Part 2). Next, you’ll do a structured weak spot analysis using an answer-review method that explains why the correct option wins. Finally, you’ll use a domain checklist and an exam-day checklist to reduce avoidable errors.
Exam Tip: Treat the mock exam like the real exam: one sitting, no notes, no pausing. The skill you’re training is decision-making under time pressure using service-selection cues.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Set up your mock exam conditions to mirror the real AI-900 experience: quiet environment, single sitting, and a strict timebox. Even though the exam is approachable for non-technical candidates, the challenge is consistency across domains: recognizing workload types (AI vs. ML vs. GenAI), choosing the right Azure capability, and spotting governance/responsible AI requirements.
Timing strategy: plan two passes. Pass 1 is for confident questions—answer and move on. Pass 2 is for flagged items where you must compare two plausible options. Your goal is to avoid spending too long early and rushing later. Use a simple rule: if you can’t justify an answer in 45–60 seconds, flag it and proceed.
Scoring rubric for your mock: (1) overall score, (2) per-domain score, and (3) “decision quality” notes. Decision quality is the key metric: when you miss, identify whether it was (a) service confusion (e.g., mixing Azure Machine Learning with Azure AI services), (b) workload confusion (classification vs. regression vs. clustering), or (c) governance confusion (privacy, fairness, transparency).
Exam Tip: Don’t only track right/wrong. Track “guess vs. known.” If you guessed correctly, it’s still a weak spot that can fail you under different wording.
Mock Exam Part 1 should emphasize Domains 1–2: describing AI workloads and considerations, and the fundamental principles of machine learning on Azure. This is where AI-900 often tests vocabulary accuracy and “workload mapping” rather than deep math. Expect decisions like whether a scenario is best solved with classification vs. regression, or whether the need is a conversational AI experience vs. an ML prediction model.
Domain 1 focuses on identifying workloads and considerations: anomaly detection, forecasting, personalization, vision, language, and conversational agents. The exam often includes governance cues: if the scenario mentions regulated data, user transparency, or minimizing bias, that points to responsible AI practices such as fairness assessment, explainability, human oversight, and privacy/security controls.
Domain 2 tests ML fundamentals and Azure delivery. You must distinguish training vs. inference, features vs. labels, and supervised vs. unsupervised learning. It also tests how ML is delivered on Azure: Azure Machine Learning for building/training/deploying models, and managed options where you consume models as APIs. Be careful with “AutoML” style statements: automation helps, but you still choose target metric, data splits, and evaluation approach.
Exam Tip: When an option mentions “build, train, tune, and deploy your own model,” it’s usually pointing to Azure Machine Learning. When it mentions “prebuilt capability via API,” it’s likely an Azure AI service.
After finishing Part 1, record your per-domain accuracy and note the top three concepts that caused hesitation. Those become your “weak spot” inputs for Section 6.4 and Section 6.5.
Mock Exam Part 2 should emphasize Domains 3–5: computer vision workloads on Azure, NLP workloads on Azure, and generative AI workloads on Azure (including responsible AI). This section rewards candidates who can match a scenario to the correct family of capabilities and avoid near-miss service names.
For Computer Vision, focus on what is being extracted: images to tags/captions, OCR text extraction, object detection, or facial analysis. The exam frequently uses “read text from images” cues (OCR) versus “describe the scene” cues (captioning) versus “find and locate items” cues (detection with bounding boxes). Azure offerings evolve, but the exam intent remains: recognize the workload type and select the appropriate managed vision capability when you’re not training a custom model.
For NLP, separate: sentiment/key phrase extraction, language detection, entity recognition, and conversational solutions. Pay attention to whether the problem is understanding text (NLP analytics) or managing a dialogue experience (chatbot). Also watch for translation cues. The exam may test that you can use managed language services for common tasks without building a model from scratch.
For Generative AI, focus on concepts: prompts, tokens, grounding, hallucinations, and safety. Azure OpenAI capability selection often hinges on whether the task is text generation/summarization, code generation, embeddings for semantic search, or image generation. Responsible AI is not optional in this domain: you must recognize safety mechanisms (content filtering, system messages, access control) and human-in-the-loop review for high-stakes outputs.
Exam Tip: If the scenario includes “retrieve company policy documents and answer based on them,” think of retrieval-augmented generation (RAG): embeddings + search + grounded responses, not “train the model on internal data” as the default.
Finish Part 2 with a short reflection: which domain felt “same-y” (vision vs. OCR vs. detection; NLP analytics vs. conversational)? Those confusions are common and fixable with a mapping table in your final review.
Your goal in answer review is not to memorize facts—it’s to improve your selection logic. Use a consistent method for every missed or guessed item: (1) restate the scenario in one sentence, (2) identify the workload type, (3) list the decision clues, (4) explain why the correct option fits, and (5) explain why each distractor fails.
Why the correct option wins usually comes down to “best fit” rather than “could work.” AI-900 distractors often are technically possible but not appropriate. For example, a custom ML approach might work, but if the scenario screams “standard capability with minimal ML expertise,” a managed Azure AI service is the best fit. Similarly, “build a chatbot” and “extract sentiment” are both language-related, but they solve different workload types (dialog vs. analytics).
Use a two-column elimination technique:
Exam Tip: When two options both sound correct, pick the one that matches the scenario’s constraints most directly (speed to implement, managed vs. custom, governance). The exam rewards constraint matching more than feature lists.
Turn misses into “if you see X, think Y” rules. Examples of high-yield rules: numeric prediction → regression; unlabeled grouping → clustering; extracting printed text from images → OCR; need to search and answer from internal documents → embeddings + retrieval + grounded generation; high-stakes decisions → transparency and human review. This transforms review time into score gains.
Your final review should be checklist-driven, aligned to the AI-900 outcomes. The last 24 hours is about reducing cognitive load and avoiding last-minute topic sprawl. Re-read only the concepts that repeatedly appeared in your weak spot analysis and re-practice the decision cues.
Domain checklist (use it as a self-audit):
Last-24-hours plan: do one light, timed set of questions for rhythm (not volume), then review only flagged rules and your “if X then Y” map. Sleep and logistics matter more than cramming marginal details.
Exam Tip: If you’re still mixing up two services, don’t chase new documentation. Instead, write a one-line discriminator (e.g., “Azure ML = build/train/deploy your model; Azure AI services = prebuilt APIs for common tasks”). That is what the exam expects you to apply.
Exam day performance is partly logistics. Plan to remove friction so you can focus on interpreting scenarios and choosing best-fit answers. Whether you test online or at a center, your preparation should include environment checks, identity requirements, and a contingency plan.
Online proctoring checklist: confirm system compatibility, stable internet, allowed workspace rules, and ID readiness. Clear your desk, silence devices, and close background apps. Expect check-in steps that take time; schedule a buffer so you don’t start stressed. If a proctor interrupts, stay calm and follow instructions—panic leads to misreads in scenario-based questions.
Test center tips: arrive early, bring required identification, and know locker policies. Use the tutorial time to settle in and set your pacing intention (two-pass strategy). During the exam, if you feel stuck, re-anchor to workload type and constraints—those cues are deliberate.
Exam Tip: The fastest way to recover from a difficult item is to flag it and move on. A single stubborn question can steal time from multiple easy wins later.
Retake plan (if needed): treat your score report as a domain-level diagnostic. Rebuild only the weak domains using your review method from Section 6.4, then do another timed mock. Most retake improvements come from reducing service confusion (Azure ML vs. AI services vs. Azure OpenAI) and improving distractor elimination, not from learning advanced math.
Finally, commit to a calm routine: hydration, a brief mental warm-up (review your “if X then Y” rules), and a pacing plan. AI-900 is designed to validate foundational judgment—on exam day, your job is to apply that judgment consistently.
1. You are taking an AI-900 practice exam. A question asks you to identify the best Azure service to extract key phrases and detect sentiment from customer emails with minimal code. Which service should you choose?
2. A retail company wants to forecast weekly product demand using historical sales data stored in Azure. They do not have data scientists and want a guided experience to train a model. Which approach best matches the requirement?
3. During a timed mock exam, you notice you are repeatedly missing questions that ask you to choose between prebuilt AI services and custom machine learning. What is the most effective next step to perform a weak spot analysis aligned to certification exam strategy?
4. A healthcare organization wants to use an AI model to help prioritize patient follow-ups. Which action best reflects Responsible AI principles that could appear on AI-900?
5. On exam day, you want to reduce avoidable errors during the AI-900 test. Which checklist item is the best example of an exam-day tactic for time and accuracy?