AI Certification Exam Prep — Beginner
Timed AI-900 simulations plus targeted drills to fix weak domains fast.
This course is built for learners preparing for the Microsoft AI-900 (Azure AI Fundamentals) exam who want more than passive reading. You’ll train the way the exam feels: timed simulations, fast scenario decisions, and a repeatable method to repair weak spots by official objective area. If you’re new to certifications, Chapter 1 walks you through registration, proctoring options, scoring expectations, and a realistic study strategy that fits a beginner schedule.
The blueprint follows the exam objectives exactly, using the domain names as the organizing framework:
Chapters 2–5 each focus on one or two domains with clear explanations and exam-style practice that mirrors Microsoft’s scenario phrasing. You’ll learn how to identify what a question is really testing (workload type, service selection, ML lifecycle concept, or responsible AI principle) and avoid common distractors.
Chapter 1 sets the foundation: how the AI-900 exam is delivered, what question types to expect (multiple choice, multi-select, matching, and scenario prompts), and how to build a plan that includes timed drills. You’ll also set up a simple error log to track misses by domain and concept so your practice time stays efficient.
In Chapter 2, you’ll master Describe AI workloads by learning how to separate AI solutions from traditional software, map common business problems to AI approaches (vision, language, prediction, generative), and recognize responsible AI themes that appear frequently on the exam.
Chapter 3 targets Fundamental principles of ML on Azure—training vs inference, supervised vs unsupervised learning, basic evaluation thinking, and the Azure Machine Learning concepts you must recognize (workspaces, compute, pipelines, and endpoints) at a fundamentals level.
Chapter 4 covers Computer vision workloads on Azure, focusing on what each capability does (OCR, image analysis, detection vs classification), how results are typically represented, and how to choose the right service when a scenario emphasizes documents, images, or visual features.
Chapter 5 combines NLP workloads on Azure with Generative AI workloads on Azure. You’ll practice selecting language capabilities (sentiment, entities, summarization, translation), understand conversational AI patterns, and learn the core concepts behind Azure OpenAI and prompt basics—plus when safety and responsible use matter in scenario questions.
Chapter 6 is a full mock exam experience split into two timed parts, followed by a structured review workflow. You’ll learn how to categorize every missed question by objective, fix the underlying concept, and re-test quickly so improvements show up in your next attempt. You’ll also get an exam-day checklist to reduce avoidable errors caused by timing, fatigue, or misreading.
If you’re ready to start, Register free and begin with the baseline diagnostic. Or explore other options and learning paths on Edu AI: browse all courses.
Microsoft Certified Trainer (MCT)
Jordan McAllister is a Microsoft Certified Trainer specializing in Azure and AI fundamentals certification prep. He has helped new learners build exam-ready intuition through scenario-based practice aligned to official Microsoft objectives.
This course is a “mock exam marathon” by design: you will learn the AI-900 fundamentals while training the exact behaviors that raise scores—reading precision, time control, and systematic weak-spot repair. AI-900 (Microsoft Azure AI Fundamentals) rewards candidates who can map a scenario to the correct AI workload and choose the right Azure service or concept without overthinking implementation details.
In this chapter you will set up your exam logistics, understand how scoring works, and build a short, high-yield study plan (2-week or 4-week) centered on timed simulations. You’ll also create a tracking system: a diagnostic baseline, domain targets, and an error log that turns every missed question into a predictable improvement.
Exam Tip: Treat AI-900 like a language test as much as a tech test. The fastest score gains come from recognizing keywords (e.g., “classify,” “extract entities,” “detect objects,” “generate text,” “responsible AI”) and immediately mapping them to the correct workload and Azure capability.
Practice note for Understand the AI-900 exam format and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and choose test center vs online proctoring: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build your 2-week and 4-week study plans with timed practice: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and weak-domain tracking setup: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and choose test center vs online proctoring: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build your 2-week and 4-week study plans with timed practice: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and weak-domain tracking setup: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format and question styles: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and choose test center vs online proctoring: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 measures whether you can describe common AI workloads and pick appropriate Azure AI services and concepts at a fundamentals level. You are not expected to design production architectures or write code, but you are expected to distinguish solution types (machine learning vs. computer vision vs. NLP vs. generative AI) and apply responsible AI principles.
Map your study to the exam objectives (and keep your notes organized the same way). Use these domains as your “bins” for practice and review:
Common trap: confusing “what the model does” with “how it is deployed.” AI-900 questions often ask you to select the correct workload or service, not the hosting option. If a question emphasizes understanding or extracting information from text, it’s usually NLP; if it emphasizes creating new content (text/images/code), it’s generative AI; if it emphasizes predicting a label/value from structured data, it’s machine learning.
Exam Tip: Build a one-page mapping sheet: keywords → workload → likely Azure service family. In practice exams, annotate each miss with “keyword I missed” to stop repeating the same interpretation error.
Registering correctly prevents last-minute problems that waste study momentum. AI-900 is scheduled through Microsoft’s certification portal and delivered via an exam provider (commonly Pearson VUE). The workflow is predictable: choose the exam, select delivery method, pick date/time, confirm personal details, and complete payment or apply a voucher.
Choose test center if you want a controlled environment and fewer technical variables. Choose online proctoring if scheduling flexibility is more important and you can meet strict room and system requirements. Online testing typically requires a clean desk, private room, stable internet, and a system check prior to exam day.
Common trap: underestimating the friction of online proctoring—background applications, notifications, dual monitors, or corporate security tools can trigger issues. If you must test online, do a “dry run” the same time of day on the same network you’ll use for the exam.
Exam Tip: Schedule your exam now, then build your study plan backward from that date. A fixed deadline improves retention and makes timed practice feel purposeful rather than optional.
AI-900 uses a scaled scoring model. You don’t need to calculate the scale; you need to manage performance by domain and by question type. The key behavior: don’t let one confusing question consume the time needed to secure easier points later.
Microsoft exams typically report a score on a scale (commonly up to 1000) with a published passing standard (commonly 700). Treat “passing” as a minimum threshold, not a target. Your practice goal should be higher to create buffer for exam-day variance (new question phrasing, stress, or a domain you personally find weaker).
Retake policies can vary and are governed by Microsoft’s current rules. The practical takeaway: don’t plan to “use the first attempt as a practice.” Treat attempt one as the real run, but reduce pressure by knowing a retake is possible if needed.
Common trap: interpreting a practice score as “I’m ready” without checking whether the score came from familiar question banks. Readiness is demonstrated by stable performance across multiple timed sets and by being able to explain why each wrong option is wrong.
Exam Tip: Track two numbers separately: (1) raw score, and (2) confidence score (how many you guessed). A high raw score with heavy guessing is unstable—fix that before exam day.
AI-900 tests fundamentals through several item formats. The content is “intro level,” but the question design often rewards careful reading and punishes assumptions. Your job is to identify what the question is truly asking: workload identification, service selection, model lifecycle concept, or responsible AI principle.
Common trap: mixing up similar capabilities. For example, OCR vs. object detection vs. image classification; entity recognition vs. key phrase extraction; traditional ML predictions vs. generative AI creation. When you see “extract text from images,” think OCR; when you see “find people/cars and draw boxes,” think object detection; when you see “decide if photo is a cat or dog,” think classification.
Exam Tip: Before reading answer choices, paraphrase the ask in 5–7 words (“choose NLP service for language detection”). This prevents the options from steering your interpretation.
As you do timed sims in later chapters, label each mistake by format. Many candidates discover their misses cluster in multi-select due to incomplete constraint checking—an easy fix once you see the pattern.
Your score is a function of knowledge and pacing. Timed simulations are not just practice—they are training for decision-making under a clock. Build a consistent approach you will use on the real exam.
Time management rules that work well for AI-900:
Elimination tactics are where fundamentals become points. Identify wrong answers fast by checking for “mismatch types”:
Guessing strategy should be deliberate, not random. If you must guess, guess after eliminating at least one option, and record the topic in your error log regardless of whether you guessed correctly—lucky points hide weak areas.
Exam Tip: Multi-select questions: treat each option as a true/false statement against the scenario. Don’t “collect” related answers; select only what is necessary and supported.
Common trap: over-reading into Azure implementation. If the question asks what workload is appropriate, don’t choose a deployment detail. Match the ask to the simplest correct concept.
This course improves scores through a tight feedback loop: baseline → targeted drills → timed simulation → error log review → retest. Start with a diagnostic (baseline quiz or first timed sim) to identify weak domains. Your goal is not to “study everything equally,” but to raise your lowest domain to an acceptable floor while keeping strengths sharp.
Set up a tracking sheet with columns that force learning from every miss:
Use score targets by domain to guide your 2-week vs. 4-week plan. In a 2-week plan, prioritize daily timed sets (short but strict) and rapid error-log cycles; you’re optimizing for speed and recognition. In a 4-week plan, add deeper concept reinforcement days between timed sets, focusing on the areas your baseline exposes (for many candidates: responsible AI principles, evaluation metrics, and distinguishing NLP tasks from generative tasks).
Exam Tip: Track “avoidable errors” separately (misread, changed correct to wrong, missed ‘choose two’). Reducing avoidable errors often boosts scores faster than learning new content.
By the end of this chapter, you should have: an exam date, a delivery method chosen, a baseline score recorded, and a living error log. From here, every lesson and timed sim in the marathon feeds that system so your weak domains shrink predictably instead of randomly.
1. You are preparing for the AI-900 exam. During practice, you notice you often choose the wrong answer because you focus on implementation details instead of the scenario keywords. Which approach best aligns with the AI-900 exam style described in this chapter?
2. A candidate has two weeks until the AI-900 exam and can study 60–90 minutes per day. They want the fastest score improvements. Based on the chapter’s study strategy, what should they prioritize?
3. You create a baseline diagnostic quiz and find your lowest performance is in identifying which AI workload matches a scenario. What is the best next step according to the chapter’s weak-domain tracking approach?
4. A company requires a quiet, controlled environment and does not allow personal devices in secure rooms. The candidate must take the AI-900 exam soon. Which exam logistics choice is most appropriate to evaluate first?
5. During timed practice, you frequently miss questions that include words like "classify" and "extract entities." What is the most exam-aligned interpretation of this pattern?
AI-900 tests whether you can recognize common AI workload types, describe the basic building blocks of machine learning, and select the right Azure service for a scenario—without over-engineering. This chapter builds a mental “taxonomy” you can use under time pressure: prediction and classification (structured data), computer vision (images/video), natural language processing (text/speech), and generative AI (creating new content). You’ll also see how responsible AI requirements show up in real governance conversations and in exam distractors.
On this exam, many wrong answers are “too specific” (a niche service when a broad capability fits) or “too custom” (Azure Machine Learning when an Azure AI service is sufficient). You’ll practice narrowing a scenario to: (1) the workload type, (2) the data modality, (3) the output type (label, number, text, embedding, image), and (4) the right platform choice (prebuilt API vs custom training).
Exam Tip: If the scenario already has a clear target output (like “approve/deny,” “detect objects,” “summarize”), start from the workload taxonomy first. Only then choose the service. Service-first thinking is a common trap.
Practice note for AI workload taxonomy: prediction, classification, vision, language, generative: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Responsible AI fundamentals and governance scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure AI services overview and when to use each: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Describe AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for AI workload taxonomy: prediction, classification, vision, language, generative: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Responsible AI fundamentals and governance scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure AI services overview and when to use each: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Describe AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for AI workload taxonomy: prediction, classification, vision, language, generative: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Responsible AI fundamentals and governance scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI workloads are used when the rules are hard to write explicitly, change over time, or depend on patterns in data. Traditional software excels when deterministic rules are stable and can be fully specified (for example, calculating sales tax from a table, validating a checksum, or routing a request based on a fixed policy). AI-900 expects you to distinguish these quickly and justify “why AI” in plain language.
The core workload taxonomy you should recognize: prediction (forecasting a number like demand next week), classification (assigning categories like fraud/not fraud), vision (understanding images or video), language (extracting meaning from text or speech), and generative (creating new text/images/code). Exam scenarios often describe the business goal rather than naming the workload, so look for clues in the verbs: “forecast,” “categorize,” “detect,” “extract,” “translate,” “chat,” “summarize,” “generate.”
Exam Tip: If the problem statement includes “rules are complex,” “patterns in historical data,” “unstructured data,” or “improves over time,” it is signaling an AI workload. If it includes “must be 100% correct,” “same result every time,” or “compliance requires explicit rules,” it may be better as traditional logic (or a hybrid with human review).
Common trap: assuming any automation implies AI. A workflow engine, RPA, or a simple keyword search is not automatically AI. Another trap is thinking generative AI is the default for all text tasks. If the scenario is about extracting entities, sentiment, or language detection, that is typically NLP analysis rather than generation.
This section maps directly to the “fundamentals of machine learning” objective: you must be comfortable with the vocabulary used in questions and answers. Features are the input variables used to make a decision (for a house price model: square footage, location, number of bedrooms). A label is the known correct output in supervised learning (the sale price, or “spam/not spam”). A model is the learned function produced after training; it generalizes patterns from training data to new data.
Training is the process of fitting the model to labeled (or sometimes unlabeled) data. Inference is using the trained model to produce predictions on new inputs. Many exam distractors confuse training with inference—watch for wording like “build,” “train,” “fit” vs “predict,” “score,” “classify.”
AI systems often output a confidence score or probability (for example, 0.92 that an email is spam). Confidence is not a guarantee of correctness; it is a model’s internal estimate. This matters for responsible AI and for system design: you may set thresholds (auto-approve if confidence > 0.98; otherwise route to human review).
Exam Tip: When you see “requires labeled data,” think supervised learning. When you see “find patterns or groups without labels,” think unsupervised learning (like clustering). When you see “optimize actions by trial and feedback,” think reinforcement learning—even if AI-900 keeps this high-level.
Common trap: mixing up classification vs regression. Classification predicts a category (discrete label). Regression predicts a numeric value (continuous). “Predict tomorrow’s temperature” is regression; “predict whether it will rain” is classification.
Responsible AI appears on AI-900 both as direct knowledge (principles) and as scenario judgment (governance). The exam commonly uses five principles: fairness, reliability and safety, privacy and security, transparency, and accountability (often implied through governance, oversight, and auditability). You should be able to match a concern to the principle.
Fairness addresses disparate impact across groups (for example, a loan model rejecting a protected group more often). Reliability and safety is about consistent behavior under normal and edge conditions (for example, a vision system failing in low light). Privacy concerns data minimization, consent, and proper handling of personal data; security covers threats like data exfiltration, prompt injection (in generative scenarios), and unauthorized access. Transparency is about explainability and communicating limitations: users should know when they are interacting with an AI system and what it can and cannot do.
Exam Tip: If a question mentions “explain why the model made a decision,” “interpretability,” or “communicate limitations,” it is almost always testing transparency. If it mentions “PII,” “GDPR,” “consent,” or “data retention,” it is testing privacy.
Governance scenarios often involve policies: human-in-the-loop review for high-impact decisions, audit trails, access controls, and model monitoring for drift. A frequent trap is picking “more data” as the fix for a fairness issue—more data can help, but only if it addresses representation and bias; the safer exam-aligned answer is to evaluate bias, use fairness metrics, and adjust the process (data, model, and thresholds) with oversight.
AI-900 expects you to distinguish two primary paths on Azure: (1) Azure AI services (prebuilt models exposed via APIs) and (2) Azure Machine Learning (a platform to build, train, deploy, and manage custom models). The “right” choice depends on whether the task is common and supported out-of-the-box or whether you need custom training, control, or MLOps.
Azure AI services are designed for quick adoption: typical examples include vision analysis, OCR, face/object detection (where applicable), speech-to-text, language detection, sentiment analysis, key phrase extraction, translation, and conversational interfaces. You usually provide data at inference time, not training time. These services are ideal when the scenario fits a standard capability and you want speed and simplicity.
Azure Machine Learning is used when you must train your own model on your own labeled dataset, manage experiments, track metrics, deploy endpoints, and monitor performance. It supports a range of frameworks and includes concepts like datasets, compute targets, pipelines, and model registries (AI-900 level: know that it is the “build your own ML” option).
Exam Tip: If a scenario says “we have historical labeled data and need a custom model for our business,” lean toward Azure Machine Learning. If it says “extract text from images,” “translate,” or “analyze sentiment,” lean toward Azure AI services.
Common trap: choosing Azure Machine Learning for a scenario that clearly matches a prebuilt AI capability. The exam often rewards the simplest service that satisfies requirements. Another trap is confusing Azure AI services with Azure OpenAI: generative AI tasks (summarization, drafting, Q&A with prompts) may point to Azure OpenAI concepts rather than classic text analytics.
To score well on “describe AI workloads,” you need a repeatable selection method. Step 1: identify the modality (tabular data, images/video, audio, text). Step 2: identify the output (number, category, extracted entities, translation, generated content). Step 3: decide prebuilt vs custom. Step 4: apply responsible AI constraints (safety, privacy, transparency) that could change the design (for example, adding human review or redaction).
Examples of clean mapping: demand forecasting → regression/prediction on structured data (often custom ML). Email routing into “billing/support/sales” → classification (could be custom ML; could also be an NLP classification capability depending on constraints). Detecting defects in manufacturing images → computer vision (often custom if domain-specific, otherwise prebuilt vision analysis for generic detection). Extracting entities from customer feedback → NLP text analytics. Real-time call transcription → speech. Drafting support responses or summarizing long tickets → generative AI with prompts.
Exam Tip: Watch for wording that implies the need for ground truth labels (“we know which transactions were fraud”). That pushes you toward supervised learning. Wording like “no labels” or “discover segments” pushes you toward clustering/unsupervised approaches.
Common traps: (1) mixing “chatbot” with “Q&A over documents.” A chatbot can be rule-based, retrieval-based, or generative; the exam wants you to identify the underlying need (conversation management vs knowledge retrieval vs content generation). (2) Choosing generative AI for extraction tasks; extraction is usually deterministic NLP analysis unless the scenario explicitly asks for drafting or summarizing in natural language.
Use the following practice routine during timed sims: read the scenario once for modality and verbs, then scan the answer choices for the “family” (Azure AI services vs Azure Machine Learning vs Azure OpenAI), and finally eliminate choices that are mismatched by modality (for example, a vision service for a purely tabular prediction). You are training recognition speed, not memorization of every product name.
Mini-case reasoning patterns that frequently appear: when a hospital wants to “predict no-show appointments,” that is prediction/regression/classification on historical scheduling data; the main risks are privacy and fairness. When a retailer wants to “detect items on shelves from camera feeds,” that is vision; reliability (lighting, angles) is a key responsible AI concern. When a company wants to “monitor social media for brand sentiment and key topics,” that is NLP analytics; transparency matters if results drive decisions. When an enterprise wants to “generate a first draft of a policy document,” that is generative AI; governance should include human review, security controls, and clear disclosure of AI assistance.
Exam Tip: If an answer choice adds unnecessary complexity (custom training, advanced pipelines) but the scenario is a standard capability (OCR, translation, sentiment), it is likely a distractor. Choose the simplest service that meets the requirement.
Rapid-fire checks to do mentally as you answer: Is the output a label or a number? Is there labeled data available? Is the input unstructured (images/text) or structured (tables)? Does the scenario require explainability, auditability, or privacy controls that imply human oversight? These checks prevent the most common AI-900 error: selecting a technology because it sounds “more AI,” rather than because it fits the workload.
1. A retail company wants to estimate next month’s sales revenue for each store based on historical sales, promotions, and local events. Which AI workload type best fits this requirement?
2. You need to build a solution that identifies whether an uploaded image contains a company logo and returns a yes/no result. You want to avoid custom model training if possible. Which Azure service is the best fit?
3. A support team wants an application that summarizes long customer emails into a few bullet points. Which workload type is being described?
4. A financial services company is deploying an AI model that recommends whether to approve or deny loan applications. During governance review, they ask for the ability to explain decisions to applicants and to monitor for bias across demographic groups. Which Responsible AI principle is most directly addressed by this requirement?
5. A marketing team wants to generate multiple variations of product descriptions and social media posts from a short set of bullet points. Which workload type best matches the scenario?
This chapter targets the AI-900 objective area that checks whether you can (1) recognize core machine learning problem types, (2) describe the end-to-end training and evaluation flow, and (3) map those fundamentals to Azure Machine Learning (Azure ML) concepts like workspaces, compute, pipelines, and endpoints. The exam rarely asks for deep math; it does test whether you can identify the right approach from a scenario, avoid classic pitfalls (like data leakage), and choose Azure components that fit the lifecycle step.
Expect scenario-based stems: a short business need, the kind of data available, and a request to pick the ML type, the metric, or the Azure ML capability. Your job is to translate narrative clues into the correct “bucket.” Read for keywords: “predict a numeric value” (regression), “assign categories” (classification), “find groups” (clustering), “learn from labeled data” (supervised), and “no labels available” (unsupervised).
Exam Tip: When two answers both sound plausible, anchor on what the question is truly asking: problem type vs. service/component vs. evaluation metric. AI-900 distractors often mix these layers (e.g., offering an Azure service choice when the prompt asks for an ML concept).
Practice note for ML basics: supervised, unsupervised, regression, classification, clustering: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Training pipeline: data prep, split, train, evaluate, deploy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure ML concepts: workspaces, compute, datasets, pipelines, endpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Fundamental principles of ML on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for ML basics: supervised, unsupervised, regression, classification, clustering: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Training pipeline: data prep, split, train, evaluate, deploy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure ML concepts: workspaces, compute, datasets, pipelines, endpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Fundamental principles of ML on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for ML basics: supervised, unsupervised, regression, classification, clustering: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Training pipeline: data prep, split, train, evaluate, deploy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 expects you to classify scenarios into the major ML problem types: supervised learning, unsupervised learning, regression, classification, and clustering. Supervised learning means you have labeled examples (inputs plus correct outputs). Unsupervised learning means you have inputs only and you want the model to discover structure. Regression and classification are the two most common supervised patterns, while clustering is the most common unsupervised pattern mentioned at this level.
Regression predicts a continuous numeric value: sales next month, temperature, demand forecast, house price. The key clue is that the output is a number on a scale, not a category. Classification predicts a discrete label: “fraud/not fraud,” “premium/basic,” “cat/dog,” “churn/retain.” Many exam stems hide classification behind “recommend an action” or “route a ticket,” but if the output is one of a fixed set of classes, it’s classification.
Clustering groups similar items without pre-defined labels: customer segmentation, grouping products by buying patterns, identifying similar documents. Clustering is not “predicting” a known target; it is discovering groups. A common trap is confusing clustering with classification because both involve “groups.” If the groups are known ahead of time (like predefined customer tiers), it’s classification; if the groups must be discovered from the data, it’s clustering.
Exam Tip: Watch for phrasing like “predict probability of…”—that still usually indicates classification (probability of class membership), even though the model outputs a number.
The exam checks whether you understand the standard training pipeline: data preparation, splitting data, training, evaluation, and deployment. In Azure terms, you should be able to describe this as a repeatable workflow rather than a one-time activity. Data prep includes cleaning, handling missing values, encoding categories, and feature engineering. Then you split data into at least training and test sets (often also a validation set). You train a model on the training set, tune it (often using validation), and evaluate final performance on the test set.
Metrics basics appear in AI-900 as “which metric would you use?” or “which outcome indicates a better model?” For classification, common metrics include accuracy, precision, recall, and F1 score. Accuracy is simplest but can be misleading with imbalanced data (e.g., 99% non-fraud). Precision measures how often predicted positives are correct; recall measures how many actual positives you found. F1 balances precision and recall. For regression, common metrics include MAE (mean absolute error), MSE (mean squared error), and RMSE (root mean squared error). These measure prediction error magnitude; lower is better.
A frequent exam trap is metric-direction confusion: for most “error” metrics (MAE/MSE/RMSE), lower is better. For most “score” metrics (accuracy, precision, recall, F1), higher is better. Another trap is selecting accuracy for rare-event detection. If the stem mentions “false negatives are costly” (missing fraud, missing disease), recall often matters more; if “false positives are costly” (unnecessary reviews, blocking good customers), precision often matters more.
Exam Tip: If the stem mentions “imbalanced classes” or “rare event,” mentally down-rank accuracy as the primary metric unless the question explicitly frames it as acceptable.
AI-900 questions often test whether you can identify why a model performs well in training but poorly in real life. Overfitting occurs when the model learns noise or patterns specific to the training set—high training performance, low test performance. Underfitting occurs when the model is too simple to capture the underlying relationship—poor performance on both training and test. The bias/variance intuition is helpful: high bias tends to underfit (model is too rigid), high variance tends to overfit (model is too sensitive to training data quirks).
Typical mitigations are concept-level only: to reduce overfitting, you can use more training data, simplify the model, regularize, or use early stopping; to reduce underfitting, you can increase model complexity, add better features, or train longer. The exam won’t demand hyperparameter details, but it will expect you to select the correct general direction (simplify vs. make more expressive).
Data leakage is a favorite trap because it can produce “too good to be true” evaluation results. Leakage happens when information that would not be available at prediction time accidentally enters training or evaluation. Examples include using post-outcome fields (e.g., “resolution date” to predict “will a ticket be resolved”), calculating normalization statistics on the full dataset before splitting, or having near-duplicate records across train and test. Leakage can make test metrics look excellent, but real-world performance collapses.
Exam Tip: If a scenario describes unusually high accuracy during evaluation and then poor production outcomes, suspect leakage first, then overfitting. Leakage is about improper data separation; overfitting is about excessive memorization.
Another common confusion: “bias” in bias/variance is not the same as fairness/bias in Responsible AI. In this section, bias means systematic error from overly simplistic assumptions. On AI-900, read the question carefully to know which meaning is intended.
Azure ML is the platform layer the exam expects you to recognize for building, training, and managing ML models. The workspace is the top-level container: it organizes experiments, models, data references, compute, and deployment artifacts. Many stems ask what you “create first” to start using Azure ML—often the workspace is the correct foundational object.
Compute is where training or inference runs. You may see compute instances (developer/workbench style), compute clusters (scalable training), and inference compute for endpoints. At AI-900 depth, it’s enough to know that you choose compute based on workload needs: development vs. scalable training vs. deployment.
Datasets (or data assets) represent managed references to data used for training and scoring. Pipelines orchestrate steps such as data prep, training, and evaluation into a repeatable workflow. This is key to “MLOps thinking”: reproducibility, automation, and traceability. When a stem emphasizes repeatable training, scheduled runs, or consistent preprocessing, pipelines are a strong match.
Endpoints are how models are exposed for use after training (real-time or batch, covered next). Also know that Azure ML supports experiments and run tracking—if a question talks about comparing model runs, reviewing metrics over time, or tracking parameters, that’s the experiment/run tracking capability within the workspace.
Exam Tip: Distinguish “where artifacts live” (workspace) from “where code runs” (compute) and from “how steps are chained” (pipeline). Distractors often swap these roles.
Deployment questions on AI-900 focus on choosing between real-time inference and batch scoring, and recognizing the role of endpoints. Real-time (online) inference is used when an application needs an immediate prediction per request—fraud checks at checkout, personalization on page load, routing a support ticket as it arrives. This is commonly delivered through a real-time endpoint that applications call via an API.
Batch inference is used when you can score many records at once on a schedule—nightly churn predictions, weekly lead scoring, periodic inventory forecasting. Batch is usually more cost-efficient for large volumes when low latency is not required. If the stem includes “every night,” “end of day,” “process a file,” or “score millions of rows,” batch is the likely match.
Monitoring is included because the exam wants you to know models can degrade over time due to data drift or changing conditions. You monitor input distributions, prediction quality (when ground truth becomes available), and operational metrics like latency and error rates. If a question mentions “model performance dropping after deployment,” the correct concept is often drift/monitoring rather than “retrain once and forget.”
Exam Tip: Latency requirement is the fastest discriminator. If the business process can tolerate waiting (hours/days), don’t pick real-time endpoints just because they sound modern.
Common trap: confusing “training pipeline” with “inference pipeline.” Training pipelines produce a model artifact; inference (real-time endpoint or batch job) uses that model to generate predictions. The exam may phrase both as “pipelines,” so anchor on whether the output is a model or predictions.
This timed domain drill set is about speed and precision: read a scenario, identify the ML type, then choose the metric or Azure ML concept that best fits. Your goal is to avoid “keyword overfitting”—don’t jump at the first familiar term. Instead, apply a quick decision tree: (1) Is there a labeled target? (2) Is the target numeric or categorical? (3) Is the need immediate or scheduled? (4) Which component is being described: workspace, compute, dataset, pipeline, endpoint?
For metrics selection, train yourself to map business risk to precision/recall: missing a positive is a false negative (recall-sensitive); flagging a negative as positive is a false positive (precision-sensitive). For regression, map the stem’s pain to error magnitude: “average error” aligns with MAE; “penalize large errors more” points toward MSE/RMSE conceptually. Remember: AI-900 typically stays at the level of “which metric is appropriate,” not “calculate the metric.”
For Azure ML scenario matching, focus on nouns in the prompt. If it’s about organizing resources, governance, and a central place to manage ML assets, choose workspace. If it’s about scaling training jobs, choose compute cluster. If it’s about automating a multi-step workflow (prep → train → evaluate), choose pipeline. If it’s about exposing a model to applications, choose endpoint. If it’s about recurring scoring of large datasets, think batch inference rather than real-time.
Exam Tip: When two answers are both “true statements,” pick the one that directly satisfies the constraint in the stem (latency, labels, scale, repeatability). Constraints are the exam’s way of telling you which concept they’re testing.
Finally, watch for mismatch traps: clustering offered when the problem is classification, accuracy offered when classes are imbalanced, or “deploy” choices presented when the question is actually about evaluation. In a timed sim, your edge comes from consistently separating problem type, lifecycle stage, and Azure component.
1. A retail company wants to predict the number of units it will sell next week for each store based on historical sales, promotions, and weather forecasts. Which machine learning problem type should you use?
2. A team is building a model and reports unusually high accuracy during evaluation. You discover that a feature in the training data is populated using the future outcome (for example, a field that is filled in after a customer cancels). What is the most likely issue?
3. You are designing an ML workflow in Azure Machine Learning. You need a cloud resource to run model training jobs and scale the CPU/GPU capacity as needed. Which Azure ML concept should you use?
4. A company wants to segment customers into groups based on purchasing behavior, but there are no existing labels describing the groups. Which approach best fits the requirement?
5. You have trained a model in Azure Machine Learning and want to make it available to an application for real-time predictions through a web-accessible interface. What should you create?
AI-900 expects you to recognize computer vision workload types, map them to the right Azure AI capability, and interpret typical outputs (tags, captions, bounding boxes, confidence scores). This chapter targets the “pick the right tool” skill the exam rewards: given a scenario (retail shelf monitoring, invoice processing, safety PPE detection, photo search), you must identify whether the workload is OCR, image analysis, detection, classification, or a document-centric extraction flow—and then choose an Azure service family that fits.
You’ll also see common test traps: confusing OCR with image captioning, thinking “object detection” means “image classification,” or choosing a custom model when prebuilt APIs already meet requirements. Keep your answers anchored to what the workload must produce (text vs labels vs locations) and what constraints matter (real-time vs batch, printed vs handwriting, structured forms vs unstructured images).
Exam Tip: When stuck between two options, restate the required output in one noun phrase: “I need text,” “I need a label,” “I need boxes,” or “I need to extract fields.” That phrase usually maps directly to the correct capability category.
Practice note for Vision capability map: OCR, image analysis, detection, classification: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure AI Vision scenarios and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Document and image processing decision flow: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Computer vision workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Vision capability map: OCR, image analysis, detection, classification: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure AI Vision scenarios and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Document and image processing decision flow: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed domain drill set: Computer vision workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Vision capability map: OCR, image analysis, detection, classification: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Azure AI Vision scenarios and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 assesses recognition more than implementation: you’re expected to identify which vision workload fits a business problem. Start with the capability map you’ll use throughout the exam:
Now tie these to common scenarios the exam likes: retail (count items → detection; classify product type → classification), manufacturing QA (detect defect location → detection; classify pass/fail → classification), document processing (extract text/fields → OCR + document extraction), and media search (tagging/captioning → image analysis). The core exam skill is to read a scenario and infer the minimum required output: if the business wants “where” in the image, you need detection; if they want “what text,” you need OCR; if they want “what’s in the scene,” you need image analysis.
Exam Tip: The word “identify” in a scenario is ambiguous—look for hints. “Identify and locate” implies detection; “identify the type” implies classification; “identify text” implies OCR.
Common trap: choosing OCR when the scenario mentions “documents,” even if the task is actually to recognize objects in photos (e.g., “detect safety vests in site images”). Always separate “document image” (text-centric) from “photo/video” (scene-centric).
Azure AI Vision (the vision service family) is frequently tested at the concept level: what it can return and when you should use it. For “image analysis” style questions, think in outputs rather than APIs: captions (natural language description), tags (keywords), categories (broad scene groupings), and confidence scores (probabilities that help you decide thresholds).
The exam often gives you an output snippet (for example, a list of tags with confidence) and asks what kind of workload produced it or what business action should follow. Your job is to interpret confidence as “likelihood,” not certainty. A high-confidence tag can be used for automated routing; lower confidence suggests human review, additional signals, or a different model approach.
Exam Tip: If the question emphasizes “describe the image” or “generate a caption,” that’s image analysis. OCR is about returning text strings with positions, not prose descriptions.
Constraints matter at selection time. Image analysis is best when you can accept general-purpose descriptions and broad attributes. It’s not intended to enforce a custom label taxonomy for a niche domain (e.g., 50 internal defect codes). That’s where custom training comes in (covered later). Also watch for trap wording like “predict the brand” or “recognize our proprietary parts”—that typically exceeds generic tagging and points to custom models.
Finally, remember the exam’s “minimum viable capability” bias: if generic tags/captions satisfy the business outcome, pick the managed, prebuilt option rather than a custom approach.
OCR questions on AI-900 usually test whether you understand what OCR returns and the practical constraints. OCR output is machine-readable text, often with structure such as lines/words and their coordinates (bounding polygons/boxes) plus confidence. This is distinct from “image analysis” tags. If the scenario needs to store, search, translate, or validate text (invoice number, VIN, street sign), OCR is the backbone.
A key exam distinction is printed text vs handwriting. Printed text is generally easier and more reliable; handwriting introduces variability and tends to be a stated constraint in the scenario. When the prompt explicitly says “handwritten notes,” “forms filled in by hand,” or “cursive,” treat that as a signal to select an OCR capability that supports handwriting recognition.
Exam Tip: If the scenario says “extract fields from invoices/receipts/forms,” don’t stop at OCR. OCR gets you text; the business likely needs key-value pairs (like Total, Date, Vendor). That’s a document extraction flow, which may involve prebuilt document models rather than only raw OCR.
Use a simple decision flow: (1) Do you need text? If no, don’t choose OCR. (2) If yes, is it unstructured text (signage) or structured documents (forms/invoices)? Structured documents typically require extracting labeled fields, tables, and layout—OCR alone may not meet the requirement. (3) Is handwriting present? If yes, ensure the chosen approach supports it and plan for lower confidence and review steps.
Common trap: choosing “translation” or “NLP” directly for scanned text. The exam expects OCR first, then downstream language processing.
The AI-900 exam frequently tests your ability to distinguish classification, object detection, and segmentation at a conceptual level. Use the “what + where + how much of the image” rule:
On AI-900, segmentation may appear as a distractor: many business problems only need boxes (detection) or labels (classification). If the scenario needs precise shape/area (paint spill coverage, tumor boundary, road lane pixels), segmentation becomes the best conceptual match. If it needs counting items or finding their positions, detection is the better fit.
Exam Tip: If the prompt mentions “count,” “locate,” “bounding box,” “coordinates,” or “draw rectangles,” that’s detection. If it mentions “outline,” “mask,” “pixel-level,” or “background removal,” think segmentation.
Common trap: seeing multiple objects and assuming classification can do it. Classification can be multi-label, but it still doesn’t tell you where objects are; the exam will reward detection when location is required.
AI-900 is not a deep architecture exam, but it does test whether you can choose between prebuilt Azure AI capabilities and custom model approaches. The guiding principle: choose the simplest managed service that meets requirements with acceptable accuracy and constraints.
Use this high-level decision flow for document and image processing:
Constraints that push you toward custom: unique classes not covered by prebuilt models, controlled taxonomy, special imaging conditions, or required accuracy beyond what generic models deliver. Constraints that push you toward prebuilt: fast time-to-value, minimal ML expertise, and broad categories.
Exam Tip: Watch for “train with your own images” language—this is a strong hint toward custom models. Watch for “detect common objects” or “generate captions”—this points to prebuilt vision.
Common trap: selecting custom for every scenario “to be safe.” The exam penalizes over-engineering. If the prompt doesn’t require custom categories, default to the managed prebuilt capability.
This chapter’s timed drill mindset is about rapid interpretation. AI-900 frequently provides a scenario plus a partial “result” (or describes what the output should look like) and asks you to pick the workload/tool. Train yourself to map outputs to capabilities:
In timed conditions, apply the “two-pass” strategy. Pass 1: identify the output type (text vs tags vs boxes vs masks). Pass 2: validate with one constraint word from the scenario (handwritten, real-time, proprietary, structured form). This reduces second-guessing and keeps you from falling for distractors that sound sophisticated.
Exam Tip: Don’t treat confidence scores as pass/fail. On the exam, confidence primarily signals that ML outputs are probabilistic and may require thresholds or human review. If a question mentions “low confidence,” the best action is often to route for review or collect more/better training data (if custom) rather than claiming the model is “wrong.”
Common traps in this domain drill set: (1) mixing up OCR with image captions, (2) choosing classification when the task requires counting/locating, and (3) picking custom training when prebuilt extraction already matches the requirement. Your goal is to answer based on the minimal required output and the simplest Azure AI capability that produces it.
1. A retail company wants to monitor store shelves using cameras. The system must identify each product on a shelf and return the location of each detected item in the image so the company can measure facings and detect out-of-stocks. Which computer vision workload type best fits this requirement?
2. A logistics team scans photos of shipping labels and needs to extract the tracking number and recipient address as text. Which Azure AI capability should you use?
3. You are designing a solution to process invoices. The business needs to extract specific fields such as invoice number, vendor name, and total amount from many documents. Which decision is most appropriate for this workload?
4. A construction company wants to confirm whether workers are wearing safety helmets in site photos. The solution must indicate which workers are missing helmets by returning the helmet locations in the image. Which workload type should you choose?
5. A photo management app wants to enable search for "beach" or "city skyline" across a user's photo library. The app does not need bounding boxes, only descriptive labels or a caption for each photo. Which capability best matches this requirement?
This chapter targets the AI-900 objective area where candidates must recognize Natural Language Processing (NLP) and generative AI workloads and select the right Azure service for a scenario. On the exam, you are rarely asked to build anything; you are tested on workload identification, service choice, and responsible AI considerations. Your job is to map a plain-English requirement (for example, “extract key points from customer emails” or “create a chat experience that answers from internal docs”) to the correct capability: Azure AI Language, Azure AI Translator, Azure AI Speech, Azure Bot Service, or Azure OpenAI.
Expect “near-miss” options that are technically related but wrong for the requirement. The exam writers like to swap terms such as sentiment vs. opinion mining, entity extraction vs. key phrase extraction, or chatbot vs. Q&A knowledge base. Another frequent trap is choosing a generative model when a deterministic NLP feature is safer, cheaper, and easier to justify from a compliance perspective.
Exam Tip: When the scenario asks to “identify language,” “translate,” “extract entities,” or “classify text,” default to Azure AI Language / Translator. When it asks to “generate,” “summarize creatively,” “write,” or “chat in natural language over broad topics,” consider Azure OpenAI—but confirm you also need grounding and safety controls.
Practice note for NLP fundamentals: sentiment, key phrases, entities, summarization, translation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Conversational AI concepts: bots, intent, utterances, knowledge grounding: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Generative AI on Azure: Azure OpenAI concepts and responsible usage: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed mixed-domain drill set: NLP + Generative AI workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP fundamentals: sentiment, key phrases, entities, summarization, translation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Conversational AI concepts: bots, intent, utterances, knowledge grounding: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Generative AI on Azure: Azure OpenAI concepts and responsible usage: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Timed mixed-domain drill set: NLP + Generative AI workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP fundamentals: sentiment, key phrases, entities, summarization, translation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
NLP workloads on AI-900 are typically framed as “What do you want to do with text?” The exam expects you to recognize common outputs: sentiment labels, extracted phrases, recognized entities, summary sentences, or translated text. Each output implies a different underlying task and service capability. In Azure, many of these tasks are offered through Azure AI Language features, while translation is handled by Azure AI Translator.
Sentiment analysis returns a polarity assessment (often positive/negative/neutral) and sometimes confidence scores. A more detailed variant is opinion mining, which ties sentiment to specific aspects (for example, “battery life” is negative, “screen” is positive). Key phrase extraction outputs representative terms or short phrases that capture the main topics in a document. Entity recognition returns structured items such as people, locations, organizations, dates, and sometimes linked references (depending on capability). Summarization typically outputs key sentences (extractive summarization) or a condensed version (abstractive, often associated with generative AI).
Common trap: Treating key phrases as “tags” that you design. Key phrase extraction is automated and may not align to a controlled taxonomy. If the scenario requires categorizing into specific predefined labels (for example, Billing, Technical Support, Cancellation), that is classification, not key phrase extraction.
Exam Tip: If the output needs to be structured fields (customer name, invoice number, city), think extraction (entities). If it needs a single label from a known set, think classification. If it needs new natural language, think generative AI (but check if an extractive summary would satisfy the requirement more safely).
Azure AI Language is a core service family for AI-900 NLP questions. The exam tends to group features into three mental buckets: classification, extraction, and question answering patterns. Classification includes assigning a document to a category, detecting the language, or determining sentiment. Extraction includes entities, key phrases, and sometimes personally identifiable information (PII) detection as a compliance-focused task.
Question answering patterns are often tested indirectly: the requirement sounds like a chatbot, but the real need is retrieving an answer from a known body of content (FAQ, policy docs, product manuals). In those cases, the right building block is a Q&A capability (knowledge base) rather than a free-form generative model. The exam also tests whether you can distinguish “bot framework to host a conversation” from “language capability that finds answers.” The bot is the container; the language feature provides the brain for specific tasks.
Common trap: Selecting Azure OpenAI for “answer questions from a policy document” when the scenario emphasizes consistency, citations, and minimizing hallucinations. Q&A over curated content is often the better first choice unless the question explicitly calls for generative reasoning or rephrasing. Conversely, if the scenario needs conversational, multi-turn paraphrasing or drafting responses, generative AI may be required.
Exam Tip: Look for clues like “from a set of FAQs,” “approved answers,” “knowledge base,” “company policy,” “must be consistent.” Those cues push you toward Q&A patterns and away from unconstrained generation.
AI-900 frequently mixes modalities: text, speech, and translation can appear in the same scenario. Your task is to choose the capability that matches the input/output. If the input is audio and the output is text, you are in speech-to-text territory. If the input is text and the output is audio, that is text-to-speech. If the requirement is “translate text between languages,” choose Azure AI Translator. If the requirement is “translate spoken conversations,” you may need a combination: speech recognition plus translation plus (optionally) speech synthesis.
Translation scenarios often include language detection. If the user may submit text in unknown languages, translation services can detect the source language before translating. The exam may try to distract you with general “language understanding” options, but translation is its own distinct workload.
Common trap: Assuming “NLP” automatically includes speech. Speech services are specialized for audio. Another trap is choosing translation when the real requirement is localization of meaning and tone across marketing content (where generative rewriting might be requested). If the prompt says “translate accurately,” pick Translator; if it says “rewrite naturally for a local audience,” that begins to look like generative AI, and you should consider responsible review workflows.
Exam Tip: Match the data type: audio implies Speech; text implies Language/Translator. Then confirm whether the output is deterministic (translation) or creative (generative rewrite).
Conversational AI questions on AI-900 test your understanding of bots, intents, utterances, and grounding. A bot is an application that manages conversation flow across channels (web chat, Teams, etc.). Intents represent what the user is trying to accomplish (for example, “reset password”), and utterances are example user phrases that map to intents. Grounding means the bot’s responses are anchored in approved data sources (FAQs, docs, transaction systems) rather than invented content.
On the exam, recognize three common bot patterns. First, the FAQ/Q&A bot, which retrieves answers from a knowledge base. Second, the task-oriented bot, which collects slots/parameters (order number, date) and calls back-end systems. Third, the generative chat assistant, which drafts responses or explains concepts, often requiring stronger safety controls and grounding.
Common trap: Overusing “intent/utterance” language when the scenario is simply knowledge retrieval. If the user’s question is answered by one paragraph from a policy page, you may not need complex intent modeling at all—retrieval patterns can be enough. Another trap is forgetting that bots are not the same as language models: the hosting and orchestration layer (bot) is separate from the NLP or generative capability.
Exam Tip: If the scenario mentions “multi-turn,” “collect user info,” or “trigger a workflow,” it’s likely a task bot. If it mentions “answer from documentation,” it’s knowledge grounding. If it mentions “draft,” “compose,” or “brainstorm,” it is generative—then immediately think about safety, content filtering, and human review.
Generative AI on Azure (commonly via Azure OpenAI) is tested at a conceptual level: what it is good for, what risks it introduces, and how to apply basic prompt practices. Generative models can produce new text, summarize, classify, extract in flexible ways, and engage in conversational responses. They are powerful, but the exam expects you to know they can hallucinate, reflect biases, or leak sensitive information if not controlled.
Prompts are instructions plus context. Strong prompts specify the role, task, constraints, and expected format (for example, bullet list, JSON, or short answer). Many failures come from vague prompts that do not define boundaries. Grounding is how you reduce hallucination by providing relevant source content (for example, internal policy excerpts) and instructing the model to answer only from that context. On AI-900, grounding appears as “use your company documents” or “ensure answers are based on approved content.”
Common trap: Assuming generative AI is always the right tool for summarization or extraction. If the scenario demands deterministic extraction (exact invoice number, guaranteed schema), classic NLP extraction or document processing may be more appropriate. Another trap is ignoring responsible AI: if the scenario involves sensitive data, you must consider privacy, access control, and content filtering.
Exam Tip: If the requirement includes “must not fabricate,” “must be auditable,” or “must comply with policy,” mention grounding, limiting responses to sources, and adding a fallback like “I don’t know” when the answer is not present.
Your timed drill strategy for mixed NLP + generative questions should be consistent: (1) identify the input type (text vs audio), (2) identify the required output (label, fields, translation, summary, new text), (3) determine whether the answer must be grounded in known content, and (4) select the service that provides that capability with the least risk. AI-900 punishes “cool tool bias”—choosing generative AI because it can do everything—when a simpler service is more correct.
When you see a prompt scenario (for example, “write a response to a customer complaint”), pause and decide whether the task is generation or analysis. If you must detect sentiment, extract key issues, and route tickets, that is primarily Azure AI Language. If you must draft a personalized reply, that is a generative workload (Azure OpenAI), but you should pair it with safety guidance such as tone constraints, refusal rules, and review steps. When you see “translate support tickets from Japanese to English,” that is Translator. When you see “transcribe recorded calls,” that is Speech-to-text.
Safety and responsible usage are not optional details: the exam expects you to recognize common mitigations. These include limiting prompts to necessary data, redacting PII before sending to models, using content filtering where applicable, setting clear system instructions, and adding human-in-the-loop review for high-impact outputs. You do not need deep implementation details, but you must be able to choose the safer design in a scenario.
Exam Tip: If two answers both seem plausible, choose the one that best matches the required output format and reduces risk (deterministic service when possible; grounded generation when needed). A frequent “weak spot” is confusing knowledge retrieval with free-form chat—train yourself to look for cues like “approved answers” vs. “creative drafting.”
1. A support team wants to automatically extract product names, customer names, and locations from incoming email text so the data can be stored in a database. Which Azure service/capability should you use?
2. You need to translate customer chat transcripts from French to English while preserving meaning. The solution should be purpose-built for translation rather than general text generation. What should you choose?
3. A company wants to classify each product review as positive, negative, or neutral and display an overall satisfaction trend. Which NLP capability is the best match?
4. You are designing a chat experience that answers employee questions using internal policy documents. The company wants responses grounded in those documents, not open-ended general knowledge. What should you use?
5. A marketing team wants an app that generates multiple draft product descriptions from short bullet points. The organization also requires responsible AI controls to reduce harmful outputs. Which Azure service is most appropriate?
This chapter is your capstone: you will run two full timed simulations, diagnose weak spots with a repeatable method, and finish with a high-yield review mapped to what AI-900 actually tests. The goal is not “more studying”—it’s exam performance under constraints. AI-900 rewards candidates who can (1) recognize common Azure AI workload patterns, (2) choose the right service family, and (3) explain responsible AI basics without overthinking implementation details.
You will notice a recurring theme: many questions are designed to test service selection and terminology boundaries (e.g., Azure Machine Learning vs Azure AI services; Azure AI Vision vs Document Intelligence; “classification” vs “object detection”). Your job is to read for intent, match to the correct workload type, and eliminate distractors that add unnecessary complexity.
Use the lessons in this chapter in sequence: first lock pacing, then complete Mock Exam Part 1 and Part 2, then run the weak spot analysis, and finally apply the exam-day checklist. Treat each piece as a system—if you skip the diagnostics, you will repeat the same mistakes.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Your timed sims must feel like the real exam: uninterrupted, single sitting, no notes, no searching. Start by setting a pacing plan that prevents time loss on “maybe” questions. The AI-900 format can vary, but the skill is stable: quick comprehension, service mapping, and distractor elimination.
Adopt a two-pass rule. Pass 1: answer everything you can in one read. If you need to reread more than once or you are debating two close choices, mark it and move. Pass 2: return to marked items with the remaining time and resolve using objective anchors (workload type, output type, and service boundary). Exam Tip: If you are still torn after 60–90 seconds on a marked item, choose the option that best matches the workload’s “noun” (vision, language, ML, generative) and the scenario’s deliverable (labels, embeddings, translation, extracted fields, predictions).
Scoring targets should be realistic and diagnostic. Aim for a first-sim target that reveals gaps (e.g., 75–80% equivalent performance), then raise to a comfort buffer (e.g., 85%+) by the second sim. The purpose is not to chase perfection; it’s to ensure you are not losing points to avoidable traps like confusing training vs inferencing, mixing up classification and regression, or assuming you must build custom ML when a prebuilt Azure AI service fits.
Finally, treat responsible AI questions as “definition + implication” items. The exam often tests whether you know what fairness, reliability/safety, privacy/security, inclusiveness, transparency, and accountability mean in practice—not how to code them.
Mock Exam Part 1 is designed to sample the full blueprint: AI workloads, machine learning fundamentals, computer vision, NLP, and generative AI. Expect scenario prompts that are “one paragraph long” and end with a direct requirement such as identifying a workload type, selecting an Azure service, or recognizing an evaluation metric. Your performance improves when you translate the scenario into a small checklist: input modality (text/image/tabular), desired output (label/score/coordinates/summary), and whether training is required.
Common traps in Part 1 revolve around choosing services that are too general or too custom. For example, candidates over-select Azure Machine Learning for tasks that are classic prebuilt Azure AI services (like OCR, key phrase extraction, translation, or sentiment). Remember what the exam wants: choose the simplest correct Azure capability that meets requirements. Exam Tip: If the scenario describes “extract text,” “detect language,” “translate,” “recognize faces,” or “analyze sentiment,” default to Azure AI services unless it explicitly demands custom model training on proprietary data.
When machine learning concepts appear, the exam usually tests foundational terms: training vs inference, features vs labels, supervised vs unsupervised learning, and evaluation basics. Watch for distractors that swap classification and regression. If the output is a category (fraud/not fraud, churn/no churn), it’s classification. If the output is a number (price, demand), it’s regression. If the task is grouping without labels, it’s clustering.
For vision, quickly separate image classification (single label), object detection (labels + bounding boxes), and OCR/document extraction (text and structured fields). For NLP, separate text analytics (sentiment, entities, key phrases), translation, and conversational AI (bots, question answering). Part 1 also introduces generative AI selection: identify when you need a model that generates new text/code versus extracting or classifying existing text. Avoid the trap of calling everything “generative.” Summarization and drafting are generative; entity extraction and sentiment are not.
Mock Exam Part 2 raises difficulty by using closer distractors and multi-constraint scenarios. Here, the exam tests whether you can pick the best fit, not just a plausible tool. You’ll see options that are all “Azure-sounding” (e.g., Azure AI Vision vs Azure AI Document Intelligence; Azure OpenAI vs generic NLP; Azure Machine Learning vs AutoML vs designer). Your advantage comes from mapping the requirement to the service’s primary output.
A high-frequency distractor pattern is “custom vs prebuilt.” If the prompt emphasizes domain-specific documents with varied layouts and a need to extract structured fields, Document Intelligence (prebuilt or custom) is typically the anchor. If it emphasizes general image understanding, tags, captions, or object location, Vision fits better. For language, if the requirement is “conversational interface,” think Azure AI Bot Service plus language capabilities; if it’s “extract entities/sentiment from text,” think Azure AI Language text analytics features. If it is “generate a draft response” or “summarize long content,” generative AI is likely, and Azure OpenAI concepts apply.
Harder distractors also show up in ML evaluation. The exam may test when accuracy is misleading (imbalanced classes) and when you’d consider precision/recall tradeoffs. You don’t need to compute metrics, but you must recognize the intent: if false positives are costly, prioritize precision; if false negatives are costly, prioritize recall. Exam Tip: In healthcare screening and safety scenarios, missing a true case (false negative) is often the bigger risk—lean recall. In fraud blocking or moderation where wrongful flags harm users, precision becomes more important.
For responsible AI, Part 2 may embed ethics inside technical scenarios. If a model behaves differently across groups, that’s fairness. If users can’t understand why a decision was made, that’s transparency. If data includes sensitive information, that’s privacy/security. Don’t over-rotate into governance jargon—answer based on the principle named and the risk described.
After both mocks, your score is less important than your miss pattern. Build a simple diagnostic grid with the official domains: (1) AI workloads and considerations, (2) fundamental principles of machine learning on Azure, (3) computer vision workloads on Azure, (4) NLP workloads on Azure, (5) generative AI workloads on Azure. For every missed or guessed item, classify it twice: domain and “failure mode.”
Use these failure modes to make your remediation efficient:
Then apply a “one-sentence rule” to each miss: write one sentence stating why the correct option is correct in terms of input → output → service. Example format: “Because the input is scanned forms and the output is structured fields, the best fit is Document Intelligence.” This forces you to internalize the exam’s decision logic.
Exam Tip: If more than half your misses are “service boundary confusion,” stop doing more questions and instead do targeted comparison drills (Vision vs Document Intelligence; Language vs OpenAI; Azure ML vs prebuilt AI services). Boundary clarity produces the fastest score gains.
This final review aligns to the official objective domains and focuses on “high-yield” recognition points that frequently decide borderline scores.
Describe AI workloads and identify common AI solution types and responsible AI considerations: Know workload categories (prediction, classification, anomaly detection, vision, language, generative). Be fluent in responsible AI pillars: fairness, reliability & safety, privacy & security, inclusiveness, transparency, accountability. Common trap: mixing fairness with inclusiveness—fairness is equitable outcomes across groups; inclusiveness is designing for broad accessibility and varied user needs.
Explain fundamental principles of machine learning on Azure: Supervised vs unsupervised, training vs inference, features vs labels. Basic evaluation intent (accuracy vs precision/recall) and why data quality matters. Recognize where Azure Machine Learning fits (build/train/manage models) versus where prebuilt AI services fit (no training required). Trap: assuming ML always requires deep learning; the exam is concept-first, not algorithm-first.
Describe computer vision workloads on Azure: Classification vs object detection vs OCR/document processing. Understand that OCR-like requirements often point to Document Intelligence (especially for forms) rather than generic Vision. Trap: choosing “vision” for a scenario that clearly needs field extraction from invoices/receipts.
Describe NLP workloads on Azure: Text analytics tasks (sentiment, key phrases, entities), translation, and conversational solutions. Trap: treating chatbots as only “language models”—the exam expects you to recognize the workload (conversation + orchestration) not just the text processing.
Describe generative AI workloads on Azure: When generation is needed (drafting, summarizing, rewriting, code generation) versus analysis/extraction. Prompt basics: instructions, context, examples, and output formatting. Trap: calling entity extraction “generative” or assuming a generative model is required for simple classification.
Your exam-day plan should reduce cognitive load. Prepare your environment (quiet space, stable internet if remote, required identification, and a clean desk). If in a test center, arrive early enough to settle; if remote, complete system checks well before the appointment. Eliminate avoidable stress so your attention stays on reading carefully and mapping scenarios to services.
Time strategy is simple: execute the two-pass method from Section 6.1. Do not “fight” a single question. AI-900 questions are often short, and the biggest time sink is second-guessing. Exam Tip: When two answers seem correct, choose the one that directly satisfies the requirement with the least added assumption. The exam favors direct alignment over creative architecture.
In the last hour before the exam, avoid heavy new content. Run a compact drill that reinforces boundaries and definitions:
Finally, commit to a calm finishing routine: if you have time at the end, review only marked items and any with obvious misreads. Don’t change answers without a clear reason grounded in the objective. Your goal is consistent decision-making—exactly what the timed sims have trained.
1. You are taking a timed AI-900 mock exam. A question describes this requirement: “Build a solution that labels images with tags such as ‘beach’, ‘mountain’, and ‘food’ without training a custom model.” Which Azure service is the best fit?
2. A company wants to process thousands of scanned invoices and extract the invoice number, vendor name, and total amount into a database. The solution should use a prebuilt model where possible. Which Azure service should you recommend?
3. During weak-spot analysis, you review a missed question: “Detect all cars in an image and return bounding box coordinates.” Which workload type should you associate this requirement with?
4. A support team wants to automatically route incoming customer emails into categories such as “billing,” “technical issue,” and “cancellation.” They also want an explanation of which AI capability is being used, without focusing on implementation details. Which description best matches the workload?
5. You are finalizing your exam-day checklist. You want a repeatable approach to reduce mistakes on service-selection questions (for example, Azure Machine Learning vs Azure AI services) under time pressure. Which action is MOST effective?