AI Certification Exam Prep — Beginner
Master Azure AI fundamentals and walk into AI-900 exam-ready.
This course blueprint is built to take a true beginner from “zero” to confident for the Microsoft AI-900: Azure AI Fundamentals exam. You’ll learn the exact concepts Microsoft tests—without requiring prior certification experience or coding. Each chapter aligns directly to the official exam domains: Describe AI workloads, Fundamental principles of ML on Azure, Computer vision workloads on Azure, NLP workloads on Azure, and Generative AI workloads on Azure.
AI-900 questions are scenario-driven: you’re asked to pick the right AI approach, identify which Azure AI capability fits a requirement, or recognize core machine learning ideas such as training vs inference and model evaluation. This course is designed around that reality: you’ll first understand the concept, then practice the exam-style decision-making that Microsoft expects.
The course is organized as a structured 6-chapter book so you can progress logically and track mastery by domain.
You’ll learn to describe AI workloads and identify when to use prebuilt Azure AI services versus when custom machine learning is appropriate. You’ll also cover the fundamentals of machine learning on Azure, including common task types (classification, regression, clustering, anomaly detection), the ML lifecycle, and the purpose of key evaluation metrics.
From there, you’ll master the service-selection mindset for applied AI workloads. For computer vision workloads on Azure, you’ll learn how to choose capabilities like image analysis and OCR, and how to reason about document extraction use cases. For NLP workloads on Azure, you’ll learn how language and speech solutions map to problems such as sentiment, entity extraction, transcription, and translation. Finally, for generative AI workloads on Azure, you’ll learn the foundational concepts (prompts, tokens, embeddings, retrieval-augmented generation) and how responsible AI considerations apply to generative solutions.
Most AI-900 misses come from confusion between similar services, misunderstanding what a metric implies, or overlooking a scenario constraint like privacy, safety, or “no custom training.” This course emphasizes:
If you’re new to certification prep, start by planning your schedule and building momentum with the early milestones. When you’re ready, you can Register free and track your progress through the chapters. You can also browse all courses to pair this with additional Azure fundamentals practice.
By the end, you won’t just know definitions—you’ll be able to choose the right Azure AI approach under exam pressure, manage your time, and walk into the AI-900 test with a proven strategy.
Microsoft Certified Trainer (MCT)
Jordan Whitaker is a Microsoft Certified Trainer who helps beginners earn Microsoft cloud and AI certifications through exam-aligned learning paths. He specializes in translating Azure AI services and core ML concepts into test-ready skills with realistic practice questions and review plans.
AI-900 is designed to validate foundational literacy—not deep engineering ability—in AI concepts and Azure AI services. That sounds simple, but many candidates miss points because they study “AI in general” instead of what the exam actually measures: recognizing workloads, choosing the right Azure service family, and applying core machine learning (ML) terms correctly (training vs. inference, evaluation metrics, and responsible AI). This chapter orients you to the exam format and builds a plan that makes your study efficient and repeatable.
Expect questions that reward precise vocabulary and service recognition. If you can reliably map a scenario (for example, “extract text from receipts” or “classify sentiment in customer reviews”) to the correct Azure offering and explain key ML lifecycle concepts, you’re on track. Your goal over the next 2–4 weeks is not to memorize product marketing names, but to practice a consistent decision process: identify the workload, identify constraints (latency, cost, customization, data sensitivity), then pick the service category and capability that fits.
Exam Tip: When you feel torn between two answers, ask: “Which option matches the workload category the exam is testing?” AI-900 often includes distractors from the right ecosystem but the wrong workload (e.g., mixing computer vision with NLP, or mixing training with inference).
Practice note for Understand the AI-900 exam format, domains, and question types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up registration, scheduling, and test-day requirements (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Learn scoring, passing expectations, and how to avoid common exam traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study plan aligned to exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Use practice-question strategy: eliminate distractors and manage time: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-900 exam format, domains, and question types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up registration, scheduling, and test-day requirements (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Learn scoring, passing expectations, and how to avoid common exam traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2-week and 4-week study plan aligned to exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 measures whether you can describe common AI workloads and select appropriate Azure AI solutions at a high level. Think of the exam as a “workload-to-service mapping” test plus foundational ML literacy. Microsoft periodically updates the skill outline, but the exam consistently centers on four pillars: (1) AI workloads and considerations, (2) machine learning principles on Azure, (3) computer vision workloads, and (4) natural language processing (NLP) and speech, with increasing emphasis on generative AI concepts and responsible AI.
For workload identification, you need to distinguish supervised learning (labeled outcomes like “fraud/not fraud”), unsupervised learning (grouping without labels), and anomaly detection (rare events). For ML fundamentals, the exam focuses on training vs. inference, features vs. labels, model evaluation (accuracy, precision/recall), and common overfitting/underfitting intuition. For vision and language, the focus is selecting the right Azure AI service family (Azure AI Vision, Azure AI Language, Azure AI Speech) and understanding what they do.
Generative AI appears as concepts (prompts, tokens, embeddings, grounding) and responsible AI (privacy, bias, transparency, content safety). You don’t need to build a model from scratch, but you must recognize when a scenario needs Azure OpenAI versus a classic classifier, and when “use a prebuilt model” is preferable to “train a custom model.”
Exam Tip: The exam rarely rewards the most complex solution. If a scenario can be solved with a prebuilt capability (OCR, sentiment analysis, key phrase extraction), that is usually the intended answer over “train a custom model,” unless the prompt explicitly demands customization or domain-specific labels.
Registering correctly reduces test-day stress and prevents avoidable issues. AI-900 is delivered through Microsoft’s exam provider (often Pearson VUE). Create or confirm your Microsoft Certification profile and ensure your legal name matches your government ID exactly—mismatches are a common reason candidates lose time or are turned away. Choose delivery mode early: online proctored (convenient, but strict environment rules) or test center (more predictable, but travel required).
Pricing varies by region and discounts may apply (student pricing, employer vouchers, or event vouchers). Treat vouchers like perishable inventory—confirm expiration dates and whether the voucher restricts exam type. For accommodations, request them well ahead of scheduling; approval can take time, and you may need documentation. If you need extra time or specific arrangements, do not wait until the week of the exam.
Reschedule and cancellation rules matter because “life happens.” Know the cutoff window in your region (often 24 hours, sometimes more). If you miss the window, you may forfeit the fee. For online exams, run the system test in advance, confirm webcam/microphone permissions, and plan a clean desk and private room. For test centers, arrive early; check-in and security procedures can take longer than expected.
Exam Tip: For online proctoring, reduce risk: use a wired connection if possible, close all background apps, disable VPN, and remove secondary monitors if required. Many candidates are ready academically but lose time to preventable technical friction.
AI-900 uses a mix of item types: traditional multiple-choice, multiple-response (“choose all that apply”), drag-and-drop matching, and scenario-based items. The interface typically allows flagging items for review, but some sections (like case-style blocks, if present) can have constraints such as reviewing within that block only. Your job is to control tempo and avoid “time sink” items.
Build a time plan before you start. A practical approach is a two-pass strategy: pass 1 answers everything you can in under about 60–75 seconds; pass 2 returns to flagged items. If a question is taking longer because you’re debating two services, step back and identify what the exam is truly testing: workload category (vision vs. language vs. ML), task type (classification vs. extraction vs. generation), and whether the scenario describes training or inference.
Exam Tip: Watch for “keyword bait.” Words like “prediction” can appear in non-ML contexts, and “model” can refer to language models or ML models. Always anchor on the input and output: What data goes in (image, text, audio)? What comes out (labels, entities, transcript, generated text)?
Microsoft exams are scored on a scaled score model. You’ll see a score report that highlights performance by skill area rather than giving you a detailed list of missed items. Do not over-interpret a single attempt—use the domain-level breakdown to target your next round of study. Passing expectations are consistent with Microsoft’s certification standards, but treat “passing” as a byproduct of solid domain coverage rather than chasing a specific score target.
A common trap is assuming that if you “feel good” about ML theory you’ll pass. AI-900 rewards accurate service selection and correct interpretation of the scenario’s constraints. Your score report helps you identify where your mental mapping is weak: for example, you might know what sentiment analysis is, but still confuse which Azure service family provides it.
If you need a retake, plan it strategically: (1) revisit the official skill outline, (2) redo the weakest domain using Microsoft Learn modules, (3) reattempt practice items only after you can explain why the correct answer is correct and why the distractors are wrong. Avoid same-day or next-day retakes without learning changes; that typically repeats the same mistakes.
Exam Tip: Your goal after an attempt is not “more practice questions.” It’s “fewer unknowns.” Turn every miss into a rule you can restate in one sentence (e.g., “Training builds the model; inference uses the model to predict on new data”). Those rules are what you carry into the next attempt.
For AI-900, prioritize official resources that mirror exam language and service boundaries. Start with Microsoft Learn learning paths aligned to AI-900 because they teach the exact workload categories and Azure service groupings the exam expects. Use the official documentation selectively—docs are deep, but you only need the “what it does,” “when to use it,” and “key limitations/inputs/outputs” for this exam.
In your first week, aim for breadth: cover AI workloads, ML basics, computer vision, NLP/speech, and an overview of generative AI and responsible AI. In week two (or weeks three and four if you’re on a longer plan), shift to depth through targeted labs or sandbox exercises. You do not need to become an Azure ML engineer, but hands-on exposure makes concepts like training vs. inference, dataset splits, and evaluation metrics feel concrete.
Exam Tip: Don’t study by product names alone. Study by “capability verbs” (detect, extract, classify, transcribe, translate, summarize, generate). The exam describes what the user wants; your job is to match the verb to the service capability.
Practice is most effective when it builds recall and decision-making, not just familiarity. Use practice sets to diagnose gaps in (1) workload identification, (2) Azure service selection, and (3) ML lifecycle vocabulary. After each set, categorize every miss: was it a concept error (e.g., confusion about precision/recall), a service-mapping error (choosing the wrong Azure AI family), or a reading error (missing “custom,” “real time,” or “multilingual”)? Your remediation depends on the category.
Adopt an “eliminate distractors” routine. Many incorrect options are plausible technologies but wrong for the scenario’s data type or outcome. Train yourself to reject answers quickly by asking: Does this service accept the input in the question (image/text/audio)? Does it produce the requested output (transcript/entities/labels/generated text)? Is the scenario asking for training a model or using an existing model for inference?
Exam Tip: Avoid the trap of memorizing the correct letter choice. Your review should end with a statement you could teach: “This is computer vision because the input is an image and the output is extracted text, so OCR in Azure AI Vision is the best fit.” If you can’t explain it, you don’t own it yet.
1. You are preparing for the AI-900 exam. Which study approach is MOST aligned with how the exam measures skills?
2. A company needs employees to take AI-900. Some will test online and others at a test center. Which preparation step is MOST important to avoid test-day issues across both delivery options?
3. During practice questions, you frequently get stuck between two plausible Azure options (for example, an NLP service vs. a computer vision service). What is the BEST decision rule to apply, based on AI-900 exam strategy?
4. You are building a 2-week AI-900 study plan for a colleague. They have limited time and want the highest score impact. Which plan is MOST aligned to AI-900 domain expectations?
5. A candidate consistently misses questions that use ML lifecycle terms (for example, confusing when a model is trained vs. when it is used to make predictions). Which correction MOST directly targets the exam objective and reduces this trap?
This chapter maps directly to the AI-900 “Describe AI workloads” domain. The exam is less about building models and more about choosing the right AI workload and the right Azure capability for a given business scenario. Expect scenario-based questions that describe inputs (images, text, sensor readings), outputs (labels, numbers, summaries, recommendations), and constraints (latency, explainability, data availability), then ask what type of AI workload it is and which Azure service fits.
You’ll repeatedly see exam terms like AI, machine learning, deep learning, and generative AI. In exam language: AI is the umbrella for systems that exhibit “intelligent” behavior; ML is AI that learns patterns from data; deep learning is ML using multi-layer neural networks (common for vision and language); and generative AI creates new content (text, images, code) rather than only predicting labels or numbers. These definitions aren’t academic—Microsoft uses them to differentiate workload types and product choices on the test.
As you read the sections, keep a decision checklist in mind: (1) What’s the modality—image, text/audio, tabular data, or mixed? (2) Is the goal prediction, understanding, searching, or generation? (3) Do you need a prebuilt capability (fastest path) or a custom model (domain-specific)? (4) What Responsible AI considerations must be addressed? Those four steps will eliminate most wrong answers.
Practice note for Define AI, ML, deep learning, and generative AI in exam terms: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match business scenarios to AI workload types (vision, language, prediction, decision): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose between custom models vs prebuilt Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply Responsible AI basics: fairness, reliability, privacy, transparency, accountability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: exam-style questions for Describe AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Define AI, ML, deep learning, and generative AI in exam terms: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Match business scenarios to AI workload types (vision, language, prediction, decision): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose between custom models vs prebuilt Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply Responsible AI basics: fairness, reliability, privacy, transparency, accountability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 commonly groups scenarios into workload categories you can recognize quickly: vision, language, prediction, and decision/support (often conversational or recommendation-like). Vision workloads include image classification (“is this a damaged part?”), object detection (“where are the pedestrians?”), OCR (“extract text from invoices”), and video analysis (“track people count over time”). Language workloads include sentiment analysis, entity extraction, translation, summarization, and speech-to-text/text-to-speech. Prediction workloads usually mean machine learning over structured data: forecasting demand, estimating risk, or predicting churn. Decision workloads often look like “recommend next action,” “route to the right agent,” or “assist a user,” which can be rules-based, ML-based, or generative.
Generative AI appears in modern scenarios where the output is new content: drafting an email reply, generating a product description, summarizing a policy with citations, or creating code snippets. On the exam, recognize generative phrasing: “compose,” “draft,” “create,” “generate,” “summarize,” “chat,” “Q&A over documents.” That usually points to Azure OpenAI (or an orchestration pattern around it).
Exam Tip: When a scenario asks to “extract text” from images, that’s not NLP first—it’s typically a vision/OCR workload. NLP may come after OCR if you then need entities, key phrases, or sentiment. Many test-takers jump straight to language services and miss the image-to-text step.
Common trap: confusing “prediction” with “generation.” Predictive ML outputs a label/number (e.g., “probability of default = 0.18”). Generative AI outputs free-form content (e.g., “write a credit risk explanation”). The exam will often include both needs; choose the option that matches the primary requirement in the question.
The exam expects you to know when ML is appropriate versus a deterministic, rules-based approach. Rules-based systems follow explicit logic (“IF customer is gold AND order > $500 THEN free shipping”). They are best when rules are stable, explainability is paramount, and you can enumerate conditions without ambiguity. They’re also easier to validate and audit because outcomes are predictable.
Machine learning is appropriate when rules are hard to write because patterns are complex, data-driven, or change over time—fraud detection, image recognition, demand forecasting, or triaging support tickets. In ML, you train on historical examples (features and labels for supervised learning) to learn a mapping from inputs to outputs. On Azure, training happens in a managed environment (for example, Azure Machine Learning), while inference is using the trained model to score new data in production.
Exam Tip: If the scenario says “the criteria changes frequently” or “too many combinations to list,” it’s a strong signal for ML. If it says “must be 100% consistent with policy” or “decisions must be explainable as business rules,” rules-based may be the better answer.
Common trap: assuming ML always beats rules. In reality, a simple threshold or rule is often the correct choice when data is limited or labels are unavailable. Another trap is mixing up training and inference: training is compute-heavy and offline/periodic; inference must meet latency and scalability needs (real-time API, batch scoring, edge deployment). The exam often tests whether you can separate those phases conceptually.
AI-900 frequently asks you to identify the type of ML task described. Start by identifying the expected output. Classification outputs a category or class label (spam vs not spam; defect type A/B/C). Binary and multiclass are both classification. Regression outputs a continuous number (price, demand, temperature, time-to-failure). Clustering groups items based on similarity when you don’t have labels (customer segmentation). Anomaly detection flags rare or unusual patterns (unexpected network traffic, sensor spikes, fraudulent transactions).
Deep learning is not a task type; it’s a technique. In exam terms, deep learning is often associated with unstructured data (images, audio, natural language) and with higher accuracy at the cost of more data and compute. Don’t pick “deep learning” when the question is asking for “classification vs regression.”
Exam Tip: Use a one-line rule: “If the answer is a word from a fixed set, it’s classification; if it’s a number on a scale, it’s regression.” Clustering is a common distractor—if labels exist (historical examples with known outcomes), it’s usually not clustering.
Model evaluation appears lightly in AI-900 but is important. For classification, you’ll see concepts like accuracy and confusion matrices; for regression, error measures (like average error). The exam’s main intent is to ensure you know evaluation is needed and that performance must be measured against requirements (for example, false negatives in fraud may matter more than overall accuracy).
Common trap: treating anomaly detection as “just classification.” It can be, but many anomaly methods are unsupervised or semi-supervised. If the scenario emphasizes “rare events” or “unknown types of issues,” anomaly detection is the better match.
Knowledge mining is about turning large volumes of unstructured or semi-structured content (PDFs, documents, images, call transcripts) into something you can search and analyze. On Azure, this is commonly associated with Azure AI Search patterns: ingestion of content, enrichment using AI (OCR, language detection, key phrase extraction, entity recognition), then indexing into a searchable structure, and finally retrieval (querying, filtering, ranking) via an application.
Think of enrichment as “AI that adds metadata,” such as recognized entities (people, places), extracted text from images, detected language, or custom tags. Indexing is building a structure optimized for fast query and ranking. Retrieval is the act of finding relevant items in response to a query. In many enterprise scenarios, this pipeline supports Q&A systems, internal document portals, and “find the right policy” use cases.
Exam Tip: If a scenario says “search across PDFs and images,” look for a solution that includes OCR plus search indexing. A pure language model answer is often incomplete if the documents aren’t already text or searchable.
Common trap: equating “search” with “web search.” In enterprise knowledge mining, you’re searching your own content and relying on enrichment to make it searchable. Another trap is missing the order of operations: you generally enrich before indexing so the index contains the enriched fields you want to query (entities, key phrases, tags).
A core AI-900 skill is choosing between prebuilt Azure AI services and custom models. Prebuilt services (for example, Azure AI Vision, Azure AI Language, Azure AI Speech, Azure AI Translator) provide ready-to-use APIs for common tasks—OCR, object detection, sentiment analysis, entity recognition, speech transcription—without needing you to train a model. These are ideal when the task is common, time-to-value matters, and you don’t have labeled data.
Custom models are appropriate when the domain is specialized: unique product defects, industry-specific document layouts, or organization-specific intent classification. In Azure, custom work often points to Azure Machine Learning for training/deployment, and in some service families, “custom” options exist (for example, training a model for your domain). The exam often frames this choice as: “Do you need to recognize your categories or use standard ones?” If it’s your categories, custom is likely required.
Exam Tip: Watch for phrases like “no data science team,” “minimal training data,” or “quickly add AI to an app.” Those are strong signals to pick a prebuilt Azure AI service. Conversely, “must meet a specific accuracy target on domain-specific images” hints at custom training.
Generative AI use cases typically map to Azure OpenAI for chat, summarization, content generation, and code assistance. A common scenario is combining retrieval (search) with generation to answer questions grounded in your organization’s data. Even when the exam doesn’t use the term “RAG,” it may describe “use company documents to answer questions.” The correct approach is usually retrieval + generation, not generation alone.
Common trap: choosing Azure Machine Learning for everything. AML is for building and managing models; it’s not the simplest path for standard OCR or sentiment analysis. The exam rewards selecting the simplest service that meets requirements.
Responsible AI is explicitly tested in AI-900. You must be able to describe the principles and connect them to practical actions. The commonly tested principles are: fairness (avoid bias across groups), reliability and safety (perform consistently and avoid harmful outcomes), privacy and security (protect data and control access), transparency (communicate system limitations and how it’s used), and accountability (humans are responsible for outcomes; governance and oversight exist).
In Azure contexts, governance basics include controlling access to AI resources (identity and role-based access), protecting data (encryption, network controls), monitoring and logging, and establishing review processes for model changes. For generative AI, Responsible AI often includes content filtering, prompt/response logging policies, grounding responses in approved data sources, and clear user disclosure that AI-generated output may be incorrect.
Exam Tip: If a scenario mentions protected attributes (age, gender, ethnicity) or different outcomes for different groups, the principle being tested is usually fairness. If it mentions “explain why the model made a decision,” think transparency and interpretability. If it mentions “sensitive customer data,” think privacy and security.
Common trap: treating Responsible AI as only a documentation task. The exam frames it as design-and-operations: test for bias, monitor drift and failures, restrict data access, and define who approves deployments. Another trap is assuming generative AI outputs are inherently trustworthy. The safe exam posture is: outputs must be validated, risks mitigated, and human oversight applied where impact is high.
1. A manufacturing company captures images of finished products on a conveyor belt and wants to automatically detect surface defects in near real time. Which AI workload type best fits this requirement?
2. A customer support team wants to extract key phrases and detect sentiment from incoming email messages without training a model. Which approach should you choose?
3. A retail company wants to predict next week’s demand for each store using historical sales and weather data. What type of AI workload is this?
4. A bank deploys an AI system to recommend whether to approve loan applications. Auditors require that the bank can explain why an applicant was declined. Which Responsible AI principle is most directly being addressed?
5. A marketing team wants an AI solution that can create multiple versions of product descriptions based on a short list of bullet points. Which workload type is most appropriate?
This chapter maps directly to the AI-900 objective area that tests whether you can explain core machine learning (ML) principles and recognize how Azure supports ML workflows end-to-end. The exam is not trying to turn you into a data scientist; it checks that you can (1) distinguish training from inference, (2) describe the ML lifecycle, (3) interpret common evaluation metrics, and (4) identify Azure Machine Learning (Azure ML) components used to build and operationalize solutions.
Expect scenario questions: “A team trained a model and now needs to…” or “A model’s accuracy looks high but users complain…” Your job is to read for clues—task type (classification vs regression), data balance, cost of false positives/negatives, and whether the team is building/training or deploying/scoring. You’ll also see Azure ML nouns (workspace, compute, job, endpoint) and must match them to their role in the lifecycle: data → training → validation → deployment → monitoring.
Practice note for Understand ML lifecycle: data, training, validation, deployment, monitoring: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate supervised, unsupervised, and reinforcement learning at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Interpret evaluation metrics and choose what fits the scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify Azure tools for ML workflows (Azure Machine Learning concepts): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: exam-style questions for ML fundamentals on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand ML lifecycle: data, training, validation, deployment, monitoring: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Differentiate supervised, unsupervised, and reinforcement learning at a high level: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Interpret evaluation metrics and choose what fits the scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify Azure tools for ML workflows (Azure Machine Learning concepts): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: exam-style questions for ML fundamentals on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
On AI-900, “training” and “inference” are foundational terms that appear in many questions. Training is the process of learning model parameters from historical data; inference (also called scoring) is using the trained model to make predictions on new data. In Azure terms, training typically happens in an experiment or job, while inference happens through a deployed endpoint (real-time) or batch scoring pipeline.
Learn the vocabulary the exam expects: features are the input variables used for prediction (for example, bedrooms, square footage, and location). A label (or target) is what you want to predict (for example, house price). A model is the learned function mapping features to a predicted label. A dataset is your curated data, often split into training/validation/test sets (covered next). A prediction is the model’s output; for classification it may be a class plus a probability score.
Exam Tip: If the scenario mentions “ground truth,” “known outcomes,” or “historical labeled data,” you are in training/evaluation territory. If it mentions “new customer,” “incoming request,” “predict now,” or “REST endpoint,” you are in inference territory.
Common trap: confusing “feature engineering” with “hyperparameter tuning.” Feature engineering changes inputs (columns, transformations). Hyperparameters are settings of the algorithm (learning rate, tree depth) chosen before/during training. Another trap: assuming all ML needs labels—unsupervised learning often has no labels, but still has features.
The exam frequently tests whether you understand why we split data and what overfitting looks like. Standard practice is to split data into training (fit the model), validation (tune settings and compare candidates), and test (final unbiased estimate). Not every scenario uses all three explicitly, but the concept is consistent: evaluate on data the model did not see during training.
Overfitting means the model learns noise and performs very well on training data but poorly on new data. Underfitting means the model is too simple (or not trained enough) and performs poorly even on training data. The bias-variance intuition helps: high bias is underfitting; high variance is overfitting. In practice, you reduce overfitting via more data, regularization, simpler models, early stopping, or better validation; you reduce underfitting via a more expressive model, better features, or longer training.
Exam Tip: Watch for wording like “training accuracy is 98% but test is 70%.” That is classic overfitting. If both are low, think underfitting or poor features/data quality.
In Azure ML scenarios, you may see automated splitting or a designer/pipeline step. The key is: the evaluation must use held-out data, and the test set should remain untouched until final selection.
AI-900 expects you to match metrics to problem types and business goals. Start by identifying the task: classification (predict a category) vs regression (predict a number). Then choose metrics accordingly.
For classification, accuracy is the proportion of correct predictions. It is easy but often misleading when classes are imbalanced (for example, fraud is rare). Precision answers: “Of the items predicted positive, how many were truly positive?” Recall answers: “Of the truly positive items, how many did we catch?” F1 balances precision and recall (harmonic mean), useful when you need a single score and positives are important.
ROC-AUC measures how well the model separates classes across all thresholds; higher is better and is less dependent on picking one probability cutoff. It’s commonly used when the threshold may change by scenario (for example, a bank tightens fraud rules during holidays).
For regression, common metrics include MAE (mean absolute error) and RMSE (root mean squared error). RMSE penalizes large errors more strongly than MAE, so it can be preferred when big misses are especially costly.
Exam Tip: When the prompt mentions “false negatives are costly” (missed fraud, missed disease), prioritize recall. When “false positives are costly” (blocking good transactions, unnecessary follow-up), prioritize precision. If you see “overall correctness” with balanced classes, accuracy may be acceptable.
After you know how to measure performance, the next exam skill is understanding how we pick a better model. Two major levers are algorithm choice (for example, logistic regression vs decision tree) and hyperparameters (settings that control learning, such as regularization strength, number of trees, maximum depth, or learning rate). Hyperparameters are not learned from the data in the same way as model parameters; they are selected through experimentation.
Cross-validation is a robust evaluation method where you split the training data into multiple folds, train on some folds, validate on the remaining fold, and rotate. This reduces sensitivity to one lucky/unlucky split and is especially helpful when data is limited. In exam scenarios, cross-validation is often the “more reliable evaluation” choice compared to a single train/validation split.
Exam Tip: If the scenario says “small dataset” or “results vary a lot depending on split,” cross-validation is a strong answer. If the scenario says “need faster experimentation,” a single validation split may be chosen, but you accept less stable estimates.
On Azure, these ideas show up in features like sweeps/hyperparameter tuning and automated ML (AutoML). Even if you don’t memorize the UI, know the principle: run multiple training jobs with different hyperparameters, compare with consistent metrics on held-out data, then select the best candidate.
The AI-900 exam expects you to recognize the core building blocks of Azure Machine Learning and where they fit in the lifecycle. The organizing container is the workspace: it centralizes assets like data references, models, experiments/jobs history, and endpoints. If a question asks where ML resources are managed and governed, “workspace” is usually the anchor concept.
Compute refers to the resources used for training and sometimes inference: compute instances (often for development), compute clusters (scalable training), and other attached compute. The exam commonly frames this as “the team needs scalable compute to train,” which points to clusters rather than a single always-on VM.
Jobs (training runs) are executions of scripts or pipelines that produce outputs such as metrics, logs, and registered models. You may see “track experiments” or “reproduce a run”—that’s job tracking in Azure ML. Endpoints are the deployment targets for inference, commonly real-time endpoints for low-latency scoring. Some scenarios mention batch scoring; conceptually, that is still “inference” but not a synchronous REST call.
Exam Tip: If the question mentions “deploy a model so applications can call it,” think endpoint. If it mentions “run training at scale” or “accelerate training,” think compute cluster. If it mentions “organize and manage ML assets,” think workspace.
MLOps is the operational layer that keeps ML reliable after deployment. The exam focuses on fundamentals: versioning, repeatable deployments, and monitoring. Once a model is deployed, you must watch not just uptime/latency, but also data drift and model drift. Data drift occurs when input feature distributions change (for example, a retailer’s shopping patterns shift seasonally). Model drift occurs when the relationship between features and labels changes (concept drift), degrading performance even if inputs look similar.
Common deployment patterns include blue/green or canary releases (send a small percentage of traffic to a new model version), allowing safe comparison before full cutover. You may also see A/B testing language: two models in parallel to compare outcomes. The exam wants you to choose these patterns when the scenario emphasizes minimizing risk while updating a model.
Exam Tip: If a scenario says “performance degraded over time,” “customer behavior changed,” or “incoming data differs from training,” the best next step is monitoring for drift and triggering retraining, not just scaling compute.
In Azure ML terms, endpoints and pipelines can be integrated with monitoring and retraining workflows. Even if the question is high-level, anchor your answer to the lifecycle: deploy → monitor → detect drift → retrain → redeploy a new version with controlled rollout.
1. A retail company trains a model in Azure Machine Learning to predict whether a customer will churn. After training, the company wants to use the model from a web app to score individual customers in real time. Which step of the ML lifecycle is the company focusing on now?
2. A team has a dataset of product images where each image is labeled with one of 10 product categories. They want to train a model to predict the category for new images. Which type of machine learning is this?
3. A hospital builds a binary classification model to flag high-risk patients. Only 1% of patients are truly high-risk. The model reports 99% accuracy, but clinicians complain it misses most high-risk cases. Which metric should the team prioritize to evaluate whether the model is identifying high-risk patients?
4. A data science team needs a centralized place in Azure to manage datasets, runs, models, and deployments for multiple ML projects. Which Azure Machine Learning resource should they use?
5. A company deployed a model to an Azure ML online endpoint. Over time, the input data distribution changes due to a new marketing campaign, and prediction quality degrades. Which ML lifecycle activity addresses this situation?
This chapter maps the AI-900 “Computer Vision workloads on Azure” domain to the decisions the exam expects you to make. At this level, you are not writing model-training code; you are identifying the workload type (classification, detection, OCR, or broader image analysis) and selecting the correct Azure service or feature. Many wrong answers on AI-900 are “almost right” because they name a vision tool that sounds plausible but solves a different task (for example, choosing object detection when the requirement is just to tag a whole image).
Think in terms of inputs (images, multi-page documents, camera streams), outputs (tags, bounding boxes, recognized text, key-value pairs), and constraints (privacy, safety, restricted capabilities). You’ll also see Responsible AI expectations: use least-privilege access, avoid unnecessary data retention, and be careful with sensitive vision scenarios. Throughout, your job on the exam is to match the requirement statement to the correct capability name and output format, not to memorize SDK methods.
Exam Tip: When two options seem similar, look for the keyword that indicates the output shape: “label” (classification), “bounding box” (detection), “text” (OCR), “key-value pairs/tables” (document intelligence), or “tags/captions” (image analysis).
Practice note for Identify vision workload types: classification, detection, OCR, and analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose Azure AI Vision features for image analysis and OCR scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand document processing basics for forms and receipts in exam context: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Review security, privacy, and Responsible AI considerations for vision: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: exam-style questions for Computer vision workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify vision workload types: classification, detection, OCR, and analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose Azure AI Vision features for image analysis and OCR scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand document processing basics for forms and receipts in exam context: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Review security, privacy, and Responsible AI considerations for vision: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 tests whether you can identify the type of vision problem before choosing an Azure capability. Two of the most commonly confused workload types are image classification and object detection. Image classification answers: “What is this image?” It typically outputs one or more labels for the whole image (for example, “dog,” “beach,” “construction site”). Object detection answers: “Where are the objects, and what are they?” It outputs labels plus locations (bounding boxes) for each detected object (for example, “person” at x/y/width/height, “car” at x/y/width/height).
On the exam, requirements language is your best clue. If the scenario says “identify whether an image contains a defect,” “categorize product photos,” or “route images to a folder,” you are in classification territory. If it says “count items,” “draw boxes around,” “locate,” “track,” or “find all instances,” you need detection. Classification can still return multiple tags, which tricks candidates into choosing detection—remember, tags do not imply coordinates.
Common trap: Confusing general “image analysis” (tags/captions) with custom trained classification/detection. AI-900 often expects you to choose built-in analysis if the requirement is generic (describe an image), and custom vision or detection if the requirement is specialized (identify a specific part defect unique to a factory).
Exam Tip: If the scenario needs both identification and location, detection is required even if the primary goal seems like categorization (for example, “count people in a room” demands detection because you must find each person instance).
Azure AI Vision (often referred to as “Vision” or “Image Analysis” in exam materials) is used for broad image understanding without you training a custom model. The exam frequently checks whether you know what outputs you can get from image analysis and when that is “good enough” compared to a custom model. Typical outputs include: tags (keywords), captions (natural language description), detected objects (with boxes in some tiers), image metadata, and sometimes smart cropping or background/foreground insights depending on the feature set referenced in the question.
Focus on interpreting the required output. If a prompt asks for “a short sentence describing the image for accessibility,” that points to captions. If it says “return a list of topics,” that points to tags. If it says “detect a brand logo” or a very domain-specific object, the built-in tags may not be reliable—expect the exam to guide you toward a custom approach instead. However, AI-900 usually stays at the level of “choose the Azure AI Vision capability,” not deep implementation details.
Also know that “analysis” can be applied to single images, and similar concepts extend to video via separate services. Candidates sometimes pick a video analytics option when the scenario only mentions a photo upload or static images. Read the data source carefully.
Common trap: Selecting OCR when the requirement is to describe an image that happens to contain text. OCR is for extracting text as structured output; captions/tags are for describing overall content. If the requirement says “extract the serial number,” that’s OCR; if it says “generate a description of this product photo,” that’s Image Analysis.
Exam Tip: Watch for the word “extract.” “Extract text” → OCR. “Extract insights/tags” → Image Analysis.
Optical Character Recognition (OCR) is a core AI-900 vision workload: converting text in images into machine-readable text. The exam expects you to recognize OCR scenarios (signs, screenshots, labels, receipts, handwritten notes) and understand that OCR output is more than “a string”—it often includes confidence scores, bounding regions, and reading order. Azure’s OCR capability is commonly described as the “Read” feature in Azure AI Vision contexts.
Handwriting is a typical twist. If the scenario explicitly mentions handwritten forms or notes, the correct choice is still OCR/Read (not general image analysis), but you should anticipate that the question may include distractors like “speech to text” or “text analytics.” OCR is vision; speech to text is audio; text analytics assumes you already have text.
Layout considerations matter in exam phrasing. If the requirement says “preserve line breaks,” “identify paragraphs,” or “capture reading order,” it’s still OCR, but you should think of it as OCR with layout output rather than just plain text. If the requirement escalates to extracting structured fields (like invoice number, total, vendor), that typically crosses into Document Intelligence (covered in Section 4.5), even though OCR is part of the pipeline.
Common trap: Picking translation when the text is in another language. Translation requires recognized text first; OCR is the first step. AI-900 questions often test sequencing logic: capture text (OCR) then translate (Language service), not the other way around.
Exam Tip: If the user story starts with “photo of…” or “scanned image of…,” the first service is usually vision (OCR/Image Analysis). If it starts with “a document (PDF) containing fields,” expect Document Intelligence.
Face-related scenarios are high-risk and therefore heavily constrained. AI-900 may test your awareness that certain face capabilities are restricted and that you must consider Responsible AI, privacy, and compliance. On the exam, the safest approach is to choose face-related capabilities only when the requirement clearly asks for them and to avoid suggesting identity or emotion inference unless explicitly supported and permitted in the scenario context.
From an ethics and governance standpoint, you should be ready to explain (in selection logic) why you would minimize data collection, use consent, and avoid storing images longer than necessary. Azure guidance emphasizes secure access (for example, managed identities where applicable), encryption at rest and in transit, and access control around sensitive biometric data.
The exam also likes “what should you do” style decision criteria: use Responsible AI principles such as fairness, transparency, reliability and safety, privacy and security, inclusiveness, and accountability. For sensitive vision, these principles translate into concrete practices: obtain consent, disclose usage, audit outcomes, and restrict access to outputs.
Common trap: Treating face detection as equivalent to face identification. Detecting a face (finding the presence/location) is different from verifying or identifying a person. If the scenario asks “blur faces for privacy,” that’s detection/location, not identity. If it asks “unlock a device for this user,” that implies verification/identity and triggers stricter scrutiny and constraints.
Exam Tip: When you see “biometrics,” “identify a person,” “surveillance,” or “public safety,” expect the question to be testing governance and constraints at least as much as the technical feature choice.
Document processing is where many candidates overuse OCR. AI-900 separates “read text from an image” (OCR) from “extract structured information from documents” (Document Intelligence). If the scenario mentions invoices, receipts, purchase orders, tax forms, or “extract fields,” the exam usually expects Document Intelligence because it returns structured outputs like key-value pairs and tables, not just lines of text.
Document Intelligence is designed for semi-structured documents where the meaning of text depends on its position (for example, “Total,” “Date,” “Vendor,” line-item tables). Under the hood, OCR is involved, but your service selection should match the business requirement: “I need the total and the tax from a receipt” is not merely OCR; it’s field extraction.
Also note the input formats: multi-page PDFs and scanned documents are common. If the question emphasizes multi-page extraction, table detection, or form field mapping, that’s another strong indicator for Document Intelligence.
Common trap: Selecting a language/NLP service because the output is “text.” NLP services analyze text sentiment/entities, but they do not extract text from images or PDFs. The pipeline is: Document Intelligence/OCR first, then NLP if needed.
Exam Tip: Keywords like “key-value,” “fields,” “invoice number,” “line items,” and “tables” should immediately move you to Document Intelligence, even if the document is “an image of a form.”
This section trains the “service matching” reflex the AI-900 exam rewards. The exam is not asking you to architect a perfect system; it’s asking you to pick the most direct capability that satisfies the requirement. Start by underlining (mentally) the noun and verb in the requirement: “classify,” “detect,” “read,” “extract fields,” “describe,” “blur,” “count,” “identify.” Then match to the output type.
Use a quick decision path: (1) Is the input an image/document? (2) Do we need text? If yes, decide between OCR (raw text) vs Document Intelligence (structured fields). (3) If not primarily text, do we need a label for the whole image (classification/tags) or locations (detection)? (4) If it’s sensitive (faces/biometrics), add privacy and Responsible AI constraints to your choice.
AI-900 distractors commonly include mixing modalities (speech vs vision) and mixing steps (translation before OCR, sentiment analysis before text extraction). Another common distractor is choosing a “more advanced” service than needed. If the requirement is “generate a caption for accessibility,” selecting a custom model is usually wrong because built-in image captioning is the intended match.
Exam Tip: If two answers both “work,” choose the one that produces the exact output the requirement asks for with the fewest extra steps. AI-900 scoring aligns with best-fit capability, not the most configurable option.
1. A retail company wants a solution that reads the text from product labels in photos taken by store employees. The output should be the recognized text string(s), not objects with bounding boxes. Which vision workload type is this?
2. A wildlife organization has images from trail cameras and wants to determine whether each image contains a bear, a deer, or neither. They do not need the location of the animal in the image—only an overall label per image. Which workload type best fits?
3. A manufacturing company wants to analyze photos of an assembly line and identify each defect location by returning coordinates around the defective parts. Which output is most associated with the correct workload choice?
4. An accounts payable team scans multi-page invoices and wants to extract vendor name, invoice number, dates, and line items into structured fields. Which Azure capability best matches this requirement in the AI-900 context?
5. A healthcare provider is building an app that analyzes patient-submitted images. They want to reduce privacy risk and follow Responsible AI guidance while using Azure vision services. Which approach best aligns with these considerations?
This chapter maps directly to the AI-900 exam domain covering Natural Language Processing (NLP), Speech, and Generative AI workloads on Azure. On the exam, you are rarely asked to implement code; instead, you must recognize the workload type (classification vs extraction vs summarization vs translation), then choose the correct Azure service family (Azure AI Language, Azure AI Speech, or Azure OpenAI) and describe the basic concepts (tokens, prompts, embeddings, RAG, and safety). Expect scenario-based questions where several options “sound AI-ish,” but only one matches the workload and data constraints.
A reliable decision approach: (1) Identify the input/output type (text, audio, or chat); (2) decide whether the task is deterministic analysis (NLP) or generative output (GenAI); (3) check if the scenario needs real-time speech, document-level extraction, multilingual translation, or grounded answers from enterprise data; (4) apply responsible AI requirements. The exam tests whether you can match these criteria quickly and avoid common traps like selecting Azure OpenAI for simple sentiment analysis, or choosing Speech services for pure text translation.
As you study, practice naming the workload first: “This is sentiment classification,” “This is entity extraction,” “This is summarization,” “This is speech-to-text,” “This is a RAG chatbot.” Once the workload is labeled, the service selection becomes much easier.
Practice note for Identify NLP workload types: classification, extraction, summarization, translation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Select Azure AI Language and Speech capabilities based on scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI concepts: prompts, tokens, embeddings, RAG, safety: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose Azure OpenAI vs other Azure services for generative AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Practice: exam-style questions for NLP and Generative AI workloads on Azure: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify NLP workload types: classification, extraction, summarization, translation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Select Azure AI Language and Speech capabilities based on scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Explain generative AI concepts: prompts, tokens, embeddings, RAG, safety: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose Azure OpenAI vs other Azure services for generative AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-900 expects you to understand core NLP building blocks and the vocabulary used in Azure services. An utterance is a piece of text (often a user’s query) that you want to interpret. In conversational design, utterances are mapped to an intent (what the user wants) and entities (the important values inside the utterance). Example: “Book a flight to Seattle tomorrow” might have intent = BookFlight, entities = Destination: Seattle, Date: tomorrow.
Many exam scenarios focus on common workload types: classification (assigning labels like “positive/negative” sentiment or topic), extraction (pulling entities, key phrases, or personally identifying information), summarization (condensing long text), and translation (text or speech). Sentiment analysis is a form of classification, typically returning polarity (positive/neutral/negative) and sometimes confidence scores. Key phrase extraction finds the most relevant terms in a document, which can feed search, tagging, or routing workflows.
Exam Tip: If the desired output is structured labels or spans of text from the original input (entities, key phrases), think “analysis/extraction,” not “generative.” This usually points to Azure AI Language rather than Azure OpenAI.
Common trap: confusing summarization with key phrase extraction. Summarization creates new sentences (abstractive) or selects key sentences (extractive). Key phrase extraction only returns phrases/terms, not a coherent summary. On the exam, read the deliverable carefully: “a short paragraph summary” implies summarization; “top keywords/tags” implies key phrases.
Azure AI Language is the go-to service for text-based NLP analysis workloads on AI-900. You’ll see scenarios asking you to detect sentiment in support tickets, extract entities from documents, classify text by topic, or summarize long articles. These are “predict/annotate text” problems where the output is typically labels, scores, extracted spans, or summaries—rather than open-ended creative generation.
Azure AI Language capabilities commonly referenced in exam objectives include sentiment analysis, key phrase extraction, named entity recognition (NER), and language detection. When a scenario mentions building an app that understands user requests across multiple intents, the exam often wants you to think in terms of orchestration: routing a user utterance to the right handling logic. In practice, orchestration can mean using an intent classifier plus entity extraction, then forwarding to downstream systems (ticketing, CRM, ordering). The exam is less about the exact API name and more about recognizing that Language handles the text understanding layer.
Exam Tip: Look for wording like “identify entities,” “classify feedback,” “detect the language,” “extract key phrases,” or “summarize a document.” Those verbs strongly map to Azure AI Language. If the prompt says “generate a new marketing email” or “write code,” that shifts you toward Azure OpenAI.
Common trap: selecting Speech services for text-only scenarios. If the input and output are text, Azure AI Language or Translator (for translation) is the better match. Another trap: picking Azure OpenAI for document extraction because it “can read anything.” The exam prefers purpose-built analysis services when the task is straightforward extraction/classification and you want predictable outputs.
Speech workloads are defined by audio input/output. AI-900 expects you to recognize three core patterns: speech-to-text (STT), text-to-speech (TTS), and speech translation. If a scenario says “transcribe call center audio,” that is STT. If it says “read this text aloud,” that is TTS. If it says “live captions in another language,” that is speech translation (audio in, translated text out) or a combination (STT → translation → TTS), depending on the deliverable.
Speech-to-text produces a text transcript from spoken audio. Typical exam clues: “captions,” “transcription,” “convert recorded meetings to text,” or “analyze voice calls.” Text-to-speech produces spoken audio from text, often used for accessibility or voice assistants. Translation scenarios require careful reading: translating text is different from translating speech. Speech translation is for real-time multilingual conversations or subtitles from live audio.
Exam Tip: Always anchor on the media type. If audio is involved anywhere in the pipeline, Speech is likely part of the answer. If there is no audio, Speech is usually a distractor.
Common trap: mixing up “speech recognition” (STT) with “language understanding.” Speech recognizes words; language understanding extracts meaning (intent/entities). Many solutions combine them, but the exam question usually highlights the primary need. If the scenario says “understand what the user wants,” that’s an NLP intent/entity problem (Language), even if speech is also used to capture the utterance.
Generative AI workloads differ from classic NLP because the system creates new content: answers, summaries, drafts, code, or dialog. On AI-900, you should be comfortable with the ideas of large language models (LLMs), prompts, and tokens. A prompt is the instruction plus any context you provide. Tokens are the chunks of text the model processes; longer prompts and longer outputs consume more tokens, which affects cost and limits.
Prompting patterns show up indirectly in scenario questions. You may see: “Provide examples,” “Use a specific format,” “Answer in JSON,” or “Follow company tone.” These imply structured prompting and constraints. You might also see “few-shot” prompting (showing a couple of examples) to steer output style and accuracy.
Two tuning concepts often tested at a high level are temperature and top-p. Temperature controls randomness: lower values produce more deterministic, conservative responses; higher values allow more variation and creativity. Top-p (nucleus sampling) restricts token choices to a probability mass; lower top-p tends to be safer/more focused, higher top-p increases diversity. The exam doesn’t require math—just the directional effect.
Exam Tip: If a scenario demands consistent, repeatable phrasing (compliance summaries, standard customer replies), choose lower temperature. If it demands brainstorming (names, slogans), higher temperature is reasonable.
Service selection clue: choose Azure OpenAI when the requirement is to generate natural language, reason over instructions, or produce creative/variable text. Don’t overuse GenAI: if the output must be a strict label or extracted field, classic NLP services are typically the exam’s intended solution.
Embeddings and retrieval are central to many real-world Azure OpenAI scenarios and are increasingly emphasized in the AI-900 generative AI domain. An embedding is a numeric representation of text (or other data) that captures meaning. Similar texts end up with vectors that are “close” to each other in vector space. This enables vector search: instead of keyword matching, you retrieve content that is semantically similar to a user’s question.
Retrieval-augmented generation (RAG) combines retrieval with an LLM. The workflow in plain terms: (1) user asks a question; (2) you search your own documents using embeddings to find the most relevant passages; (3) you place those passages into the prompt as grounded context; (4) the LLM generates an answer that cites or uses the retrieved content. The key value is reducing hallucinations and keeping answers aligned with your organization’s data—without necessarily fine-tuning a model.
Exam Tip: If the scenario says “answer questions using our internal PDFs/knowledge base” and also says “do not make up answers,” RAG is the intended pattern. Look for choices mentioning embeddings, vector search, or “grounding data.”
Common trap: assuming you must train or fine-tune the model for company data. On AI-900, the preferred concept is often RAG because it uses retrieval rather than model retraining. Another trap is confusing embeddings with tokens: tokens are how text is chunked for processing; embeddings are meaning vectors used for similarity and retrieval.
AI-900 explicitly tests responsible AI principles in the context of generative AI. You must recognize risks (harmful content, bias, prompt injection, data leakage, overreliance) and the high-level mitigations on Azure. In Azure, responsible GenAI commonly includes content safety (filtering or moderating harmful outputs), privacy and security controls (protecting sensitive inputs/outputs), and human oversight (review processes for high-impact decisions).
Content safety means detecting and handling categories like hate, violence, sexual content, and self-harm, and applying policies (block, warn, or allow with logging). Privacy considerations include minimizing sensitive data in prompts, using access control for stored conversations, and avoiding exposing confidential documents through poorly designed retrieval. Human oversight is essential in domains like healthcare, finance, or HR: the model can assist, but a person should verify decisions, especially when outcomes affect individuals.
Exam Tip: If a scenario mentions “customer-facing chatbot,” “public website,” or “students,” assume safety controls and monitoring are required. If it mentions “PII,” “medical records,” or “confidential documents,” prioritize privacy, least-privilege access, and data handling controls.
Common trap: thinking responsible AI is optional or only a policy document. On the exam, responsible AI is a design requirement: choose answers that include moderation, auditing, and review loops. Another trap is assuming the model is a source of truth; good answers often include language like “assist,” “recommend,” or “draft,” paired with validation and human review.
1. A retail company wants to automatically label incoming customer emails as "complaint", "praise", or "billing question". The solution must return a category for each email and does not need to generate new text. Which workload type is this?
2. A company stores contracts as text and needs to extract specific fields such as organization names, dates, and monetary amounts to populate a database. Which Azure service capability best fits this requirement?
3. A support center wants near real-time transcription of phone calls and then to translate the transcribed text to English for supervisors. Which Azure service family should you use for the speech-to-text portion?
4. You are designing a chatbot that must answer questions using an internal policy manual and should avoid making up facts. Which approach best supports grounded answers from enterprise content?
5. Which statement best describes embeddings in the context of generative AI workloads on Azure?
This chapter is your conversion layer from “I understand the topics” to “I can pass AI-900 under timed conditions.” The exam rewards recognition and decision-making: matching an AI workload to the right Azure service, knowing when you’re training versus doing inference, and identifying evaluation or responsible AI concepts at a glance. Your goal here is to build repeatable habits: pacing, triage, and a review workflow that turns every missed item into a predictable point gain.
We’ll run two mock-exam blocks (to mimic the cognitive load and context switching AI-900 is known for), then do weak spot analysis and a targeted final review mapped directly to the course outcomes and the real exam objective language. You’ll finish with an exam-day checklist and a 24-hour plan that prioritizes recall, not cramming.
Exam Tip: Treat the mock exam as skills practice, not a score report. Your “pass probability” increases fastest when you improve your process: reading the stem carefully, eliminating distractors, and identifying which Azure AI capability is being described.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final 24-hour review plan and confidence boosters: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final 24-hour review plan and confidence boosters: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
The AI-900 exam is designed to test breadth: you will bounce between machine learning fundamentals, computer vision, NLP, and generative AI concepts. Your mock exam must reproduce that switching cost. Set a timer, close notes, and take the exam in one sitting. If you’re practicing at home, remove “helpful” tools—no searching docs, no second screen, and no pausing the timer for interruptions. The point is to rehearse decision-making under constraints.
Use a two-pass pacing strategy. Pass 1: answer what you know quickly and flag anything that requires deeper comparison. Pass 2: return to flagged items with deliberate reasoning and elimination. You’re training your brain to avoid getting stuck early, which is the most common pacing failure for first-time test takers.
Exam Tip: Build an internal time budget: roughly 60–90 seconds per question on the first pass. If you cannot clearly identify the workload and the likely service family (Azure Machine Learning vs Azure AI Vision vs Azure AI Language/Speech vs Azure OpenAI) within that window, flag and move on.
After the mock, do not immediately review every detail. First, write a quick “error log” list: what domain felt slow, what keywords you missed, and whether your misses were concept gaps or reading mistakes. This log powers your weak spot analysis later in the chapter.
Mock Exam Part 1 should be a mixed-domain block that mirrors how AI-900 interleaves topics. The goal is not just to “know services,” but to recognize decision criteria. For example, when the scenario describes labeling, training, and evaluating a model, the exam is often probing machine learning lifecycle concepts (training vs inference, model evaluation, and responsible deployment). When it describes extracting text, tagging images, or detecting objects, you’re typically in Azure AI Vision territory. When it describes sentiment, key phrases, language detection, entity recognition, or speech-to-text, you’re in Azure AI Language/Speech.
Your review workflow matters as much as your initial answers. For each missed item, capture three elements in your notes: (1) the keyword(s) that should have triggered the right domain, (2) the incorrect assumption you made, and (3) the “decision rule” you’ll use next time. Decision rules are short: “If the prompt says ‘train a model on my data,’ think Azure Machine Learning; if it says ‘prebuilt OCR,’ think Azure AI Vision.”
Exam Tip: Watch for prompts that intentionally blend terms like “model,” “endpoint,” or “prediction.” In AI-900, “prediction” can refer to ML inference, but it can also describe prebuilt API output (vision/language). Your job is to detect whether you are expected to build/train or consume a prebuilt capability.
End Part 1 by categorizing errors into: service-selection errors, lifecycle errors (training vs inference, evaluation), and governance/responsible AI errors. This classification makes your next study block targeted rather than repetitive.
Mock Exam Part 2 should intentionally “ramp difficulty” by mixing near-neighbor services and concepts. This is where AI-900 often differentiates prepared candidates: the distractors become plausible because they are in the right general family but wrong for the workload. Expect trickier service boundaries (for example, using a general machine learning approach when a prebuilt Azure AI service is more appropriate, or mixing classical ML evaluation terms with generative AI language).
To simulate real stress, shorten your time budget slightly and keep the same two-pass strategy. The aim is to maintain clarity even when the questions feel similar. If you notice you’re answering based on familiarity (“I’ve seen this name before”), stop and force the workload-first approach: identify the input type (text, image, audio, tabular), the task (classification, regression, extraction, generation), and whether you need training/customization.
Exam Tip: A common difficulty-ramp pattern is “custom vs prebuilt.” If the scenario stresses domain-specific language, brand-specific entities, specialized image categories, or proprietary data, the exam may be nudging you toward customization (custom models or training) rather than a purely prebuilt endpoint.
After Part 2, run a “confidence calibration” check: identify questions you got right but felt unsure about. These are high risk on exam day. Put them in your weak spot list even if they were correct—AI-900 punishes hesitation through lost time.
Rationales are where you gain points quickly. For each item you review, you must be able to explain why the correct option is correct and why the others are wrong. AI-900 distractors often share a surface resemblance: they live in the same Azure ecosystem, they “use AI,” or they handle the same data type but with a different task.
Train yourself to spot keyword traps. “Training” signals building a model (often Azure Machine Learning) versus “analyze/extract/detect” which often signals prebuilt services. “Evaluate” may reference metrics (accuracy, precision/recall, confusion matrix) or validation concepts rather than deployment steps. “Real-time endpoint,” “batch scoring,” and “inference” indicate prediction time behaviors—don’t confuse these with the training pipeline.
Exam Tip: If two answers are both plausible Azure services, ask: “Does this require me to bring labeled data and train?” If yes, that leans toward Azure Machine Learning or custom model flows. If no, that leans toward Azure AI services (Vision/Language/Speech) or Azure OpenAI for generative tasks.
Another common trap is mixing responsible AI concepts. The exam may present fairness, reliability/safety, privacy/security, inclusiveness, transparency, and accountability in similar language. Anchor on definitions: fairness is about equal outcomes and bias mitigation; transparency is about understandability and explainability; privacy is data protection; reliability/safety is consistent performance and harm reduction.
Finally, practice elimination by mismatch: if the stem describes images but the option is clearly text-focused, eliminate instantly. This sounds obvious, but under time pressure candidates overthink and miss easy eliminations.
Use this section as your “final 24-hour review plan and confidence boosters” playbook, mapped to the course outcomes and the objective language Microsoft tends to test.
Describe AI workloads and key decision criteria for choosing Azure AI solutions: Be fluent in workload-to-service matching. The exam expects you to distinguish conversational AI, anomaly detection, classification/regression, vision analysis, NLP extraction, speech recognition, and generative AI. Decision criteria include: prebuilt vs custom, data type, latency needs, and governance requirements.
Explain fundamental principles of machine learning on Azure: Know training vs inference cold. Training: fit a model using data (often labeled), tune, validate. Inference: use the trained model to score new data via endpoints. Model evaluation: interpret metrics and understand that “better” depends on the business goal (precision vs recall tradeoffs). Don’t confuse data preparation steps with evaluation steps.
Identify computer vision workloads on Azure and select appropriate Azure AI Vision capabilities: Recognize common tasks: OCR/read text, image tagging, object detection, and spatial analysis language in stems. Avoid the trap of recommending custom training when the prompt clearly wants standard extraction or description.
Identify NLP workloads on Azure and select appropriate Azure AI Language and Speech capabilities: Map tasks: sentiment analysis, key phrase extraction, entity recognition, language detection, summarization, translation, speech-to-text and text-to-speech. A frequent trap is mixing “speech” with “language”—audio input typically points to Speech; plain text typically points to Language.
Describe generative AI workloads on Azure: Know core terms: prompt, completion, tokens, embeddings, grounding/retrieval augmentation, and responsible AI considerations. Be ready to choose Azure OpenAI for generation/summarization/chat-style tasks and to discuss safe, transparent use.
Exam Tip: Your confidence booster is repetition of decision rules, not rereading. Spend the last review block reciting: “data type → task → prebuilt vs custom → service family.”
Exam day is execution. Prepare your environment the night before: stable internet, quiet space, and a cleared desk if testing online. Confirm your exam appointment time zone. Have acceptable ID ready and ensure your name matches the registration details. If you’re using an online proctor, run the system check early and close background applications to avoid disqualification risk.
Use a simple time plan: start with a calm first pass, flagging uncertain items without spiraling. On the second pass, use elimination and keyword anchors. If you find yourself debating two similar options, go back to the workload definition: what input, what output, and does it require training? This prevents “Azure service name bias,” where you choose a familiar product rather than the correct capability.
Exam Tip: If you’re stuck, don’t reread the entire question repeatedly. Instead, extract the nouns (data type), verbs (task), and constraints (custom vs prebuilt, real-time vs batch, responsible AI). Then decide.
Retake strategy is part of preparedness, not pessimism. If you don’t pass, your error log becomes your study plan: reclassify misses by objective area, review only the decision rules you failed to apply, and retake quickly while recognition memory is still strong. Most candidates improve substantially by fixing process issues (reading precision, pacing, and service-family mapping) rather than “learning everything again.”
Finish with a confidence routine: review your top 10 decision rules, scan responsible AI principles, and sleep. Fatigue causes more wrong answers on AI-900 than lack of knowledge.
1. A company wants to run a timed AI-900 practice session and improve their score by focusing on recognition of Azure AI services. During review, they notice they often confuse "building a custom model" with "using a prebuilt model." Which pairing correctly matches a scenario to the most appropriate Azure service type? Scenario: Identify objects in images (no custom training required) and return bounding boxes.
2. You are doing weak-spot analysis after a mock exam. You missed several questions about when a system is performing training vs. inference. Which statement best describes inference in the context of Azure AI solutions?
3. A retail company built a binary classifier to predict whether a customer will churn. During the final review, you want to quickly validate how the model performs across different decision thresholds. Which evaluation artifact is best suited for this on AI-900?
4. During a mock exam, you see a question about Responsible AI. A bank uses an ML model to approve loans and wants to detect whether approval rates differ unfairly across demographic groups. Which Responsible AI concept does this most directly relate to?
5. You are advising a colleague on an exam-day checklist. They tend to overthink and pick complex solutions. Which approach best aligns with AI-900 exam strategy when choosing an Azure service in a scenario question?