AI Certification Exam Prep — Beginner
Objective-mapped AI-102 practice that builds exam-ready Azure AI confidence.
This course is built for learners preparing for the Microsoft AI-102 exam (Azure AI Engineer Associate). If you’re new to certification exams but have basic IT literacy, you’ll get a clear study path and a large bank of exam-style questions mapped directly to the official domains. The focus is practical exam readiness: understanding what Microsoft is testing, recognizing patterns in scenario questions, and practicing the decisions you’ll make as an Azure AI Engineer.
The AI-102 blueprint spans six major domains. This course mirrors those objectives and reinforces them through targeted practice:
Chapter 1 gets you oriented: exam registration, scoring expectations, common question formats, and a study strategy you can follow even if you’ve never taken a Microsoft exam. You’ll also run a baseline diagnostic to identify which domains need the most attention.
Chapters 2–5 follow the official exam objectives by name and group related skills the way they appear in real scenarios. You’ll review core concepts (service selection, deployment choices, security, and evaluation), then apply them using exam-style practice sets designed to reflect Microsoft’s wording and distractor patterns.
Chapter 6 finishes with a full mock exam experience split into two parts, followed by weak-spot analysis and an exam-day checklist. This structure is designed to help you transition from “I recognize the services” to “I can choose the best option under time pressure.”
AI-102 questions often require you to choose between similar options (for example, picking the correct service for OCR vs document extraction, selecting the right retrieval pattern for RAG, or applying the right identity and networking control). This course emphasizes:
If you’re ready to begin, you can Register free and start working through the chapters in order, or browse all courses to compare other Azure and AI certification prep options. By the end of this course, you’ll have a clear grasp of each AI-102 domain and the practice needed to approach the exam with a repeatable strategy.
Microsoft Certified Trainer (MCT) | Azure AI Engineer (AI-102)
Jordan Whitaker is a Microsoft Certified Trainer who specializes in Azure AI solution design and exam readiness for the Azure AI Engineer Associate track. He has coached learners through objective-based practice, scenario analysis, and hands-on Azure AI patterns aligned to Microsoft certification exams.
AI-102 is less about memorizing feature lists and more about demonstrating engineering judgment: choosing the right Azure AI capability, deploying it securely, monitoring it in production, and iterating based on evaluation signals. The exam repeatedly tests whether you can translate a scenario’s constraints (latency, privacy, language support, cost, and operational readiness) into the correct architecture and configuration choices.
This chapter orients you to the exam format and how to study efficiently. You will learn what the credential validates, how to register and prepare your environment, how scoring works, how Microsoft writes scenario-based items, and how to build a 2–4 week plan mapped to the official domains. You will also set expectations for a baseline diagnostic and a realistic target score so your practice-test time is purposeful.
Exam Tip: AI-102 questions often have more than one “technically possible” answer. The correct choice is the one that best fits the scenario’s non-functional requirements (security, monitoring, cost, governance) and aligns with Azure-first services (Azure AI Services, Azure OpenAI, Azure AI Search, Azure AI Document Intelligence) rather than DIY components unless explicitly required.
Use the rest of this chapter as your “operating manual” for the course: how to approach questions, how to plan your study blocks, and how to avoid the traps Microsoft builds into scenario wording.
Practice note for Understand the AI-102 exam format, question types, and time management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and set up your test environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2–4 week study plan mapped to official domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and target score planning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand the AI-102 exam format, question types, and time management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam and set up your test environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 2–4 week study plan mapped to official domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and target score planning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 validates the day-to-day work of an Azure AI Engineer: planning and managing an AI solution and implementing applied AI workloads using Azure services. In practice-test terms, your job is to recognize which domain is being tested and which decision the question is forcing you to make. The exam objectives commonly cluster into: (1) plan/manage resources (identity, networking, monitoring, cost), (2) generative AI (Azure OpenAI, prompt engineering, RAG, safety/evaluation), (3) agentic solutions (tool use, orchestration, state, guardrails), (4) computer vision (image analysis, OCR, document intelligence), (5) NLP (text analytics patterns), and (6) knowledge mining (Azure AI Search indexing, enrichment, chunking, retrieval).
What the exam is really validating is your ability to move from “prototype works” to “solution can be run.” That includes selecting SKUs and regions, applying RBAC and managed identities, choosing private endpoints when required, designing content filters and prompt/response handling, and knowing when to use Azure AI Search vs. custom storage/query approaches.
Exam Tip: Watch for wording that signals the exam domain. “Minimize administrative effort” often points to managed services and built-in features (managed identity, built-in monitoring). “Ensure data does not traverse the public internet” signals private endpoints/VNet integration. “Must cite sources” or “reduce hallucinations” signals RAG with retrieval grounding and evaluation.
Common trap: treating AI-102 as a pure developer exam. Microsoft will test deployment readiness (logging, quotas, throttling, key rotation, content safety) even in questions that look like simple API usage.
Register through Microsoft’s certification dashboard and schedule via the authorized delivery provider. Decide early whether you will test online proctored or at a test center, because your preparation checklist changes. For online proctoring, your “test environment” is part of your score: a stable connection, a quiet room, and a compliant workspace are required. For a test center, travel time and check-in rules become the risk to manage.
Set up your account details to match your government-issued ID exactly (name order and special characters matter). Have acceptable identification ready and unexpired. If your legal name does not match your profile, fix it before scheduling; last-minute changes can trigger delays.
Exam Tip: Do a full technical check 24–48 hours before exam day (not five minutes before). The most preventable failure mode is treating environment readiness as optional. If you plan to use a laptop, confirm power settings won’t sleep/hibernate mid-exam.
Scheduling strategy: pick a date that supports a 2–4 week plan with at least two full practice-test cycles (attempt → review → targeted drills → reattempt). Avoid scheduling immediately after major work deadlines. You want consistent daily study time and a final 48-hour window for consolidation and light review rather than cramming new services.
Common trap: booking too early without a baseline diagnostic. You should schedule after you can estimate how far you are from passing and how many objective areas need focused remediation.
Microsoft exams are scored on a scaled system. The number of questions can vary, and not every question contributes equally. You are not aiming to “get X questions right” so much as to demonstrate competency across objective areas. Passing requires meeting the scaled passing score, and your score report will show performance bands by skill domain—use this to drive your study plan.
Time management matters because AI-102 can include multi-step scenarios and case-study style items. Plan to make one pass through the exam: answer what you can confidently, mark hard items, and return with remaining time. Don’t spend disproportionate time on a single item early; that is how candidates run out of time on easier points later.
Exam Tip: When you miss practice questions, categorize the miss: (1) concept gap, (2) misread constraint, (3) Azure service confusion, (4) test-taking error (overthinking, changing correct answers). Only category (1) needs more reading; categories (2)–(4) need process fixes.
Retake policies can enforce waiting periods and attempt limits. Treat each attempt as a project with a post-mortem: identify the top two weak domains and fix them before reattempting. “More practice tests” only helps if you review deeply and extract decision rules (for example: when to prefer Azure AI Search for retrieval vs. custom vector database, or when private endpoints are required for compliance scenarios).
Target score planning: set a practice-test target above the passing bar to create margin. Your baseline diagnostic (early in week 1) gives you a starting score; your goal is to move that score reliably, not sporadically. Consistency is the signal that you understand the objectives rather than guessing.
AI-102 is dominated by scenario-based questions that simulate real engineering constraints. Microsoft typically provides a short scenario (sometimes with multiple requirements) and asks for the “best” action, service, configuration, or next step. Your task is to locate the constraint that eliminates distractors. Distractors are often plausible but violate one requirement such as network isolation, least privilege, or operational monitoring.
Develop a repeatable parsing method: first underline requirements (must/should), then identify the objective domain (planning, gen AI, agents, vision, NLP, search). Next, translate requirements into a decision rule. Example: “Must prevent data exfiltration and use corporate identity” points you to managed identity and RBAC, plus private endpoints where relevant. “Needs citations and up-to-date internal knowledge” points to a RAG pattern with Azure AI Search indexing and retrieval, with chunking strategy and grounding.
Exam Tip: Watch for “minimize cost” paired with “high availability” or “global users.” The cheapest option is rarely correct if it cannot meet SLA/latency requirements. Microsoft expects you to balance cost with reliability and monitoring, not optimize a single dimension.
Common traps include: confusing similar services (Azure AI Document Intelligence vs. OCR features inside vision), assuming client-side secrets are acceptable (they are not—use managed identity where possible), and ignoring service limits/quotas (rate limits, token limits, content filtering). Another frequent trap is choosing a tool because it “can do it,” not because it is the most operationally appropriate Azure-managed capability for the requirement.
How to identify the correct answer: eliminate options that violate a hard constraint, then choose the remaining option that best aligns with Azure best practices (secure by default, monitorable, scalable) and the exam objective wording.
Your highest-return workflow is cyclical: learn the objective, practice questions mapped to that objective, review misses to extract decision rules, then repeat with narrower focus. This course’s practice tests are most effective when you treat them as diagnostics rather than final exams. The goal is to shorten the time between seeing a scenario and recognizing the correct Azure pattern.
Build a 2–4 week plan aligned to official domains. Week 1 is orientation plus baseline diagnostic and foundational gaps (resource planning, identity, networking). Week 2 focuses on generative AI and RAG: Azure OpenAI deployments, prompt engineering patterns, evaluation, safety, and Azure AI Search retrieval grounding. Week 3 covers agents and applied AI workloads: tool use/orchestration, state handling, guardrails; plus vision, OCR, and document intelligence. Week 4 is consolidation: mixed sets, timed practice, and weak-area drills. If you only have 2 weeks, compress by prioritizing your weakest two domains from the baseline diagnostic.
Exam Tip: Review is where the score increases. For every wrong answer, write the “because” in one sentence (for example: “Private endpoint required because traffic must not traverse public internet” or “RAG needed because model must use internal documents and provide citations”). If you cannot write the because, you are guessing.
Time management practice: do some sets timed, not all. Early on, untimed practice helps you learn. Later, timed mixed sets train endurance and reduce careless mistakes. Aim for a target score buffer on practice tests and require yourself to hit it twice on different days before you consider a domain “stable.”
Common trap: re-reading Learn modules without converting them into answerable decisions. Always translate reading into “if scenario says X, choose Y” rules that match exam-style prompts.
Use Microsoft Learn as your primary reference for objective-aligned content, but avoid passive consumption. Pair each Learn topic with hands-on validation where possible: create resources, set permissions, test endpoints, and observe outputs. Hands-on work is especially valuable for areas the exam loves to probe indirectly: authentication flows (keys vs. Entra ID), network isolation (private endpoints), and operational readiness (logging, metrics, alerts).
Sandboxing options include Azure free resources, a dedicated subscription for study, and labs that let you experiment without polluting production. When practicing generative AI, include end-to-end patterns: prompt templates, system messages, content filtering expectations, and RAG retrieval with Azure AI Search (indexing, chunking, embeddings, hybrid retrieval). For vision and document intelligence, test real sample documents so you understand what the service returns and what post-processing is required.
Exam Tip: Build a “decision notebook” rather than a feature notebook. Organize notes by scenario trigger and best answer: security triggers (managed identity, RBAC, private endpoints), cost triggers (right-size SKUs, avoid over-provisioning), monitoring triggers (Application Insights, logging, alerts), and gen AI triggers (RAG vs. fine-tuning, grounding, evaluation).
Adopt a lightweight note system: one page per exam domain with: key services, common constraints, and 10–15 decision rules. Add a “mistake log” from practice tests with the exact misread phrase that fooled you. This trains you to catch Microsoft’s wording patterns. Keep your notes concise so you can review them in the final 48 hours.
Finally, treat your baseline diagnostic as a tool, not a judgment. Its purpose is to identify which objectives to study first and to set a target score plan. The combination of Learn + hands-on sandboxing + disciplined review of practice misses is the fastest path to passing AI-102.
1. You are creating a 3-week study plan for AI-102. You took a baseline diagnostic and scored 48%. Your goal is to pass efficiently with minimal rework. Which approach best aligns with how Microsoft typically designs AI-102 scenario questions?
2. You are practicing exam-style questions. You notice multiple answers would work technically, but only one is scored as correct. What should you prioritize to choose the best answer in AI-102 scenarios?
3. You have 120 minutes for the AI-102 exam and frequently get stuck on long scenarios. Your practice results show you miss easy questions near the end due to time pressure. What is the best time-management strategy during the exam?
4. A company is deciding between taking the AI-102 exam at a test center or online. They handle confidential customer data and want to reduce the risk of exam-day issues. Which preparation step is most aligned with setting up a reliable test environment?
5. You are advising a team on how to set a target score for practice tests over a 2–4 week preparation window. The team currently averages 55% on mixed-domain quizzes. What is the best way to use target score planning to improve pass likelihood?
AI-102 tests more than “can you call an API?” It expects you to plan a production-ready Azure AI workload: selecting the correct service and architecture, deploying it securely, governing it responsibly, monitoring it under load, and controlling cost. This chapter aligns to the exam’s planning and management objectives and prepares you for the domain practice set by teaching you how to reason like the test: pick the simplest service that meets requirements, prove you can secure it with least privilege and private networking, and show operational readiness with monitoring, resiliency, and cost controls.
On exam questions, the “right answer” usually matches a specific constraint in the scenario: data residency, network isolation, identity model (keys vs Entra ID), latency/throughput, or regulatory logging. A common trap is selecting a technically possible option that violates a governance rule (public endpoint exposure, hard-coded keys) or an operational requirement (no monitoring, no retry strategy). As you read the sections, practice mapping each requirement to a concrete Azure capability (resource type, configuration, or control plane feature) and eliminate options that don’t directly satisfy the constraint.
Practice note for Design solutions and choose the right Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Secure, govern, and manage identity for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deploy, monitor, and optimize reliability and cost: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Planning & management (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design solutions and choose the right Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Secure, govern, and manage identity for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deploy, monitor, and optimize reliability and cost: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Planning & management (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design solutions and choose the right Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Secure, govern, and manage identity for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 frequently asks you to choose the “right” Azure AI service and deployment model for a scenario. Start by classifying the workload: prebuilt Azure AI Services (Vision, Language, Speech, Document Intelligence), Azure AI Search for retrieval, or Azure OpenAI for generative. Then decide whether to run in Azure (managed endpoints) or in containers (customer-managed runtime). Containers are commonly tested when the prompt mentions disconnected environments, edge locations, strict data sovereignty, or needing to run the model behind a customer-controlled network boundary. Managed cloud endpoints are typically correct for rapid time-to-value, managed scaling, and integrated monitoring.
Regions matter. The exam likes scenarios that mention “must remain in EU,” “use a specific geography,” or “minimize latency for users in APAC.” Your architecture answer should place resources in the required region and call out that not every AI capability is available in every region. If a service is unavailable in the required region, the correct design may involve choosing an alternative service or a compliant region pairing. Also consider multi-region resiliency: active/active for read-heavy endpoints, or active/passive for failover; the exam often rewards designs that reduce blast radius.
Exam Tip: When you see “on-premises,” “offline,” “air-gapped,” or “no data leaves the facility,” immediately evaluate containerized Azure AI Services or a pattern where only metadata leaves the network. If the scenario also demands managed identity, private endpoints, and Azure-native logging, a managed cloud deployment is usually expected.
Finally, recognize “architecture pattern” hints: if the question mentions retrieval, grounding, or “use enterprise documents,” it’s pushing you toward a RAG architecture using Azure AI Search plus an LLM endpoint. If it mentions “process many documents,” consider async/batch patterns with queues and event-driven compute to absorb spikes.
Provisioning is where exam questions test your familiarity with Azure resource concepts: endpoints, authentication, and role assignments. Most Azure AI Services expose an endpoint URL and require either API keys or Microsoft Entra ID (Azure AD) authentication. A classic trap: selecting “keys” when the scenario calls for enterprise governance (centralized identity, key rotation, or no secrets in code). If the prompt includes “least privilege,” “managed identity,” “no shared secrets,” or “rotate credentials,” prefer Entra ID + RBAC.
Know what must be configured after creation: selecting a pricing tier, enabling diagnostic settings, configuring allowed networks (public vs private), and assigning roles. RBAC is tested by asking who can “invoke,” “manage,” or “read” the resource. In practice, management-plane actions (create/update) require Contributor-like roles, while data-plane invocation is often controlled through specific “Cognitive Services User”/service-specific roles when Entra ID auth is enabled. If keys are used, anyone with the key effectively has invocation access—another reason keys are considered weaker for governance.
Exam Tip: If a question says “application runs in Azure” and “must not store secrets,” the best pattern is: system-assigned managed identity on the compute + RBAC assignment to the AI resource + Entra ID authentication. Eliminate answers that propose embedding keys in app settings without Key Vault.
Also watch for “multiple environments” (dev/test/prod). The exam expects separate resources (or at least separate keys/quotas) and consistent IaC deployment. When asked how to prepare for deployment readiness, describe repeatable provisioning via ARM/Bicep/Terraform and configuration through parameterization rather than manual portal changes.
Networking is a high-yield area because many wrong answers ignore isolation requirements. If the scenario says “no public internet access,” “internal-only,” or “exfiltration risk,” the correct solution usually includes private endpoints (Private Link) for the Azure AI resource plus corresponding DNS configuration (private DNS zones) so clients resolve the service endpoint to a private IP. A common trap is confusing IP firewall rules with true private connectivity; firewall allowlists still expose a public endpoint, while private endpoints eliminate the need for public exposure.
Managed identity appears frequently alongside private networking: your compute (App Service, Functions, AKS, VM) uses a system-assigned or user-assigned managed identity to authenticate to the AI resource via Entra ID. This avoids secrets and supports rotation-free credentials. If the question includes “rotate keys,” “remove hard-coded secrets,” or “audit access by user/app,” managed identity + RBAC is the cleanest path.
Exam Tip: When you see “must be accessible only from VNet,” look for an answer that combines: disable public network access (where supported) + private endpoint + private DNS + RBAC/managed identity. If any element is missing, it may be an incomplete design.
Finally, consider egress controls. In agentic or RAG scenarios, the solution might call external tools or retrieve from storage/search. The secure design restricts outbound calls (e.g., through controlled endpoints) and enforces that all dependencies (Storage, Search, OpenAI, Key Vault) are reachable privately when required. The exam often rewards answers that secure the whole chain, not just the model endpoint.
Responsible AI in AI-102 is usually tested through governance basics: content safety, prompt/response logging considerations, human oversight, and compliance alignment. Scenarios may mention “avoid harmful content,” “prevent leaking PII,” or “audit generated outputs.” Your design response should include safety controls (for generative AI, using Azure AI Content Safety or built-in safety configurations where applicable) and a policy for what you log and how you protect it.
A frequent exam trap is proposing to log “everything” (full prompts, full outputs, full documents) without addressing sensitive data handling. The better answer acknowledges that logs are data too: apply data minimization, redact or hash identifiers, use secure storage with RBAC, and set retention policies. If the prompt mentions regulations (HIPAA, GDPR, financial), the expected approach includes explicit consent/notice considerations, data residency, and access auditing rather than ad-hoc application logging.
Exam Tip: If the scenario demands “traceability” or “investigation of incidents,” choose Azure-native logging (Azure Monitor/Log Analytics) with diagnostic settings enabled on each resource. Avoid answers that rely only on local application logs without centralized retention and access controls.
Also be ready to identify where governance lives: policy and controls at the platform level (Azure Policy, RBAC, private networking), and safety controls at the application/AI level (input validation, content filtering, grounding constraints). The exam typically wants layered guardrails, not a single control.
Operational excellence is core to “plan and manage.” AI-102 scenarios often describe intermittent failures, latency spikes, or “requests are being rejected.” Translate those symptoms into: monitoring, throttling/quotas, and resiliency patterns. First, enable diagnostic settings on AI resources to send logs and metrics to Log Analytics, Event Hub, or Storage. Then define alerts on key signals: error rates (4xx/5xx), latency, saturation (throttling), and dependency failures (Search/Storage).
Throttling is a common exam topic: 429 responses indicate rate limits or capacity constraints. The best answer usually includes client-side retry with exponential backoff and jitter, request pacing, and batching where appropriate. A trap is choosing “increase timeout” as a fix for throttling—it doesn’t address rate limiting. Another trap is ignoring idempotency: retries must be safe, especially for operations that create resources or trigger side effects.
Exam Tip: When asked to “improve reliability,” prefer patterns that reduce single points of failure: queue-based load leveling, circuit breakers, and graceful degradation (e.g., fall back to a simpler model or cached result) rather than only “scale up.”
The exam may also include “deployment caused outage.” Correct responses include canary or blue/green deployments, feature flags for risky changes (like prompt template updates), and rollback plans. Monitoring must validate both infrastructure health and model/app behavior (latency, output quality signals when available).
Cost and capacity questions on AI-102 reward candidates who can name concrete levers: pricing tiers, throughput/capacity units, caching, batching, and controlling token usage for LLMs. If the scenario says “cost is too high,” avoid vague answers like “optimize code.” Instead, identify measurable drivers: request volume, payload size, model size, and retry storms. For generative workloads, token limits, prompt length, and response length are the first places to look; for OCR/document processing, page counts and batch frequency dominate.
Capacity planning on the exam usually appears as “prepare for peak traffic” or “avoid 429s.” The correct solution combines: estimating peak RPS, sizing quotas/capacity, pre-provisioning where applicable, and implementing load leveling with queues. A trap is relying solely on autoscale when the downstream AI service enforces fixed quotas—autoscale can amplify throttling if you scale callers faster than the service capacity.
Exam Tip: If you see “budget alert” or “chargeback,” the answer should mention Azure Cost Management budgets/alerts and resource tagging. If you see “predictable monthly spend,” look for reserved capacity/commitment options where applicable and tight quota controls.
For deployment readiness, think in checklists because the exam often describes a near-production state. A strong final design includes: IaC templates, environment separation, secret management (Key Vault or managed identity), network lockdown, monitoring/alerting, documented SLOs, and a tested failover/retry strategy. This mindset directly supports the chapter’s domain practice set: you’re not just building an AI feature—you’re operating an Azure AI service reliably and securely at scale.
1. You are designing an Azure AI solution for a healthcare provider. The provider requires that all inference traffic to Azure AI services stays on a private network and that no public endpoints are exposed. The solution will call an Azure AI Services resource (multi-service). What should you configure to meet the requirement with the least operational overhead?
2. A company is building multiple internal applications that call Azure OpenAI and other Azure AI services. Security policy prohibits storing long-lived secrets in application configuration and requires centralized identity and least-privilege access. Which authentication approach should you implement?
3. You operate a production AI workload that calls Azure AI services. During traffic spikes, requests occasionally receive HTTP 429 (Too Many Requests). The business requires high reliability and minimal user-visible errors. What should you implement first?
4. A team must meet a compliance requirement to retain audit logs of all access and configuration changes to Azure AI resources for at least one year and make them searchable for investigations. Which approach best meets the requirement?
5. You need to choose Azure AI services for a solution that: (1) extracts printed text from scanned invoices, (2) identifies key-value pairs like invoice number and total, and (3) minimizes custom model training time. Which service should you use?
This chapter maps directly to the AI-102 objective area that expects you to implement generative AI solutions—not just describe them. On the exam, you’ll be asked to choose configurations, SDK calls, architectural patterns, and safety controls that are correct for a given scenario. Your job is to translate business requirements (accuracy, latency, cost, and safety) into the right Azure OpenAI and Azure AI Search design.
The lessons in this chapter are intentionally practical: you’ll build Azure OpenAI chat/completions solutions aligned to objectives, apply prompt engineering and output control for exam scenarios, implement Retrieval Augmented Generation (RAG) with Azure AI Search plus grounding techniques, and then consolidate with a domain practice set mindset. As you read, focus on how the exam phrases requirements (for example, “must cite sources,” “must not store prompts,” “minimize token usage,” or “must prevent harmful content”). Those words are signals that point to specific features and settings.
Exam Tip: When a question provides both functional requirements (e.g., “answer from internal manuals only”) and non-functional requirements (e.g., “cost must be minimized”), the correct option typically addresses the functional requirement first (grounding/RAG) and then optimizes tokens/caching/model choice second. Don’t pick a cheaper model if it violates grounding or safety requirements.
Practice note for Build Azure OpenAI chat/completions solutions aligned to objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply prompt engineering and output control for exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement RAG with Azure AI Search and grounding techniques: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Generative AI (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build Azure OpenAI chat/completions solutions aligned to objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply prompt engineering and output control for exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement RAG with Azure AI Search and grounding techniques: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Generative AI (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build Azure OpenAI chat/completions solutions aligned to objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Apply prompt engineering and output control for exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 commonly tests whether you understand the difference between a model and a deployment. In Azure OpenAI, you select a model family (for example, GPT-style chat models, embedding models, or image generation models), then create a deployment with a deployment name. Most SDK calls target the deployment name, not the underlying model identifier. A frequent exam trap is choosing an answer that references “call the model directly” rather than “call the deployment endpoint using the deployment name.”
Tokens drive both cost and model limits. The exam expects you to reason about prompt + completion tokens, maximum context window, and how conversation history accumulates. If a scenario says “multi-turn chat over long documents,” assume you must summarize, chunk, or use RAG rather than dumping entire documents into the prompt. Parameters such as temperature (creativity), top_p (nucleus sampling), and max_tokens (response cap) show up in scenario questions. In regulated scenarios, lower temperature and explicit output schemas reduce variance and help with repeatability.
Exam Tip: If a question mentions “reduce cost” or “avoid exceeding context limits,” the best answer often involves trimming chat history, using summaries, or moving knowledge into Azure AI Search + RAG rather than increasing max tokens.
In your “build chat/completions solutions” lesson, map the requirement to the right API style: use chat for multi-turn, role-based prompting; use completions (where applicable) for simpler single-turn generation. The exam may also check whether you can identify when embeddings are required (semantic search, similarity, RAG) versus when generation alone is sufficient (formatting, rewriting, summarization of short input).
Prompt engineering is tested in AI-102 as an implementation skill: selecting the correct role messages (system/developer/user), applying constraints, and producing outputs that downstream code can reliably parse. System messages are your “policy layer” inside the prompt: they define persona, boundaries, and output rules. A common trap is placing critical rules in the user message; the exam typically expects durable instructions (safety, citation format, JSON schema) to live in the system message so they persist across turns.
Use prompt patterns that align with scenario requirements:
Structured outputs are a recurring exam theme because they reduce post-processing errors. When a scenario says “must integrate with a workflow,” “must return fields,” or “must be machine-readable,” the correct solution usually includes JSON output constraints (and often schema-like guidance). Your “output control” lesson should emphasize: specify keys, allowed values, and fallback behavior (e.g., nulls) to avoid brittle parsing. Also restrict verbosity and include explicit instructions like “Do not include additional keys” when strict parsing is required.
Exam Tip: If the options include “use regex to parse free-form text” versus “constrain the model to JSON,” the exam almost always favors structured output prompting (and/or native structured output features when presented) because it is more reliable and test-aligned.
Finally, watch for prompt injection scenarios. If the question describes user-supplied content attempting to override instructions (“ignore previous directions”), the best answer combines role separation (system message rules), delimiter-based context isolation, and RAG grounding/citations rather than “just increase temperature” or “add more examples.”
AI-102 assesses whether you can design safety into a generative AI solution rather than treating it as an afterthought. Azure OpenAI content filtering and safety features are typically evaluated by scenario: public-facing chatbot, HR assistant, healthcare FAQ, or internal tool. The exam expects you to apply a policy-driven design: define what is allowed, detect and handle risky inputs/outputs, and ensure auditing/monitoring requirements are met.
Key implementation concepts include: applying content filters for both prompts and completions, choosing appropriate severity thresholds, and defining fallbacks (refusal messages, escalation to a human, or redirect to safe content). If the scenario says “must prevent hateful content,” “must block self-harm instructions,” or “must comply with responsible AI,” the correct answer typically includes explicit content filtering plus prompt-level guardrails (system message constraints) and logging for investigation (while respecting privacy requirements).
Exam Tip: When an option says “just add a disclaimer,” treat it as insufficient. The exam generally expects technical controls (filters, refusal behavior, constrained prompts) instead of purely informational mitigation.
Another frequent trap: confusing safety with privacy. Safety controls manage harmful content; privacy controls cover data handling (PII, retention, access). If the scenario emphasizes “do not store prompts” or “restrict access to keys,” that’s security/governance. If it emphasizes “avoid violent content,” that’s safety. Many correct answers require both, but the exam will test whether you can pick the feature that matches the stated requirement.
In practice, safety should be layered: prompt rules (system message), content filtering, and retrieval constraints (RAG limited to approved sources). That layered approach is often the “most correct” answer in multi-select or best-practice questions.
RAG is one of the highest-yield topics in the “Implement generative AI solutions” objective area. The exam doesn’t just ask what RAG is; it tests whether you can implement the pattern with Azure AI Search and Azure OpenAI: create embeddings, index chunks, retrieve relevant passages, and ground the final response with citations. Your “Implement RAG with Azure AI Search and grounding techniques” lesson should translate into a crisp architecture: ingest → chunk → embed → index → retrieve → generate.
Chunking is frequently tested because it impacts relevance, cost, and citation quality. If chunks are too large, you waste tokens and dilute relevance; too small, you lose context. Look for options that mention overlap (sliding windows) and metadata (document id, page, section) because citations require traceability. Embeddings are then stored in a vector field in Azure AI Search, enabling vector search and often hybrid search (keyword + vector) for better recall.
Exam Tip: If a question requires “answer must include sources” or “must only use internal documents,” RAG with citations is the intended solution—not fine-tuning. Fine-tuning changes style/behavior; it does not reliably inject up-to-date proprietary facts.
Grounding techniques on the exam include: (1) strict instructions to use only retrieved context, (2) “insufficient information” fallback, and (3) returning citations tied to chunk metadata. A common trap is retrieving content but failing to pass it to the model in a structured way (no delimiters, no source ids), which makes citations impossible and increases hallucination risk. Another trap: using the entire document as context instead of retrieved top-k chunks; that usually violates token/cost constraints and reduces answer quality.
The exam increasingly emphasizes that you must evaluate generative AI solutions, not just build them. Expect scenario questions that mention “model updates caused different answers,” “latency is too high,” “cost exceeded budget,” or “users report inconsistent outputs.” The correct responses usually combine prompt/model tuning with measurable evaluation.
Quality evaluation includes groundedness (did the answer use sources?), correctness, completeness, and safety. For RAG, evaluate retrieval metrics (are the right chunks being returned?) separately from generation (is the model using them well?). Latency and cost are typically driven by token usage (prompt size + response), retrieval time, and model selection. Practical tuning levers include reducing context size (better chunking, lower top-k), summarizing conversation history, caching frequent retrieval results, and constraining output length.
Exam Tip: If an option proposes “manual spot checks only,” it is usually incomplete. The exam favors repeatable evaluation: automated regression sets, tracked metrics, and clear acceptance thresholds (for example, citation presence, refusal correctness, or response time SLOs).
Regression testing is especially relevant for prompt changes and retrieval/index updates. Even small changes (chunk size, synonym maps, ranking profiles, or system message edits) can shift outputs. For exam scenarios about “recent changes broke behavior,” choose answers that add versioning, A/B testing, and automated evaluation runs rather than ad-hoc fixes.
This section is where many exam questions hide: not “how to build,” but “what went wrong” and “how to fix it.” Hallucinations often stem from missing or poor retrieval, ambiguous prompts, or high temperature. If a scenario says “model invents policies,” the exam expects you to add RAG grounding, require citations, lower temperature, and implement an “I don’t know” fallback when sources don’t support an answer.
Grounding failures can occur even with RAG when the wrong chunks are retrieved or when the prompt does not clearly instruct the model to use the provided context. Look for fixes like hybrid search, improved chunking/overlap, better metadata filters (department, region, date), and explicit delimiters separating sources from the user request. Another classic trap is passing multiple conflicting chunks; the model may merge them. The best exam answer typically includes ranking improvements and prompt instructions to prioritize the most recent or highest-authority source.
Rate limits and throttling appear in operational scenarios: “HTTP 429,” “too many requests,” or “spiky traffic.” Correct mitigations include request throttling, exponential backoff with retries, queue-based buffering, and caching. Simply “increase max tokens” or “use a larger model” won’t solve throughput limits and may worsen cost/latency.
Exam Tip: When you see “must be accurate and verifiable,” assume the exam wants citations and grounded answers; when you see “intermittent failures under load,” assume it wants retry/backoff and queueing—not prompt changes.
As you move into the domain practice set for Generative AI, keep a diagnostic mindset: identify whether the scenario is a generation problem, a retrieval problem, a safety problem, or a throughput problem. AI-102 questions are often easiest when you classify the failure mode first and then select the feature that directly addresses it.
1. A company is building a customer-support copilot that must answer ONLY from internal policy PDFs and must include citations for every answer. Which design best meets the requirement in Azure?
2. You are optimizing an Azure OpenAI chat solution for cost and latency. The application uses a long system prompt and repeats the same instructions in every request. What is the BEST action to reduce token usage while preserving behavior?
3. A healthcare app uses Azure OpenAI to draft responses. The solution must prevent harmful or unsafe content from being returned to users. Which control should you apply FIRST to meet the safety requirement in an exam scenario?
4. You implement RAG with Azure AI Search. Users report that answers sometimes include correct facts but also unrelated information. You want to improve grounding to retrieved documents. Which change is MOST likely to help?
5. A company needs to generate vector representations of product descriptions for semantic search and then use Azure AI Search to retrieve relevant items. Which Azure OpenAI capability is required for the vectorization step?
This chapter targets the AI-102 objective area that often feels “conceptual” on first read but becomes very testable when you map it to concrete design choices: when to use an agent (vs. a single prompt), how to connect tools safely, how to keep state, how to orchestrate multiple steps, and how to add guardrails and observability. Expect exam items that describe a business workflow (support ticket triage, claims processing, sales research, IT automation) and ask you to pick the best architecture components and controls.
An agentic solution in Azure is typically built with a chat or completion model (often Azure OpenAI) that can decide when to call tools (functions) and how to combine tool outputs into an answer. The exam is less about memorizing one SDK and more about recognizing patterns: planner/executor separation, tool/function calling, grounding with retrieval, state management, and operational controls (logging, evaluation, incident handling). You should be able to explain why a given design reduces hallucinations, improves reliability, or meets security requirements.
Exam Tip: When a scenario includes “perform actions” (create a ticket, update CRM, execute a query, send an email), assume you need tool use with strong validation/permissions. When it includes “use only company-approved content,” assume grounding (RAG) plus citations and refusal behavior.
This chapter also sets you up for the domain practice set by teaching how to identify correct answers: look for (1) a clear tool boundary, (2) a grounded knowledge source, (3) state scope (per-turn vs. durable), (4) orchestration strategy (single agent vs. routing vs. multi-agent), (5) guardrails, and (6) observability and failure handling.
Practice note for Understand agentic architectures and tool use patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design orchestration, memory/state, and grounding strategies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Add guardrails, observability, and failure handling for agents: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Agentic solutions (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand agentic architectures and tool use patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design orchestration, memory/state, and grounding strategies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Add guardrails, observability, and failure handling for agents: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Agentic solutions (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand agentic architectures and tool use patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 expects you to distinguish “chatbot” from “agent.” A chatbot can answer questions; an agent can decide to take steps using tools and then report results. In exam scenarios, the trigger words are “book,” “create,” “update,” “search internal systems,” “run a query,” or “resolve the issue end-to-end.” That implies a planner/executor mindset: the model (planner) chooses actions; your application (executor) performs them with deterministic code and returns structured results.
Function calling is the testable mechanism that connects model reasoning to tool execution. You expose a tool schema (function name, parameters, types), and the model produces a tool call payload. Your code then validates parameters, enforces authZ, invokes the tool, and returns the tool output back to the model for synthesis. Tools are the concrete capabilities (Search, SQL query, REST API call, file read/write), while skills are reusable tool bundles or workflows (e.g., “CustomerLookupSkill” that calls CRM + billing + returns normalized profile).
Exam Tip: If an answer choice says “let the model call the API directly,” treat it as a trap. The correct architecture has your service call the API (with managed identity/keys), not the model. The model emits intent (function call), your code enforces policy.
Common traps include (1) over-trusting the model to select safe parameters, (2) embedding secrets in prompts, (3) skipping strict JSON/schema validation, and (4) conflating tool outputs with ground truth without verification. On the exam, prefer designs that keep a deterministic boundary: the model proposes; the app disposes. Also watch for “skills” vs. “tools”: skills typically imply composability and reuse; tools are individual actions. If the question asks for reducing prompt complexity across multiple tasks, a skills abstraction is usually the best fit.
Tool integration questions usually test whether you can choose the right tool type and boundary for reliability and security. Four high-frequency patterns appear in AI-102-style scenarios: (1) API tools for business actions (CRM, ServiceNow, Graph), (2) search tools for grounding (Azure AI Search retrieval), (3) code tools for deterministic computation/formatting, and (4) data actions (SQL/NoSQL queries, data lake lookups).
For API tools, design for idempotency, retries, and least privilege. If a model is allowed to “create order,” ensure the tool requires explicit fields (customerId, sku, quantity) and your service checks authorization and business rules. For search tools, return snippets plus metadata (source, URL, timestamp) so the model can cite and you can audit. For code tools, keep them constrained: use code to compute totals, validate formats, convert units, or apply policies—never to “decide” policy.
Exam Tip: When you see “must provide citations” or “must answer only from approved docs,” the best tool pattern is retrieval + response synthesis, not “fine-tune the model.” Fine-tuning is rarely the first-choice answer for enterprise grounding requirements.
Data actions introduce common exam pitfalls: SQL injection via tool parameters, overly broad query permissions, and ambiguous joins that produce misleading results. Prefer parameterized queries, stored procedures, or a data access layer that accepts only whitelisted fields. Another trap is latency: chaining too many tools can violate response time requirements. In those cases, prefer parallel calls (where safe) or precomputed indexes (e.g., Azure AI Search) rather than repeated database scans.
Finally, tool outputs should be structured. If an option suggests returning free-form text from tools, consider it weaker than JSON with explicit fields, because structured outputs enable validation, deterministic rendering, and easier incident triage.
State management is a frequent hidden requirement in agent questions. The exam tests whether you can separate short-term context (the last few turns and tool outputs) from long-term memory (user preferences, prior cases, durable facts). Short-term memory typically lives in the conversation transcript and must be truncated/summarized to fit token limits. Long-term memory should be stored externally (database, key-value store, vector store) and retrieved intentionally, not blindly appended to every prompt.
Short-term strategy: maintain a message history with role separation (system/developer/user/tool) and summarize older turns while preserving commitments, constraints, and unresolved tasks. Long-term strategy: store user profile facts (timezone, language, account IDs) and retrieve them only when needed. For grounding, store embeddings of relevant documents and retrieve per query rather than persisting entire documents into chat history.
Exam Tip: If a scenario mentions “must forget data after the session” or “data residency/privacy,” avoid designs that write raw transcripts to long-term stores. Prefer ephemeral session state or redacted logs with retention controls.
Common traps: (1) treating memory as “whatever the model remembers,” (2) leaking sensitive information by storing full prompts/responses, and (3) using vector memory as an authoritative source without verifying freshness or ownership. Correct answers usually include explicit scope: session cache for immediate context, durable store for approved fields only, and retrieval filters (userId, tenantId, document ACLs) to prevent cross-user leakage.
Conversation management also includes tool-result handling: tool outputs should be linked to a turn, persisted with correlation IDs, and optionally summarized for future steps. If the agent runs multi-step plans, you need a state machine concept (planned steps, completed steps, pending confirmations) so failures can be retried safely without duplicating actions.
Orchestration is where AI-102 questions become architecture-heavy. You should recognize when a single agent with tools is sufficient versus when you need routing or multi-agent collaboration. Routing patterns classify the user request and send it to a specialized agent (billing agent, technical agent, HR agent) with scoped tools and knowledge. Multi-agent workflows break a complex task into roles: a planner agent creates a plan, a researcher agent gathers sources, an executor agent performs actions, and a verifier agent checks constraints and outputs.
On the exam, choose multi-agent only when the scenario justifies separation of concerns: strict tool permissions, different data sources, or independent verification. Otherwise, a single orchestrator with clear tool boundaries is simpler and often preferred for cost and latency. If an option introduces multiple agents “to improve accuracy” without a governance reason, it may be a distractor.
Exam Tip: Look for requirements like “must not access HR data unless needed” or “separate systems per department.” That points to routing with scoped identities and tool sets per route, not one omniscient agent.
Workflow control is also testable: synchronous orchestration for interactive chat; asynchronous orchestration for long-running tasks (document review, batch enrichment). Correct designs include checkpoints, retries, and compensating actions (e.g., if step 3 fails after creating a ticket, update the ticket status rather than creating a duplicate). Another common trap is ignoring user confirmation. For destructive actions (delete, submit payment, send email externally), the best answer usually requires a “confirm intent” step before the executor tool runs.
Guardrails are highly exam-relevant because they connect safety, correctness, and compliance. In agentic systems, guardrails typically combine grounding controls (RAG), permissioning, jailbreak resistance, and validation. Grounding means the agent’s responses are tied to retrieved enterprise sources; permissioning ensures the agent can only call tools within the user’s rights; jailbreak resistance ensures the user cannot override policies; validation ensures tool parameters and final outputs meet constraints.
Grounding guardrails: use Azure AI Search with security trimming (ACL filters) and return citations. If a scenario demands “answer only from provided sources,” include a refusal rule when retrieval returns insufficient evidence. Permission guardrails: tools should run under managed identity or service principals, with per-tool RBAC and least privilege. Never rely on the model to “decide” whether access is allowed.
Exam Tip: If an answer choice says “add a stronger system prompt to prevent jailbreaks,” that is incomplete by itself. Prefer layered defenses: content filters/policies, tool allowlists, schema validation, and access control outside the model.
Jailbreak resistance and prompt injection: treat retrieved content as untrusted input. Strip or neutralize instructions found inside documents (“ignore previous instructions”). Prefer patterns where the model is instructed to treat retrieved text as data, not commands. Validation: enforce JSON schema for tool calls; validate ranges, formats, and business rules; sanitize user inputs before passing to data tools. For final responses, use post-processing checks (PII detection, profanity, policy compliance) when required by the scenario.
Common traps include over-grounding (dumping too many chunks, causing confusion) and under-grounding (no citations, vague claims). The “correct” option generally balances: retrieve top-k relevant chunks, include metadata, and instruct the model to cite and to say “I don’t know” when sources don’t support an answer.
Agentic solutions fail in more ways than simple Q&A: tool timeouts, partial plans, wrong parameter extraction, policy violations, and cascading errors across steps. The exam expects you to operationalize agents with tracing, auditing, evaluation, and incident response. Tracing means you can reconstruct what happened: user input, model output, tool calls, tool responses, and final answer—all linked by correlation IDs. Auditing means you can answer “who did what” and “why,” including which data sources were used and which actions were taken.
In Azure scenarios, look for integration with Application Insights/Azure Monitor (requests, dependencies, exceptions), plus structured logs of tool calls and model metadata (model version/deployment, prompt template version, retrieval query, document IDs). For regulated environments, store audit trails with retention policies and access controls; redact or tokenize sensitive fields before logging.
Exam Tip: If a requirement says “investigate hallucinations” or “prove the source of an answer,” the best design includes logging retrieved document identifiers and the exact passages returned, not just the final response.
Evaluation is both offline (test sets, regression) and online (quality metrics, human review sampling). Expect exam language like “continuous improvement” or “monitor response quality”; that points to systematic evaluation, not ad-hoc spot checks. Incident response includes runbooks: disable a risky tool, roll back a prompt version, rotate keys, throttle traffic, and notify owners. Failure handling should be explicit: timeouts with fallback, retries with backoff, circuit breakers for unstable dependencies, and safe degradation (answer with partial results plus transparency).
Common traps: logging secrets, storing full raw prompts without redaction, and missing per-step timing/latency metrics (which hides tool bottlenecks). Correct answers emphasize end-to-end traceability, least-privilege access to logs, and a feedback loop that ties incidents to evaluation and prompt/tool updates.
1. A company is building an AI agent that triages support emails and, when appropriate, creates an incident in ServiceNow. The security team requires that the model cannot directly execute actions without strict validation and least-privilege access. Which design best meets the requirement?
2. You are designing a chat-based agent for HR policy Q&A. The business requirement is: "Answer only using company-approved content and include citations." Which approach best reduces hallucinations and satisfies the requirement?
3. A sales research agent performs a multi-step workflow: (1) gather leads from an internal database, (2) summarize each lead, (3) draft outreach emails, and (4) hand off to a human for approval. You want the solution to be reliable and testable, with clear separation between deciding steps and executing tools. Which architecture pattern is most appropriate?
4. An IT automation agent helps employees reset passwords and update distribution lists. The agent must keep track of a user's identity and approvals across a conversation and across sessions (e.g., if the user returns later the same day). What is the best state management strategy?
5. A company deploys an agent that runs tool calls to execute database queries and send notification emails. In production, occasional tool timeouts cause the agent to return partial results without indicating failure. You need better reliability and operational control. Which change best aligns with guardrails, observability, and failure handling expectations?
This chapter maps to the AI-102 skills measured around implementing computer vision, natural language processing, and knowledge mining solutions. On the exam, you’re rarely asked to “define OCR” or “name an Azure service.” Instead, you’ll be given a scenario with constraints (accuracy, latency, cost, offline/online, language coverage, structured vs unstructured data, compliance) and you must pick the right service pattern and configuration. The four lessons in this chapter mirror that reality: you’ll implement computer vision scenarios (image/OCR/documents), implement NLP scenarios (analysis/extraction/conversation), build knowledge mining with Azure AI Search indexing and enrichment, and then synthesize them in a domain practice set mindset (without memorizing trivia).
As you read, keep a mental checklist for every scenario: (1) What is the input modality (image, PDF, text, audio)? (2) Is the output structured fields, free-form insights, or search experiences? (3) Does the solution require training/customization or is prebuilt sufficient? (4) What are the operational requirements (throughput, monitoring, security, region availability)? The test is designed to see whether you can match those constraints to an Azure AI service and a realistic implementation pattern.
Practice note for Implement computer vision scenarios (image, OCR, documents) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement NLP scenarios (analysis, extraction, conversation) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement knowledge mining with Azure AI Search pipelines and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Vision + NLP + Knowledge mining (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement computer vision scenarios (image, OCR, documents) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement NLP scenarios (analysis, extraction, conversation) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement knowledge mining with Azure AI Search pipelines and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: Vision + NLP + Knowledge mining (exam-style): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement computer vision scenarios (image, OCR, documents) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement NLP scenarios (analysis, extraction, conversation) for exam cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 expects you to select the right vision capability for a business scenario, not just “use Computer Vision.” The core decision is whether you need (a) general image understanding, (b) custom classification/detection, or (c) text extraction (OCR). For general image understanding—captions, tags, objects, people/scene descriptions—use Azure AI Vision image analysis features (prebuilt). For custom domain-specific categories (e.g., identifying specific product SKUs, detecting defects on a manufacturing line), you typically need a custom model pattern (custom image classification or object detection) rather than relying on generic tags.
OCR is its own branch: when the question says “extract printed/handwritten text from images” or “read serial numbers,” you’re in OCR land. OCR can be a feature within vision offerings, but exam scenarios often push you toward a document-centric workflow if the input is PDFs, scans, invoices, receipts, or multi-page forms (which leads to Document Intelligence in the next section). If the input is a photo with a short line of text, OCR in a vision service may be enough; if the input is a form with fields and tables, treat it as a document problem.
Exam Tip: Watch for wording like “need bounding boxes for objects” (object detection), “need categories you define” (custom classification), “need to locate text lines and return the text” (OCR), and “need key-value pairs/tables across many vendor layouts” (Document Intelligence).
Implementation-wise, be ready to reason about throughput and async patterns. Vision/document analysis calls can be synchronous for small inputs but often become asynchronous for large documents or batch processing. In case studies, also note security requirements: private endpoints/VNet integration can be a deciding factor when multiple services overlap in capability.
Document Intelligence is the go-to when the exam mentions invoices, receipts, IDs, contracts, multi-page PDFs, tables, and “extract fields into JSON.” The testable patterns typically include: (1) prebuilt models (invoice/receipt/ID), (2) layout extraction (text, tables, selection marks) when you don’t need semantic fields, and (3) custom extraction when documents vary or you need your own schema.
A common exam move is to give you a requirement like “documents come from many vendors with different layouts; extract total amount, invoice number, and date.” The correct direction is usually a prebuilt invoice model first (fastest time-to-value), and only then a custom model if prebuilt doesn’t meet accuracy or field coverage. Conversely, if the scenario says “proprietary form with stable layout used internally,” a custom model is often justified.
Confidence scores matter. AI-102 questions sometimes ask what to do when extraction confidence is low or when human review is required. Your design should include a threshold-based workflow: accept high-confidence fields automatically, route low-confidence fields to manual review, and store both the extracted value and confidence for auditability.
Exam Tip: If the requirement includes “tables/line items,” that strongly signals Document Intelligence rather than plain OCR. Layout extraction can find tables, but a prebuilt/custom model is usually needed to map tables to business concepts (e.g., line item description, quantity, unit price).
On the exam, you also need to recognize the difference between “extract everything” and “extract specific fields.” Layout is best for broad text/tables without semantic labeling. Prebuilt/custom models are best when your downstream system needs stable field names and types. In document-heavy pipelines, plan how outputs flow into storage and search (often feeding Azure AI Search as enriched content for retrieval).
NLP on AI-102 typically revolves around selecting the right capability: sentiment analysis, key phrase extraction, named entity recognition (NER), language detection, summarization, classification, or custom entity extraction. Many exam cases can be solved with prebuilt Text Analytics-style features when you need common insights (sentiment, entities, PII detection) and can tolerate generalized models. If the scenario demands domain-specific categories or labels (e.g., support tickets must be routed to 25 internal teams with organization-specific taxonomy), expect a custom classification pattern.
Start by identifying whether the task is analysis (derive insights from text), extraction (pull entities/fields), or generation (draft text). This chapter’s focus is analysis/extraction patterns; generative AI and RAG are covered elsewhere, but your solution may still feed extracted entities into search indexes or downstream automation.
Exam Tip: “PII,” “redaction,” “compliance,” and “sensitive data” keywords often signal a requirement for entity recognition focused on personal data, plus storage/telemetry controls. Don’t answer with “log raw text everywhere” patterns; assume audit and minimization requirements.
Operationally, expect to reason about batching (processing many documents efficiently), idempotency (retries without double-writing), and evaluation. The exam often rewards designs that include offline evaluation sets, thresholding, and monitoring drift for custom models—especially when routing or classification decisions affect business workflows.
Conversational scenarios on AI-102 test whether you can design an end-to-end language workflow, not merely “add a chatbot.” Begin by identifying the conversation type: FAQ-style retrieval, transactional bot (collect slots/fields), agent-assisted support, or voice-enabled assistant. Then map to components: intent recognition, entity extraction, dialog/state management, fallback/escalation, and evaluation.
Even when the exam mentions “conversation,” the real objective is often orchestration: how do you manage context, handle ambiguity, and integrate with external systems? Look for requirements like “confirm before submitting,” “handoff to human,” “maintain conversation state across channels,” or “log conversation transcripts safely.” These indicate you need explicit dialog management patterns and state persistence, plus guardrails around sensitive data.
Exam Tip: If the scenario includes “users ask follow-up questions referencing earlier messages,” prioritize solutions that maintain state and use a consistent conversation history strategy. If it includes “must not answer outside policy,” prioritize moderation/filters and deterministic fallback flows (e.g., provide a form, link, or escalation).
Design-wise, incorporate “happy path” and “unhappy path.” A strong answer includes: clarification questions when confidence is low, explicit escalation criteria, and safe logging practices. When conversation is paired with enterprise knowledge, it often feeds into Azure AI Search (hybrid retrieval) and uses extracted metadata (topic, product, region) to narrow results.
Knowledge mining questions often hinge on your Azure AI Search configuration choices: index schema, analyzers, filters/facets, and whether to use keyword search, vector search, or hybrid retrieval. The exam wants you to distinguish “search over documents” from “store documents.” Azure AI Search is for retrieval; raw content typically sits in Blob Storage, SQL, or Cosmos DB, while the search index stores searchable fields and metadata.
Index design is testable. Identify which fields need full-text search (e.g., content, title), which need filtering (e.g., category, security group, document type), and which need faceting/sorting. If a requirement says “users filter by department and date,” those must be filterable fields. If it says “support exact match for product codes,” choose an analyzer/field configuration that doesn’t over-tokenize identifiers.
Exam Tip: “Hybrid retrieval” is often the best answer when you need both semantic similarity and precise keyword constraints. Vector search improves relevance for conceptual queries, but keyword search is still superior for exact terms, codes, and compliance phrases. Hybrid combines both and is a frequent scenario-fit on AI-102.
Implementation-wise, be ready to recognize the ingestion pattern: data source → indexer → index. If the scenario emphasizes “near real-time updates,” you may need a push pattern (app writes to index) or frequent indexing schedules rather than a slow nightly batch. If it emphasizes “large PDFs and scans,” that signals skillsets/enrichment (next section) to make the content searchable.
Enrichment pipelines are the bridge between raw content and high-quality retrieval. AI-102 tests whether you understand how Azure AI Search skillsets can extract text, detect language, perform OCR, identify entities/key phrases, and attach metadata to documents during indexing. The key is that enrichment is not “nice to have”—it’s often the only way to make scanned PDFs searchable, to enable filters (entities as facets), or to prepare content for chunk-based retrieval.
Chunking is especially important in modern retrieval patterns. Large documents must be split into smaller passages to improve relevance and reduce noise. A typical design is: extract text → split into chunks (by page/heading/length) → store each chunk as a searchable item with parent document metadata (document ID, source URI, ACL, page number). This enables targeted retrieval and better grounding for downstream applications.
Exam Tip: If the requirement mentions “citations,” “show the paragraph that answered the question,” or “retrieve the most relevant section,” chunking plus metadata (page/section) is implied. Answers that index an entire 200-page PDF as one field are usually wrong.
From the chapter’s “domain practice set” perspective, the exam frequently combines all three domains: a user uploads images/PDFs (vision + Document Intelligence), you extract entities and PII (NLP), then index chunks with metadata (AI Search) to support secure retrieval. The correct solution is usually the one that shows a cohesive pipeline: ingestion, extraction/enrichment, indexing, and a query experience that respects access control and returns explainable results.
1. A healthcare provider needs to extract structured fields (patient name, DOB, policy number, totals) from scanned insurance claim forms. The forms vary by template, and the provider wants the model to improve as more samples are labeled. Which Azure AI approach should you use?
2. You are building a photo moderation feature for a social platform. The requirement is to detect adult content and generate a short caption and tags for each image with minimal custom training. Which service and feature set best meets the requirement?
3. A support team wants to analyze incoming customer emails to (1) identify the main issues customers mention and (2) extract product names and locations from the text. The team does not want to train a model. Which combination should you implement?
4. A company wants to create a searchable portal over thousands of PDFs and images. Users must be able to search by extracted text and also by entities (people, organizations, locations) found in the content. The company wants an automated ingestion pipeline. Which architecture best fits?
5. You are implementing a customer-service chatbot that must answer questions using company policy documents and remain grounded in that content. You need an approach that supports orchestrating prompts, retrieving relevant passages, and maintaining conversation state. Which option is most appropriate?
This chapter is your capstone: you will simulate the pressure and pacing of the real AI-102 exam, identify weak spots with evidence (not vibes), and lock in a repeatable review method. The AI-102 tests applied decision-making across Azure AI services, not memorized trivia. That means you must be able to read a scenario, extract constraints (security, latency, cost, data residency, safety), and choose the best-fit service and configuration—often where multiple answers are “technically possible” but only one aligns with the requirements.
We’ll work in four phases that map to the lessons in this chapter: two full mock exam passes (Part 1 and Part 2), a weak spot analysis process, and an exam-day checklist. Throughout, you’ll see how each major domain shows up in exam scenarios: planning and managing solutions; implementing generative AI (Azure OpenAI, RAG, safety, evaluation); implementing agentic solutions (tools, orchestration, state, grounding, guardrails); computer vision (image analysis, OCR, Document Intelligence); NLP (text analytics and conversational patterns); and knowledge mining (Azure AI Search indexing/enrichment/chunking/retrieval).
Exam Tip: Treat every practice session as “open-book now, closed-book later.” First attempt: allow brief lookups but record what you had to look up. Second attempt: no lookups. That delta is your real readiness metric.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Your goal is to simulate exam conditions closely enough that timing, fatigue, and uncertainty become familiar. AI-102 typically mixes standalone questions and case-study style sets. In your mock, use two blocks to mirror this: a first pass that emphasizes breadth and a second pass that stresses endurance and focus. Set a fixed time window and do not pause the clock for “quick research.” If you must look something up during your learning phase, mark it and move on.
Scoring should be objective and diagnostic. Track: (1) accuracy by domain (plan/manage; generative AI; agents; vision; NLP; knowledge mining), (2) error type—concept gap vs. misread vs. “two good answers,” and (3) time per item. Your improvement plan should target the highest-impact combination: frequent + costly errors.
Exam Tip: A “slow correct” today can become “fast correct” only if you practice the elimination logic. If you only practice by re-reading notes, you’ll stay slow under pressure.
Mock Exam Part 1 (Set A) should feel like a realistic cross-domain sprint. Expect scenarios that combine at least two objectives—e.g., an Azure OpenAI chatbot that must use enterprise documents (Azure AI Search) while meeting security and monitoring requirements. Your job is to identify “hard constraints” first: identity model (managed identity vs keys), network (private endpoints), compliance (data residency), and safety (content filtering, jailbreak resistance). Only then select services and patterns.
Key concept clusters that frequently appear together in Set A-style items:
Exam Tip: In scenario questions, the “best answer” is usually the one that satisfies all constraints with the fewest moving parts. Over-engineered options often hide a miss (extra service that doesn’t meet a stated requirement or adds avoidable risk).
As you work Set A, practice writing a 1–2 sentence “requirement extraction” before selecting an answer. This prevents the most common failure mode: noticing one keyword (like “OCR”) and choosing a service without checking for structured extraction, confidence scores, or downstream indexing needs.
Mock Exam Part 2 (Set B) is the endurance pass. You are testing whether you can keep quality high when you’re tired—exactly when misreads happen. This set should lean into agentic solutions, evaluation/safety, and “choose the best orchestration” items. Expect scenarios like: an agent that calls tools (Search, internal APIs), maintains state, must be grounded to enterprise data, and must refuse unsafe instructions.
Focus areas that commonly drive Set B errors:
Exam Tip: If an option promises “highest accuracy” but introduces a new dependency (extra service, custom training) without justification in the requirements, it’s often a distractor. The exam rewards alignment, not maximal capability.
Time boxing matters more in Set B. If you exceed your target time on an item, flag it and move on. The exam is designed so that some questions are inherently slower—don’t let them steal time from easier points.
After both mock parts, use a consistent review framework to convert mistakes into durable wins. The key is to analyze distractors—the wrong answers that look attractive—because that is what the real exam uses to differentiate prepared candidates. For every missed (or guessed) item, write: (1) the scenario’s top 2–3 constraints, (2) the correct approach, and (3) the specific reason each distractor fails.
Use these common distractor patterns to speed your analysis:
Exam Tip: When two options both “work,” look for the hidden requirement: auditing, least privilege, network isolation, evaluation repeatability, or cost control. The best answer usually mentions the operational control plane, not just the model.
Finally, categorize each error: knowledge gap (study), process gap (slow reading or missing constraints), or strategy gap (not eliminating). Your weak spot analysis in the next section should be built from these categories, not from raw score alone.
Use this checklist as your final review map. You are not trying to re-learn everything—only to confirm you can recognize the tested patterns and make correct choices under constraints.
Exam Tip: If you cannot explain “why this service, not the neighboring one,” you are not done. AI-102 frequently tests boundaries between similar services and overlapping capabilities.
Before exam day, do a targeted mini-drill: pick one weak domain and write a one-page “decision tree” (e.g., OCR vs Document Intelligence vs Vision; or keyword vs semantic vs vector search). This reduces hesitation and prevents second-guessing.
On exam day, strategy protects your score. Many candidates lose points not from lack of knowledge but from time mismanagement and overthinking. Your priorities: (1) bank easy points early, (2) prevent a few hard questions from consuming the entire session, and (3) avoid unforced errors in case studies.
For case-study style items, read in this order: business requirements → constraints (security/compliance, networking, latency, cost) → existing environment → only then the question. Build a quick “must/should” list. The most common trap is answering based on the architecture you would build in real life, rather than what the prompt explicitly requires.
Exam Tip: If an answer choice includes specific operational controls (private endpoint, RBAC, monitoring, citations/grounding) and the scenario mentions risk or compliance, that option often aligns with the exam’s intent.
Finally, do a last-minute mental checklist before submitting: Did you misread “must” vs “should”? Did you ignore a security requirement? Did you pick a service that can’t produce the requested output format (tables, key-value pairs, citations, structured JSON)? This 20-second scan catches the most expensive mistakes.
1. You are designing an AI-102 solution that answers employee questions using internal PDFs and SharePoint pages. Requirements: (1) responses must cite sources, (2) minimize hallucinations, (3) support hybrid search (keyword + vector), and (4) content changes daily. Which architecture best meets the requirements with the least custom code?
2. A customer-support bot uses Azure OpenAI and must comply with a policy: it must refuse self-harm instructions, redact personally identifiable information (PII) in outputs, and log safety events for review. Which approach best aligns with Azure AI-102 best practices for safety and responsible AI?
3. You are building an agentic solution that can (1) look up order status from an internal API, (2) call a refund tool only after user confirmation, and (3) keep conversational state across turns. Which design most directly meets the requirements?
4. A company receives scanned invoices (mixed layouts) and must extract vendor name, invoice number, and total with high accuracy. They also want to label low-confidence fields for human review. Which Azure service and feature is the best fit?
5. After completing a full mock exam, you notice you consistently miss questions involving service selection under constraints (data residency, latency, and cost). What is the most effective weak-spot analysis action to improve readiness for AI-102?