HELP

+40 722 606 166

messenger@eduailast.com

AI-102 Computer Vision and NLP on Azure: Domain Deep Dive

AI Certification Exam Prep — Beginner

AI-102 Computer Vision and NLP on Azure: Domain Deep Dive

AI-102 Computer Vision and NLP on Azure: Domain Deep Dive

Master AI-102 domains with guided labs, practice sets, and a full mock exam.

Beginner ai-102 · microsoft · azure · azure-ai

Course goal: pass Microsoft AI-102 with domain-by-domain mastery

This exam-prep blueprint is designed for learners preparing for the Microsoft AI-102 exam (Azure AI Engineer Associate). You’ll study the exact skills Microsoft tests—organized into a 6-chapter “book” that maps directly to the official exam domains. The focus is practical decision-making: which Azure AI service to choose, how to secure and monitor it, and how to implement end-to-end solutions for computer vision, NLP, generative AI, agentic workflows, and knowledge mining.

What the AI-102 exam covers (and how this course maps)

The AI-102 skills outline spans six domains. This course structure mirrors them so you always know what objective you’re practicing:

  • Plan and manage an Azure AI solution (architecture, security, monitoring, operations)
  • Implement generative AI solutions (Azure OpenAI, prompt patterns, RAG, governance)
  • Implement an agentic solution (tools/function calling, orchestration, reliability)
  • Implement computer vision solutions (image analysis, OCR, document processing)
  • Implement NLP solutions (classification, extraction, summarization, translation)
  • Implement knowledge mining and information extraction (Azure AI Search, indexing, enrichment)

How the 6 chapters work

Chapter 1 starts with the exam itself: registration, question styles, scoring expectations, and an actionable study strategy for beginners. You’ll set a plan that fits your schedule and learn how to avoid common pitfalls (like over-studying one domain and under-preparing for scenario questions).

Chapters 2–5 are the core learning path. Each chapter targets 1–2 official domains and follows a consistent pattern: concept clarity, implementation choices (the “why this service” logic), and exam-style practice. The internal sections are intentionally scenario-driven so you build the same mental model the exam expects—working from requirements (security, latency, compliance, cost) to the right Azure AI design.

Chapter 6 is a full mock exam experience split into two parts, followed by weak-spot analysis and a final checklist. You’ll practice pacing, review technique, and objective-by-objective remediation so your final days of prep are efficient.

Why this course helps you pass AI-102

  • Beginner-friendly on-ramp: assumes basic IT literacy and no prior certification experience.
  • Objective-aligned structure: every chapter maps to the official domain names so you can track readiness.
  • Exam-style practice: frequent scenario questions that reflect real Azure constraints and tradeoffs.
  • Full mock exam: trains endurance, time management, and review strategy—often the difference between near-pass and pass.

Get started on Edu AI

If you’re ready to begin, Register free and set your target exam date. You can also browse all courses to pair this deep dive with foundational Azure or data prep learning paths.

What You Will Learn

  • Plan and manage an Azure AI solution: choose services, secure, monitor, and optimize deployments
  • Implement generative AI solutions: build with Azure OpenAI, prompt design, RAG patterns, and safety controls
  • Implement an agentic solution: orchestrate tools, function calling, and multi-step workflows with governance
  • Implement computer vision solutions: image analysis, OCR, video, and custom vision model lifecycle
  • Implement NLP solutions: classification, extraction, summarization, translation, and conversational patterns
  • Implement knowledge mining and information extraction: Azure AI Search indexing, enrichment, and document intelligence

Requirements

  • Basic IT literacy (web apps, APIs, files, networking basics)
  • Comfort navigating the Azure portal (helpful but not required)
  • No prior certification experience needed
  • A computer with a modern browser and reliable internet access

Chapter 1: AI-102 Exam Orientation and Study Strategy

  • Understand the AI-102 blueprint and domain weights
  • Register, schedule, and set up your exam environment
  • Scoring, question formats, and time management tactics
  • Build your 2-week and 4-week study plan with checkpoints

Chapter 2: Plan and Manage an Azure AI Solution

  • Design the right Azure AI architecture for a scenario
  • Secure identities, data, and endpoints for Azure AI workloads
  • Deploy, monitor, and troubleshoot Azure AI resources
  • Cost, performance, and reliability optimization practice set

Chapter 3: Implement Generative AI Solutions

  • Build Azure OpenAI chat and completion solutions end-to-end
  • Prompt engineering, evaluation, and prompt flow patterns
  • RAG implementation with Azure AI Search and embeddings
  • GenAI exam-style questions: safety, latency, and grounding

Chapter 4: Implement an Agentic Solution

  • Design agent architectures and tool ecosystems
  • Implement tool/function calling and multi-step planning
  • Add memory, state, and monitoring for agent reliability
  • Agentic exam-style questions: orchestration and guardrails

Chapter 5: Implement Computer Vision, NLP, and Knowledge Mining

  • Computer vision implementations: image analysis, OCR, and custom models
  • NLP implementations: classification, extraction, translation, and summarization
  • Knowledge mining pipelines with Azure AI Search and enrichment
  • Mixed-domain exam practice: CV + NLP + search scenarios

Chapter 6: Full Mock Exam and Final Review

  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist

Nadia Kline

Microsoft Certified Trainer (MCT) — Azure AI

Nadia is a Microsoft Certified Trainer who specializes in preparing learners for Azure role-based certifications, including AI-102. She designs exam-aligned learning paths that combine Azure best practices, scenario-based questions, and hands-on implementation guidance.

Chapter 1: AI-102 Exam Orientation and Study Strategy

AI-102 is not a “memorize-the-API” exam. It measures whether you can design and implement Azure AI solutions under real constraints: security, cost, latency, deployment, and responsible AI. This course is a domain deep dive focused on computer vision and NLP, but the exam still expects you to make correct service choices, wire up endpoints, monitor quality, and diagnose failures.

In this chapter you will align your preparation to the official skills outline, set up the exam logistics correctly, understand the question formats you’ll face, and build a 2-week or 4-week plan with checkpoints. The goal is to reduce surprises: exam day should feel like executing a plan, not discovering the test.

Exam Tip: Treat AI-102 as a decision-making exam. When two answers both “work,” Microsoft usually wants the one that is more secure-by-default, more maintainable, and more aligned to the scenario requirements (data residency, private networking, least privilege, evaluation/monitoring, and responsible AI).

Practice note for Understand the AI-102 blueprint and domain weights: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and set up your exam environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, question formats, and time management tactics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build your 2-week and 4-week study plan with checkpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand the AI-102 blueprint and domain weights: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and set up your exam environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, question formats, and time management tactics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build your 2-week and 4-week study plan with checkpoints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand the AI-102 blueprint and domain weights: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and set up your exam environment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, question formats, and time management tactics: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 1.1: What AI-102 measures—skills outline and domains

Start by mapping your study to the AI-102 skills outline (the “blueprint”). The blueprint is your contract: anything not listed is unlikely to be tested; anything listed can appear in multiple forms (conceptual, scenario-based, or implementation detail). In practice, AI-102 clusters into six working domains that mirror how teams ship AI features in Azure.

First, plan and manage an Azure AI solution: selecting between Azure AI services, Azure OpenAI, Azure AI Search, and Document Intelligence; setting up resource groups, regions, networking, identities, keys, and monitoring. Second, implement generative AI solutions: prompt design, RAG patterns, grounding with search, safety controls, and evaluation. Third, implement agentic solutions: tool orchestration, function calling, multi-step workflows, and governance. Fourth and fifth, the domain focus of this course: computer vision (image analysis, OCR, video, custom model lifecycle) and NLP (classification, extraction, summarization, translation, conversational patterns). Sixth, knowledge mining: indexing and enrichment with Azure AI Search plus information extraction pipelines.

  • Service selection logic: The exam often asks “which service” and “which feature.” Know when to use Azure AI Vision vs. Document Intelligence vs. Azure OpenAI vs. Azure AI Search.
  • Deployment mechanics: Endpoints, keys vs. managed identity, private endpoints, quotas, and monitoring are common.
  • Lifecycle thinking: Data ingestion → model/inference → evaluation → monitoring → iteration. Many questions embed a broken link in this chain.

Common trap: Over-optimizing for “newest” features. The correct choice is the one explicitly supported by the scenario and the service’s typical usage (for example, using Document Intelligence for structured document extraction rather than generic OCR when you need fields, tables, and confidence per field).

Exam Tip: When reading a scenario, underline the constraints: private data, must not store prompts, needs citations, latency SLO, multi-region, role separation. These constraints usually eliminate 2–3 answer options immediately.

Section 1.2: Registration, scheduling, accommodations, and exam policies

Registering correctly is part of exam readiness. Schedule through Microsoft’s exam provider (as linked from the certification page) and choose your delivery mode: test center or online proctoring. Online proctoring is convenient, but it has stricter environment rules; test centers reduce technical risk. Select the option that minimizes uncertainty for you.

Plan your exam environment like a deployment: remove variables. If you take the exam online, verify your ID requirements, system check, room rules, and allowed materials well in advance. Expect policies around personal items, breaks, and camera positioning. For accommodations, start the request process early; approvals can take time and you don’t want your study plan to outpace your scheduling window.

  • Identity and profile: Ensure your legal name matches your ID to avoid check-in failure.
  • Timing and time zones: Confirm your appointment time in local time; avoid scheduling right after travel.
  • Policy awareness: Understand what constitutes a policy violation (second monitor, phone within reach, talking aloud).

Common trap: Treating online proctoring like an open-book technical assessment. It’s not. Even if you know the content, a preventable check-in issue can derail the attempt.

Exam Tip: Schedule a “buffer day” before and after exam day for light review and rest. Cramming the night before increases careless errors on scenario questions, where a single missed requirement changes the correct answer.

Section 1.3: Exam format—case studies, labs, multiple choice, and hotspots

AI-102 questions are built to simulate job tasks. Expect a mix of traditional multiple choice, multiple response (“choose all that apply”), drag-and-drop ordering, and hotspot-style questions where you select a region in the UI or choose configuration elements. Some deliveries include case studies: longer scenarios with multiple questions sharing the same context. Treat case studies as requirements documents.

Labs are not always present, but you should prepare as if hands-on skills will be tested. This course emphasizes practical implementation because “I recognize the term” is not enough when the question asks what to configure, what endpoint to call, or what component to add to fix a failure. In case-study style items, you must manage information: you cannot reread everything for every question without losing time.

  • Case study tactic: Extract a mini spec: objectives, constraints, current state, and “must” statements. Then answer from that spec.
  • Hotspot tactic: Look for UI cues in the stem—identity options, networking, and resource settings are frequent targets.
  • Implementation detail: Know common patterns: RAG with Azure AI Search, OCR vs. document extraction, and how managed identity changes connection setup.

Common trap: Answering based on how you would build it “in general” rather than what the scenario asks. If the prompt emphasizes governance and safety, the expected answer often includes content filtering, grounding/citations, logging, and least-privilege access—not just a model choice.

Exam Tip: For multi-select items, don’t hunt for “the best” single option. Instead, validate each choice against the scenario constraints and eliminate any option that breaks security, data residency, or maintainability—even if it could functionally work.

Section 1.4: Scoring model, passing guidance, and retake strategy

Microsoft exams use a scaled scoring model. Your final score is not simply “percent correct,” and question weighting can vary. The practical implication: don’t waste time trying to compute your score mid-exam. Focus on maximizing correct decisions on the highest-signal items—scenario-based questions that test architecture, security, and correct service usage.

Passing guidance is straightforward: aim for consistent competence across domains, not perfection in a single area. Because this course is a deep dive into computer vision and NLP, you must still protect time for the planning/management and generative/agentic portions. Many candidates fail not due to weak CV/NLP knowledge, but due to avoidable mistakes in identity, networking, monitoring, or selecting the correct Azure service for an extraction/search workflow.

  • Time management: Allocate time per question, and use “mark for review” strategically for long items.
  • Error type awareness: Most missed items come from misreading constraints, not lack of knowledge.
  • Confidence calibration: If you’re unsure between two answers, ask which one is more aligned with Azure best practices: managed identity, private endpoints, least privilege, and observability.

Retake strategy: If you don’t pass, treat the score report as a backlog. Rebuild your plan around the lowest domain(s) and re-run hands-on labs, especially around deployment and troubleshooting. Retakes should not be “another attempt”; they should reflect new evidence of mastery (completed labs, reviewed notes, corrected misconceptions).

Exam Tip: After each practice set, write down the reason your wrong option was wrong (e.g., “breaks least privilege,” “wrong service for structured fields,” “doesn’t support private networking”). That reasoning skill transfers directly to the exam.

Section 1.5: Building a study system—notes, labs, and spaced repetition

A good AI-102 study system produces two outcomes: (1) you can implement core patterns quickly, and (2) you can explain why a choice is correct under constraints. Build your system around three pillars: notes for decision rules, labs for muscle memory, and spaced repetition for retention.

Notes: Keep a running “decision journal” instead of copying documentation. For each topic—Vision OCR vs. Document Intelligence, RAG with Azure AI Search, content safety controls, agent tool calling—capture a small set of rules: when to use it, required inputs, common configuration, and failure modes. Your notes should read like: “If requirement is citations + enterprise docs → RAG with Search; add grounding and safety; use managed identity.”

Labs: Hands-on work is where you learn exam-critical details: endpoints, authentication methods, index schema implications, and debugging. Labs should include at least one end-to-end workflow per domain (vision pipeline, NLP extraction, RAG pipeline, and monitoring/alerts). Emphasize “break/fix” labs: intentionally misconfigure identity or networking and practice diagnosing.

  • Spaced repetition: Convert your decision rules into flashcards: triggers (“need table extraction”), correct service/feature, and the key reason.
  • Checkpoints: Weekly checkpoint = one timed practice set + one mini project review + one gap list.
  • Coverage: Rotate across domains so you don’t overfit to your favorite area.

Common trap: Watching content without producing artifacts. If you finish a video or reading without creating a decision rule, a lab outcome, or a flashcard, you likely won’t retain it under time pressure.

Exam Tip: Practice explaining your answer out loud in one sentence: “I choose X because it satisfies Y constraint and avoids Z risk.” If you can’t do that, you’re vulnerable to distractor options.

Section 1.6: How to use this course—chapter flow, practice sets, and mock exam

This course is organized to mirror how the exam thinks: start with foundations (service selection, security, monitoring), then go deep into computer vision and NLP implementations, and finally integrate with generative AI, agents, and knowledge mining patterns you’ll use to build complete solutions. Each chapter will connect technical features to exam objectives and to common scenario constraints.

Use a consistent loop per chapter:

  • Read for decision rules: As you read, extract “if/then” rules that help you choose services and configurations.
  • Implement the pattern: Complete at least one lab or walkthrough that forces you to configure identity, networking, and evaluation/monitoring—not just call an API once.
  • Practice sets: Do timed practice in small blocks to build speed and reduce misreads. Review mistakes by identifying the violated requirement.

You will also run a mock exam near the end of your plan. Treat it as a dress rehearsal: timed, no interruptions, and a strict review process afterward. The mock is not just to “see your score”; it is to reveal which domain decisions you still make inconsistently (for example, when to use Document Intelligence vs. Vision OCR, how to implement RAG with citations, or what governance controls belong in an agentic workflow).

2-week plan checkpointing: Focus on high-frequency objectives and daily practice: one domain per day plus a mixed review block. 4-week plan checkpointing: Add deeper labs, more break/fix troubleshooting, and a second mock exam. In both plans, schedule a weekly “integration day” where you connect CV/NLP outputs into Azure AI Search and test end-to-end behavior.

Exam Tip: Don’t postpone the mock exam until you “feel ready.” Take it when you have baseline coverage, then let the results drive your final study sprints.

Chapter milestones
  • Understand the AI-102 blueprint and domain weights
  • Register, schedule, and set up your exam environment
  • Scoring, question formats, and time management tactics
  • Build your 2-week and 4-week study plan with checkpoints
Chapter quiz

1. You are starting AI-102 preparation for a team that is strong in coding but weak in solution design. The team keeps asking for a list of APIs to memorize. Which guidance best aligns to how AI-102 is assessed?

Show answer
Correct answer: Focus on decision-making: choose the right Azure AI service and architecture under constraints (security, cost, latency, deployment, monitoring, and responsible AI).
AI-102 emphasizes designing and implementing Azure AI solutions in scenarios, not recalling parameter names or language syntax. Option A matches the exam’s scenario-driven focus on constraints (security, cost, latency, deployability, monitoring/evaluation, responsible AI). Option B is wrong because exam items rarely hinge on exact API schema trivia. Option C is wrong because the exam is vendor-solution oriented; implementation details are typically conceptual or service-choice focused rather than language-specific syntax.

2. Your exam is in 10 days. You want to minimize exam-day issues and ensure the test environment is ready. Which action should you complete first?

Show answer
Correct answer: Review the official AI-102 skills outline (blueprint) and map study time to the weighted domains before building your schedule.
A study strategy aligned to the official skills outline and domain weights reduces surprises and matches how certification objectives are structured. Option A is correct because it anchors preparation to the blueprint and weighting. Option B is wrong because a single project may overfit to one approach and ignore other tested domains; also you cannot reference your project during the exam. Option C is wrong because relying on pattern recognition over objectives risks gaps and does not align to the certification’s domain coverage.

3. You are in a timed AI-102 exam. You encounter a long scenario with multiple requirements and three answers that all appear feasible. What is the most reliable tactic to select the best answer in Microsoft-style questions?

Show answer
Correct answer: Choose the option that is most secure-by-default, least-privilege aligned, and maintainable while meeting stated constraints (for example, data residency, private networking, and monitoring).
When multiple solutions could work, Microsoft exams commonly reward the one best aligned to explicit constraints and good practices (secure-by-default, least privilege, maintainability, and monitoring/evaluation). Option A reflects this decision-making framing. Option B is wrong because 'newest' is not a consistent selection rule and may conflict with requirements like residency or networking. Option C is wrong because fewer steps is not inherently better; the exam often prefers correct governance, security, and operational readiness even if it requires additional configuration.

4. A company is planning an on-site proctored AI-102 exam for multiple employees. Some employees want to use their corporate laptops with strict security policies and VPN always-on. What is the best preparation step to reduce the risk of technical issues during the exam?

Show answer
Correct answer: Run the proctoring system check and validate the exam environment (network, permissions, camera/mic, and security/VPN constraints) well in advance, and have a contingency plan.
Exam readiness includes confirming the test environment and avoiding surprises (connectivity, permissions, and proctoring requirements). Option A is correct because it validates constraints early and plans mitigation. Option B is wrong because environment issues are a common failure point and the candidate is responsible for readiness. Option C is wrong because it is not a realistic or compliant recommendation; certification guidance emphasizes preparing within policy constraints rather than removing controls.

5. You have two weeks to prepare for AI-102. You already know Azure basics but have limited time. Which study plan structure best matches the chapter’s recommended approach?

Show answer
Correct answer: Build a 2-week plan with checkpoints aligned to the skills outline and domain weights, using targeted practice to identify gaps and revisiting weak areas before exam day.
A checkpoint-based plan aligned to the official outline reduces surprises and ensures coverage across domains. Option A is correct because it combines weighted objectives, gap-finding, and iteration. Option B is wrong because it defers feedback until too late and lacks checkpoints/time management. Option C is wrong because—even if the course focuses on vision and NLP—the AI-102 exam expects broader competence across the blueprint, including service choice, deployment, security, and monitoring.

Chapter 2: Plan and Manage an Azure AI Solution

AI-102 doesn’t only test whether you can call an API—it tests whether you can design a secure, operable, and cost-aware solution that can survive real production constraints. In this chapter, you’ll practice the mindset the exam rewards: map a scenario to the right Azure AI services, plan regions and quotas early, lock down identities and networks, and then prove the system is observable and supportable. The “gotcha” on many questions is that multiple choices can “work,” but only one meets the stated constraints (data residency, private connectivity, managed identity, budget, latency, or compliance).

As you read, keep two decision loops in mind. First: architecture fit (service selection, dependencies, networking). Second: operations fit (security, monitoring, incident response, and cost/performance tuning). The lessons in this chapter align to those loops and to how AI-102 frames “plan and manage” tasks: choose services, secure them, deploy and troubleshoot, then optimize.

Exam Tip: When a question includes words like “must not traverse the public internet,” “customer-managed keys,” “data must remain in region,” or “least privilege,” treat them as hard constraints. Eliminate any option that violates them, even if it is otherwise a good design.

Practice note for Design the right Azure AI architecture for a scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure identities, data, and endpoints for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy, monitor, and troubleshoot Azure AI resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Cost, performance, and reliability optimization practice set: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design the right Azure AI architecture for a scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure identities, data, and endpoints for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy, monitor, and troubleshoot Azure AI resources: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Cost, performance, and reliability optimization practice set: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design the right Azure AI architecture for a scenario: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure identities, data, and endpoints for Azure AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 2.1: Service selection—Azure AI services vs Azure OpenAI vs custom models

Service selection is a frequent exam objective because it drives everything else: security model, cost profile, latency, and even which monitoring signals you can capture. AI-102 commonly distinguishes between (1) Azure AI services (prebuilt vision/speech/language endpoints), (2) Azure OpenAI (generative models and embeddings), and (3) custom model paths (Azure Machine Learning, custom vision training, or fine-tuning where applicable).

Use Azure AI services when the task is a well-known capability with a managed API surface: OCR, image tagging, face blurring (where supported), language detection, key phrase extraction, translation, etc. These services usually minimize ML ops overhead and are strong answers when the scenario emphasizes speed-to-market and standard features. Use Azure OpenAI when the scenario explicitly needs generative responses, conversational behavior, summarization with instruction-following, or retrieval-augmented generation (RAG) using embeddings. Consider custom models when the prompt includes domain-specific labels, unique visual categories, specialized terminology, or strict control over training/evaluation beyond what prebuilt models offer.

Exam Tip: If the scenario needs “grounded answers from internal documents” or “reduce hallucinations,” the intended direction is RAG: Azure AI Search + embeddings (Azure OpenAI) + prompt that cites sources. Don’t default to fine-tuning when the issue is missing knowledge rather than model behavior.

Common trap: picking generative AI for deterministic extraction. If the requirement is structured, repeatable extraction (invoice fields, IDs, form tables), the exam often prefers Document Intelligence or prebuilt extraction rather than a chat completion. Another trap is ignoring multimodal constraints: some workloads require image+text reasoning, but many enterprise designs still separate steps (vision OCR to text, then language processing). On AI-102, choose the simplest reliable composition that meets requirements.

  • Azure AI services: best for standard CV/NLP tasks, straightforward deployment, stable outputs.
  • Azure OpenAI: best for generation, summarization, Q&A, tool/function calling, embeddings for semantic retrieval.
  • Custom models: best for specialized domains, bespoke labels, controlled training lifecycle and evaluation.

To identify the correct answer, underline verbs in the scenario: “classify,” “extract fields,” “summarize,” “converse,” “search,” “ground,” “moderate,” “detect objects.” Then map each verb to the managed service with the least operational complexity that still satisfies constraints.

Section 2.2: Resource planning—regions, quotas, SKUs, networking, and dependencies

Resource planning shows up on the exam as “why won’t this deploy?” or “why is it throttling?” questions. Start with region alignment: data residency, latency targets, and service availability differ by region. Many solutions fail in design because dependencies land in different regions or have incompatible networking options. For example, an end-to-end RAG architecture might require Azure OpenAI, Azure AI Search, Storage, and Key Vault; the best design places them in the same region when possible to reduce latency and simplify egress patterns.

Quotas and capacity are another test favorite. Azure OpenAI enforces quotas (e.g., tokens per minute) at subscription/region/model levels; Azure AI services can throttle per pricing tier and per resource. If a scenario mentions “429,” “rate limit,” or “spikes,” expect to choose a design that adds buffering (queues), retries with backoff, or requests quota increases. Also note SKU selection: higher tiers can provide higher throughput, larger document limits, or features like private networking and customer-managed keys in some services.

Exam Tip: When the requirement includes “private connectivity,” confirm each dependency supports private endpoints in the chosen region and tier. The correct answer is often the one that explicitly calls out Private Link plus correct DNS planning.

Networking planning is not optional for AI-102. Know the difference between service endpoints (for some PaaS) and private endpoints (Private Link). Private endpoints give a private IP in your VNet, but they introduce DNS requirements (private DNS zones) and can break clients if name resolution is not planned. Dependencies also include identity providers (Microsoft Entra ID), logging backends (Log Analytics workspace), and secrets stores (Key Vault). The exam expects you to recognize that “secure by default” includes wiring these foundational resources, not bolting them on later.

  • Plan regions for compliance and latency; verify service availability.
  • Validate quotas early (especially Azure OpenAI) and design for throttling.
  • Choose SKUs that meet features (throughput, private endpoints, CMK).
  • Model dependencies: Search, Storage, Key Vault, Monitor, networking.

Common trap: assuming all AI services can be deployed in any region and then proposing cross-region calls that violate “data must remain in region.” If a question emphasizes residency, avoid designs that move content to another region for processing, logging, or indexing unless explicitly allowed.

Section 2.3: Security—Managed Identities, Key Vault, RBAC, private endpoints, and content safety

AI-102 security questions usually hide the answer in “how should the app authenticate” and “how should traffic flow.” For authentication, prefer Microsoft Entra ID with Managed Identities for Azure-hosted workloads (Functions, App Service, Container Apps, AKS). Managed identity avoids secrets in code and integrates cleanly with RBAC. If the scenario describes “rotation overhead” or “no secrets,” the correct answer nearly always involves a managed identity plus role assignments on the target resource.

Key Vault is central for secret management and often for customer-managed keys (CMK) where supported. Use Key Vault to store API keys when you cannot use Entra-based auth (some services still require keys) and to manage certificates. Combine Key Vault access policies or RBAC (depending on vault configuration) with managed identity. A recurring trap: granting overly broad permissions (“Owner,” “Contributor”) when the question asks for least privilege. The exam expects you to choose specific roles (for example, Key Vault Secrets User vs Key Vault Administrator) that match the access need.

Private endpoints and network isolation are tested as architecture patterns: put AI resources behind Private Link, restrict public network access, and route traffic within VNets. This includes planning private DNS so the service FQDN resolves to the private IP. If you see “exfiltration risk,” “no public internet,” or “internal-only,” prioritize private endpoints and disable public access where possible.

Exam Tip: If both “Managed Identity” and “store the key in Key Vault” appear as options, choose Managed Identity for service-to-service auth when supported. Use Key Vault for secrets only when you must.

Content safety is part of security because it protects users and the organization. For generative scenarios, implement input/output moderation using Azure AI Content Safety (or built-in content filtering where applicable) and log moderation actions for audit. Don’t confuse “content safety” with “data encryption”; the exam can place both in the same scenario, and you must address each explicitly.

  • AuthN/AuthZ: Entra ID + Managed Identity + RBAC for least privilege.
  • Secrets/keys: Key Vault; prefer CMK if compliance requires it.
  • Network: Private Endpoints + private DNS + disable public access.
  • GenAI safety: content moderation, policy enforcement, auditing.
Section 2.4: Responsible AI—logging, evaluation, human-in-the-loop, and compliance considerations

Responsible AI is tested less as philosophy and more as implementation: what do you log, how do you evaluate, and how do you demonstrate controls. Logging should include prompts, system messages, model parameters, retrieval citations, safety filter outcomes, and user feedback signals—while still honoring privacy requirements. The exam may force you to balance observability with compliance: you might need to redact PII before logs are persisted, or store only hashes/metadata depending on policy.

Evaluation is a lifecycle requirement, not a one-time test. For GenAI, evaluate groundedness, relevance, toxicity, and policy adherence; for CV/NLP, evaluate accuracy, precision/recall, and error patterns by segment. If the scenario mentions “regression after update,” the correct architecture includes versioning (prompts, index schema, models), automated evaluation, and staged rollouts. Human-in-the-loop appears when the business impact is high (medical, financial approvals, identity verification) or when the confidence score is below a threshold. That design often includes an exception queue and an annotation/review workflow rather than trying to “prompt harder.”

Exam Tip: If the question states “must be auditable” or “must support investigations,” prioritize immutable logs (with retention policies), trace correlation IDs, and clear separation of duties for who can access sensitive logs.

Compliance considerations commonly include data retention, residency, encryption, and access controls. A trap is treating “we don’t store customer data” as meaning “we don’t need governance.” Even if you don’t persist content, you still need to manage access, monitor abuse, and document model behavior and limitations. Expect scenario constraints like “HIPAA,” “GDPR,” or “internal policy prohibits storing prompts.” In those cases, choose architectures that minimize data storage, apply redaction, and enforce strict RBAC, while still providing operational telemetry.

  • Log safely: redact/limit PII, store necessary metadata, enforce retention.
  • Evaluate continuously: offline test sets, regression checks, staged deployments.
  • Human-in-the-loop: thresholds, exception queues, reviewer access controls.
  • Compliance: residency, encryption, separation of duties, audit trails.
Section 2.5: Monitoring and operations—Azure Monitor, Application Insights, alerts, and runbooks

Operations is where “plan and manage” becomes real. AI-102 expects you to understand how to instrument and troubleshoot AI workloads using Azure Monitor, Log Analytics, and Application Insights. For an application layer (web app, Function, container), use Application Insights for request tracing, dependency calls, exceptions, and end-to-end transaction maps. For platform metrics (throttling, latency, failures), use Azure Monitor metrics and diagnostic settings to stream logs to a Log Analytics workspace or storage.

Alerting should be symptom-based and actionable: spikes in 4xx/5xx, increased latency, token usage anomalies, queue backlogs, search indexing failures, or content safety blocks exceeding thresholds. A common exam trap is choosing alerts without an action plan. AI-102 likes answers that include runbooks (Azure Automation, Logic Apps, or documented procedures) and clear ownership: who responds, what to check first, and how to roll back or mitigate (scale out, increase quota, switch deployment, reduce prompt size, or degrade gracefully).

Exam Tip: When you see “intermittent failures” plus “works in dev,” suspect throttling, networking/DNS (private endpoint), or identity/permissions differences. The best answer usually includes telemetry that can distinguish these quickly (dependency logs + platform metrics + correlation IDs).

Troubleshooting patterns the exam uses: 401/403 (RBAC/identity), 404 (wrong endpoint/deployment name), 429 (quota/throttle), timeouts (network path, large payloads, slow search queries), and indexing delays (skillset errors, document chunking, data source credentials). Your operational design should include retries with exponential backoff, circuit breakers, and idempotent processing for queued workloads.

  • Instrument apps with Application Insights; correlate traces with IDs.
  • Enable diagnostic settings on AI resources to Log Analytics.
  • Create alerts tied to runbooks; avoid “alert fatigue.”
  • Design for throttling: queues, retries, backoff, and load shaping.

Cost and performance optimization is part of operations: monitor token consumption, caching hit rates (for embeddings or retrieved passages), and search query performance. On the exam, the “optimize” answer is often: reduce prompt tokens, chunk documents appropriately, cache embeddings, and use the smallest model that meets quality targets.

Section 2.6: Exam-style scenarios—tradeoffs, constraints, and architecture decision questions

Architecture decision questions reward disciplined elimination. Start by listing constraints: private networking, residency, latency, throughput, cost ceiling, and governance. Next, map the required capabilities to services (Section 2.1), then validate feasibility via region/SKU/quota (Section 2.2). Finally, check that security and operations are explicitly addressed (Sections 2.3–2.5). Many wrong answers are “partial designs” that meet the feature need but ignore identity, monitoring, or compliance.

Tradeoffs are often subtle. If the scenario emphasizes minimal ops and standard extraction, managed Azure AI services win over custom models. If it emphasizes “domain-specific accuracy” and controlled evaluation, custom training plus a lifecycle (data versioning, evaluation gates) becomes more appropriate. For GenAI, if the issue is factuality on internal content, RAG is the typical answer; if the issue is style or format consistency, prompt engineering and structured outputs may be enough; if the issue is consistent domain jargon across many tasks, then fine-tuning (where supported and justified) might be proposed—provided you still include safety controls and evaluation.

Exam Tip: The exam loves “choose the best option” where two are technically valid. The tiebreakers are usually (1) meeting a stated constraint, (2) least privilege/secretless auth, (3) private connectivity, and (4) operational readiness (alerts/runbooks).

Constraints also drive deployment choices. If an organization requires “no public endpoints,” ensure every hop supports private endpoints and that name resolution is planned. If the constraint is “must handle burst traffic,” prefer queue-based ingestion and asynchronous processing over synchronous fan-out calls. If the constraint is “predictable cost,” choose rate limiting, caching, and model selection that bounds token usage, plus budgets and alerts.

  • Eliminate answers that violate hard constraints (residency, private-only, least privilege).
  • Prefer managed services unless the scenario justifies custom training/ops.
  • Prove operability: logs, metrics, alerts, runbooks, rollback plan.
  • Optimize: right-size models, reduce tokens, cache embeddings, tune search.

By the time you finish a scenario, you should be able to say: “This design can be deployed in the right region, authenticated without embedded secrets, reached privately, observed end-to-end, and operated under cost and quota limits.” That is the exam’s definition of “plan and manage” for Azure AI solutions.

Chapter milestones
  • Design the right Azure AI architecture for a scenario
  • Secure identities, data, and endpoints for Azure AI workloads
  • Deploy, monitor, and troubleshoot Azure AI resources
  • Cost, performance, and reliability optimization practice set
Chapter quiz

1. A healthcare company is building an image analysis app using Azure AI services. The solution must ensure that inference traffic and keys never traverse the public internet, and the app runs in an Azure VNet. Which design best meets the requirements with least operational overhead?

Show answer
Correct answer: Deploy the Azure AI service with a Private Endpoint, disable public network access, and have the app authenticate using a managed identity with RBAC.
Private Endpoint provides private connectivity from the VNet and disabling public network access enforces the 'must not traverse the public internet' constraint. Using managed identity + RBAC avoids key distribution and supports least-privilege. IP allowlists (B) still use public endpoints and storing keys in config is insecure. WAF/App Gateway (C) protects inbound HTTP to the app but does not make the outbound call to the AI service private; it still uses public internet and API keys.

2. A company must deploy an Azure AI solution that processes customer text data. The data must remain in Germany for compliance, and the team wants to minimize latency for users in Berlin. What should you do first during planning?

Show answer
Correct answer: Select an Azure region in Germany that supports the required Azure AI service and validate service availability/quotas for that region before deployment.
Data residency is a hard constraint: you must choose a Germany region that supports the needed Azure AI capability and confirm availability/quotas early (AI-102 planning emphasis). Front Door (B) can reduce latency but does not satisfy the requirement that data remain in Germany if the service is hosted in West Europe. GRS (C) can replicate data to paired regions and can violate strict residency requirements; it also doesn’t control where the AI service processes data.

3. You deploy an Azure AI resource used by a production web app. You need to monitor request volume, throttling, and failed calls, and you must be able to troubleshoot individual end-to-end requests across services. Which approach best meets these requirements?

Show answer
Correct answer: Enable diagnostic settings on the Azure AI resource to send logs/metrics to Log Analytics and instrument the application with Application Insights for distributed tracing.
Diagnostic settings + Log Analytics provide centralized metrics/logs (including request counts and failures where available), and Application Insights enables end-to-end request correlation/tracing across components—key for troubleshooting. Advisor (B) is optimization guidance, not request-level observability. Policy and Service Health (C) are useful for governance and platform incidents but don’t provide per-request tracing or detailed failure analysis.

4. A team experiences intermittent HTTP 429 (Too Many Requests) responses from an Azure AI endpoint during peak usage. They must reduce throttling while controlling costs. Which action is the best first step?

Show answer
Correct answer: Review the service’s quota/throughput limits for the region and tier, then right-size by requesting quota increases or adjusting the pricing tier/throughput configuration based on measured demand.
429s commonly indicate throttling due to quotas/throughput limits. AI-102 expects validating quotas early and using monitoring to right-size capacity (including requesting quota increases where applicable) while balancing cost. Retrying without backoff (B) can amplify throttling and increase cost; proper exponential backoff is typical, but it doesn’t address sustained capacity limits. Storage redundancy/zone redundancy (C) relates to durability/availability, not service request throttling.

5. A financial services company requires customer-managed keys (CMK) for encryption at rest for an Azure AI workload. They also want to enforce least privilege for key access. Which design best meets the requirement?

Show answer
Correct answer: Store keys in Azure Key Vault and configure the Azure AI resource to use CMK, granting the service access via a managed identity with only the required Key Vault key permissions.
CMK scenarios typically use Azure Key Vault (or Managed HSM where supported) and a managed identity for the service/resource to access the key with least-privilege permissions—aligning with exam emphasis on CMK and identity-based access. Client-side encryption with SAS (B) doesn’t satisfy CMK for encryption at rest on the Azure service and introduces key management risk. Microsoft-managed keys (C) violate the CMK requirement; IP allowlisting doesn’t address encryption key ownership or access control.

Chapter 3: Implement Generative AI Solutions

This chapter maps directly to the AI-102 skills measured around implementing generative AI with Azure OpenAI: building chat/completions end-to-end, designing prompts, implementing Retrieval-Augmented Generation (RAG) with Azure AI Search and embeddings, and applying safety controls. The exam typically frames these as scenario questions: you are given constraints (latency, cost, accuracy, data boundaries, compliance) and must choose the best design or configuration.

You should be able to explain the difference between model selection vs deployment configuration, how token limits affect prompt construction, and how grounding changes answer quality and risk. You should also recognize when a “clever prompt” is not enough and the correct answer is a RAG architecture, a safety control, or an evaluation loop.

Exam Tip: When a question mentions “hallucinations,” “must cite sources,” “use company documents,” or “keep answers up to date,” the expected direction is grounding/RAG—not more few-shot examples.

Practice note for Build Azure OpenAI chat and completion solutions end-to-end: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Prompt engineering, evaluation, and prompt flow patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG implementation with Azure AI Search and embeddings: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for GenAI exam-style questions: safety, latency, and grounding: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build Azure OpenAI chat and completion solutions end-to-end: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Prompt engineering, evaluation, and prompt flow patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG implementation with Azure AI Search and embeddings: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for GenAI exam-style questions: safety, latency, and grounding: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build Azure OpenAI chat and completion solutions end-to-end: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Prompt engineering, evaluation, and prompt flow patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG implementation with Azure AI Search and embeddings: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 3.1: Azure OpenAI fundamentals—models, deployments, tokens, and limits

Section 3.1: Azure OpenAI fundamentals—models, deployments, tokens, and limits

AI-102 expects you to distinguish Azure OpenAI models (capabilities) from deployments (your provisioned endpoints/config). In Azure OpenAI, you select a model family suited to the task (chat, reasoning, embeddings) and create a deployment name that your application calls. On the exam, the “right answer” is often to create separate deployments for different workloads (e.g., low-latency chat vs back-office summarization) so you can scale, monitor, and apply quotas independently.

Token math is a frequent hidden constraint. Your prompt plus retrieved context plus the model’s output must fit the model’s context window. If a scenario includes long documents, multi-turn conversations, or verbose system prompts, look for designs that reduce tokens: summarizing chat history, chunking documents, retrieving fewer passages, or using structured outputs instead of verbose prose. Rate limits and throughput constraints are also commonly tested—especially when a system must support many concurrent users.

  • Tokens: input + output; long system messages and excessive RAG context are the fastest way to blow the budget.
  • Deployments: provide isolation for quotas, scaling, and monitoring; the deployment name is what apps call.
  • Completions vs chat: most modern app patterns use chat-style messages; completions may still appear in legacy scenarios.

Exam Tip: If the scenario complains about “intermittent 429 responses,” suspect rate limiting/throughput. The best fix is usually capacity planning (quota/throughput, retry with backoff, caching, batching, or splitting workloads), not changing the prompt wording.

Common trap: confusing Azure OpenAI with OpenAI public endpoints. The exam generally wants Azure-native controls (resource governance, networking, monitoring) and Azure service integration (Azure AI Search, Azure Monitor, Key Vault) rather than generic API advice.

Section 3.2: Prompt design—system instructions, few-shot, tool hints, and structured outputs

Section 3.2: Prompt design—system instructions, few-shot, tool hints, and structured outputs

Prompt engineering on AI-102 is less about “clever wording” and more about repeatability, control, and evaluation. The exam expects you to use system instructions to define role, policy boundaries, and output format. If a scenario requires consistent behavior across users, the system message is the correct place to put durable rules (tone, refusal behavior, citation requirements, formatting), while user messages remain task-specific.

Few-shot examples are best for teaching a specific transformation (e.g., extract fields, classify intents) when you cannot fine-tune or when the data pattern is narrow. However, few-shot increases tokens and can reduce latency and throughput—so it’s not always the best answer. For structured outputs (JSON, YAML, schema-based responses), the exam often rewards approaches that constrain the model: explicit schema instructions, “respond with only JSON,” and validation in code. This reduces downstream parsing errors and is a common production requirement.

  • System instructions: policy and invariant behavior; keep them concise to reduce token overhead.
  • Few-shot: improves consistency for narrow tasks; watch cost/latency and avoid overly long examples.
  • Tool hints: tell the model when to call tools or functions and what arguments to provide.
  • Structured outputs: reduce ambiguity; pair with programmatic validation and retries.

Exam Tip: When the question mentions “must integrate with downstream systems,” “must parse reliably,” or “must enforce a schema,” prefer structured outputs plus validation over free-form natural language.

Common trap: using the user message to enforce safety or policy. The exam typically expects policy enforcement to be in system instructions and platform controls (content filtering, access controls), not something the user can override.

Section 3.3: Grounding and RAG—embeddings, chunking, retrieval, citations, and freshness

Section 3.3: Grounding and RAG—embeddings, chunking, retrieval, citations, and freshness

Retrieval-Augmented Generation is a centerpiece objective: combine Azure OpenAI with Azure AI Search to ground answers in enterprise content. The exam looks for correct component roles: embeddings represent text as vectors; Azure AI Search stores and retrieves relevant chunks (vector and/or hybrid search); the chat model synthesizes an answer using retrieved passages; and citations are produced by tracking which chunks were used.

Chunking is a practical detail that shows up in “why is retrieval poor?” scenarios. If chunks are too large, you waste tokens and dilute relevance; too small, you lose context. A typical design uses overlapping chunks and stores metadata (source, page, URL, timestamp) to support citations and filtering (e.g., only show documents the user is permitted to access). Hybrid search (keyword + vector) is often the best answer when content includes identifiers, product codes, or exact phrases.

  • Embeddings: use an embeddings model to create vectors for documents and queries.
  • Retrieval: top-k selection; apply filters for security trimming and recency.
  • Citations: return source links/IDs; ensure the model is instructed to cite retrieved sources only.
  • Freshness: re-indexing frequency, incremental indexing, and cache invalidation affect how current answers are.

Exam Tip: If the scenario says “answers must be based only on internal docs” or “include references,” the correct design includes RAG plus an instruction to avoid unsupported claims, not just temperature changes.

Common traps include: (1) assuming embeddings “contain the data” (they do not; they are numeric representations), (2) forgetting that citations require you to store and pass source metadata, and (3) ignoring freshness—if the content changes daily, a one-time index build is not a valid solution.

Section 3.4: Orchestration patterns—prompt flow, chaining, and evaluation loops

Section 3.4: Orchestration patterns—prompt flow, chaining, and evaluation loops

Real solutions rarely succeed with a single prompt. The exam tests whether you can select orchestration patterns that improve reliability: chaining (multi-step prompts), routing (choose a path based on classification), and evaluation loops (automated checks for quality and safety). Azure AI Foundry prompt flow (often referenced as “prompt flow”) appears as a way to design, version, and evaluate multi-step GenAI workflows with repeatable datasets and metrics.

Chaining patterns include: (1) rewrite the user question for retrieval, (2) retrieve context, (3) generate an answer with citations, and (4) run a verification step that checks for missing citations or policy violations. For latency-sensitive apps, the “best” pattern may be a minimal chain with caching and smaller models for intermediate steps. For high-stakes outputs (legal, medical, finance), the best pattern often adds verification and human review checkpoints.

  • Prompt flow: orchestrate steps, manage variants, run evaluations, and track outputs.
  • Chaining: decomposes tasks; reduces hallucination by separating retrieval and generation concerns.
  • Evaluation loops: regression testing on prompt/model changes; measure groundedness, relevance, and safety.

Exam Tip: When a scenario highlights “prompt changes broke the app” or “need repeatable testing,” look for prompt flow/evaluation datasets, versioning, and automated metrics—not ad-hoc manual testing.

Common trap: over-orchestrating every scenario. If the question emphasizes low latency and high volume, extra chains may be penalized. Look for cues like “sub-second” or “thousands of requests per minute” and choose simpler flows with caching and careful token management.

Section 3.5: Safety and governance—content filtering, PII handling, and policy enforcement

Section 3.5: Safety and governance—content filtering, PII handling, and policy enforcement

Safety controls are not optional in AI-102 scenarios. You should be prepared to apply layered mitigations: platform content filtering, prompt-level policies, data handling controls, and monitoring/auditing. The exam often presents “the model sometimes returns disallowed content” or “users submit personal data” and expects you to combine preventive controls with detection and response.

For PII, the correct approach is typically to minimize what you send to the model (data minimization), redact or mask identifiers where feasible, apply access controls, and log carefully. In RAG scenarios, ensure “security trimming” so retrieval respects user permissions; otherwise the model may reveal content from documents the user should not access.

  • Content filtering: use Azure OpenAI safety features; tune thresholds to balance false positives vs risk.
  • PII handling: redact/mask, encrypt at rest/in transit, restrict logging, and apply retention policies.
  • Policy enforcement: system instructions for refusal behavior + application-side checks + monitoring.

Exam Tip: If the scenario is about compliance or regulated industries, the “best answer” is rarely “change temperature.” It is usually governance: access control, private networking options, logging/auditing strategy, and explicit safety policies.

Common trap: relying solely on a prompt to block sensitive output. The exam typically expects you to use platform controls and application logic as well, because prompts can be bypassed and can fail under adversarial input.

Section 3.6: Exam-style practice set—choose the right design for real-world GenAI constraints

Section 3.6: Exam-style practice set—choose the right design for real-world GenAI constraints

This lesson is about reading the question like an architect under constraints. AI-102 scenarios frequently bundle multiple requirements; your job is to identify the primary driver and pick the design that satisfies it with the fewest trade-offs. Look for keywords that map to tested objectives: “grounded,” “citations,” “fresh,” “low latency,” “high throughput,” “PII,” “must not store prompts,” “tenant isolation,” “evaluations,” and “regression testing.”

For end-to-end chat and completion solutions, expect to justify: which model type, how to structure messages, how to manage conversation history, and how to handle retries/timeouts. For RAG, expect to justify: chunking strategy, top-k retrieval, hybrid vs vector search, metadata filters, and citation handling. For safety, expect layered controls and auditability.

  • Latency constraint: reduce tokens, cache retrieval results, use smaller models for intermediate steps, and avoid unnecessary chains.
  • Grounding constraint: implement RAG with Azure AI Search, include citations, and instruct the model to answer only from retrieved content.
  • Freshness constraint: incremental indexing, scheduled ingestion, and clear cache invalidation.
  • Governance constraint: content filtering, PII minimization, security trimming, and monitored deployments.

Exam Tip: When two answers both “work,” choose the one that best satisfies the explicit constraints while aligning with Azure-native services. The exam rewards service-appropriate patterns (Azure AI Search for retrieval, prompt flow for evaluation/orchestration, Azure OpenAI safety controls for content moderation) rather than generic or manual solutions.

Common trap: selecting a solution that improves quality but violates constraints (e.g., adding more context when the issue is token limit; storing full prompts when the requirement says not to retain personal data; retrieving all documents instead of top-k). Train yourself to eliminate options that fail a single must-have requirement, even if they sound technically impressive.

Chapter milestones
  • Build Azure OpenAI chat and completion solutions end-to-end
  • Prompt engineering, evaluation, and prompt flow patterns
  • RAG implementation with Azure AI Search and embeddings
  • GenAI exam-style questions: safety, latency, and grounding
Chapter quiz

1. You are building an internal support assistant using Azure OpenAI. Users report occasional hallucinations, and compliance requires answers to be grounded in the latest approved policy documents and to include citations. Which design best meets the requirement?

Show answer
Correct answer: Implement Retrieval-Augmented Generation (RAG) by embedding policy documents, storing them in Azure AI Search, retrieving top passages at query time, and prompting the model to answer using only retrieved content with citations
RAG with Azure AI Search provides grounding on current documents and enables citations by returning source passages at query time, aligning with AI-102 scenarios about "must cite sources" and "keep answers up to date." Few-shot prompting (B) can improve format but cannot guarantee factuality or use of the latest documents. Fine-tuning (C) bakes knowledge into weights, is harder to keep current, and still does not guarantee citations or strict use of approved sources.

2. A team deploys a chat solution with Azure OpenAI. Some user conversations exceed the model context window and responses become inconsistent because earlier requirements fall out of the prompt. You must preserve key instructions while controlling token usage. What is the best approach?

Show answer
Correct answer: Summarize older turns into a compact running summary, keep critical system/developer instructions stable, and include only the most relevant recent messages in the prompt
Token limits affect prompt construction; a common exam-aligned pattern is to maintain stable system instructions and compress older conversation into a summary while sending only relevant turns. Sending the full history (A) will continue to hit context limits and increase latency/cost. Temperature (C) affects randomness/creativity and does not restore missing context—raising it can worsen reliability.

3. You are implementing RAG. The company has PDFs with many tables and short policy clauses. You need higher retrieval precision so the model receives only the most relevant excerpts. Which change most directly improves retrieval quality before generation?

Show answer
Correct answer: Chunk documents into semantically meaningful sections (for example, headings/clauses), generate embeddings per chunk, and tune Azure AI Search ranking (topK/filters) for retrieval
Retrieval precision in RAG is primarily driven by how content is chunked, embedded, and retrieved (including metadata filters and topK). Poor chunking reduces relevance and increases noise in the context window. Temperature (B) changes generation behavior and does not improve retrieval. Using one large chunk per PDF (C) usually harms relevance because embeddings become less specific and retrieval returns overly broad context.

4. A customer-facing chat app must meet strict safety requirements: it should reduce the chance of returning harmful content and should handle user prompts attempting policy bypass. Which solution aligns best with Azure generative AI safety controls?

Show answer
Correct answer: Enable and configure Azure OpenAI content filtering/safety system for both prompts and completions, and implement refusal/redirect behavior in the app when safety events occur
Certification scenarios emphasize using platform safety controls (content filtering/safety systems) plus application logic to handle blocked/flagged content. Prompt-only approaches (A) are not sufficient against jailbreaks and do not provide enforceable filtering. Increasing max_tokens (C) affects output length and cost/latency, not safety enforcement.

5. Your team is troubleshooting latency in a RAG-based assistant. Average response time increased after adding grounding. You need the biggest latency reduction without removing grounding. Which change is most effective?

Show answer
Correct answer: Stream responses from the model to the client and reduce the number of retrieved passages (topK) and context size sent to the model
RAG latency commonly comes from retrieval and large prompts. Streaming improves perceived latency, and reducing topK/context size reduces tokens processed, lowering generation time and cost while keeping grounding. Increasing embedding dimensionality (B) can increase compute and query time. More few-shot examples (C) increases prompt tokens and does not eliminate the need for retrieval when grounding is required.

Chapter 4: Implement an Agentic Solution

AI-102 increasingly expects you to think beyond “single prompt in, single response out” and instead design systems that can plan, call tools, and complete multi-step work with measurable reliability. In this chapter, you’ll map agent concepts to exam objectives: tool ecosystems (functions/APIs/connectors), multi-step planning and orchestration, memory/state, and the governance controls that keep agentic systems compliant and safe in production.

On the exam, the best answer is rarely “use an agent framework.” Instead, you must choose which tools to expose, how to constrain them, where to store state, and what monitoring/auditing is required. Expect questions that describe failures (hallucinated actions, runaway tool loops, data leakage, missing approvals) and ask you to remediate with grounding, permission boundaries, and observable workflows.

Exam Tip: When you see “multi-step,” “needs to call internal systems,” “must be auditable,” or “cannot leak tenant data,” translate that into: tool schemas + routing/orchestration + state storage + logging + policy enforcement. Those are the scoring keywords.

Practice note for Design agent architectures and tool ecosystems: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Implement tool/function calling and multi-step planning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Add memory, state, and monitoring for agent reliability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Agentic exam-style questions: orchestration and guardrails: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design agent architectures and tool ecosystems: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Implement tool/function calling and multi-step planning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Add memory, state, and monitoring for agent reliability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Agentic exam-style questions: orchestration and guardrails: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design agent architectures and tool ecosystems: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Implement tool/function calling and multi-step planning: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Add memory, state, and monitoring for agent reliability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 4.1: Agent concepts—plans, tools, autonomy levels, and failure modes

An agentic solution is a model-driven controller that decides what to do next and uses external capabilities (tools) to do it. For AI-102, treat “agent” as an architecture pattern rather than a product: a loop of (1) interpret user goal, (2) plan steps, (3) select tools, (4) execute, (5) observe results, (6) continue or stop.

Autonomy levels are central to exam scenarios. Low autonomy means the model suggests actions but a workflow or human approves. High autonomy means the model executes tool calls automatically. The exam will test whether you can pick the right level for risk: production changes, payments, PII access, and regulated operations typically require approvals and strict permissions.

  • Plans: explicit step lists, sometimes with checkpoints (validation, approval, reconciliation).
  • Tools: APIs, database queries, search retrieval, document extraction, ticketing systems, and vision/NLP endpoints.
  • Stop conditions: max iterations, confidence thresholds, and “no tool available” fallbacks.

Common failure modes appear frequently in case studies: tool hallucination (calling a tool that doesn’t exist), schema mismatch (wrong argument names/types), runaway loops (repeating calls without progress), partial completion (returns before finishing), and ungrounded conclusions (model answers without verifying via tools). Reliability improvements usually involve better tool design (clear schemas), explicit planning prompts, constrained routing, and robust post-tool validation.

Exam Tip: If a prompt or design “lets the model browse everything” or “gives broad database access,” expect that to be wrong. The correct answer usually narrows scope: least privilege, deterministic boundaries, and verifiable outputs.

Section 4.2: Tool integration—functions, APIs, connectors, and schema design

Tool integration is where agentic solutions become real systems. On Azure, the model is not the system of record; tools are. Your job is to expose tools in a way that is callable, safe, and testable. Function calling (tool calling) requires you to define tool names, descriptions, and JSON schemas for arguments so the model can produce structured calls rather than free text.

In exam terms, a “tool” could be an Azure Function, an API Management endpoint, a Logic App connector, a database query service, or an Azure AI Search query. You’ll be evaluated on choosing integrations that fit enterprise requirements: API Management for throttling and auth, Managed Identities for service-to-service calls, and private endpoints/network isolation when needed.

  • Schema design: Use strict types, required fields, enums for constrained choices, and clear descriptions. This reduces invalid calls and improves determinism.
  • Idempotency: For “create/update” tools, include idempotency keys or request IDs to avoid duplicates when the agent retries.
  • Validation: Validate tool inputs server-side; never rely on the model to self-police.

A frequent trap: exposing a single “doAnything(query)” tool. That maximizes hallucination and makes it hard to audit. Instead, prefer multiple narrow tools (e.g., search_kb, get_order_status, create_ticket) with constrained parameters and well-defined outputs. Another trap is returning unstructured text from tools; structured outputs (JSON with explicit fields) make downstream reasoning and monitoring far easier.

Exam Tip: When asked how to improve reliability of tool use, pick answers that tighten schemas, add validation, and use APIM/Functions with proper authentication. “Improve prompt wording” alone is rarely sufficient.

Section 4.3: State and memory—conversation history, retrieval memory, and session storage

Agents need state: what the user asked, what tools were called, and what was learned. AI-102 scenarios typically distinguish between (1) conversation history, (2) retrieval-based memory, and (3) durable session storage. Conversation history is the rolling transcript used for context; it’s short-lived and must be trimmed or summarized to fit token limits. Retrieval memory stores facts and prior interactions as embeddings and retrieves them when relevant (a RAG-style pattern for memory). Session storage persists structured state like user preferences, workflow progress, and tool results.

Design decisions hinge on privacy and relevance. Not everything should be stored as “memory.” PII, secrets, and regulated data may need redaction, encryption, and strict retention policies. The exam may ask how to prevent cross-user leakage: the correct approach is per-user or per-tenant partitioning and access control at the storage layer, not just prompt instructions.

  • Conversation management: summarize older turns; preserve key constraints and decisions; keep citations or tool outputs that justify actions.
  • Retrieval memory: store only vetted/allowed snippets; attach metadata (userId, tenantId, timestamps, sensitivity labels).
  • Session state: store workflow step, approvals, and correlation IDs for observability.

Common trap: putting all memory into the prompt without governance. That increases token cost, raises leakage risk, and makes it hard to honor deletion requests. A better approach is a tiered design: short-term context in the prompt, long-term memory in a store you can query and control, and minimal necessary state persisted with retention policies.

Exam Tip: If a scenario requires “auditability” or “replay,” choose designs that persist tool call logs, inputs/outputs, and correlation IDs in storage that supports retention and access control—not just chat transcripts.

Section 4.4: Orchestration—routing, multi-agent patterns, and workflow control

Orchestration is how you control multi-step behavior. AI-102 will test whether you can choose between simple routing (one agent picks a tool), workflow orchestration (a defined sequence with gates), and multi-agent patterns (specialists coordinated by a controller). The key exam skill is identifying when you need deterministic control versus flexible reasoning.

Routing patterns include “intent classification to tool” (e.g., FAQ vs. order status), “retrieval first” (try grounded answer before tools), and “policy first” (check authorization before acting). Workflow control adds explicit steps: verify identity, retrieve data, compute decision, request approval, execute update, confirm outcome. Multi-agent patterns can separate duties: a planner agent proposes steps, an executor agent performs tool calls, and a reviewer agent validates outputs against policy and evidence.

  • Loop controls: max tool calls, timeouts, and “no progress” detection to prevent infinite retries.
  • Error handling: retry with backoff for transient failures; fail fast on validation/auth failures; produce user-safe errors.
  • Grounding checkpoints: require citations from search/tool outputs before final answers.

A common exam trap is assuming multi-agent automatically improves quality. It can also multiply cost and complexity. Choose multi-agent only when responsibilities genuinely differ (e.g., compliance review, domain specialization) or when you need robust verification. For straightforward tasks, a single orchestrated agent with deterministic steps is typically more correct.

Exam Tip: If the prompt says “must follow a business process” or “requires approvals,” pick workflow orchestration (Logic Apps/Durable Functions/Step Functions-like control) rather than a free-running agent loop.

Section 4.5: Safety and governance—permissions, grounding, audit logs, and policy boundaries

Safety and governance are not optional in agentic systems because tools can change real systems. AI-102 questions often center on least privilege, grounding, and traceability. Permissions begin with identity: use Managed Identities for Azure resources, OAuth scopes for APIs, and role-based access (Azure RBAC) to restrict what the agent can do. Avoid embedding secrets in prompts or code; use Key Vault and secure configuration.

Grounding is your defense against hallucinations and unsupported claims. For knowledge tasks, use Azure AI Search to retrieve authoritative content and require citations. For action tasks, validate preconditions with tools (e.g., check account status before issuing refunds) and confirm outputs (e.g., read-after-write verification). Policy boundaries include content safety (filtering unsafe outputs), data boundaries (tenant isolation), and action boundaries (approval gates, allowlists for tools, and explicit deny rules).

  • Audit logs: store tool invocations, arguments, results, model responses, and user identity/correlation IDs.
  • Observability: capture latency, failure rates, token usage, and escalation events; monitor for anomalous tool usage.
  • Governed prompts: centralize system prompts and policies; version them; review changes.

Traps to avoid: “the model will refuse harmful requests” as a primary safeguard (insufficient), or “log everything” without addressing PII retention. The correct design balances traceability with compliance: redact sensitive fields, apply retention limits, and restrict log access.

Exam Tip: When you see “regulated,” “PII,” “SOX,” “HIPAA,” or “customer data,” expect answers featuring least privilege + tenant isolation + audit logging + retention controls. Prompts alone do not satisfy compliance requirements.

Section 4.6: Exam-style case study—choose designs that meet reliability and compliance needs

Case study questions typically give you a business scenario and constraints, then ask which design best meets reliability and compliance. Practice translating narrative requirements into architecture choices. Example scenario signals: “users can request account changes,” “must produce an audit trail,” “only approved knowledge sources,” “support handoff to humans,” “operate across multiple tenants,” and “minimize hallucinations.”

A strong agentic design for such a scenario would look like: an orchestrator that routes intents (information vs. action), uses retrieval grounding via Azure AI Search for policy/FAQ answers, and exposes narrow action tools through an API layer (API Management + Functions) protected by Managed Identity and RBAC. Workflow control enforces identity verification, eligibility checks, and approval gates for high-risk actions. State is stored per session (workflow step, correlation ID) and long-term memory is limited to allowed, non-sensitive preferences with metadata and partitioning.

  • Reliability choices: strict tool schemas, server-side validation, idempotent action tools, loop limits, and post-action verification.
  • Compliance choices: tenant-scoped storage, redaction in logs, retention policies, and explicit audit records of tool calls and approvals.
  • Operational choices: monitoring dashboards for tool failures and anomalous usage; alerting on policy violations or repeated retries.

How to identify the correct answer: pick the option that (1) minimizes the agent’s authority by default, (2) grounds responses in approved sources, (3) makes tool calls observable and auditable, and (4) uses platform-native security controls rather than “trust the model.” If an option proposes a single, broad tool with unrestricted database access or no audit trail, it is almost certainly wrong.

Exam Tip: In “choose the best design” items, prioritize governance and determinism for action-taking agents. Flexibility is valuable only after you’ve satisfied security, compliance, and traceability constraints.

Chapter milestones
  • Design agent architectures and tool ecosystems
  • Implement tool/function calling and multi-step planning
  • Add memory, state, and monitoring for agent reliability
  • Agentic exam-style questions: orchestration and guardrails
Chapter quiz

1. A company is building an Azure OpenAI-based support agent that must create and update tickets in an internal ITSM system. During testing, the model occasionally invents ticket IDs and claims updates were applied even when the API call failed. You need to ensure the assistant only reports actions that were actually executed and can be audited end-to-end. What should you implement? A. Add tool/function calling with a strict JSON schema, execute actions outside the model, and return tool results back to the model; log tool inputs/outputs with correlation IDs B. Increase the system prompt length with detailed instructions to never fabricate ticket IDs and to always be truthful C. Enable a larger model deployment and raise max tokens so the model has more context to reason about the ticket lifecycle

Show answer
Correct answer: Add tool/function calling with a strict JSON schema, execute actions outside the model, and return tool results back to the model; log tool inputs/outputs with correlation IDs
A is correct: certification scenarios expect you to separate "decision" (model) from "execution" (your orchestrator), use structured tool schemas, and only let the model summarize confirmed tool results. Logging tool calls and results with correlation IDs enables auditability and traceability. B is insufficient because prompt-only controls cannot guarantee the model won’t hallucinate actions or tool outcomes. C may improve generation quality but does not provide reliability guarantees, execution confirmation, or auditable action traces.

2. You are designing an agent that performs multi-step work: (1) look up customer entitlements, (2) query relevant knowledge base articles, (3) draft a response, and (4) if required, submit an approval request to a manager before sending. The requirement is that approvals must be enforced even if the model tries to skip step 4. What is the best design? A. Implement a deterministic orchestration workflow (state machine) where the approval gate is enforced by the orchestrator before calling the send-email tool B. Rely on the agent framework’s default planner to decide when approvals are needed C. Add a system message that says the agent must always request approval for premium customers before sending emails

Show answer
Correct answer: Implement a deterministic orchestration workflow (state machine) where the approval gate is enforced by the orchestrator before calling the send-email tool
A is correct: exam guidance emphasizes guardrails and governance implemented outside the model—policy enforcement and approval gates belong in the orchestrator/workflow so they cannot be bypassed. B is incorrect because planner behavior is probabilistic and can skip steps; it also lacks a hard control boundary. C is incorrect because prompt instructions are not an enforcement mechanism; the model can still attempt to call the send tool or claim approval occurred without a verified state transition.

3. A healthcare organization is deploying a patient-facing agent that can retrieve appointment details and lab results. The agent must never leak one patient’s data to another, and it must be able to resume a conversation after a disconnect. Which approach best satisfies isolation and continuity requirements? A. Store conversation state and memory in a per-user scoped data store keyed by a verified patient identity, and enforce authorization in the tool layer for every data retrieval B. Store all conversation transcripts in a single shared vector index to maximize retrieval quality across patients C. Put the patient’s full record in the system prompt at the start of each session so the model does not need to call tools

Show answer
Correct answer: Store conversation state and memory in a per-user scoped data store keyed by a verified patient identity, and enforce authorization in the tool layer for every data retrieval
A is correct: the exam expects tenant/user isolation via identity-scoped state and authorization checks at data access boundaries (tools/APIs), plus durable state to resume sessions. B is wrong because a shared index across patients introduces cross-user retrieval risk and data leakage unless heavily partitioned and access-controlled; as stated, it violates isolation. C is wrong because embedding full records in prompts increases exposure risk, complicates auditing/minimization, and still doesn’t enforce authorization on access—prompts are not a secure data boundary.

4. An agent uses tool calling to search a document repository and then calls an internal 'UpdatePolicy' API. A red-team test shows prompt injection in a document can cause the model to call UpdatePolicy with attacker-controlled parameters. You need a mitigation aligned with agentic guardrails. What should you do? A. Implement allowlisted tools with parameter validation and policy checks in the orchestrator/tool layer; treat retrieved content as untrusted and require grounded citations for decisions B. Increase retrieval top-k and add more documents so malicious instructions are diluted by benign content C. Disable retrieval and rely solely on the model’s internal knowledge to avoid injection

Show answer
Correct answer: Implement allowlisted tools with parameter validation and policy checks in the orchestrator/tool layer; treat retrieved content as untrusted and require grounded citations for decisions
A is correct: exam scenarios about prompt injection and runaway actions are mitigated by hard permission boundaries—allowlisting tools, validating inputs, and enforcing policies outside the model. Retrieved text must be treated as untrusted, and grounding/citations help ensure actions are based on approved sources rather than instructions in content. B is incorrect because dilution is not a security control; malicious instructions can still be followed. C is incorrect because removing retrieval increases hallucination risk and does not address the core need for controlled tool execution and authorization.

5. A team reports their agent sometimes enters a loop: it repeatedly calls the same search tool with minor query variations and never completes the task. You must improve reliability and make failures observable for production operations. What should you implement? A. Add step limits/timeouts, loop detection based on repeated tool-call patterns, and end-to-end monitoring with structured traces of planning steps and tool calls B. Add more examples (few-shot) in the prompt demonstrating how to finish tasks quickly C. Switch to a model with a larger context window so it can remember previous searches without external state

Show answer
Correct answer: Add step limits/timeouts, loop detection based on repeated tool-call patterns, and end-to-end monitoring with structured traces of planning steps and tool calls
A is correct: agent reliability on the exam maps to orchestration controls (max steps, timeouts), stateful safeguards (detect repeated patterns), and observability (logs/traces/metrics for tool calls and decisions). B can reduce loops but does not guarantee termination or provide operational visibility. C may help with short-term context but does not enforce termination, does not provide monitoring/audit trails, and does not prevent repeated tool calls under uncertainty.

Chapter 5: Implement Computer Vision, NLP, and Knowledge Mining

This chapter maps directly to the AI-102 skills that show up repeatedly in case studies: choosing the right vision and language service for a workload, designing an extraction pipeline, and wiring outputs into Azure AI Search for retrieval and analytics. The exam is less interested in “can you call an API” and more interested in whether you can select a service, design the processing steps, and secure/operate the solution (identity, private networking, monitoring, cost, and latency). You’ll see this as decision-point questions: image captions vs OCR vs custom detection; document forms vs free-form text; classic NLP vs generative summaries; keyword vs vector vs hybrid search.

Across the lessons in this chapter, keep a mental model of an end-to-end pipeline: ingest (blob/storage/stream) → analyze (vision/OCR/document intelligence) → enrich (NLP + skills) → index (Azure AI Search with filters, facets, vectors) → serve (apps, bots, RAG). Most incorrect answers on AI-102 are “nearly right” but miss one critical requirement: handwriting, layout preservation, multilingual needs, PII handling, or low-latency constraints.

Exam Tip: When a question says “extract key-value pairs from invoices” or “tables from PDFs,” that’s a Document Intelligence pattern—not generic OCR—because the test expects you to recognize structured extraction and schema mapping requirements.

Practice note for Computer vision implementations: image analysis, OCR, and custom models: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP implementations: classification, extraction, translation, and summarization: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining pipelines with Azure AI Search and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mixed-domain exam practice: CV + NLP + search scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Computer vision implementations: image analysis, OCR, and custom models: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP implementations: classification, extraction, translation, and summarization: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining pipelines with Azure AI Search and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mixed-domain exam practice: CV + NLP + search scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Computer vision implementations: image analysis, OCR, and custom models: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 5.1: Computer vision solutions—image analysis, detection, and OCR decision points

AI-102 frequently tests whether you can distinguish between “general image understanding” and “text-in-image extraction.” For general image scenarios (tagging, captions, dense captions, object detection, smart crops), use Azure AI Vision. For reading text, use OCR/Read capabilities (often surfaced as part of Vision) and validate whether you need printed text, handwriting, or layout fidelity. The exam often embeds constraints like “photos taken at angles,” “low-light,” or “multiple languages,” which push you toward robust OCR and pre-processing choices.

A practical decision flow: (1) Is the primary output semantics about the scene (labels, objects, captions)? Choose Vision image analysis. (2) Is the primary output text? Choose Read/OCR. (3) Do you need domain-specific classes (e.g., detect a particular part, defect type, brand logo beyond built-in)? Choose custom training: Azure AI Custom Vision (classification/detection) or a bespoke model approach, then integrate results back to the app or search index.

  • Image analysis: best for tags/captions, object presence, brand detection (if supported), basic moderation signals; often used to enrich search metadata.
  • OCR/Read: best for extracting lines/words with bounding boxes; pick it when downstream steps rely on exact text (search, entity extraction, compliance).
  • Custom Vision: best when you must detect custom categories with training images; exam expects you to mention dataset labeling, iteration, precision/recall tradeoffs, and retraining.

Common trap: Selecting Custom Vision for “extract text from receipts.” Custom Vision does not replace OCR; it classifies/detects visual objects. Receipts typically need OCR (and often Document Intelligence for structured fields).

Exam Tip: If the prompt mentions “bounding boxes for each word/line” or “overlay recognized text on the image,” that’s OCR/Read. If it mentions “count items on a shelf” or “detect defects,” that’s object detection (built-in or custom) with precision/recall tuning.

Section 5.2: Document processing—Document Intelligence patterns for forms and unstructured docs

Document Intelligence is the exam’s go-to for forms, invoices, receipts, IDs, and any PDF where layout matters (tables, key-value pairs, selection marks, fields with positions). The service is about document understanding, not just text recognition. On AI-102, expect design questions: which model type, how to validate confidence, and how to route low-confidence results to human review.

Pattern 1: Prebuilt models (invoice, receipt, ID, business card) when your documents match common formats. Pattern 2: Custom extraction when you need your own schema (e.g., “PolicyNumber,” “CoverageLimit,” “EffectiveDate”) across varying templates. Pattern 3: Layout-first extraction when you need paragraphs, tables, and reading order to preserve structure for downstream summarization or indexing.

  • Use confidence scores to branch logic (auto-approve vs manual queue).
  • Normalize output: dates/currencies/units before indexing.
  • Store the original document URI plus extracted JSON for traceability.

Common trap: Treating PDFs as plain text. If a requirement mentions “table cells,” “line items,” “checkboxes,” or “signature blocks,” the exam expects Document Intelligence rather than generic OCR + regex. Another trap is ignoring multilingual forms—ensure the model and post-processing support the target languages and locale formats.

Exam Tip: Watch for wording like “key-value pairs” and “table extraction.” These are near-synonyms for Document Intelligence on AI-102. If the question also needs search, the correct architecture often becomes: Document Intelligence → enrichment (optional) → Azure AI Search index.

Section 5.3: NLP solutions—text analytics tasks, evaluation, and deployment considerations

NLP tasks on AI-102 commonly include language detection, sentiment/opinion mining, key phrase extraction, named entity recognition (NER), PII detection, summarization, and translation. The exam tests your ability to select the right feature and then operationalize it: batching, throughput, latency, authentication, and monitoring for drift.

When a scenario says “classify support tickets” or “route emails,” you should think of text classification (custom categories), plus entity extraction to capture product names, account IDs, or locations. For compliance, prioritize PII detection and redaction before storage/indexing. For cross-lingual intake, language detection + translation is a standard pattern: detect → translate to a pivot language → run analytics consistently.

  • Evaluation: know precision/recall tradeoffs for extraction and classification; avoid overfitting by testing on holdout sets.
  • Deployment: choose between synchronous calls for interactive UX and async/batch for large corpora (e.g., thousands of documents).
  • Security: managed identity where supported; store secrets in Key Vault; consider private endpoints for data exfiltration control.

Common trap: Using sentiment analysis to “summarize a document.” Sentiment answers “how positive/negative,” not “what happened.” For summaries, use summarization capabilities or generative patterns, and confirm whether extractive vs abstractive is required. Another trap is ignoring input size limits; long documents often require chunking with overlap to preserve context for entity extraction and summarization.

Exam Tip: If the question requires “mask PII before indexing,” the correct order is: OCR/Document Intelligence → PII detection/redaction → store/index. Indexing first and “filtering later” is usually the wrong answer because it violates the requirement.

Section 5.4: Conversational language—bot patterns, grounding, and conversation safety basics

Conversational scenarios on AI-102 blend NLP with orchestration. You’ll be tested on when to use a bot framework pattern, when to ground responses with enterprise data (RAG), and how to apply basic safety controls. Typical use cases: an employee assistant that answers policy questions, a customer support bot that escalates to a human, or a copilot that searches knowledge bases and cites sources.

Grounding is the core design point: the bot should retrieve relevant content (often via Azure AI Search) and use it as context for generation or response composition. If a question mentions “answers must be based only on internal documents” or “include citations,” you should think: retrieval step + constrained generation, not open-ended chat completion. If the scenario needs multi-step actions (create ticket, check order, update CRM), it becomes an agentic pattern with tool/function calling and clear authorization boundaries.

  • Conversation safety basics: content filtering, prompt-injection awareness, and data leakage prevention (don’t echo secrets; don’t allow arbitrary tool calls).
  • Operational: log prompts/responses responsibly (redact PII), measure helpfulness and grounding failures, implement fallback to “I don’t know” when retrieval is empty.

Common trap: Assuming “chat” equals “knowledge.” The exam expects you to add a retrieval layer for enterprise QA. Another trap is skipping user authentication/authorization in tool actions—if the bot can call APIs, the question will often imply role-based access must apply.

Exam Tip: When you see “must not fabricate” or “must cite sources,” select designs that include retrieval + citations and guardrails (system instructions, grounding, and refusal behavior when evidence is missing).

Section 5.5: Knowledge mining—Azure AI Search indexing, skillsets, enrichments, and vector search

Knowledge mining is where CV and NLP outputs become searchable, filterable, and retrievable at scale. Azure AI Search concepts show up heavily on AI-102: index schema design, indexers/data sources, skillsets for enrichment, and query patterns (keyword, semantic, vector, hybrid). The exam looks for whether you can design an index that supports the app’s queries: filters, facets, sorting, security trimming, and relevance.

A typical pipeline: documents land in Blob Storage → indexer runs → skillset enriches (OCR, entity extraction, key phrases, language detection) → enriched fields are projected into the search index. For images/PDFs, you’ll often include OCR first, then NLP skills over the extracted text. For vector search scenarios, you add embeddings fields (and store chunked passages) to enable similarity queries; for best results, combine vector similarity with keyword filters (hybrid) and optionally semantic ranking.

  • Index design: separate raw text, chunks, and metadata fields; mark fields as searchable/filterable/facetable appropriately.
  • Enrichment: skillsets can call built-in skills or custom skills; plan for throughput and error handling.
  • Security: enforce access control via app-layer filters or per-document ACL metadata (security trimming pattern).

Common trap: Putting everything into one giant “content” field. The exam expects structured fields for filtering/faceting (e.g., documentType, customerId, createdDate) and separate enriched fields (entities, keyPhrases). Another trap: forgetting chunking for vector search; embeddings on extremely long documents are ineffective and may exceed limits.

Exam Tip: If the scenario needs “search by meaning” and “exact filters by date/customer,” the best answer is usually hybrid search: vector similarity over chunks + structured filters + semantic ranking where applicable.

Section 5.6: End-to-end information extraction—design scenarios across CV, NLP, and search

This is where AI-102 case studies live: you must assemble multiple services into a coherent architecture with the right ordering and governance. A reliable blueprint for mixed-domain scenarios is: ingest → extract → normalize → enrich → index → retrieve/serve. The exam will vary the inputs (scanned PDFs, mobile photos, multilingual emails) and the outputs (dashboards, compliance flags, support routing, RAG assistants).

Example design decisions you should be ready to justify: If the input is scanned contracts, start with Document Intelligence (layout/text) rather than plain OCR; then run PII detection and entity extraction; then index into Azure AI Search with fields for parties, dates, obligations, and a vectorized chunk field for semantic retrieval. If the input includes images (e.g., damage photos), add Vision analysis for tags/objects and store those tags as searchable metadata alongside the claim text. For multilingual scenarios, detect language early and either translate or maintain per-language fields in the index.

  • Resilience: implement retries and dead-letter patterns for failed enrichments; don’t block the whole pipeline on one bad document.
  • Cost/latency: batch where possible; cache embeddings for unchanged content; avoid re-indexing entire corpora for small updates.
  • Governance: audit logs, redaction, private networking, and least-privilege identities for each component.

Common trap: Indexing raw OCR text without normalization or metadata, then trying to “fix relevance” later. AI-102 expects you to engineer the index for the query experience upfront. Another trap is ignoring human-in-the-loop for low-confidence extraction on critical business processes (claims payments, compliance).

Exam Tip: In end-to-end questions, identify the “hard requirement” first (structured fields, citations, PII, handwritten text, latency). Then choose services that satisfy it and place them in the correct order. Many wrong options are valid services in the wrong sequence.

Chapter milestones
  • Computer vision implementations: image analysis, OCR, and custom models
  • NLP implementations: classification, extraction, translation, and summarization
  • Knowledge mining pipelines with Azure AI Search and enrichment
  • Mixed-domain exam practice: CV + NLP + search scenarios
Chapter quiz

1. A company receives thousands of PDF invoices from different vendors. Each invoice has varying layouts, and the company must extract key-value pairs (InvoiceNumber, TotalDue) and line-item tables. The solution must map the extracted fields into a consistent schema for indexing. Which Azure service should you use for extraction?

Show answer
Correct answer: Azure AI Document Intelligence (Form Recognizer)
Azure AI Document Intelligence is designed for structured document extraction, including key-value pairs and tables, and supports mapping to a schema across varying layouts. Vision OCR can read text but does not reliably preserve structure or extract tables/fields as first-class outputs, so you would be left building your own parsing logic. Azure AI Language NER operates on plain text after extraction and cannot recover layout, tables, or key-value relationships from PDFs.

2. You are building an app that identifies whether a photo contains a company logo and returns the logo’s bounding box. The logo is proprietary, and you have labeled training images. Which approach best meets the requirement?

Show answer
Correct answer: Train an Azure AI Vision custom model for object detection
Custom object detection in Azure AI Vision is intended for detecting specific, custom classes and returning bounding boxes. Image captions are generic and not reliable for proprietary logos, and they typically don’t provide precise bounding boxes for your custom classes. Language classification on filenames/alt text does not analyze image pixels and will fail when metadata is missing or inaccurate.

3. A global support center stores chat transcripts in multiple languages. You need to: (1) detect the language, (2) translate content to English, and (3) produce a short extractive summary for an agent dashboard. Which set of services is the most appropriate?

Show answer
Correct answer: Azure AI Language (language detection + summarization) and Azure AI Translator
Azure AI Language provides language detection and text summarization capabilities, while Azure AI Translator handles translation. OCR and semantic ranking do not translate or summarize chat transcripts; OCR is for images/documents, and ranking improves search relevance rather than producing summaries. Document Intelligence and Image Analysis are designed for documents/images and do not address multilingual chat translation and summarization requirements.

4. You are designing a knowledge mining pipeline to index scanned contracts stored in Azure Blob Storage. Requirements: extract text from images, detect PII (names, addresses), and enable search with filters by contract type. Which pipeline best matches Azure AI Search enrichment patterns?

Show answer
Correct answer: Blob Storage data source → Azure AI Search skillset (OCR + PII detection) → Azure AI Search index with filterable fields
Azure AI Search supports cognitive skillsets for enrichment, including OCR for scanned images and PII detection (via language skills), and you can project enriched fields into an index with filterable fields such as contract type. Image captions are not a replacement for OCR text extraction and do not meet PII detection or structured filtering requirements. Uploading raw PDFs without enrichment limits searchability (especially for scanned images) and misses the built-in skillset pattern the exam expects for knowledge mining.

5. A retail company wants a search experience over product manuals that supports keyword filtering (brand, model), and also allows users to ask questions using natural language with improved relevance for conceptually similar results. The solution should minimize rework and use Azure-native capabilities. Which search configuration best fits the requirement?

Show answer
Correct answer: Azure AI Search with hybrid search (keyword + vector) and filterable facets for brand/model
Hybrid search in Azure AI Search combines lexical matching (good for exact terms, model numbers, and filters/facets) with vector similarity (good for natural-language, semantic similarity). Keyword-only search misses concept-based matching for Q&A-style queries. Vector-only search typically cannot replace structured filtering needs (brand/model) and can reduce precision for exact identifiers, so it does not meet the combined requirement.

Chapter 6: Full Mock Exam and Final Review

This chapter is your conversion layer from “I studied the services” to “I can pass AI-102 under exam pressure.” The exam rewards decisions: choosing the right Azure AI service for a requirement, applying security/governance constraints, and diagnosing deployment/quality issues quickly. You will complete two mock blocks (Part 1 and Part 2), then run a disciplined weak-spot analysis, and finish with an exam-day checklist that prevents avoidable losses.

As you work, keep the course outcomes in view: planning and managing Azure AI solutions (security, monitoring, optimization), implementing generative AI solutions (Azure OpenAI, prompt/RAG/safety), implementing agentic workflows (tools/function calling/governance), computer vision (analysis/OCR/video/custom model lifecycle), NLP (classification/extraction/summarization/translation/conversation), and knowledge mining (Azure AI Search, enrichment, Document Intelligence). The mock experience is designed to force you to integrate these outcomes the way the real exam does.

Exam Tip: Your goal is not to “get everything right on the first pass.” Your goal is to build a repeatable method: identify the objective being tested, eliminate distractors, and confirm the best-fit service/feature with governance and operational constraints.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 6.1: Mock exam instructions—timing, marking, and review strategy

Section 6.1: Mock exam instructions—timing, marking, and review strategy

Treat the mock exam like the real AI-102: a time-boxed decision exercise. Use a two-pass approach. Pass 1 is for high-confidence items: answer and move. Pass 2 is for “almost there” items: you’ll revisit with calmer context and cross-question clues. If you get stuck, mark and advance—time pressure is one of the exam’s hidden objectives.

Allocate time intentionally: reserve a final review buffer for marked questions and sanity checks. Do not spend more than a couple minutes on any single item in the first pass. The exam often includes distractors that are “technically true” but not the best solution for the stated constraints (latency, cost, data residency, managed identity, private endpoints, content safety, or model lifecycle). Your timing strategy protects you from over-investing in one ambiguous prompt.

Exam Tip: When you mark an item, add a one-line note about what you’re missing (e.g., “unsure between AI Vision vs Custom Vision,” “RAG vs fine-tune,” “key vault vs managed identity path”). That note becomes your targeted learning queue in Section 6.4.

  • Pass 1: answer only if you can justify the service and one key configuration detail.
  • Mark questions where the constraint is unclear; the wording usually reveals it on a second read.
  • During review, verify: identity/permissions, network isolation, logging/monitoring, and safety controls.

Finally, simulate exam conditions: no documentation, no “quick lookups,” and no multitasking. You are training recall and reasoning under constraints—exactly what AI-102 measures.

Section 6.2: Mock Exam Part 1—domain-mixed question set

Section 6.2: Mock Exam Part 1—domain-mixed question set

Part 1 mixes domains the way AI-102 does: you may pivot from OCR design to RAG indexing to deployment security in consecutive items. The skill being tested is service selection plus “one level deeper” implementation knowledge. For example, it’s not enough to know Azure AI Search exists—you must know when to use an indexer + skillset enrichment pipeline versus pushing embeddings directly, and how filters/security trimming affect retrieval quality.

Expect frequent objective pairings: (1) vision/OCR + knowledge mining, (2) NLP + conversation patterns, (3) generative AI + safety + monitoring, (4) planning + identity/networking. Common distractors include picking a model-centric solution when the requirement is actually data pipeline governance, or selecting an overly complex service (e.g., a custom model lifecycle) when a prebuilt feature meets the need.

Exam Tip: Translate each prompt into a checklist: input modality (image/text/video), required output (entities, summaries, captions), constraints (PII, private networking, latency), and lifecycle (one-off vs continuous retraining). The “lifecycle” clue often determines whether you choose Custom Vision / Azure AI Language custom projects / standard prebuilt features.

  • Computer Vision: know when OCR belongs to Azure AI Vision versus when Document Intelligence is the better fit (structured forms, key-value pairs, tables, layout understanding).
  • NLP: differentiate classification/extraction/summarization/translation needs and the operational implications (batch vs real-time, confidence thresholds, human-in-the-loop review).
  • Generative AI: recognize when RAG is preferred over fine-tuning (fresh data, traceability, citations, lower risk) and where Content Safety fits in the request flow.
  • Operations: confirm that managed identity, Key Vault, private endpoints, and logging (Azure Monitor/App Insights) are accounted for when a prompt mentions compliance or isolation.

As you complete Part 1, practice eliminating answers that ignore a hard constraint. Many wrong answers are “good solutions” in general but fail a specific requirement like data egress limits, need for deterministic extraction, or the requirement to explain provenance.

Section 6.3: Mock Exam Part 2—case study + implementation scenarios

Section 6.3: Mock Exam Part 2—case study + implementation scenarios

Part 2 shifts from isolated questions to case-study thinking: you’ll be asked to design end-to-end architectures and then choose implementation details. AI-102 case study prompts typically include constraints such as multi-tenant security, private networking, throughput spikes, human review workflows, and auditability. Your job is to map requirements to an Azure-native pattern that is secure and maintainable.

For a knowledge-mining scenario, you should be ready to describe the pipeline: ingest documents (Blob Storage), extract text/structure (Document Intelligence or OCR), enrich (skillset with entity extraction, language detection, custom skills where needed), index (Azure AI Search), and then serve retrieval to a generative model (Azure OpenAI) with grounding. The exam often tests whether you understand where embeddings live (in the search index) and how hybrid search + semantic ranking can improve recall/precision without overcomplicating the design.

Exam Tip: In case studies, pick the “boring but correct” architecture: managed services, least privilege, predictable scaling. Over-engineered answers (microservices everywhere, custom model training without need, bespoke vector DB when AI Search suffices) are common traps.

For agentic solutions, expect implementation scenarios involving tool/function calling, orchestrating multiple steps (retrieve → reason → act), and governance. The exam may hint at guardrails: constrain tool scopes, validate parameters, log tool calls, and enforce policy (content filtering, allowed domains, rate limits). If the scenario mentions “approval” or “audit,” incorporate human-in-the-loop checkpoints and durable traces (storage/logging) rather than ephemeral chat history only.

  • Security and compliance: use managed identity, avoid embedding secrets in code, and consider private endpoints where stated.
  • Reliability: plan for retries, idempotency, and monitoring—especially for multi-step agent workflows.
  • Quality: apply evaluation practices for RAG (grounding, citations, chunking strategy, and relevance thresholds).

Finish Part 2 by validating that your design addresses every explicit constraint. Case studies penalize “almost complete” architectures that miss one requirement like tenant isolation or data retention.

Section 6.4: Answer review plan—how to analyze mistakes and fix knowledge gaps

Section 6.4: Answer review plan—how to analyze mistakes and fix knowledge gaps

Your score improves fastest when you classify mistakes correctly. After each mock part, review in three buckets: (A) knowledge gap (you didn’t know the feature), (B) misread constraint (you missed a keyword like “private,” “real-time,” “structured forms,” “multilingual,” “cost”), or (C) execution error (rushed elimination, second-guessing, or inconsistent method). Each bucket has a different fix.

For (A), create a micro-syllabus: one page per service area with “what it does,” “when not to use it,” and “one configuration detail the exam loves.” For (B), rewrite the prompt in your own words and underline constraints; then explain why the wrong options fail those constraints. For (C), enforce process: two-pass method, constraint checklist, and a final verification step (identity/networking/monitoring/safety).

Exam Tip: Don’t just note “I got it wrong.” Write: “I chose X because of Y; the correct answer is Z because it satisfies constraint Q.” That causal chain is what you need under time pressure.

  • Service confusion fix: build comparison tables (AI Vision OCR vs Document Intelligence; RAG vs fine-tuning; AI Search vector/hybrid vs pure keyword).
  • Governance fix: practice stating the minimum viable controls—managed identity, Key Vault, logging, Content Safety, and data access boundaries.
  • Lifecycle fix: confirm you know when custom model training is justified and what deployment/monitoring implies (drift, retraining cadence, evaluation).

End the review by selecting the top 5 weak objectives and doing a focused 60–90 minute remediation session, then reattempt a small set of similar items to confirm improvement.

Section 6.5: Final domain recap—must-know objectives and common traps

Section 6.5: Final domain recap—must-know objectives and common traps

This final recap is aligned to what AI-102 repeatedly measures: correct service choice, correct architecture pattern, and secure/operationally sound implementation. Start with planning and management: identify when you need private endpoints, managed identity, Key Vault, role assignments, and how to monitor with Azure Monitor and Application Insights. Many candidates lose points by proposing correct AI features but omitting deployment controls and observability.

Generative AI must-knows: prompt design basics (system vs user instructions, grounding directives), RAG pattern components (chunking, embeddings, retrieval, citations), and safety controls (Azure AI Content Safety, filtering policies, and logging). A key trap is assuming fine-tuning is the default for domain knowledge; the exam usually prefers RAG for up-to-date, auditable, and lower-risk knowledge injection.

Exam Tip: If the question mentions “latest policies,” “frequently changing content,” “citations,” or “traceability,” default toward RAG + Azure AI Search rather than fine-tuning.

Agentic solutions: know how tool/function calling fits into orchestrations, and the governance angle—restrict tool access, validate inputs/outputs, log actions, and include approval gates for high-impact actions. Traps include giving the agent unrestricted tool access or skipping audit trails in regulated scenarios.

  • Computer vision: choose between prebuilt analysis, OCR, and custom model lifecycle; know when video analysis or document layout is implied.
  • NLP: match requirement to capability (classification vs entity extraction vs summarization vs translation) and note multilingual and latency constraints.
  • Knowledge mining: understand indexing pipelines, enrichment skillsets, and security trimming; avoid solutions that break access control in retrieval.

Always validate the “nonfunctional” requirements: latency, throughput, cost, security, and maintainability. The correct answer is often the one that best balances functional correctness with these operational constraints.

Section 6.6: Exam day readiness—check-in steps, time strategy, and last-hour review

Section 6.6: Exam day readiness—check-in steps, time strategy, and last-hour review

On exam day, aim for predictable execution. Complete check-in early, ensure a quiet testing environment, and remove any workflow friction (power, network stability, allowed ID). Once the exam begins, your goal is to protect time and accuracy: use the two-pass strategy from Section 6.1 and keep a steady pace. If you feel stuck, it’s usually because you’re debating between two services—mark it, move on, and return when you’ve collected more context from other items.

Exam Tip: When returning to marked questions, reread only the last sentence first. Many prompts reveal the deciding constraint at the end (e.g., “must run in a private network,” “must extract tables,” “must provide citations,” “must support multiple languages”).

  • First 5 minutes: settle your pace; don’t overthink early questions.
  • Middle block: maintain discipline—answer, justify mentally, move.
  • Final buffer: resolve marked items, then do a fast scan for misreads (negations like “NOT,” “least,” “best”).

In the last hour before the exam, avoid deep study. Do a lightweight review of your “must-know mappings”: which service for which requirement (Vision vs Document Intelligence; Language features; Search indexing/enrichment; Azure OpenAI with RAG; Content Safety; managed identity and private endpoints). Rehearse a simple mental template: requirement → constraints → service → key configuration → monitoring/safety. That template turns stress into routine, which is exactly what passing requires.

Chapter milestones
  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
Chapter quiz

1. You are designing an AI-102 solution for a financial company. Requirements: (1) summarize customer call transcripts, (2) redact PII from the summaries, (3) log prompts and completions for auditing, and (4) ensure no customer data is used to train models. Which approach best meets the requirements with the least custom code?

Show answer
Correct answer: Use Azure OpenAI for summarization with content filtering, store logs in Azure Monitor/Log Analytics, and use Azure AI Language PII detection before storing the final summary; configure Azure OpenAI with data privacy guarantees (no training on your data).
A matches typical AI-102 decision-making: Azure OpenAI is the best fit for abstractive summarization, and Azure AI Language provides built-in PII entity recognition for redaction. Azure OpenAI provides enterprise data privacy (customer prompts/completions not used for training) and can be integrated with Azure Monitor/diagnostic logging for audit requirements. B is wrong because OCR/Custom Vision are computer vision services and do not apply to already-digital transcripts; Custom Vision is not a PII redaction service and would require significant custom labeling and inference logic. C is wrong because Azure AI Search retrieval is not summarization, and security controls do not replace PII detection/redaction—PII could still be exposed in retrieved passages.

2. You deployed an Azure AI Search + Azure OpenAI RAG solution. Users report that answers are fluent but frequently cite outdated policy versions. You confirm the policy PDFs were updated in Blob Storage yesterday. Which action should you take first to restore answer freshness?

Show answer
Correct answer: Run the indexer (or skillset pipeline) to re-ingest/re-enrich the updated documents and validate that the index contains the new content/metadata.
A is the first step because RAG freshness depends on the search index reflecting the latest source content; if the indexer/skillset hasn’t processed updates (or change detection isn’t configured correctly), retrieval will return old chunks regardless of model quality. B is wrong because temperature affects randomness, not data currency; it can increase hallucination risk. C can improve retrieval quality in some cases, but it will not fix stale content in the index; you must first ensure the updated documents are ingested and searchable.

3. A healthcare organization is building a document-processing pipeline to extract key-value pairs and tables from scanned lab reports and return structured JSON. The solution must handle variable layouts across different clinics and minimize training/maintenance. Which service should you choose?

Show answer
Correct answer: Azure AI Document Intelligence prebuilt models (e.g., Read/Layout and relevant prebuilt extraction) and optionally a composed model if needed, returning structured outputs.
A aligns with AI-102 guidance: Azure AI Document Intelligence is designed for OCR + layout, table extraction, and key-value/document field extraction with minimal training compared to custom vision pipelines; it returns structured results suitable for JSON. B is wrong because Image Analysis is not optimized for robust document layout/table understanding and would require substantial custom parsing logic and ongoing maintenance for layout variability. C is wrong because using a generative model as the primary extractor for scanned documents is less deterministic, harder to govern for PHI, and not the recommended service for reliable table/key-value extraction compared to Document Intelligence.

4. You are troubleshooting a production Azure OpenAI integration. The app intermittently fails with HTTP 429 errors during peak hours. Business requirement: maintain throughput without increasing user-facing latency. Which mitigation is most appropriate?

Show answer
Correct answer: Implement client-side retry with exponential backoff and request queuing, and request a quota/throughput increase or use multiple deployments to distribute load.
A reflects standard operational guidance: 429 indicates throttling due to rate/throughput limits; the correct response is resilient retry/backoff, smoothing bursts via queuing, and adjusting capacity (quota increase and/or load distribution across deployments/regions where applicable). B is wrong because content filtering is a safety/governance control and disabling it is not a valid throttling strategy; throttling is tied to capacity limits, not primarily safety features. C is wrong because reducing max_tokens can reduce per-request cost/latency, but forcing max_tokens=1 breaks the app’s functional requirements and does not guarantee elimination of throttling under high request rates.

5. After completing a mock exam, you identify a weak spot: you often choose the wrong service when a requirement combines OCR, entity extraction, and search-based retrieval. Which revision strategy most directly improves exam performance under time pressure?

Show answer
Correct answer: Create a decision matrix mapping requirements to Azure AI services (Vision vs Document Intelligence vs Language vs Search vs OpenAI), then drill with timed scenario questions focusing on elimination of distractors and operational constraints (security, monitoring, cost).
A matches how AI-102 questions are designed: they test service selection and operational constraints. A decision matrix plus timed scenario drills builds a repeatable method (identify objective, eliminate distractors, confirm best-fit service/feature with governance/ops constraints). B is wrong because passive review doesn’t simulate exam pressure or strengthen decision-making; you need scenario practice and error-driven refinement. C is wrong because while quotas/regions can matter, most exam errors come from misaligned service choice and architecture tradeoffs rather than memorizing exhaustive SKU/pricing tables.
More Courses
Edu AI Last
AI Course Assistant
Hi! I'm your AI tutor for this course. Ask me anything — from concept explanations to hands-on examples.