HELP

AI-102 Agentic AI & Knowledge Mining: Build and Deploy

AI Certification Exam Prep — Beginner

AI-102 Agentic AI & Knowledge Mining: Build and Deploy

AI-102 Agentic AI & Knowledge Mining: Build and Deploy

A complete, domain-mapped AI-102 prep path with practice and a mock exam.

Beginner ai-102 · microsoft · azure · azure-ai

Prepare to pass Microsoft AI-102 with a domain-mapped blueprint

This course is a structured exam-prep blueprint for the Microsoft AI-102 exam (Azure AI Engineer Associate). It’s designed for beginners with basic IT literacy who want a clear, step-by-step path from “new to certification exams” to confident exam readiness. The focus is on the exact skills measured in the official domains, with an emphasis on how Microsoft tests design decisions, tradeoffs, and operational readiness—not just terminology.

What the AI-102 exam measures (and how we cover it)

The AI-102 exam spans six core domains. This course is organized as a 6-chapter “book,” with Chapters 2–5 mapping directly to the skill areas and Chapter 6 providing a full mock exam and final review cycle.

  • Plan and manage an Azure AI solution — governance, identity, security, cost, monitoring, and deployment readiness.
  • Implement generative AI solutions — Azure OpenAI design patterns, prompting, grounding/RAG, safety, and evaluation.
  • Implement an agentic solution — tool/function orchestration, memory/state, reliability controls, and safe automation.
  • Implement computer vision solutions — OCR and image analysis architecture choices and workflow integration.
  • Implement NLP solutions — extraction, classification, summarization, and conversational patterns.
  • Implement knowledge mining and information extraction — Azure AI Search indexing, enrichment pipelines, and structured extraction at scale.

Course structure (6 chapters that mirror how you’ll study)

Chapter 1 is your exam on-ramp: registration, scoring, question styles, and a beginner-friendly study strategy so you don’t waste time on low-yield topics. Chapters 2–5 each go deep into one or two official domains, emphasizing common scenario patterns (security constraints, quota limitations, latency vs cost decisions, and “best answer” architecture selection). Chapter 6 delivers a full mock exam split into two parts, followed by weak-spot analysis and an exam-day checklist.

Why this blueprint improves pass probability

AI-102 questions often reward applied reasoning: choosing the right Azure AI capability, securing access properly, selecting a retrieval strategy, or identifying the most supportable deployment approach. This course is designed to build that reasoning through structured coverage and exam-style practice in every major chapter, culminating in a full mock exam and a final review map by domain.

  • Beginner-first exam strategy: understand how Microsoft tests scenarios and tradeoffs.
  • Domain alignment: every chapter traces back to named official objectives.
  • Practice-driven: consistent exam-style questions to build speed and accuracy.
  • Mock exam + review loop: identify weak areas and fix them before exam day.

Get started

If you’re ready to build a reliable study routine and prep with a domain-mapped plan, start by setting up your learning account and bookmarking your study schedule. Register free to begin, or browse all courses to compare related Azure AI tracks.

What You Will Learn

  • Plan and manage an Azure AI solution (governance, security, cost, monitoring, deployment)
  • Implement generative AI solutions with Azure OpenAI (prompting, RAG patterns, safety, evaluation)
  • Implement an agentic solution (tools/functions, orchestration, memory, reliability and controls)
  • Implement computer vision solutions (image analysis, OCR, custom vision workflows, responsible AI)
  • Implement NLP solutions (classification, extraction, summarization, conversational patterns)
  • Implement knowledge mining and information extraction (Azure AI Search indexing, enrichment, skillsets)

Requirements

  • Basic IT literacy (networks, web apps, APIs, JSON)
  • Comfort using a browser-based Azure portal (no prior certification experience required)
  • Optional: basic Python or C# familiarity to understand SDK examples

Chapter 1: AI-102 Exam Orientation and Study Strategy

  • Exam format, registration, and scoring: what to expect
  • Domain-by-domain study plan and timeboxing for beginners
  • Lab environment setup strategy (Azure subscription, quotas, tools)
  • How to read AI-102 questions: keywords, distractors, and elimination
  • Baseline diagnostic quiz and goal setting

Chapter 2: Plan and Manage an Azure AI Solution

  • Design solution architecture across Azure AI services
  • Identity, network, and data security decisions for AI workloads
  • Cost management, quotas, capacity planning, and scaling
  • Deployment patterns: dev/test/prod and CI/CD concepts
  • Exam-style practice set: governance and operations scenarios

Chapter 3: Implement Generative AI Solutions

  • Model selection and prompt engineering fundamentals for Azure OpenAI
  • RAG design: grounding, retrieval, and context window management
  • Safety, content filters, and responsible AI for generative apps
  • Evaluation and optimization: latency, cost, quality, and regression testing
  • Exam-style practice set: generative solution design and troubleshooting

Chapter 4: Implement an Agentic Solution

  • Agent design: goals, plans, tools, and constraints
  • Orchestration and tool/function integration patterns
  • Memory, state, and knowledge access for multi-step tasks
  • Reliability engineering: guardrails, retries, and human-in-the-loop
  • Exam-style practice set: agent workflows and control scenarios

Chapter 5: Computer Vision, NLP, and Knowledge Mining

  • Computer vision solutions: image analysis and OCR design choices
  • NLP solutions: extraction, classification, summarization, and conversation
  • Knowledge mining with Azure AI Search: indexing and enrichment pipelines
  • Information extraction workflows: documents, entities, and metadata at scale
  • Exam-style practice set: CV/NLP/Search integrated scenarios

Chapter 6: Full Mock Exam and Final Review

  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final domain review and rapid recall drills

Jordan Whitaker

Microsoft Certified Trainer (MCT)

Jordan Whitaker is a Microsoft Certified Trainer who designs exam-aligned learning paths for Azure and AI certifications. He has coached learners through Microsoft certification prep with a focus on practical architecture, governance, and deployment skills.

Chapter 1: AI-102 Exam Orientation and Study Strategy

AI-102 tests whether you can design and implement practical Azure AI solutions under real-world constraints: security, cost, reliability, and responsible AI. This chapter sets expectations for the exam experience, then gives you a repeatable study system that aligns to the domains you’ll be scored on. You’ll also learn how to set up a lab environment without burning time or money, how to read AI-102 questions for keywords and hidden constraints, and how to use diagnostics to set realistic score goals.

The biggest mistake candidates make is studying “by service” (e.g., only Azure OpenAI or only Azure AI Search) rather than “by objective.” The exam rewards integration: using Azure AI Search for retrieval-augmented generation (RAG), adding safety filters and evaluation, deploying with monitoring, and applying governance practices. Throughout this chapter, you’ll see how to translate an objective into (1) documentation to read, (2) a lab to build, and (3) question patterns to practice.

Exam Tip: Treat AI-102 as an implementation-and-operations exam, not a conceptual AI theory test. If your study notes don’t include deployment steps, quota constraints, authentication choices, and monitoring signals, you’re missing what the exam is measuring.

Practice note for Exam format, registration, and scoring: what to expect: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain-by-domain study plan and timeboxing for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Lab environment setup strategy (Azure subscription, quotas, tools): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for How to read AI-102 questions: keywords, distractors, and elimination: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Baseline diagnostic quiz and goal setting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam format, registration, and scoring: what to expect: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain-by-domain study plan and timeboxing for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Lab environment setup strategy (Azure subscription, quotas, tools): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for How to read AI-102 questions: keywords, distractors, and elimination: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Baseline diagnostic quiz and goal setting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 1.1: AI-102 overview and Azure AI Engineer Associate role

Section 1.1: AI-102 overview and Azure AI Engineer Associate role

AI-102 validates the skills of an Azure AI Engineer Associate: someone who plans, builds, integrates, and operates AI capabilities in Azure. The exam focuses on applied decision-making: choosing the right Azure AI services, wiring them together securely, and meeting non-functional requirements (latency, cost, governance, and reliability). Expect scenarios that combine multiple components—like Azure OpenAI + Azure AI Search + storage + managed identities + monitoring—because that reflects how solutions ship in production.

The course outcomes map to the job role you’re being tested on: planning and managing solutions (governance/security/cost/monitoring/deployment), implementing generative AI solutions with Azure OpenAI (prompting, RAG, safety, evaluation), implementing agentic solutions (tools/functions, orchestration, memory, reliability controls), plus computer vision, NLP, and knowledge mining with Azure AI Search indexing/enrichment.

On the exam, “role clarity” matters because it tells you what the question expects you to own. A common trap is over-indexing on model selection or data science details. AI-102 is not asking you to train foundation models; it’s asking you to deploy and integrate capabilities safely. When a question mentions production requirements (PII, private networking, RBAC, cost caps), interpret it as an engineering governance question, not a prompt-writing question—even if Azure OpenAI appears in the stem.

Exam Tip: When you see wording like “minimize administrative effort,” “follow least privilege,” or “meet compliance,” think first about managed identities, RBAC, private endpoints, logging, and content safety controls—these are frequent differentiators between answer choices.

Section 1.2: Registration, exam policies, accommodations, and retakes

Section 1.2: Registration, exam policies, accommodations, and retakes

Register for AI-102 through Microsoft’s certification portal and schedule via the authorized provider (typically online proctored or in-person). Plan your date backward from a realistic study runway: beginners often need multiple weeks of structured practice plus lab time. Your goal is not just completion of reading; it’s exposure to scenario questions where governance, deployment, and service limits appear together.

Know the policies because they affect your test-day performance. Online proctoring has strict rules: clean desk, no secondary screens, and limited breaks. In-person centers reduce proctoring friction but require travel time and can add scheduling constraints. If you need accommodations (extra time, assistive technology), apply early; delays here can disrupt your study calendar and force you into a suboptimal date.

Retake rules matter for strategy. Some candidates rush an attempt “to see the exam” and then spend the retake window frustrated. A better approach is to use a baseline diagnostic (without writing down exact items) to identify weak domains, then book the exam once you can consistently explain why each incorrect option is wrong. Budget both time and cost: factor in subscription charges for labs and potential retake fees.

Exam Tip: Schedule your exam at a time when you can do a full-length practice set in the same time slot during the two weeks prior. This reduces cognitive surprises and helps you pace case studies and multi-part items.

Section 1.3: Scoring model, question types, and case study tactics

Section 1.3: Scoring model, question types, and case study tactics

AI-102 uses scaled scoring, so your raw correct count translates to a scaled score. Treat every question as valuable; don’t assume some items “don’t count.” Your controllable variable is consistency: avoid preventable misses caused by misreading constraints, confusing similar services, or ignoring the requirement to choose the best option (not merely a working one).

Expect multiple question types: single-answer multiple choice, multiple-response, ordering/sequence (e.g., correct deployment steps), drag-and-drop matching, and case studies. Case studies are time traps: they present a business scenario, existing environment, requirements, and constraints. The exam often hides key constraints in non-obvious places—such as “must support customer-managed keys,” “data must not traverse public internet,” or “solution must be regionally pinned.” Those constraints eliminate otherwise attractive answers.

Case study tactics: read requirements first, then skim the environment for blockers (identity, networking, quotas), and only then read the question. Re-check each answer against the exact wording. Many distractors are “nearly right,” but violate one constraint (e.g., using keys in code instead of managed identity, or proposing a service that doesn’t support the needed feature in that context).

Exam Tip: Build a habit of underlining three items mentally: (1) the user goal (what outcome), (2) the hard constraint (what must/must not), and (3) the optimization target (minimize cost, maximize reliability, reduce latency). Most wrong answers fail one of these three.

Section 1.4: Skill outline mapping to the six official exam domains

Section 1.4: Skill outline mapping to the six official exam domains

Your study plan should be domain-driven and timeboxed, especially if you’re new to Azure AI. While exact weightings can change, AI-102 commonly evaluates competency across these six functional domains that align tightly with your course outcomes: (1) Plan and manage an Azure AI solution; (2) Implement generative AI solutions with Azure OpenAI; (3) Implement an agentic solution; (4) Implement computer vision solutions; (5) Implement NLP solutions; (6) Implement knowledge mining and information extraction with Azure AI Search.

Map each domain to what the exam actually tests:

  • Planning/management: identity (managed identity, RBAC), network isolation, cost controls, monitoring/alerts, deployment patterns, and responsible AI governance.
  • Generative AI (Azure OpenAI): prompt patterns, RAG architecture, embeddings + retrieval, safety filters, evaluation approaches, and operational considerations (quotas, rate limits).
  • Agentic solutions: tool/function calling, orchestration choices, memory boundaries, reliability controls (timeouts, retries), and guardrails (allowlists, content checks).
  • Computer vision: OCR vs image analysis vs custom vision workflows, when to use prebuilt vs custom, and privacy/responsible use.
  • NLP: extraction/classification/summarization patterns, conversation design decisions, and measuring quality.
  • Knowledge mining: indexing pipelines, enrichment/skillsets, data sources, field mappings, and query patterns.

A frequent trap is mixing similarly named services or features. For example, candidates confuse “knowledge mining” (indexing + enrichment pipelines) with “RAG” (retrieval at query time). The exam expects you to know how they complement: knowledge mining builds the searchable index; RAG uses retrieval results to ground a generative response.

Exam Tip: Whenever a question mentions “indexing,” “enrichment,” “skillset,” or “cognitive skills,” anchor on Azure AI Search ingestion. Whenever it mentions “grounding,” “citations,” or “use retrieved passages,” anchor on RAG with search + embeddings.

Section 1.5: Study workflow: notes, flashcards, labs, and spaced repetition

Section 1.5: Study workflow: notes, flashcards, labs, and spaced repetition

A beginner-friendly plan is a two-track workflow: concept track (reading + notes) and execution track (labs). Timebox by domain: allocate weekly blocks so you touch all domains early, then return for deepening. This prevents the common failure mode of spending all week on generative AI and then realizing you never practiced AI Search skillsets or OCR configuration.

Notes: keep them objective-aligned and decision-focused. Write down “if-then” rules that appear in questions: if private networking required → consider private endpoints; if secretless auth required → managed identity; if cost must be minimized → pick prebuilt capability rather than custom training, when it meets requirements.

Flashcards: convert your “if-then” rules and service differentiators into spaced repetition prompts. Your goal is speed: on exam day, you should instantly recognize which service/feature a keyword implies.

Labs: set up a stable practice environment. Use one Azure subscription (or a dedicated resource group strategy), define naming conventions, and clean up resources to control cost. Plan for quotas and regional availability: generative AI and some vision/NLP features can be region-limited. Track your Azure OpenAI quota and model availability, and document which region you can reliably deploy to.

Exam Tip: In labs, practice the “boring” pieces: identity, RBAC, logging, and key management. Many candidates can demo a chatbot but miss questions about secure access to storage/search or how to monitor failures and throughput.

Section 1.6: Practice strategy: reviewing rationales and tracking weak areas

Section 1.6: Practice strategy: reviewing rationales and tracking weak areas

Practice is where your score moves. The key is not volume; it’s rationale review. After each practice set, classify misses into buckets: (1) knowledge gap (didn’t know feature), (2) misread constraint (missed “must/only/not”), (3) elimination failure (didn’t spot why distractor violates requirements), or (4) execution gap (never built it in a lab). Each bucket has a different fix: read docs, slow down and annotate constraints, practice elimination, or run a lab.

Use a baseline diagnostic early to set goals and timebox. Your diagnostic isn’t about memorizing items; it’s about discovering which domains are currently unstable. Beginners often find planning/management and knowledge mining are weaker than expected because they require Azure platform fluency (identity, networking, indexing pipelines). Set a realistic target score and a retest plan only if needed, but don’t schedule a retake as motivation—schedule study milestones as motivation.

Track weak areas with a simple spreadsheet: domain, sub-skill, symptom, corrective action, and re-test date. Then apply spaced repetition: revisit the same sub-skill 1 day, 3 days, 7 days, and 14 days after you fix it. The exam rewards durable recognition of patterns, not short-term cramming.

Exam Tip: When reviewing rationales, always write a one-sentence “why the correct answer is best” and a one-sentence “why each distractor fails.” This builds elimination speed, which is often the difference between passing and timing out on case studies.

Chapter milestones
  • Exam format, registration, and scoring: what to expect
  • Domain-by-domain study plan and timeboxing for beginners
  • Lab environment setup strategy (Azure subscription, quotas, tools)
  • How to read AI-102 questions: keywords, distractors, and elimination
  • Baseline diagnostic quiz and goal setting
Chapter quiz

1. You are planning your AI-102 preparation. Based on the exam’s scoring approach, which study strategy best aligns with how AI-102 evaluates candidates?

Show answer
Correct answer: Study by exam objectives and practice end-to-end solution implementation across services (security, cost, monitoring, responsible AI)
AI-102 is an implementation-and-operations exam that scores you against objective domains and expects integrated solutions (for example, combining retrieval, safety, deployment, and monitoring). Option B is a common pitfall: studying “by service” misses cross-service integration and objective coverage. Option C is wrong because AI-102 is not a theory-heavy exam; it emphasizes practical build, deploy, secure, and operate tasks.

2. You are setting up a lab environment for AI-102 practice. You have a limited budget and want to avoid getting blocked mid-lab by platform limits. What is the best initial approach?

Show answer
Correct answer: Create an Azure subscription and verify required resource provider registration, quotas, and region availability before building labs; use cost controls to limit spend
AI-102 labs commonly require Azure resources that can be constrained by quotas, SKUs, and regional availability. Proactively verifying subscription access, quotas, and regions reduces failure risk and aligns with real exam expectations around constraints and operations. Option B is wrong because hands-on implementation is central to AI-102; postponing setup delays required practice. Option C is wrong because quotas and regional/SKU limitations are common blockers in Azure AI services.

3. During a practice question, you see the requirement: “The solution must minimize operational overhead and support monitoring in production.” Which approach best matches how to interpret and answer AI-102 questions?

Show answer
Correct answer: Treat the requirement as a key constraint and eliminate options that increase manual steps or lack monitoring/operability signals
AI-102 questions often hide scoring-relevant constraints in wording (for example, operational overhead, monitoring, cost, security). Correct technique is to identify keywords and use them to eliminate distractors. Option B is wrong because operability and monitoring are explicitly tested as real-world constraints. Option C is wrong because “most feature-rich” can violate constraints (higher ops overhead/cost) and is a common distractor pattern.

4. A beginner candidate has four weeks to prepare for AI-102 and struggles to stay consistent. Which plan best reflects the chapter’s recommended domain-by-domain approach and timeboxing?

Show answer
Correct answer: Create a weekly schedule mapped to exam domains, timebox each domain, and include (1) targeted docs, (2) a lab, and (3) question patterns per objective
The chapter emphasizes studying by objective/domain with a repeatable loop: documentation, hands-on lab, and practice questions, using timeboxing to prevent over-investing in a single area. Option B is wrong because delaying questions and labs reduces feedback and fails to build exam-relevant implementation skills. Option C is wrong because “by service” studying often misses integrated objectives and exam domain coverage.

5. You take a baseline diagnostic quiz and score below your target. What is the best next action aligned with the chapter’s goal-setting guidance?

Show answer
Correct answer: Use the diagnostic results to identify weak domains, set a realistic score goal and timeline, then adjust your study plan and labs accordingly
A baseline diagnostic is meant to inform goal setting and planning: identify weak domains, set a realistic target, and adapt study activities (docs/labs/practice patterns). Option B is wrong because diagnostics should drive changes in prioritization and timeboxing, not be ignored. Option C is wrong because AI-102 emphasizes implementation and operational decision-making, not pure memorization.

Chapter 2: Plan and Manage an Azure AI Solution

AI-102 does not only test whether you can call an API. It tests whether you can operate an AI solution in Azure: choose the right services, secure them, control cost, deploy safely, and monitor the system over time. Many exam scenarios describe an “AI app” in business terms (support chatbot, document processing pipeline, image moderation workflow) and then ask you to pick architecture, identity, networking, or operations choices that meet constraints like “no public internet,” “use least privilege,” “minimize cost,” or “support dev/test/prod.”

This chapter maps to the exam objective of planning and managing an Azure AI solution. Expect scenario questions that force tradeoffs: speed vs. cost, simplicity vs. governance, and proof-of-concept vs. production readiness. The correct answer is usually the one that aligns with Azure’s recommended patterns: managed identity over keys, private networking over public endpoints for sensitive data, Azure Monitor over ad-hoc logging, and automated deployments over manual portal changes.

You’ll also see knowledge mining and retrieval-augmented generation (RAG) architectures show up indirectly here: storage + Azure AI Search indexing + enrichment + Azure OpenAI inference. Even when the prompt engineering is the “main feature,” the exam often scores you on how you secure data, prevent exfiltration, and manage quotas.

Practice note for Design solution architecture across Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identity, network, and data security decisions for AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Cost management, quotas, capacity planning, and scaling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deployment patterns: dev/test/prod and CI/CD concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: governance and operations scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design solution architecture across Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Identity, network, and data security decisions for AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Cost management, quotas, capacity planning, and scaling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deployment patterns: dev/test/prod and CI/CD concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: governance and operations scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 2.1: Choose Azure AI services for the workload (tradeoffs and fit)

Section 2.1: Choose Azure AI services for the workload (tradeoffs and fit)

AI-102 expects you to recognize which Azure AI service best matches a requirement and why. Start by classifying the workload: (1) generative chat or content creation, (2) search and knowledge mining, (3) vision/OCR, (4) language understanding/extraction, or (5) orchestration/agentic workflows. For generative models, the managed path is Azure OpenAI, typically paired with Azure AI Search for RAG. For knowledge mining and enterprise search, Azure AI Search is the hub: it indexes content from Blob, SQL, Cosmos DB, and can apply enrichment skillsets (OCR, entity recognition, key phrase extraction) during indexing.

For document-heavy scenarios, distinguish between “OCR only” and “structured extraction.” Azure AI Vision OCR handles text detection; Azure AI Document Intelligence (formerly Form Recognizer) targets forms, invoices, receipts, and layout extraction. For image understanding beyond OCR (tags, captions, object detection), Azure AI Vision fits. For classic NLP (classification, NER, summarization), Azure AI Language is the go-to—unless the scenario explicitly requires generative output with grounding, in which case Azure OpenAI + RAG is usually more aligned.

Exam Tip: When a prompt-based chatbot must answer using internal PDFs and must cite sources, the exam is usually steering you to Azure OpenAI + Azure AI Search (RAG) rather than “fine-tune a model” or “store documents in a database and prompt directly.”

  • Choose Azure AI Search when the requirement includes relevance ranking, filtering, facets, semantic search, vector search, or hybrid retrieval across many documents.
  • Choose enrichment (skillsets) when the question mentions “extract entities during indexing,” “normalize content,” or “make images searchable.”
  • Choose Document Intelligence when the output must be fields (vendor, total, date) with consistent schema.

Common trap: picking the most “powerful” service instead of the most direct fit. The exam often rewards minimizing complexity: if you only need OCR, don’t introduce an LLM. If you need search relevance and citations, don’t rely on blob listing or “prompt stuffing.”

Section 2.2: Authentication and authorization (Entra ID, keys, managed identity)

Section 2.2: Authentication and authorization (Entra ID, keys, managed identity)

Identity questions are frequent because they combine governance, security, and operational best practices. Azure AI services typically support key-based authentication, and many support Microsoft Entra ID (Azure AD) via role-based access control (RBAC). For production, prefer Entra ID because it enables least privilege, auditing, and easy rotation without embedding secrets. For workloads running in Azure (App Service, Functions, AKS, Logic Apps), use managed identity so the runtime can obtain tokens without storing credentials.

Understand the “who calls what” chain. Example: an Azure Function calls Azure AI Search and Azure OpenAI. With a system-assigned managed identity on the Function, you can grant RBAC roles to that identity on the Search service and the Azure OpenAI resource. Your app code then requests tokens; no API keys required. If the scenario involves developers or CI/CD pipelines, service principals or federated credentials (OIDC) may appear as the right choice to avoid long-lived secrets.

Exam Tip: If the stem says “avoid storing secrets,” “use least privilege,” or “centralize access control,” the best answer is typically managed identity + RBAC, not API keys in configuration.

  • Keys: simplest, but rotation and leakage risk; often used for quick demos or where Entra ID isn’t supported for a specific operation.
  • Entra ID + RBAC: preferred for enterprise controls, auditability, and conditional access policies (human access).
  • Managed identity: best for Azure-hosted workloads calling Azure services.

Common trap: confusing RBAC with data-plane permissions. Some services require both management-plane permissions (create resources) and data-plane roles (query indexes, use models). On the exam, read for verbs like “deploy the resource” vs. “call the endpoint.” Another trap is granting overly broad roles (Owner/Contributor) when a narrower built-in role exists; least privilege is a consistent scoring theme.

Section 2.3: Network security and data protection (private endpoints, encryption)

Section 2.3: Network security and data protection (private endpoints, encryption)

AI-102 operations questions often hinge on whether data must stay off the public internet. In Azure, the standard pattern is to use Private Link (private endpoints) so clients connect to services over a private IP in a virtual network. For Azure AI Search, Azure OpenAI, Storage, and many related services, private endpoints reduce exposure and help meet compliance requirements. Combine this with disabling public network access when the scenario demands it.

Data protection includes encryption at rest (typically enabled by default) and encryption in transit (HTTPS/TLS). Exam items may ask how to manage customer-managed keys (CMK) using Azure Key Vault for services that support it. Also consider secrets handling: store keys/certificates in Key Vault, not in code or app settings—unless a managed identity removes the need for secrets altogether.

Exam Tip: If the question includes “must not traverse the public internet,” the answer is usually private endpoint + VNet integration (or equivalent private networking), not “IP restrictions” alone.

  • Private endpoints: map the service to a private IP; works well with hub-and-spoke VNets and on-prem via VPN/ExpressRoute.
  • Network rules/firewalls: can restrict allowed IP ranges, but still uses public endpoints unless paired with private access.
  • Key Vault: for CMK and secret storage; typically accessed via managed identity.

Common trap: mixing up “service endpoint” vs. “private endpoint.” Service endpoints extend VNet identity to Azure services but do not provide the same private IP-based isolation as Private Link. Another trap: assuming that disabling public access is always possible; some scenarios require a transitional architecture (e.g., dev public, prod private), which the exam may reward if it’s aligned to environment separation and governance.

Section 2.4: Monitoring and reliability (logging, metrics, alerting, SLAs)

Section 2.4: Monitoring and reliability (logging, metrics, alerting, SLAs)

Production AI solutions require observability: you must detect failures, performance regressions, and usage anomalies (including cost spikes). In Azure, the baseline is Azure Monitor: metrics for service health and latency, logs (often via Log Analytics workspace), and alerts routed to action groups. For app-level tracing, Application Insights is commonly paired with Functions/App Service to correlate requests across dependencies.

For Azure AI Search, monitor query latency, throttling, and indexer failures. For Azure OpenAI, watch request rates, token usage, and throttling/429 responses—then implement retry with exponential backoff. Reliability decisions also include designing for transient faults and regional issues: use retries, circuit breakers, and idempotency where appropriate. If the scenario mentions “business-critical” or “high availability,” you may need to consider multi-region design (where supported) and clear RTO/RPO expectations, but avoid inventing complex architectures when a single-region SLA meets stated requirements.

Exam Tip: When you see “intermittent 429” or “requests failing under load,” the exam usually wants throttling-aware client behavior (retry/backoff) and/or capacity planning—not just “increase timeout.”

  • Logs: use structured logging for prompts, retrieval queries, and tool calls (with redaction) to support troubleshooting and safety reviews.
  • Metrics: alert on error rate, latency percentiles, throttling, and indexer status.
  • SLAs: know that SLAs apply to the service, but your end-to-end SLA depends on every dependency in the chain.

Common trap: logging sensitive content. The “best” operational answer often includes collecting enough telemetry to debug while applying data minimization: avoid storing full prompts or documents unless required, mask PII, and control access to logs using RBAC.

Section 2.5: Cost, quotas, and capacity planning for AI services

Section 2.5: Cost, quotas, and capacity planning for AI services

Cost is a first-class exam topic because AI workloads can scale unpredictably. Azure OpenAI is frequently constrained by quotas and token-based consumption; Azure AI Search costs depend on service tier/partitions/replicas; enrichment incurs additional compute. You must translate requirements into capacity choices: throughput, latency, index size, and concurrency. A good mental model: replicas scale query throughput (read), partitions scale index/storage and ingestion throughput (write). If the stem says “high query volume,” think replicas; if it says “large index” or “heavy ingestion,” think partitions.

Quotas show up as “deployment capacity” or rate limits. When usage exceeds quota, you’ll see throttling, and the right mitigation depends on the bottleneck: request a quota increase, implement batching, reduce tokens (shorter prompts, smaller context), cache responses, or move heavy processing offline (precompute embeddings, enrich at index time). For RAG, token cost is strongly driven by how much text you stuff into the prompt—improving retrieval quality and chunking strategy is a cost control mechanism, not only an accuracy tweak.

Exam Tip: If the scenario asks to “reduce cost without losing accuracy,” consider architectural changes like better retrieval (fewer chunks), caching, and using the smallest model that meets requirements—before jumping to “buy more capacity.”

  • Azure Cost Management: budgets, alerts, and cost analysis to detect spikes.
  • Tagging: enforce cost allocation by environment/app/team.
  • Autoscale vs. fixed: some AI services scale differently; the exam often tests whether you know to scale the correct dimension (replicas/partitions) rather than generic “scale up.”

Common trap: assuming “serverless” means “cheap.” A bursty workload can be expensive if each request is large (tokens) or triggers heavy enrichment. Read for constraints like “predictable workload” vs. “spiky traffic,” and match to reserved capacity/throughput planning where appropriate.

Section 2.6: Lifecycle management: environments, IaC/CI-CD concepts, rollbacks

Section 2.6: Lifecycle management: environments, IaC/CI-CD concepts, rollbacks

AI-102 expects you to treat AI resources as deployable infrastructure with controlled promotion across dev/test/prod. Environment separation reduces blast radius and supports compliance: separate resource groups/subscriptions, distinct keys/endpoints, and distinct data sets. Use Infrastructure as Code (IaC) such as Bicep or ARM templates (and commonly Terraform in real projects) to create repeatable deployments: AI Search services, indexes, data sources, indexers, OpenAI deployments, managed identities, and private endpoints.

CI/CD concepts appear as “how do you deploy safely?” The exam-friendly answer usually includes: version control, automated builds, automated tests, staged releases, and approvals for production. For AI systems, include configuration versioning (prompts, index schema, skillsets) and safe rollout patterns. Blue/green or canary deployments reduce risk: route a small percentage of traffic to the new version, monitor, then promote. Rollbacks must be planned—especially for schema changes (index fields) and model deployment changes (switching OpenAI deployments). A practical rollback approach includes keeping the prior index and deployment available until the new one is validated.

Exam Tip: If the stem mentions “avoid downtime” during index changes, prefer deploying a new index (v2), reindexing, then swapping the app to the new index—rather than modifying an index in-place.

  • Dev/test/prod: separate resources and data; don’t test with production PII.
  • IaC: repeatable and auditable; supports policy enforcement.
  • Rollbacks: keep prior versions and use staged traffic shifting.

Common trap: manual portal changes in production. The exam often frames this as “fastest” but not “correct.” Look for answers that emphasize automation, traceability, and policy-based governance (e.g., Azure Policy to restrict public network access or enforce tags), aligning operational controls with the solution’s security and cost requirements.

Chapter milestones
  • Design solution architecture across Azure AI services
  • Identity, network, and data security decisions for AI workloads
  • Cost management, quotas, capacity planning, and scaling
  • Deployment patterns: dev/test/prod and CI/CD concepts
  • Exam-style practice set: governance and operations scenarios
Chapter quiz

1. A company is building a RAG-based support chatbot using Azure OpenAI, Azure AI Search, and Azure Storage. Security policy requires that no service endpoints are reachable over the public internet and that access follows least privilege. Which approach best meets the requirements?

Show answer
Correct answer: Use private endpoints (Private Link) for Storage, Azure AI Search, and Azure OpenAI, and use managed identities with Azure RBAC for service-to-service access
Private endpoints remove public reachability and keep traffic on private networking, which aligns with common exam requirements like “no public internet.” Managed identities reduce secret management and support least privilege via RBAC. Option B still uses public endpoints (even if IP-restricted) and relies on long-lived secrets (keys), which is weaker governance. Option C violates the explicit “no public internet” constraint even if traffic is encrypted.

2. You are planning capacity for an Azure OpenAI workload used by multiple internal applications. The business expects periodic spikes and wants to avoid production outages caused by throttling. What should you plan for first to reduce the risk of request failures during spikes?

Show answer
Correct answer: Request sufficient Azure OpenAI quota/capacity in the target region and implement client-side retry/backoff for 429 responses
Azure OpenAI workloads are commonly limited by service quota/capacity and throttling (429), so planning quota and implementing retry/backoff is the most direct mitigation for spike-driven failures. Option B can help storage cost/throughput but does not address model inference throttling. Option C may improve the front-end’s compute capacity, but it won’t prevent Azure OpenAI rate limits from rejecting requests.

3. A team manages dev/test/prod environments for an AI document processing pipeline (Storage -> Azure AI Search indexing -> Azure OpenAI summarization). They want consistent, repeatable deployments and to prevent configuration drift caused by manual portal changes. Which approach best aligns with certification exam recommended practices?

Show answer
Correct answer: Use Infrastructure as Code (ARM/Bicep/Terraform) and a CI/CD pipeline to deploy each environment with parameterized configurations and approval gates for production
CI/CD plus IaC is the recommended pattern to achieve repeatability, environment consistency, and reduced drift, often emphasized in exam scenarios about dev/test/prod. Option B and C rely on manual steps and environment-specific tweaks, which increases drift and makes auditing and rollback difficult.

4. A healthcare organization must ensure that only an Azure Function can read documents from a Storage account and call Azure AI Search indexing, without storing secrets in code. What is the best identity approach?

Show answer
Correct answer: Enable a managed identity on the Azure Function and assign the minimum required RBAC roles on Storage and Azure AI Search
Managed identity avoids embedding secrets and supports least privilege via RBAC assignments, which is a common exam best practice. Option B uses powerful keys (often over-privileged) and increases secret-handling risk even with rotation. Option C introduces shared credentials across environments, breaks least-privilege isolation, and is poor operational governance.

5. A company is running an AI content moderation workflow and wants centralized visibility into failures, latency, and request volume over time. They also need alerting when error rates exceed a threshold. Which solution best meets these operational requirements?

Show answer
Correct answer: Use Azure Monitor (metrics, logs), Application Insights for the app layer, and configure alert rules based on logs/metrics
Azure Monitor plus Application Insights provides centralized telemetry, queryable logs, metrics, and alerting—typical exam guidance for operating AI solutions. Option B is not real-time, is difficult to query at scale, and doesn’t provide alerting. Option C is insufficient for application/service health (activity logs track control-plane operations, not detailed runtime failures/latency).

Chapter 3: Implement Generative AI Solutions

This chapter maps to the AI-102 skills measured around building production-grade generative AI apps with Azure OpenAI: choosing the right model and deployment, engineering prompts for reliable outputs, designing Retrieval-Augmented Generation (RAG) to ground responses, integrating tools/functions for agent-like behaviors, applying safety and responsible AI controls, and evaluating/optimizing for cost, latency, and quality. The exam frequently tests whether you can connect design decisions (model choice, context strategy, filters, caching) to real operational outcomes (accuracy, hallucination rate, throughput, cost per request, and data exposure risk).

Expect scenario questions that include partial constraints: “must not store prompts,” “needs citations,” “needs low latency,” “needs deterministic JSON,” or “budget is capped.” Your job is to recognize which Azure OpenAI features, prompt patterns, and RAG components address those constraints. A common trap is answering with a generic “use GPT-4” or “use RAG” without aligning to governance, token budgets, and safety controls. Use the sections below as your decision checklist.

Practice note for Model selection and prompt engineering fundamentals for Azure OpenAI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG design: grounding, retrieval, and context window management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Safety, content filters, and responsible AI for generative apps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Evaluation and optimization: latency, cost, quality, and regression testing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: generative solution design and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Model selection and prompt engineering fundamentals for Azure OpenAI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG design: grounding, retrieval, and context window management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Safety, content filters, and responsible AI for generative apps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Evaluation and optimization: latency, cost, quality, and regression testing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: generative solution design and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 3.1: Azure OpenAI resources, deployments, and model selection

Section 3.1: Azure OpenAI resources, deployments, and model selection

AI-102 expects you to understand Azure OpenAI as an Azure resource with one or more deployments (model + version + capacity configuration). Exam scenarios often distinguish between the “resource” (networking, identity, keys, private endpoints, logging) and the “deployment” (which model is served and how your app calls it). You typically choose a model family based on capability needs (reasoning quality, instruction following, multimodal), and then constrain cost/latency by selecting an appropriate size and response budget.

Model selection is not just “best model wins.” The exam tests tradeoffs: small/fast models for high-throughput classification or extraction; larger reasoning models for complex synthesis; and embedding models for semantic search/RAG indexing. Also consider whether you need vision input, strict structured outputs, or tool calling support in your chosen model/version.

  • Capability fit: reasoning-heavy tasks vs routine generation; multimodal needs; tool/function calling.
  • Operational fit: latency SLOs, concurrency, and cost per 1K tokens.
  • Data/regional fit: resource region constraints and enterprise networking requirements.

Exam Tip: When a question mentions “capacity,” “throughput,” “quota,” or “multiple environments,” think in terms of separate deployments (dev/test/prod) and possibly separate resources for isolation. Answer options that only mention “switch the model in code” may be incomplete if the scenario needs governance, network isolation, or distinct monitoring boundaries.

Common trap: confusing Azure AI Search indexing/embeddings (offline or batch) with the chat completion runtime. On the exam, embeddings are typically generated during ingestion (or periodically refreshed), while chat completions are runtime and should be optimized for token usage and response size.

Section 3.2: Prompt patterns (system vs user, few-shot, structured outputs)

Section 3.2: Prompt patterns (system vs user, few-shot, structured outputs)

Prompt engineering fundamentals are heavily tested through “why is the model ignoring instructions?” or “how do I force JSON?” scenarios. You should know role hierarchy and intent: system messages define durable behavior (tone, rules, boundaries), user messages provide the task and data, and assistant messages can be used for few-shot demonstrations or to continue conversation state. The exam expects you to place policy-like instructions (safety boundaries, citation requirements, refusal behaviors) in the system message, not buried in the user prompt.

Few-shot prompting is a reliability tool: show 1–3 high-quality examples that match the desired format and edge cases (e.g., missing fields). This is especially important for extraction/classification tasks that must be consistent. For structured outputs, prefer explicit schemas and constraints (for example: “Return a JSON object with keys X, Y, Z; do not include extra keys; values must be strings.”). If the platform feature set includes structured output modes, use them to reduce format drift.

  • Instruction clarity: separate task, constraints, and input data blocks.
  • Delimiters: wrap user-provided content to reduce prompt injection risk.
  • Determinism controls: lower temperature for repeatable extraction and routing tasks.

Exam Tip: If the scenario demands “machine-readable output,” choose answers that combine (1) explicit schema, (2) low temperature, and (3) validation/retry logic. Relying on “the model usually outputs JSON” is a classic exam trap; the correct design includes guardrails.

Common trap: putting critical rules in the user message while allowing user-provided text to override them. The exam often implies prompt injection attempts (“the document says ignore previous instructions”). Your best defense is layered: system policies, delimiters, and grounding to trusted content.

Section 3.3: RAG architecture: chunking, embeddings, vector search, citations

Section 3.3: RAG architecture: chunking, embeddings, vector search, citations

RAG design is a core AI-102 objective because it directly addresses hallucinations and enterprise knowledge needs. A standard RAG pipeline: ingest documents, chunk text, generate embeddings for chunks, store them in a vector index (often Azure AI Search), retrieve top-K relevant chunks at runtime, and provide those chunks as grounded context to the model. The exam tests whether you can tune chunking and retrieval to fit the model’s context window and still return accurate, cited answers.

Chunking: choose sizes that preserve meaning (headers + paragraphs) while enabling targeted retrieval. Overly large chunks waste tokens; overly small chunks lose context and reduce retrieval precision. Overlap can help maintain continuity, but too much overlap increases index size and cost.

Embeddings and vector search: embeddings represent semantic meaning; vector similarity finds conceptually related text even if keywords differ. Many scenarios use hybrid retrieval (keyword + vector) to handle both exact terms (policy numbers, product names) and semantic matches.

Citations: exams frequently require “show sources.” That implies you must pass document IDs/URLs/page numbers through the pipeline and instruct the model to cite retrieved sources only. Citations are also a safety control: they encourage answers grounded in retrieved evidence.

  • Context window management: retrieve only what is needed; summarize long context; keep a token budget for the answer.
  • Ranking and filters: metadata filters (tenant, ACL, language, date) are critical for multi-user apps.
  • Failure modes: no relevant chunks → answer should abstain or request clarification.

Exam Tip: If you see “must not answer from general knowledge” or “only answer from internal docs,” the correct solution includes (1) retrieval, (2) explicit instruction to use only retrieved content, and (3) a fallback behavior when retrieval confidence is low. A tempting wrong answer is “increase temperature/model size,” which does not solve grounding.

Section 3.4: Tooling integration basics (function calling and structured actions)

Section 3.4: Tooling integration basics (function calling and structured actions)

Agentic behaviors on AI-102 begin with “tool use”: the model decides when to call a function (tool) and returns structured arguments, while your application executes the tool and provides results back to the model. This pattern is used for database lookups, workflow automation (create ticket, send email), calculations, and calling Azure services (Search queries, storage retrieval, or internal APIs). The exam tests that you understand the boundary: the model proposes actions; your code enforces authorization, validation, and side-effect controls.

Function calling reduces brittle prompt parsing. Instead of asking the model to “output an API call,” you define a tool signature (name, description, JSON schema). The model returns a structured payload that you can validate. This is also a safety feature: you can block tools, require user confirmation, or enforce least privilege.

  • Tool selection: small set of well-described tools beats many vague ones.
  • Argument validation: schema validation + allow-lists for risky fields (e.g., email recipients, destinations).
  • Idempotency: avoid duplicate side effects by using request IDs and safe retries.

Exam Tip: If a scenario includes “the model executed an unintended action” or “prompt injection caused data exfiltration,” look for answers that add application-side controls: tool allow-lists, parameter validation, user confirmation, and RBAC checks. Answers that only say “improve the prompt” are usually insufficient because the exam wants defense in depth.

Common trap: treating tool outputs as trusted. Tool results should be treated like any external input—sanitize, constrain, and pass only what’s needed back into the model to limit token cost and exposure.

Section 3.5: Safety controls: content filtering, grounding, data handling

Section 3.5: Safety controls: content filtering, grounding, data handling

Generative apps must be safe by design, and AI-102 often frames this as “responsible AI + enterprise security.” You should recognize three layers: (1) platform safety features (content filtering), (2) grounding/RAG to reduce hallucinations, and (3) secure data handling (PII, secrets, retention, access control). Content filters typically apply to both prompts and completions; questions may describe blocked responses or false positives and ask what to tune (for example: adjusting filtering configuration, adding user messaging, or refining prompts and retrieval).

Grounding as a safety control: RAG plus strict instructions (“use only provided sources; cite them”) reduces unsupported claims. Pair this with abstention logic when evidence is missing.

Data handling: the exam expects least privilege, secure networking, and careful logging. Avoid storing sensitive prompts/responses unless necessary; if you must log, redact PII/secrets. Ensure that retrieved documents respect ACLs (Azure AI Search security trimming or application-side filtering). Mis-scoped retrieval is both a security incident and an exam pitfall.

  • Prompt injection defenses: delimit untrusted text; never allow documents to redefine system rules.
  • PII controls: detect/redact before sending to the model when required by policy.
  • User transparency: explain refusals and provide safe alternatives.

Exam Tip: When you see “multi-tenant,” “role-based access,” or “internal HR/finance documents,” prioritize answers that enforce authorization at retrieval time (metadata filters, ACL trimming) rather than “post-filtering the final answer.” Post-filtering is too late because the model may have already seen unauthorized content.

Section 3.6: Evaluate and optimize: quality metrics, A/B tests, caching, cost

Section 3.6: Evaluate and optimize: quality metrics, A/B tests, caching, cost

Evaluation and optimization is where many candidates lose points because they focus only on “prompt quality” and ignore system metrics. AI-102 expects you to manage latency, throughput, and cost while preventing quality regressions. Define success metrics aligned to the task: grounded answer rate, citation correctness, factual consistency, extraction accuracy, refusal rate, and user satisfaction. For RAG, also track retrieval metrics (recall@K, precision@K, and how often the answer uses retrieved sources).

Regression testing: keep a fixed evaluation set of representative prompts/documents and compare outputs across prompt changes, model version changes, and retrieval tuning. Many production failures come from silent changes (new chunking strategy, updated embeddings, model upgrade) that degrade answers. The exam frequently hints at “it used to work” scenarios—choose answers that introduce repeatable tests and monitoring, not just ad-hoc prompt edits.

Latency and cost levers: reduce tokens (shorter context, smaller max output), cache frequent results (especially for deterministic system prompts + identical retrieval), and consider model right-sizing. Caching is especially effective for stable FAQs or repeated tool results, but be careful with personalization and access control.

  • A/B tests: compare two prompts or retrieval strategies with the same traffic slice.
  • Streaming: improve perceived latency for long responses.
  • Batching: for embeddings and offline enrichment, batch requests to reduce overhead.

Exam Tip: If the scenario says “cost spiked” or “token usage is high,” the correct answer usually combines (1) context reduction (better retrieval/chunking), (2) output limits, and (3) caching. A common wrong choice is “increase context window,” which can worsen both cost and latency unless paired with better retrieval discipline.

Finally, treat evaluation as continuous: monitor production for drift (new document formats, new user intents) and update chunking, prompts, and safety thresholds with controlled rollouts and regression checks.

Chapter milestones
  • Model selection and prompt engineering fundamentals for Azure OpenAI
  • RAG design: grounding, retrieval, and context window management
  • Safety, content filters, and responsible AI for generative apps
  • Evaluation and optimization: latency, cost, quality, and regression testing
  • Exam-style practice set: generative solution design and troubleshooting
Chapter quiz

1. You are building an API that uses Azure OpenAI to generate receipts in a strict JSON schema (no extra keys). The calling system rejects responses that contain non-JSON text. You must minimize format errors without adding significant latency. Which approach should you implement first?

Show answer
Correct answer: Use structured output enforcement (JSON schema/response format) and set low temperature to reduce variability
Structured output/response format (schema-constrained output) is designed to enforce machine-readable JSON and reduces the common failure mode where models add prose or extra keys; lowering temperature further improves determinism. A larger model plus an instruction-only prompt still commonly leaks non-JSON text because instructions are not a hard constraint. RAG can help content grounding, but it does not reliably enforce output structure and adds tokens/latency; retrieving a template doesn’t prevent the model from emitting extra text.

2. A support chatbot must answer questions using an internal policy library and include citations. The policy corpus is 50,000 documents and changes daily. Users also ask follow-up questions in the same chat. You need to reduce hallucinations while staying within a limited context window. Which design best fits?

Show answer
Correct answer: Use Retrieval-Augmented Generation (RAG): query the index each turn, inject only top-ranked passages with source metadata, and keep conversation state compact (summaries/selected turns)
RAG is the standard pattern for grounding answers in a large, frequently changing corpus and for producing citations by passing retrieved chunks plus source identifiers into the prompt. Fine-tuning is slower to update, does not guarantee factual recall, and does not inherently provide citations; it also risks baking in outdated policies. Putting the entire corpus in the system prompt is infeasible due to context window limits and would be extremely costly and slow.

3. You are deploying a generative assistant for a bank. The app must block self-harm instructions and hate content, and it must not reveal customer PII in responses. The business requires consistent enforcement across prompts, regardless of user attempts to jailbreak. Which solution should you prioritize?

Show answer
Correct answer: Enable Azure OpenAI content filters/safety settings and implement additional application-layer checks/redaction for PII before returning responses
Certification-style best practice is to combine platform safety controls (content filters) with app-layer governance for sensitive data (PII detection/redaction, allowlists, policy checks). Prompts alone are not a reliable control against jailbreaks and do not guarantee consistent policy enforcement. A larger model may improve general compliance but is not a control mechanism and does not address PII leakage risk by itself.

4. A team reports that their RAG-based assistant sometimes answers using outdated information even though the latest documents are in the index. They also notice that retrieved passages are correct, but the model ignores them and responds from general knowledge. What change is most likely to improve grounding to the retrieved content?

Show answer
Correct answer: Adjust the prompt to clearly instruct: answer only from provided sources, include citations, and say you don’t know when sources are insufficient; ensure retrieved chunks are placed prominently in the message
When correct retrieval is present but the model ignores it, the issue is often prompt/grounding strategy: explicit instructions, citation requirements, and careful placement/formatting of retrieved context typically improves adherence. Increasing temperature generally increases variability and can worsen hallucinations. Reducing retrieval to a single chunk can remove needed context and does not address the core problem of the model not being instructed (and shaped) to privilege supplied sources.

5. You operate a generative summarization service with strict cost and latency SLOs. After a prompt change, average tokens per request increased by 40%, cost rose, and P95 latency degraded. You need a process that detects this kind of degradation before production. What should you implement?

Show answer
Correct answer: An automated regression evaluation pipeline that tracks quality metrics plus operational metrics (token usage, cost per call, latency) across a fixed test set before deployment
Exam-aligned practice emphasizes evaluation/optimization with regression testing: a repeatable test set and automated gates for both quality and operational KPIs (latency, throughput, token counts, cost). Increasing max_tokens typically increases token usage and can worsen cost/latency. Switching to a larger model often increases cost and latency and does not prevent regressions caused by prompt changes.

Chapter 4: Implement an Agentic Solution

This chapter targets the AI-102 skills measured around building agentic applications: systems that can plan, call tools, observe outcomes, and iterate toward a goal while operating under constraints. On the exam, you’ll be tested less on “agent hype” and more on whether you can select the right architecture and controls: tool/function contracts, orchestration choices, memory/state design, and reliability and governance mechanisms that keep an agent safe, auditable, and cost-effective.

A practical way to frame an agent (and a common exam lens) is: Goal → Plan → Act (tool calls) → Observe (tool results) → Update state → Repeat. Your job is to implement that loop with the right boundaries: limiting tool permissions, validating tool inputs/outputs, handling errors, and deciding what the agent should remember. Many exam scenarios disguise these choices as “which component should you use?” questions—often between orchestration logic in your app vs. model-driven planning, and between short-lived session context vs. durable storage.

As you read, keep translating requirements into design decisions: Is the task deterministic or open-ended? Do we need approvals? What data can be stored? What failure modes matter? Those are the levers the exam expects you to pull.

Practice note for Agent design: goals, plans, tools, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Orchestration and tool/function integration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Memory, state, and knowledge access for multi-step tasks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Reliability engineering: guardrails, retries, and human-in-the-loop: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: agent workflows and control scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Agent design: goals, plans, tools, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Orchestration and tool/function integration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Memory, state, and knowledge access for multi-step tasks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Reliability engineering: guardrails, retries, and human-in-the-loop: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: agent workflows and control scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 4.1: Agent fundamentals: planning, acting, and observation loops

Section 4.1: Agent fundamentals: planning, acting, and observation loops

Agent design starts with specifying goals (what “done” means), plans (the intermediate steps), tools (what actions are allowed), and constraints (policies, time/cost limits, and data boundaries). The exam frequently probes whether you can separate a model’s generative reasoning from the application’s responsibility for control. In practice, an agent loop includes: (1) interpret user intent, (2) draft a plan or next action, (3) call a tool, (4) observe results, and (5) decide whether to continue or stop.

Two design choices show up often. First, who owns the plan: a “model-plans” agent lets the LLM propose steps, but your app still enforces constraints and validates outputs. A “workflow-first” agent uses a predefined state machine where the LLM fills in slots (like parameters), which is easier to test and safer for regulated workloads. Second, termination: you need clear stop conditions (goal achieved, max steps, max cost, or human approval required). Many production incidents come from missing termination logic—on the exam, that maps to “limit iterations,” “timeouts,” and “budgets.”

Exam Tip: If a scenario emphasizes compliance, predictability, or auditability, prefer explicit workflow/state-machine orchestration and constrained toolsets over free-form autonomous planning.

  • Goals: define success criteria and expected output format.
  • Constraints: step limits, timeouts, tool allowlists, and data access boundaries.
  • Observations: tool results must be treated as data, not trusted reasoning.

Common trap: assuming the model can “remember” or “self-correct” without external state. On AI-102, memory and state are explicit engineering tasks; the loop must persist what matters and discard what doesn’t.

Section 4.2: Tools and functions: contracts, schemas, and error handling

Section 4.2: Tools and functions: contracts, schemas, and error handling

Tools/functions are how an agent affects the world: calling APIs, querying Azure AI Search, writing tickets, sending emails, or executing business operations. The exam focuses on correct contracts: define what the function does, required parameters, parameter types, allowed ranges, and what errors look like. If you use function calling, treat the schema as an interface specification, not documentation—tight schemas reduce hallucinated parameters and make tool calls reliably parseable.

Error handling is an agent’s “muscle.” You should implement: input validation before calling tools, output validation after tool returns, and well-defined retry behavior. A robust pattern is: (1) attempt tool call, (2) classify errors (transient vs. permanent), (3) retry with backoff for transient failures, and (4) fall back or escalate for permanent errors. For example, a 429/503 suggests backoff; a 400 with schema mismatch suggests you must reformat parameters rather than retry blindly.

Exam Tip: When you see “tool returns malformed data” or “agent calls API with wrong fields,” the best answer usually includes strict JSON schema, server-side validation, and safe failure behavior (don’t let the model invent missing required values).

  • Parameter hygiene: enforce enums for categorical choices (e.g., region, priority).
  • Idempotency: design tool actions so retries don’t duplicate side effects (use request IDs).
  • Observability: log tool name, arguments (redacted), latency, and status codes for debugging.

Common trap: letting the model “handle” exceptions in natural language only. The correct engineering approach is to surface structured error details to the orchestration layer, then decide whether to retry, ask the user for missing inputs, or route to human review.

Section 4.3: Orchestration patterns: single vs multi-agent, routing, delegation

Section 4.3: Orchestration patterns: single vs multi-agent, routing, delegation

Orchestration is the control plane for agent behavior: deciding which model prompt, which tool, and which sequence of steps to use. AI-102 questions often present a requirement like “support multiple task types” or “use specialist skills” and ask you to choose between a single general agent, a router pattern, or a multi-agent design. A single-agent approach is simpler and easier to govern; you add tools and instructions but keep one loop. A router pattern classifies the request and dispatches it to a specialized workflow (e.g., “search-and-summarize” vs. “create-support-ticket”). A multi-agent pattern delegates subproblems to specialized agents (planner, researcher, writer), but increases complexity, latency, and failure surface area.

Delegation requires explicit boundaries: what context each agent receives, what outputs are accepted, and who has authority to call sensitive tools. A strong exam answer typically mentions least privilege: the “writer” agent shouldn’t have permission to send emails; only the “action” agent does after approvals.

Exam Tip: If the prompt mentions “different departments,” “different policies,” or “different tool permissions,” think router or multi-agent with separated tool scopes—not one omnipotent agent.

  • Routing signals: intent classification, keyword rules, or embeddings similarity.
  • Delegation outputs: require structured outputs (e.g., JSON with citations) before downstream actions.
  • Concurrency: multi-agent can parallelize research steps, but watch rate limits and costs.

Common trap: choosing multi-agent when a router-to-workflows solution is safer. On the exam, prefer the simplest architecture that meets requirements, especially when governance and predictability are emphasized.

Section 4.4: State and memory: session context, long-term memory, privacy

Section 4.4: State and memory: session context, long-term memory, privacy

State answers: “What does the agent need to know right now to finish the task?” Memory answers: “What can it reuse later?” AI-102 expects you to distinguish short-lived session context (conversation turns, current plan, last tool results) from long-term memory (user preferences, prior cases, durable facts), and to handle privacy correctly.

Session state is typically stored in your app tier or a low-latency store and should be bounded: keep only what’s needed for the next steps to control token costs and reduce leakage risk. Long-term memory should be explicit and justifiable: store stable preferences (“use metric units”), prior decisions, or curated summaries—not raw transcripts by default. When you do store content, consider encryption, access control, retention policies, and user consent.

Exam Tip: If the stem mentions PII, regulatory requirements, or “should not store user prompts,” the correct design is usually ephemeral session state + redaction + minimal retention, with long-term memory limited to user-approved, non-sensitive summaries.

  • Memory sourcing: use Azure AI Search or a database for durable retrieval rather than stuffing everything into the prompt.
  • Privacy controls: redact secrets/PII before logging; apply RBAC to memory stores.
  • Grounding: retrieve authoritative facts (RAG) rather than relying on “remembered” model behavior.

Common trap: conflating memory with RAG. Retrieval is for authoritative external knowledge; memory is for user- or agent-specific continuity. On the exam, pick retrieval for “latest policies” and memory for “user’s recurring preferences,” but apply strict governance to both.

Section 4.5: Safety and governance for agents: permissions, policies, approvals

Section 4.5: Safety and governance for agents: permissions, policies, approvals

Agent safety is primarily about controlling actions. A chatbot that only answers questions has a different risk profile than an agent that can modify data, send messages, or trigger workflows. AI-102 scenarios frequently ask how to prevent unauthorized actions, data exfiltration, or prompt injection leading to dangerous tool calls. Your design should implement least privilege at multiple layers: tool allowlists, API scopes, managed identities, and role-based access control on underlying resources.

Policy enforcement should be externalized where possible: use centralized policy and approvals rather than embedding rules only in prompts. For sensitive operations (e.g., “delete records,” “send to all users,” “approve refund”), implement human-in-the-loop approvals or multi-step confirmations. Additionally, constrain tools with server-side checks (e.g., user can only access their tenant’s data) so even if the model is manipulated, the tool cannot overreach.

Exam Tip: If a question mentions “agent should not execute unless…”, the best answer usually combines (1) explicit approval gates, (2) permission-scoped credentials, and (3) server-side authorization checks—prompts alone are not sufficient controls.

  • Prompt injection defense: treat retrieved content as untrusted; never let documents override tool policy.
  • Action gating: require confirmations with a summary of intended effects.
  • Auditability: log who requested the action, what tool was called, and what data was accessed (with redaction).

Common trap: overreliance on content filters for action safety. Content safety helps with outputs; governance for agents must also control capabilities and permissions.

Section 4.6: Testing and reliability: evals, red-teaming basics, incident response

Section 4.6: Testing and reliability: evals, red-teaming basics, incident response

Reliability engineering for agents means making behavior repeatable enough to trust in production. The exam focuses on guardrails (schema validation, tool constraints), retries/backoff, monitoring, and a basic evaluation strategy. You should test: (1) correctness of tool selection, (2) correctness of tool arguments, (3) grounding/citation behavior when using retrieval, and (4) refusal/approval behavior for restricted actions. Automated evals can measure task success rates and policy compliance across a regression suite of representative prompts.

Red-teaming basics are also relevant: attempt prompt injection, jailbreaks, and data-leak scenarios, and verify the agent fails safely (refuses, requests approval, or limits tool scope). When failures occur, incident response should include rapid rollback (feature flags), tighter tool permissions, updated allowlists/denylists, and expanding test cases to prevent recurrence.

Exam Tip: If the problem statement includes “intermittent failures” or “rate limits,” expect answers involving retries with exponential backoff, circuit breakers, and idempotent tool design. If it includes “unexpected actions,” expect approval gates, tool allowlists, and better logging/auditing.

  • Telemetry: capture step count, tool latency, error types, and completion reasons (stopped, max steps, approved, refused).
  • Regression testing: keep a fixed set of prompts and expected structured outputs to detect drift.
  • Runbooks: define what to do when the agent loops, leaks data, or triggers a restricted tool.

Common trap: treating evaluation as subjective. The exam leans toward measurable checks: schema validity, citation presence, policy decisions, and tool-call success—things you can automate and monitor continuously.

Chapter milestones
  • Agent design: goals, plans, tools, and constraints
  • Orchestration and tool/function integration patterns
  • Memory, state, and knowledge access for multi-step tasks
  • Reliability engineering: guardrails, retries, and human-in-the-loop
  • Exam-style practice set: agent workflows and control scenarios
Chapter quiz

1. A company is building an agent that can create support tickets. Requirements: the agent must call a "CreateTicket" tool, but only after validating required fields and ensuring the user explicitly confirmed the ticket submission. Which design best meets the requirement?

Show answer
Correct answer: Implement app-side orchestration that validates tool arguments, requires an explicit confirmation step, and only then allows the tool call
Correct: App-side orchestration provides deterministic control over tool invocation (validation and explicit confirmation/human-in-the-loop gating) and is aligned with agent reliability and governance expectations. B is wrong because prompt-only controls are not reliable guardrails; the model can still call tools with malformed or unconfirmed inputs. C is wrong because persistence and async execution do not enforce confirmation/validation at the point of action, and long-term memory is not a control mechanism.

2. You are implementing a multi-step agent workflow: (1) query an internal catalog, (2) compute a quote, (3) generate an email. The agent must persist progress so that if the process fails after step 2, it can resume without repeating expensive tool calls. What should you use?

Show answer
Correct answer: Persist workflow state and tool results in durable storage keyed by a workflow/session identifier
Correct: Durable state storage (e.g., database/table/blob) enables resumption and idempotent behavior by recording step completion and tool outputs. A is wrong because prompt context is session-scoped, size-limited, and can be lost or truncated; it is not reliable for resuming after failures. C is wrong because models do not retain runtime tool results from a prior execution; relying on training data is not the same as persisting workflow state.

3. A team is integrating several tools into an agent: "SearchDocs", "GetCustomerRecord", and "UpdateCustomerRecord". Security policy requires least privilege and prevention of unintended data changes. Which approach best aligns with this requirement?

Show answer
Correct answer: Provide the agent only read-only tools by default and require a separate approval gate before enabling any update tool calls
Correct: Least privilege plus explicit gating for write operations is a standard control for safe agentic systems; it reduces blast radius and supports auditable approvals. B is wrong because always-on write tools violate least privilege and increase risk of unintended changes. C is wrong because a prompt warning is not an enforceable control; you need permission boundaries and/or approvals, not guidance alone.

4. An agent uses a function/tool calling pattern to fetch inventory counts. During peak hours, the inventory API intermittently returns HTTP 429 (rate limit). The business requirement is to increase success rate while controlling costs and preventing infinite loops. What should you implement?

Show answer
Correct answer: Automatic retries with exponential backoff and a maximum retry limit; surface failure to a human or fallback after the limit is reached
Correct: Exponential backoff with caps is a reliability engineering pattern that improves success rates under transient failures while avoiding runaway cost/loops; escalation/fallback after limits supports safe operation. B is wrong because tight-loop retries amplify rate limiting and can cause infinite loops and excessive cost. C is wrong because eliminating retries reduces resiliency; controlled retries are expected for transient errors like 429/5xx.

5. A company wants an agent that answers HR policy questions. Requirements: the agent must cite the most current policy and avoid storing employee questions containing personal data beyond the active session. Which design best fits?

Show answer
Correct answer: Use retrieval (knowledge access) at runtime for policy documents and keep only short-lived session context; avoid writing PII-containing queries to long-term memory
Correct: Runtime retrieval ensures answers reflect the latest internal documents, and limiting memory to session context reduces unnecessary persistence of PII—both are common exam decisions around knowledge access and memory constraints. B is wrong because fine-tuning does not guarantee immediate currency and storing all questions conflicts with the requirement to avoid persisting PII beyond the session. C is wrong because public browsing is not an appropriate source for internal HR policy and increases governance risk.

Chapter 5: Computer Vision, NLP, and Knowledge Mining

This chapter targets the AI-102 skills that repeatedly show up in scenario questions: choosing the right vision capability (image analysis vs OCR vs document extraction), implementing NLP tasks (extraction, classification, summarization), and building knowledge mining pipelines with Azure AI Search (indexing, enrichment, skillsets). The exam rarely asks you to recite API names; it asks you to design an end-to-end solution under constraints like latency, cost, governance, security, and accuracy—then identify the best Azure service combination.

As you read, keep a “pipeline mindset.” Most real solutions on the exam chain together: content ingestion (Blob/ADLS/SharePoint), extraction (OCR/document intelligence), enrichment (language/vision skills), indexing (keyword + vector), and then an app layer that queries Search or orchestrates tools via an agent. Traps typically involve picking a model/service that can’t meet the input format, throughput, or retrieval requirement.

Exam Tip: When the scenario mentions “search across documents” with “filters/facets,” think Azure AI Search first. When it mentions “extract structured fields from forms,” think Document Intelligence (formerly Form Recognizer). When it mentions “describe an image / detect objects,” think Azure AI Vision image analysis. Then decide whether you need real-time calls, batch enrichment, or both.

Practice note for Computer vision solutions: image analysis and OCR design choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP solutions: extraction, classification, summarization, and conversation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining with Azure AI Search: indexing and enrichment pipelines: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Information extraction workflows: documents, entities, and metadata at scale: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: CV/NLP/Search integrated scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Computer vision solutions: image analysis and OCR design choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP solutions: extraction, classification, summarization, and conversation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining with Azure AI Search: indexing and enrichment pipelines: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Information extraction workflows: documents, entities, and metadata at scale: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam-style practice set: CV/NLP/Search integrated scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 5.1: Implement computer vision solutions (analysis, OCR, document flows)

Section 5.1: Implement computer vision solutions (analysis, OCR, document flows)

AI-102 tests whether you can map a vision requirement to the correct capability and output type. “Image analysis” tasks include generating captions, tags, object detection, and sometimes dense captions depending on the feature set. “OCR” is specifically about extracting text and layout from images or scanned documents. “Document flows” usually imply multi-page PDFs, forms, invoices, receipts, or contracts where you must turn unstructured pages into structured fields and searchable content.

Design starts with the input and the expected output. If the app needs searchable text from images, OCR is non-negotiable. If the app needs key-value pairs (invoice number, totals, vendor), use a document extraction approach (Document Intelligence prebuilt models or custom models) rather than trying to regex raw OCR. For photos where the user wants “what’s in the picture,” image analysis is the better match.

  • Image analysis: captions/tags/object detection for photos; often used to enrich metadata for search and downstream automation.
  • OCR: convert pixels to text + bounding boxes; useful for indexing, highlighting, and compliance review.
  • Document extraction: maps to structured schemas; supports higher-quality field extraction than OCR alone.

Common trap: Selecting image analysis to “extract text from a PDF.” The exam expects you to recognize that OCR/document extraction is needed, and that PDFs may require a document-capable pipeline (including page handling) rather than a single-image endpoint.

Exam Tip: In scenario stems, keywords like “scanned,” “handwritten,” “invoice,” “receipt,” “multi-page,” “tables,” and “key-value pairs” point away from generic image analysis and toward OCR/document intelligence. Keywords like “detect people/vehicles,” “describe scene,” or “generate tags” point toward image analysis.

Section 5.2: Vision architecture decisions: latency, batch vs real-time, storage

Section 5.2: Vision architecture decisions: latency, batch vs real-time, storage

Many AI-102 questions are architecture decisions disguised as feature questions. You must decide whether to run vision in real time (synchronous API call as the user uploads) or batch (asynchronous processing of a backlog). Real-time is appropriate when user experience requires immediate feedback—e.g., a mobile app reading text from a package label. Batch is appropriate when processing large archives—e.g., back-scanning a million PDFs for knowledge mining.

Latency and throughput drive the design. For batch pipelines, you often stage raw content in Azure Blob Storage or ADLS Gen2, trigger processing via Event Grid/Functions, and write extracted artifacts (text, JSON fields, thumbnails) back to storage for later indexing. For real-time, you may still store the original content for audit/reprocessing, but your primary success metric is request latency and reliability (timeouts, retries, idempotency).

  • Batch patterns: queue-based fan-out (Storage + Functions + Queue/Service Bus), backpressure, and retry policies; good for cost control and predictable processing.
  • Real-time patterns: API gateway + function/app service + vision endpoint; requires careful limits, caching, and graceful degradation.
  • Storage choices: Blob/ADLS for raw + enriched artifacts; consider immutable storage or versioning for governance.

Common trap: Treating OCR output as “the source of truth” and discarding originals. Exam scenarios often include compliance, audit, or reprocessing requirements; keeping original documents (and associating them with extracted metadata) is a safer design choice.

Exam Tip: If the stem mentions “cost,” “large backlog,” “daily ingestion,” or “indexing pipeline,” prefer batch enrichment. If it mentions “interactive,” “upload and instantly,” “mobile,” or “live camera,” prefer synchronous calls—then mitigate risk with timeouts, fallbacks, and storing for later reprocessing.

Section 5.3: Implement NLP solutions: key phrase/entity extraction and classification

Section 5.3: Implement NLP solutions: key phrase/entity extraction and classification

NLP questions on AI-102 usually revolve around picking the right extraction or classification technique and knowing what the output is used for. Key phrase extraction produces topical phrases suitable for tagging and search facets. Entity recognition extracts typed entities (people, organizations, locations, dates, product identifiers) that can drive filtering, linking, or compliance checks. Classification assigns labels to text (e.g., “refund request,” “technical issue,” “legal hold”). The exam expects you to connect the NLP result to downstream actions: routing tickets, indexing fields, or triggering workflows.

For implementation, the typical pattern is: collect text (from documents or chat logs), clean/normalize it, run an NLP model, then persist outputs in a structured store or search index. In knowledge mining scenarios, those outputs become searchable fields and facets. In operational apps, those outputs become decision inputs (queue selection, priority, escalation).

  • Extraction is about enriching content with structured metadata (entities, key phrases).
  • Classification is about selecting a category for automation and analytics.
  • Quality controls: language detection, confidence thresholds, and human review loops for low-confidence cases.

Common trap: Confusing summarization with key phrase extraction. Summaries are narrative compressions; key phrases are short tags. If the requirement is “generate tags” or “improve search facets,” key phrases/entities are the better fit.

Exam Tip: Look for the verb in the requirement: “tag,” “extract,” “identify,” “detect” → extraction. “route,” “categorize,” “assign label” → classification. Then verify the data shape needed by the next component (Search fields, filters, or workflow rules).

Section 5.4: Conversational patterns and summarization use cases (prompt + NLP)

Section 5.4: Conversational patterns and summarization use cases (prompt + NLP)

AI-102 increasingly blends classic NLP tasks with Azure OpenAI prompting patterns. On the exam, “conversational patterns” often mean a chat experience that uses tools: retrieve knowledge, call a function, summarize results, and maintain state. Summarization use cases include meeting notes, customer support transcripts, case histories, or long documents where users need a brief synopsis plus citations.

A reliable pattern is: (1) retrieve relevant content (often via Azure AI Search), (2) ground the model with retrieved passages, (3) instruct the model to summarize with constraints (length, sections, tone), and (4) optionally run lightweight NLP to extract structured items (action items, entities, sentiment) for reporting. The exam aims to see that you can combine prompt design with deterministic post-processing when the output must be structured.

Common trap: Using only a prompt to produce “structured” results without validation. If the scenario emphasizes auditability, downstream automation, or strict schemas, the better answer usually includes function calling/JSON schema enforcement and/or post-processing plus retries.

Exam Tip: When you see “conversation must remember context,” separate “short-term context” (chat history window) from “long-term memory” (stored summaries, embeddings, or key facts). The correct design frequently summarizes older turns and stores them, rather than passing the entire conversation every time (cost/latency/token limits).

Also watch for safety and policy requirements: if the bot interacts with users, the exam may require content filtering, PII handling, or restricted tool access. Those are controls that sit around the conversational workflow, not after the fact.

Section 5.5: Implement knowledge mining: Azure AI Search indexes, ingestion, vector + keyword

Section 5.5: Implement knowledge mining: Azure AI Search indexes, ingestion, vector + keyword

Knowledge mining is a high-frequency AI-102 scenario: ingest heterogeneous content, enrich it, and make it searchable. Azure AI Search is the core service, and the exam expects you to understand index design and ingestion choices. Indexes contain fields with attributes that control behavior (searchable, filterable, sortable, facetable, retrievable). A common objective is to enable both keyword search (lexical match, filters, facets) and vector search (semantic similarity using embeddings). Many solutions need a hybrid approach: keyword for precision and compliance queries (“find ‘termination for cause’”), vector for “find similar” and natural-language questions.

Ingestion typically uses indexers that connect to data sources like Blob Storage, ADLS Gen2, or databases. Indexers can run on schedules for incremental updates. When the content is documents, you often split text into chunks before embedding, so each chunk becomes a searchable unit with its own vector and metadata (document id, page, section, security labels).

  • Keyword retrieval: excels with exact terms, filters, and structured fields.
  • Vector retrieval: excels with paraphrases and concept matching; requires embeddings and vector fields.
  • Hybrid: often the best exam answer when the scenario wants both accuracy and natural language discovery.

Common trap: Proposing “just use a database” for enterprise search requirements. If the scenario includes facets, relevance ranking, full-text queries, and enrichment pipelines, Azure AI Search is the expected platform component.

Exam Tip: If the stem mentions “permissions per document,” look for security trimming patterns (indexing ACLs, filtering by user/group claims) and ensure your index has filterable fields to enforce access at query time.

Section 5.6: Enrichment and information extraction: skillsets, projections, indexing strategy

Section 5.6: Enrichment and information extraction: skillsets, projections, indexing strategy

This is where many candidates lose points: knowing that Azure AI Search can do more than ingest text—it can run enrichment via skillsets and shape output via projections. A skillset is a pipeline of cognitive skills (built-in and custom) applied during indexing: OCR, language detection, entity recognition, key phrase extraction, text splitting, embedding generation, and custom Web API skills. The output is an enriched document tree that you must map into index fields.

Projections matter when you need to index at “chunk level” for RAG: instead of one index document per file, you project child items (pages/sections/chunks) into separate index documents with shared metadata. This improves retrieval quality and citation granularity. The indexing strategy should also preserve traceability: store references to the original blob path, page number, and offsets so you can show citations and re-run enrichment when models or schemas change.

  • Skillset design: order matters (OCR before NLP; splitting before embeddings); handle failures with retry and dead-letter patterns externally.
  • Field mapping: map enriched outputs (entities, key phrases, normalized text) into filterable/facetable/searchable fields appropriately.
  • Versioning: plan for reindexing when chunking logic or embedding models change.

Common trap: Indexing only the final summary and discarding extracted entities/metadata. Exam scenarios often require faceted navigation (“filter by vendor,” “filter by date range,” “show entities”), which needs structured fields, not just a summary blob.

Exam Tip: If the question emphasizes “at scale” and “documents,” think end-to-end: OCR/document extraction → enrichment skillset → projections for chunks → index fields for filters/facets → hybrid vector+keyword queries. The best answer is usually the one that maintains lineage (document id/page) and enables both precision filtering and semantic retrieval.

Chapter milestones
  • Computer vision solutions: image analysis and OCR design choices
  • NLP solutions: extraction, classification, summarization, and conversation
  • Knowledge mining with Azure AI Search: indexing and enrichment pipelines
  • Information extraction workflows: documents, entities, and metadata at scale
  • Exam-style practice set: CV/NLP/Search integrated scenarios
Chapter quiz

1. A company stores millions of scanned PDF invoices in Azure Blob Storage. Users must search across invoices by vendor name, invoice number, and total amount, and refine results using facets (for example, vendor) and filters (for example, date range). You must extract these fields from the documents and make them searchable with low operational overhead. Which solution best meets the requirements?

Show answer
Correct answer: Create an Azure AI Search index with an indexer for Blob Storage and an enrichment pipeline that uses Document Intelligence to extract structured fields into the index.
Azure AI Search is designed for document search with filters/facets, and indexers + skillsets support at-scale enrichment. Document Intelligence is the right choice for extracting structured fields from forms/invoices into searchable fields. Option B is wrong because image tagging/captioning is not reliable for invoice field extraction and Cosmos DB is not a search engine with rich faceting like Search. Option C is wrong because doing extraction at query time is high-latency/costly and Azure AI Language cannot reliably parse scanned PDFs without OCR/document extraction, and it does not provide faceting/filtering search capabilities by itself.

2. You are designing an API that receives a photo taken from a mobile device and must return a short description and detected objects in under 500 ms for an interactive user experience. The images can include street scenes and products. Which approach should you choose?

Show answer
Correct answer: Use Azure AI Vision image analysis in a real-time call to generate captions and detect objects.
For real-time image understanding (captions/object detection) under strict latency, Azure AI Vision image analysis is the intended capability. Option B is wrong because Document Intelligence targets structured document/form extraction, not general scene description or object detection. Option C is wrong because Azure AI Search enrichment pipelines are typically asynchronous/batch-oriented for knowledge mining; waiting for indexing/enrichment to complete will not meet sub-second interactive latency.

3. A support team wants to automatically route incoming customer emails into one of five categories (Billing, Technical, Account, Sales, Other). The model must return a single label per email and will be called in real time from a ticketing system. Which NLP capability best fits the requirement?

Show answer
Correct answer: Text classification using Azure AI Language (custom or built-in classification) to assign one category label per message.
The requirement is supervised (or predefined) labeling into a small set of classes, which maps directly to text classification in Azure AI Language. Option B is wrong because entity extraction finds spans like names/organizations, not a single routing label; entities may be absent or not correlate to category. Option C is wrong because summarization produces condensed text, not a deterministic label, and still requires an additional classification step (manual mapping fails the automation requirement).

4. You are implementing a knowledge mining solution for an engineering portal. Content includes Word documents and PDFs in SharePoint and Azure Blob Storage. Users need keyword search plus semantic/vector search over extracted text, and results must include metadata filters such as project, author, and last modified date. Which design best meets these requirements?

Show answer
Correct answer: Use Azure AI Search with connectors/indexers for the sources, an enrichment skillset to extract text and metadata, and an index configured for both lexical and vector search with filterable fields.
Azure AI Search is the correct service for enterprise search across multiple sources with filters/facets, and it can combine keyword and vector retrieval in one index while storing filterable metadata fields. Option B is wrong because Table Storage is not designed for efficient similarity search or faceted filtering across large corpora, and the approach recreates indexing/search features already provided by Search. Option C is wrong because sending all documents per query is not scalable/cost-effective, and LLMs do not natively enforce structured filters/facets over a corpus without a retrieval/index layer.

5. A company processes 500,000 mixed documents per day (scanned forms, printed letters, and handwritten notes). They must extract structured fields from forms when possible, and otherwise extract entities (people, locations, dates) from the free text. The solution must be scalable and support downstream search and analytics. Which workflow is most appropriate?

Show answer
Correct answer: Create a pipeline that uses Document Intelligence for form/document extraction, applies OCR when needed, then runs Azure AI Language entity extraction on the extracted text, and indexes outputs in Azure AI Search.
At high scale, you want an offline/batch extraction and enrichment pipeline: Document Intelligence for structured form fields (and OCR/document reading), then Language for entity extraction on text, and Azure AI Search for indexing/retrieval and analytics-friendly structured fields. Option B is wrong because image captions are not a substitute for accurate field extraction, and sentiment analysis does not produce required structured fields/entities. Option C is wrong because query-time processing is not scalable for large corpora, leads to high latency and repeated compute cost, and does not provide the indexed filters/facets expected in knowledge mining solutions.

Chapter 6: Full Mock Exam and Final Review

This chapter is your transition from learning to scoring. AI-102 doesn’t reward trivia as much as it rewards correct engineering decisions under constraints: security boundaries, latency, cost, reliability, and governance. The goal of the full mock exam is not to “see what you get,” but to surface patterns: what you misread, what you over-assume, and what you fail to connect across domains (Azure OpenAI + AI Search + monitoring + identity).

Use this chapter as a guided rehearsal. You’ll run two mock blocks, perform weak-spot analysis, and finish with rapid recall drills mapped to exam outcomes: planning and managing Azure AI solutions; implementing generative AI solutions and RAG patterns; building agentic orchestration with tools/functions and memory; implementing CV and NLP solutions; and implementing knowledge mining with Azure AI Search indexing and enrichment.

Exam Tip: Treat every item like a real customer ticket. Ask: “What is the constraint?” (data residency, private network, cost ceiling, safety requirement, latency, multilingual input). The correct answer is usually the option that satisfies the constraint with the least added complexity—unless the prompt explicitly demands customization, deterministic output, or strict isolation.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final domain review and rapid recall drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final domain review and rapid recall drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 6.1: Mock exam instructions and pacing strategy (AI-102 style)

Section 6.1: Mock exam instructions and pacing strategy (AI-102 style)

AI-102 questions commonly blend design and implementation details. You’ll see scenario prompts that implicitly test: identity and network access, data flow (ingest → enrich → index → retrieve), evaluation and safety controls, and operational readiness (monitoring, cost, deployment). Your pacing must prevent over-investing in early items while still reading carefully enough to catch constraints.

Run your mock in two passes. Pass 1 is “efficient correctness”: answer what you can in 60–90 seconds each and flag anything that requires rereading, computation, or comparing two nearly-correct services. Pass 2 is “constraint resolution”: return to flagged items and resolve them by mapping each requirement to a feature (e.g., private endpoint support, managed identity, content filtering, semantic ranker, vector search, skillset enrichment).

  • Start by skimming the prompt for hard requirements: private networking, customer-managed keys, region restrictions, PII, regulated industry, or guaranteed citations.
  • Identify the architecture layer being tested: governance/ops, generation/RAG, agents/tools, CV, NLP, or search/enrichment.
  • Eliminate options that violate constraints (e.g., public endpoints when private is required; embedding-only when full generation is required; OCR when object detection is requested).

Exam Tip: When two options both “work,” choose the one that uses the Azure-native service intended for that outcome and minimizes custom glue code. AI-102 frequently rewards service-fit (e.g., Azure AI Search skillsets for enrichment rather than hand-rolled ETL, unless you need bespoke transformations).

Common pacing trap: rereading the entire scenario repeatedly. Instead, write a one-line “constraint summary” on scratch paper (e.g., “private, PII, citations, low latency, multilingual”). Use it to test each option quickly.

Section 6.2: Mock Exam Part 1 (mixed domains, scenario heavy)

Section 6.2: Mock Exam Part 1 (mixed domains, scenario heavy)

Mock Exam Part 1 should feel like a cross-domain sprint: short scenarios that jump between Azure OpenAI, Azure AI Search, Vision, Language, and operational controls. Your task is to build “service instinct”—the ability to map a requirement to the correct product feature without second-guessing.

Expect mixed-domain prompts such as: deploying a RAG app with governance requirements; choosing between document-level vs chunk-level retrieval; configuring vector search + semantic ranking; selecting OCR vs document intelligence for forms; and applying responsible AI controls for content. In agentic scenarios, the exam looks for correct tool/function boundaries, state management (conversation history vs long-term memory), and reliability techniques (timeouts, retries, idempotency, and safe tool invocation).

  • Generative AI and RAG: Watch for the citation requirement. If the user must see “where the answer came from,” prioritize retrieval grounded in Azure AI Search results and pass sources into prompts; do not rely on the model’s memory.
  • Knowledge mining: If the prompt mentions PDFs, images, or scans, you likely need OCR extraction during indexing (built-in skills or integrated document processing) rather than plain text ingestion.
  • Vision vs Language: Describing objects/scene is vision; extracting entities/PII/sentiment is language. Tricky prompts combine both (e.g., receipts: OCR + entity extraction).

Exam Tip: For security and governance scenarios, the “right” answer often includes managed identity for service-to-service access, private endpoints for data-plane isolation, and logging/monitoring via Azure Monitor/Application Insights—these are high-frequency exam themes.

Common trap: assuming “more advanced” equals “more correct.” The exam frequently rewards simpler choices when they satisfy requirements (e.g., using built-in skillsets rather than custom skills, or using Azure OpenAI content filtering and prompt engineering before building a complex moderation pipeline).

Section 6.3: Mock Exam Part 2 (case study + troubleshooting items)

Section 6.3: Mock Exam Part 2 (case study + troubleshooting items)

Mock Exam Part 2 should simulate a case study and troubleshooting set, where you must keep multiple requirements in your head while diagnosing failures. Case studies often hide “gotchas” in constraints: data must remain in a private VNet; customer-managed keys are required; logs cannot contain PII; or latency targets force caching and smaller context windows.

For generative AI case studies, you’re typically asked to improve answer quality and reliability. Quality improvements map to: better chunking strategy, hybrid retrieval (vector + keyword), semantic ranking, query rewriting, grounding instructions, and evaluation metrics. Reliability improvements map to: fallbacks (retrieval failure handling), tool call validation, rate limit backoff, and deterministic formatting. Safety improvements map to: content filtering, prompt shields, red-teaming, and structured output validation.

  • Troubleshooting RAG: If answers are irrelevant, check indexing (fields searchable/filterable), chunking granularity, embedding model consistency, and whether vector search is actually being used.
  • Troubleshooting agents: If tools are misused, enforce tool schemas, add tool-choice constraints, and implement confirmation steps for high-impact actions.
  • Ops troubleshooting: If costs spike, identify high-token prompts, excessive retrieval top-k, and unnecessary re-embedding or re-indexing; add caching and batch operations.

Exam Tip: In troubleshooting items, distinguish between “symptom fixes” and “root cause fixes.” AI-102 tends to prefer changes that align with the architecture layer responsible (e.g., fix retrieval configuration in Search rather than adding more prompt text to compensate for bad indexing).

Common case-study trap: ignoring deployment realities. If the scenario mentions staging/production, blue-green or canary, or model versioning, the correct answer will include controlled rollout and monitoring—especially for prompt changes and agent tool updates.

Section 6.4: Answer review method: rationales, flags, and notes that stick

Section 6.4: Answer review method: rationales, flags, and notes that stick

Your score improves most during review, not during the mock. The goal is to convert every missed or guessed item into a reusable decision rule. Use a structured method: for each flagged question, write (1) the constraint, (2) the tested objective, (3) why the wrong option was tempting, and (4) the “trigger phrase” that should have guided you.

Build a “trap log” with categories that match AI-102 patterns:

  • Service confusion: mixing Azure AI Search vs Azure OpenAI “memory”; mixing Vision OCR vs Document Intelligence; mixing Language entity extraction vs custom ML.
  • Security omissions: forgetting managed identity, private endpoint, RBAC scopes, key management, or data exfiltration boundaries.
  • RAG mechanics: wrong chunk size, missing semantic configuration, ignoring filters, embedding mismatch, or using too-large top-k.
  • Agent controls: missing tool validation, lack of idempotency, unsafe tool execution, or no fallback when tools fail.

Exam Tip: Your review notes should be phrased as if-then rules: “If private networking is required, then verify each service supports private endpoints and that the data path stays private.” These rules become rapid recall drills in Section 6.5.

A common review trap is rewriting the textbook. Don’t. Instead, extract the decision hinge. Example hinge: “Need citations and grounding → retrieve from Search and pass sources; don’t rely on chat history.” This is the kind of note that sticks under time pressure.

Section 6.5: Final review map by domain (what to memorize vs what to reason)

Section 6.5: Final review map by domain (what to memorize vs what to reason)

This final domain review is your weak-spot analysis translated into a memorization plan. AI-102 rewards reasoning from constraints, but there are still items you should memorize because they show up as near-miss distractors.

Plan and manage an Azure AI solution: Memorize core governance controls (RBAC, managed identity, private endpoints, key vault integration, logging/metrics). Reason about tradeoffs: cost vs latency, isolation vs complexity, and monitoring placement (app + service telemetry). Exam Tip: If the scenario mentions “least privilege,” assume managed identity and scoped roles; if it mentions “no public internet,” assume private endpoints and network rules.

Generative AI with Azure OpenAI: Memorize RAG building blocks (embeddings, chunking, hybrid retrieval, citations, evaluation). Reason about prompt design and safety: the correct approach usually layers retrieval grounding + structured output + safety filters. Trap: treating prompt engineering as a substitute for retrieval quality.

Agentic solutions: Memorize tool/function concepts, schemas, and orchestration patterns (planner-executor, tool-calling, memory types). Reason about reliability: validate tool inputs, handle retries, and maintain state intentionally. Trap: storing everything as “memory” instead of storing only what’s useful and safe.

Computer vision: Memorize when to use OCR vs image analysis vs custom vision workflows. Reason from the artifact type: scanned docs imply OCR; product defects imply custom detection/classification. Trap: using generic image captions when you need structured extraction.

NLP solutions: Memorize common tasks (classification, extraction, summarization, conversational patterns). Reason about multilingual needs, PII handling, and evaluation. Trap: confusing entity extraction with document search relevance.

Knowledge mining: Memorize indexing concepts (data sources, indexers, skillsets, field attributes, semantic config, vector fields). Reason about enrichment pipelines and incremental updates. Trap: forgetting that retrieval quality starts at indexing design.

Section 6.6: Exam day operations: environment checks, time management, retake plan

Section 6.6: Exam day operations: environment checks, time management, retake plan

On exam day, reduce variance. Your job is to execute your process, not to “feel confident.” Start with environment checks: stable internet, allowed testing space, and a quick system check if remote proctoring is used. Have scratch paper (if permitted) and write your constraint shorthand method at the start: “Read constraints → map to domain → eliminate violations → choose simplest compliant option.”

Time management: commit to a two-pass strategy. Pass 1 answers quickly and flags. Pass 2 resolves flags by re-reading only the relevant lines. Don’t let one ambiguous item steal minutes from five straightforward ones. If a question feels like two correct answers, re-check for a single word that changes everything (private/public, must/should, guarantee/best effort, audit/regulatory, multilingual, deterministic format).

  • Use flags for: multi-requirement scenarios, cost/latency tradeoffs, and anything involving private networking or identity (high-impact constraints).
  • Before submitting, do a “constraint sweep” on flagged items: did your chosen service violate any explicit requirement?

Exam Tip: If you’re stuck, anchor on the exam objective being tested. Ask: “Is this really a model question, a search/indexing question, or a governance question?” Misclassification is a common reason candidates pick plausible but wrong answers.

Retake plan (if needed): schedule within 7–14 days while context is fresh. Use your trap log to drive targeted practice: one domain per day, then a mini-mock. Focus on converting repeated misses into if-then rules. The fastest score gains come from fixing recurring misreads, not learning new features.

Chapter milestones
  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final domain review and rapid recall drills
Chapter quiz

1. You are building a RAG application using Azure OpenAI and Azure AI Search. The customer requires: (1) data must not traverse the public internet, (2) least-privilege access for the app to both services, and (3) minimal operational complexity. Which design best meets the requirements?

Show answer
Correct answer: Deploy Azure OpenAI and Azure AI Search with private endpoints in a VNet, use Managed Identity from the app to access both services, and disable public network access on the services.
A satisfies private connectivity (private endpoints + public network access disabled) and least privilege via Microsoft Entra ID with Managed Identity. B and C still rely on public endpoints (even if restricted) and use static secrets/keys, which increases secret-management risk and does not meet the “must not traverse public internet” constraint.

2. During a mock exam, your RAG answers are frequently incorrect because the model cites content that is not present in the retrieved documents. You want the fastest mitigation with the least architectural change while preserving answer quality. What should you do first?

Show answer
Correct answer: Lower the model temperature, require citations, and instruct the model to answer only from retrieved passages; if insufficient context, respond that information is not available.
A is the quickest, lowest-complexity mitigation aligned with common AI-102 guidance: prompt/grounding controls (citations, refusal when context is missing) and decoding settings reduce unsupported claims. B is a major architectural change and may not address grounding behavior. C is heavier, costlier, and fine-tuning does not guarantee factual grounding without retrieval and guardrails.

3. You are implementing an agent that can call tools (functions) to look up order status and issue refunds. A compliance requirement states: refunds above $500 must require human approval, and all tool calls must be auditable. Which approach best satisfies this requirement?

Show answer
Correct answer: Implement tool-call gating: validate tool arguments server-side, enforce business rules (human-in-the-loop approval for refunds > $500), and log each tool invocation with correlation IDs to an audit store.
A enforces policy outside the model (deterministic controls), supports human approval workflows, and provides auditable logs—typical certification expectations for governance and reliability. B relies on prompt adherence and is not enforceable. C increases fragility and risk (parsing free text for critical transactions) and still lacks strong, explicit audit/control boundaries.

4. A global support team needs near real-time monitoring to detect spikes in failed requests and elevated latency across an Azure OpenAI + Azure AI Search application. You also need to correlate the user request with downstream calls. Which solution is most appropriate?

Show answer
Correct answer: Instrument the app with Application Insights distributed tracing, propagate a correlation ID through the request pipeline, and create Azure Monitor alerts on failure rate and latency metrics.
A aligns with operational excellence: near real-time telemetry, correlation, and alerting using Azure Monitor/Application Insights. B is incomplete (covers only one service) and is not real-time. C is delayed, does not provide actionable alerting, and transcripts are not a substitute for request/dependency telemetry.

5. You are doing weak-spot analysis after a mock exam. You notice you often choose solutions that are technically correct but violate a stated constraint (cost ceiling, private network, deterministic output). What is the best next step to improve exam performance?

Show answer
Correct answer: Adopt a constraint-first checklist: identify explicit constraints (security boundary, latency, cost, governance), eliminate options that violate them, then pick the least-complex option that satisfies all constraints.
A directly targets the failure mode described and reflects how AI-102 questions are designed: the best answer typically satisfies constraints with minimal added complexity. B may help occasionally but does not address misreading/constraint violations. C often increases complexity, cost, and operational risk and will frequently conflict with the exam’s ‘least complexity that meets requirements’ pattern.
More Courses
Edu AI Last
AI Course Assistant
Hi! I'm your AI tutor for this course. Ask me anything — from concept explanations to hands-on examples.