HELP

AI-102 Practice Tests: 250+ Questions Mapped to Objectives

AI Certification Exam Prep — Beginner

AI-102 Practice Tests: 250+ Questions Mapped to Objectives

AI-102 Practice Tests: 250+ Questions Mapped to Objectives

250+ AI-102 questions mapped to objectives, plus a full mock exam.

Beginner ai-102 · microsoft · azure · azure-ai

Prepare to pass Microsoft AI-102 with objective-mapped practice

This Edu AI course blueprint is built for learners targeting the Microsoft AI-102 (Azure AI Engineer Associate) exam. If you’re new to certification exams but have basic IT literacy, you’ll get a guided path that turns the official skills measured into structured practice—so you can build confidence, speed, and accuracy.

The AI-102 exam validates your ability to design and implement Azure AI solutions end-to-end. That includes selecting the right Azure AI services, securing and operating solutions, and implementing workloads across generative AI, agentic patterns, vision, language, and search-based knowledge mining. This course is organized as a 6-chapter “book” that mirrors those domains and keeps your study time focused on what Microsoft actually tests.

What this course covers (aligned to the official AI-102 domains)

  • Plan and manage an Azure AI solution: architecture choices, provisioning, identity and network security, deployment patterns, monitoring, and cost management.
  • Implement generative AI solutions: Azure OpenAI deployments, prompt design, retrieval-augmented generation (RAG) with Azure AI Search, evaluation, and safety.
  • Implement an agentic solution: tool/function calling concepts, orchestration patterns, memory strategies, and guardrails for reliable automation.
  • Implement computer vision solutions: OCR/document scenarios and image analysis decision-making, plus solution design tradeoffs.
  • Implement NLP solutions: Azure AI Language capabilities, translation, speech fundamentals, and integration decisions.
  • Implement knowledge mining and information extraction: Azure AI Search indexing/enrichment and document extraction workflows.

How the 6 chapters work

Chapter 1 gets you exam-ready before you even start: registration and scheduling, scoring and retakes, common traps, and a practical study strategy for beginners. Chapters 2–5 each focus on one or two domains with clear subtopic breakdowns and dedicated exam-style practice sets mapped back to the objective names. Chapter 6 is a full mock exam split into two timed parts, followed by a structured weak-spot analysis process and an exam-day checklist.

Why this blueprint helps you pass

Most learners don’t fail because they never saw the content—they fail because they can’t recognize what the question is really testing under time pressure. This course is designed to build that recognition skill: you’ll practice by domain, review misses by objective, and then retest with a tighter focus until your accuracy stabilizes.

To get started on Edu AI, use Register free. If you want to compare other certification tracks first, you can also browse all courses.

Recommended pacing

Plan for 2–4 weeks depending on your schedule. Use Chapter 1 to set your baseline, then complete one domain chapter at a time, finishing with the Chapter 6 mock exam and final review to confirm readiness for AI-102 exam day.

What You Will Learn

  • Plan and manage an Azure AI solution (governance, security, monitoring, cost, deployment)
  • Implement generative AI solutions (Azure OpenAI, prompt design, RAG, safety)
  • Implement an agentic solution (tools/functions, orchestration, memory, evaluation)
  • Implement computer vision solutions (Azure AI Vision, OCR, image analysis, custom vision workflows)
  • Implement NLP solutions (Azure AI Language, translation, speech, conversational experiences)
  • Implement knowledge mining and information extraction (Azure AI Search, enrichment, document intelligence)

Requirements

  • Basic IT literacy (web apps, REST APIs, JSON, authentication concepts)
  • Familiarity with Azure basics (resource groups, subscriptions) is helpful but not required
  • No prior certification experience needed
  • A computer with internet access to review explanations and take timed practice sets

Chapter 1: AI-102 Exam Orientation and Study Strategy

  • Understand AI-102 format, skills measured, and question types
  • Register, schedule, and set up Pearson VUE (online or test center)
  • Scoring, retakes, accommodations, and exam-day rules
  • Build a 2–4 week study plan using objective mapping
  • How to use practice tests: timing, review loops, and error logs

Chapter 2: Plan and Manage an Azure AI Solution

  • Design Azure AI solution architecture and resource organization
  • Secure AI workloads (identity, keys, private endpoints, RBAC)
  • Deploy and operate (CI/CD, endpoints, SDKs, APIs)
  • Monitor and optimize (logging, metrics, cost, reliability)
  • Domain practice set: 50+ questions with objective-level review

Chapter 3: Implement Generative AI Solutions

  • Azure OpenAI fundamentals (models, tokens, deployments, parameters)
  • Prompt engineering and system design for reliability
  • RAG with Azure AI Search (indexing, chunking, embeddings, citations)
  • Content safety, grounding, evaluation, and responsible AI
  • Domain practice set: 60+ questions mapped to generative AI objectives

Chapter 4: Implement an Agentic Solution + NLP Solutions

  • Agent design: goals, planning, tool use, and orchestration patterns
  • Function calling/tools and state: memory, grounding, and session handling
  • Language solutions: classification, extraction, summarization, and Q&A
  • Speech and translation building blocks and integration choices
  • Domain practice set: 60+ questions covering agentic + NLP objectives

Chapter 5: Implement Computer Vision + Knowledge Mining and Information Extraction

  • Vision fundamentals: image analysis, OCR, and document scenarios
  • Design vision pipelines (batch vs real-time, edge considerations)
  • Knowledge mining architecture: Azure AI Search indexes and enrichment
  • Information extraction workflows with Document Intelligence and skillsets
  • Domain practice set: 60+ questions for vision + knowledge mining objectives

Chapter 6: Full Mock Exam and Final Review

  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final objective-by-objective review plan

Jordan Whitaker

Microsoft Certified Trainer (MCT) | Azure AI Engineer Associate

Jordan Whitaker is a Microsoft Certified Trainer who designs Azure exam-prep programs focused on practical, objective-mapped learning. Jordan has supported learners across Microsoft role-based certifications with an emphasis on AI-102 exam readiness and real-world Azure AI implementation skills.

Chapter 1: AI-102 Exam Orientation and Study Strategy

AI-102 (Designing and Implementing a Microsoft Azure AI Solution) rewards candidates who can translate requirements into the right Azure AI services, secure and monitor them properly, and make design tradeoffs under constraints. This chapter sets your “test-taking operating system”: how the exam is structured, how to schedule it, how scoring works, and how to build a short, high-yield plan that maps directly to objectives. You’ll also learn how to use practice tests like a diagnostic tool—timed, reviewed, and converted into a repeatable error-log loop—so your study time produces measurable gains.

The most common failure mode on AI-102 is not “not knowing enough,” but “answering a different question than the one asked.” Many items are written as mini design reviews where one constraint (cost, latency, region, data residency, security boundary, token limits, or throughput) silently dominates the decision. Throughout this chapter, you’ll learn to spot those pivot words, identify the true objective being tested, and eliminate attractive but invalid options.

Practice note for Understand AI-102 format, skills measured, and question types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and set up Pearson VUE (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, retakes, accommodations, and exam-day rules: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 2–4 week study plan using objective mapping: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for How to use practice tests: timing, review loops, and error logs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand AI-102 format, skills measured, and question types: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register, schedule, and set up Pearson VUE (online or test center): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, retakes, accommodations, and exam-day rules: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 2–4 week study plan using objective mapping: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for How to use practice tests: timing, review loops, and error logs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 1.1: Exam overview and skills measured for AI-102

Section 1.1: Exam overview and skills measured for AI-102

AI-102 is a role-based exam aimed at practitioners who design and implement Azure AI solutions end-to-end. Expect scenario-driven questions that force you to select services and architectures, not just recall definitions. The skills measured align closely with the outcomes in this course: planning and managing an Azure AI solution (governance, security, monitoring, cost, deployment); implementing generative AI solutions (Azure OpenAI, prompt design, RAG, safety); building agentic solutions (tools/functions, orchestration, memory, evaluation); and implementing computer vision, NLP, and knowledge mining pipelines (Azure AI Vision, Language, Speech, Translator, Azure AI Search, Document Intelligence).

Question formats vary, but they often share a pattern: a short business context, a few technical constraints, and multiple “almost right” answers. You will see items that test your ability to pick between adjacent services (for example, when Azure AI Search is required versus when storage + app logic is enough), to design a RAG flow with proper chunking and citations, or to decide where to implement safety and governance controls (resource-level RBAC, content filters, private networking, logging, and cost controls).

Exam Tip: Treat every question as an objective check. Before looking at the options, state the objective in your own words (e.g., “choose the right service for OCR with structured extraction,” “secure a generative endpoint,” “design monitoring for production”). That one sentence reduces misreads and narrows the plausible answers.

  • Management domain: authentication/authorization, network isolation, telemetry, responsible AI controls, deployment strategies, cost governance.
  • Generative AI domain: model selection, prompt patterns, RAG, embeddings and vector search, grounding, safety filters, evaluation.
  • Agentic patterns: tool/function calling, orchestration, memory boundaries, conversation state, reliability and evaluation loops.
  • Vision/NLP/Knowledge mining: service selection, data formats, limitations, batch vs real-time, enrichment pipelines, search indexing.

A frequent exam emphasis is “fit-for-purpose” design: the simplest solution that meets constraints. If an option adds services without benefit, it is often wrong—unless the question explicitly requires features like vector search, private endpoint access, or enterprise compliance boundaries.

Section 1.2: Registration and scheduling workflow (Microsoft Learn + Pearson VUE)

Section 1.2: Registration and scheduling workflow (Microsoft Learn + Pearson VUE)

Scheduling is straightforward, but avoid preventable friction that can derail your timeline. The official workflow typically starts on Microsoft Learn: locate AI-102, confirm it’s the correct exam (name and code), and proceed to schedule through the exam provider (commonly Pearson VUE). Your Microsoft account becomes the identity anchor—ensure the name on your Microsoft profile exactly matches your government-issued ID, including middle names/initials if present. Mismatches are a common last-minute issue.

Choose between an online proctored exam and a test center. Online proctoring is convenient but demands a clean environment: stable internet, a supported OS/browser, a webcam, and a quiet room with no prohibited items. Test centers reduce environmental risk but require travel and fixed schedules. Decide based on your risk tolerance: if your home network is unreliable or you can’t guarantee a distraction-free space, a test center is often the safer option.

Exam Tip: Schedule first, then plan backward. A booked date creates urgency and helps you allocate study hours realistically. If you aim for a 2–4 week sprint, pick a date that forces consistency without creating panic.

  • Run the Pearson VUE system test early (not the night before) and again on exam day.
  • Review check-in rules (desk clear, phone placement, room scan) and plan your setup.
  • Know your time zone and confirm the appointment time in the confirmation email.

Finally, ensure you can access Microsoft Learn and any training resources from the device you’ll use. If your organization enforces endpoint restrictions, confirm you can complete the online check-in workflow without blocked camera/microphone permissions.

Section 1.3: Scoring model, cut score, retake policy, and case studies

Section 1.3: Scoring model, cut score, retake policy, and case studies

AI-102 uses a scaled scoring model. You do not need to “ace” every domain; you need to meet the passing standard (commonly communicated as a cut score on a 1000-point scale). Scaled scoring means the raw number correct may not map directly to your final score, and different question types can contribute differently. Practically, your strategy should be to avoid zero-strength areas: weak domains are where you hemorrhage points because scenario questions compound misunderstandings (service choice + security + monitoring in one prompt).

Be prepared for longer scenario sets often referred to as case studies. These may include tabs or exhibits (requirements, current architecture, constraints, or user stories). The trap is time: candidates read every line twice. Instead, scan for hard constraints first (region, data residency, PII handling, network isolation, latency, budget, throughput) and then map each question to those constraints.

Exam Tip: In case studies, write (mentally) a short “constraint list” and reuse it for each sub-question. If an answer violates any constraint, eliminate it immediately—even if it sounds technically impressive.

Retake policies can change, but generally you must wait before retesting after a failed attempt, and subsequent retakes may have longer waiting periods or limits per year. Treat this as a reason to plan your first attempt seriously: your 2–4 week strategy should include at least one full-length timed simulation and a remediation cycle. For accommodations, Microsoft and Pearson VUE provide processes for candidates who need them; initiate early because approvals can take time.

Exam-day rules matter because violations can invalidate the attempt. Know what’s allowed (ID requirements, breaks, scratch paper rules at a test center) and what’s not (unauthorized devices, leaving camera view, reading questions aloud). Many “I failed unexpectedly” stories are actually “my exam ended unexpectedly” due to proctoring issues.

Section 1.4: Common AI-102 traps (wording, constraints, service limits)

Section 1.4: Common AI-102 traps (wording, constraints, service limits)

AI-102 questions are designed to reward careful reading. The highest-yield skill is identifying the “dominant constraint” hidden in the wording. Watch for qualifiers such as “minimize cost,” “no code changes,” “must be private,” “must support offline/batch,” “must provide citations,” or “data must not leave region.” These phrases often disqualify half the options immediately.

Service confusion is another trap: Azure AI services have overlapping capabilities, but the exam expects you to choose the service that natively meets requirements. For example, OCR and document extraction can involve Azure AI Vision and Document Intelligence, but the question will signal the expected direction through output needs (plain text vs structured fields/tables), training requirements, or workflow constraints. Similarly, knowledge mining tasks frequently imply Azure AI Search with enrichment and indexing rather than custom database querying.

Exam Tip: When two options both “work,” prefer the one that requires fewer custom components while meeting the stated constraints. The exam often tests best-practice architecture, not creative improvisation.

  • Wording trap: “Ensure” vs “help” vs “monitor.” “Ensure” implies enforcement (policy/RBAC/network) not just logging.
  • Generative AI trap: “Ground responses in enterprise documents” implies RAG (retrieval + citations), not just a bigger prompt.
  • Agent trap: “Use tools to call APIs” implies function/tool calling and orchestration, not only prompt instructions.
  • Limits trap: Token/context constraints, throughput, and latency: solutions that ignore chunking, batching, or caching are often invalid.
  • Security trap: Private networking and identity boundaries: “no public internet exposure” implies private endpoints/VNet integration patterns and strict access control.

Finally, beware “buzzword gravity.” Options that include extra services (event buses, multiple databases, or complex orchestration) can look enterprise-grade but may violate “minimize cost” or “reduce operational overhead.” On AI-102, elegance is often simplicity plus governance.

Section 1.5: Study strategy by domain and objective mapping

Section 1.5: Study strategy by domain and objective mapping

A 2–4 week plan works if you study by objectives, not by random topic browsing. Start by listing the measured skills domains (management/governance, generative AI, agentic solutions, vision, language/speech/translation, knowledge mining). For each domain, map it to tasks you must be able to perform on the exam: select the correct service, design the architecture, apply security and monitoring, and reason about tradeoffs (cost, latency, compliance, reliability).

Use an “objective map” spreadsheet with columns: objective, key services, common constraints, must-know terminology, and “failure patterns” you personally exhibit (from practice tests). Your weekly plan should rotate through domains but revisit weak ones more frequently. A practical cadence for a 3-week plan is: Week 1 (baseline + fundamentals + management), Week 2 (deep practice: generative/RAG + Search + Document Intelligence), Week 3 (mixed timed sets + remediation + full simulation).

Exam Tip: Study like the exam asks: start from requirements. When you learn a service feature, attach it to a requirement phrase you expect to see (e.g., “extract key-value pairs and tables” → Document Intelligence; “vector similarity over chunks” → Azure AI Search vector search + embeddings; “moderate prompts and outputs” → safety/content filtering strategy).

  • Plan/Manage: RBAC, keys vs managed identity where applicable, network isolation, monitoring/alerts, cost controls, deployment environments.
  • Generative AI: prompt structure, grounding, RAG steps (ingest → chunk → embed → index → retrieve → generate), safety and evaluation.
  • Agentic: tool selection, function calling, orchestration patterns, memory scope, reliability and fallback behaviors.
  • Vision/NLP: image analysis vs OCR vs custom models; text analytics vs conversational; translation and speech pipeline choices.
  • Knowledge mining: indexing strategy, enrichment, document parsing, search relevance, and end-to-end query experience.

Keep your plan honest: if you can’t explain why one option is wrong, you don’t fully own the objective yet. Your goal is not “coverage,” but confident discrimination among near-miss answers.

Section 1.6: Practice-test methodology (timed sets, review, remediation)

Section 1.6: Practice-test methodology (timed sets, review, remediation)

Practice tests are most valuable when you treat them as a measurement-and-remediation system. Start with a diagnostic timed set to establish your baseline and expose blind spots. Then shift into short, focused timed sets (for example, 20–30 questions) that target one or two domains at a time. Time pressure matters because AI-102 punishes over-reading; you must learn to extract constraints quickly.

After each set, do a structured review loop: (1) re-read the question and underline the constraint words, (2) explain why the correct answer satisfies them, (3) explain why each wrong option fails, and (4) log the error type. Your error log should categorize misses into patterns such as “misread constraint,” “service confusion,” “security/governance gap,” “ignored limit,” and “over-engineered design.” The remediation step is not “read more”—it is “write a rule.” Example rule: “If the requirement includes citations and grounding, I must retrieve from an index/store; prompts alone are insufficient.”

Exam Tip: Separate knowledge errors from execution errors. If you knew the concept but rushed, you need pacing tactics. If you didn’t know the concept, you need targeted study and a second attempt on similar questions within 48 hours.

  • Timing tactic: On long scenarios, identify constraints first, then read for details. Don’t memorize the story; solve the requirement.
  • Elimination tactic: Remove any option that violates a stated constraint (region, privacy, cost, latency, “must,” “only”).
  • Remediation tactic: Re-test the same objective with new questions until your error rate drops consistently.

End your preparation with at least one full-length simulation under exam-like conditions. Then do a final pass through your error log—not to relearn everything, but to reinforce your personal “trap detectors.” That is how practice questions become exam points.

Chapter milestones
  • Understand AI-102 format, skills measured, and question types
  • Register, schedule, and set up Pearson VUE (online or test center)
  • Scoring, retakes, accommodations, and exam-day rules
  • Build a 2–4 week study plan using objective mapping
  • How to use practice tests: timing, review loops, and error logs
Chapter quiz

1. You are preparing for the AI-102 exam. During practice questions, you frequently choose an option that is technically valid but does not satisfy a specific constraint stated in the prompt. Which approach best aligns with real AI-102 item strategy to reduce this failure mode?

Show answer
Correct answer: Identify pivot words and constraints (for example, latency, region/data residency, security boundary, throughput/token limits) and select the option that optimizes for the dominant constraint
AI-102 items often read like mini design reviews where one constraint dominates the correct choice. Actively scanning for pivot words (cost/latency/region/security/limits/throughput) aligns to the exam’s skills of translating requirements into the right design. Option B is wrong because “more services” can increase cost/complexity and may violate constraints. Option C is wrong because tradeoffs are common in AI-102 scenarios; the exam expects you to choose the best fit under constraints.

2. A candidate wants to take AI-102 from home. They need to minimize the risk of being turned away on exam day due to policy issues. Which action is MOST appropriate when setting up the exam through Pearson VUE?

Show answer
Correct answer: Schedule the exam as an online proctored session and complete the system test and workspace check in advance
Pearson VUE online proctoring has specific technical and environment requirements; completing the system test and ensuring your workspace meets rules reduces the highest-risk failure points. Option B is wrong because check-in is not intended for troubleshooting and can result in cancellation if requirements aren’t met. Option C is wrong because practice tests do not address exam delivery requirements or exam-day rules.

3. Your manager asks how AI-102 scoring works and whether there is a penalty for incorrect answers. You want to provide guidance that supports good exam-time decision making. What should you recommend?

Show answer
Correct answer: Answer every question because leaving items blank reduces the chance to earn points, and the exam is scored based on correct answers
Certification exams like AI-102 score based on correct responses; the practical strategy is to attempt all questions and use review/marking features to revisit uncertain items. Option B is wrong because it assumes a penalty model; the recommended behavior is to avoid leaving questions unanswered. Option C is wrong because scoring is not limited to the first half; any unanswered questions due to time mismanagement simply forfeit potential points.

4. A team has 3 weeks to prepare for AI-102. They have uneven experience across objectives and want a plan that maximizes score improvement per hour. Which study plan best matches the chapter’s recommended objective-mapping approach?

Show answer
Correct answer: Run a timed diagnostic practice test, map missed items to the official skill areas/objectives, then schedule focused study blocks and retest cycles on the weakest objectives
Objective mapping is a high-yield strategy: measure current performance (diagnostic), align gaps to the skills measured, study targeted areas, then validate with retesting. Option B is wrong because linear reading is inefficient and doesn’t ensure coverage of weak objectives. Option C is wrong because AI-102 includes scenario-based decision questions where selecting the right service/design under constraints is central, not just implementation steps.

5. You are using practice tests as a diagnostic tool. After each attempt, you want a repeatable loop that produces measurable gains and prevents repeating the same mistakes. What is the BEST next step after finishing a timed practice test?

Show answer
Correct answer: Review each missed or guessed question, log the error category (for example, misread constraint vs. knowledge gap), map it to an objective, and create targeted review tasks before retesting
The chapter emphasizes using practice tests in a loop: timed attempt, structured review, and an error log that distinguishes knowledge gaps from reading/constraint errors, then targeted remediation and retest. Option B is wrong because retaking without review mainly trains recall of answers rather than fixing underlying gaps. Option C is wrong because guessed-correct items often indicate fragile understanding and can hide recurring issues with constraints or objective knowledge.

Chapter 2: Plan and Manage an Azure AI Solution

AI-102 doesn’t just test whether you can call an API—it tests whether you can run an Azure AI solution in production. That means picking the right service for the job, organizing resources so teams can ship safely, securing data and models, deploying consistently, and operating reliably under cost and quota constraints. In practice tests, many “almost right” answers fail because they ignore governance (who can do what), networking (public vs private access), or capacity limits (regional availability and quotas).

This chapter aligns to the exam outcome “Plan and manage an Azure AI solution” and supports later outcomes (generative AI, agents, vision, NLP, search) by focusing on the platform decisions that enable them. As you read, keep asking: “What is the Azure-native way to manage identity, secrets, networks, monitoring, and deployments across environments?” The exam rewards solutions that are secure by default, automate repeatable actions, and use Azure control-plane features correctly.

Exam Tip: When a question mentions “enterprise,” “regulated,” “private data,” or “production,” assume the expected answer includes Entra ID authentication, RBAC, Key Vault, private endpoints, and centralized monitoring—unless the scenario explicitly relaxes those requirements.

Each section below maps to a specific objective area: architecture and service selection, resource planning, security, deployment, and operations. Finish with an objective-level practice set review strategy so you can diagnose weaknesses quickly after each domain set.

Practice note for Design Azure AI solution architecture and resource organization: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure AI workloads (identity, keys, private endpoints, RBAC): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy and operate (CI/CD, endpoints, SDKs, APIs): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Monitor and optimize (logging, metrics, cost, reliability): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 50+ questions with objective-level review: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design Azure AI solution architecture and resource organization: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure AI workloads (identity, keys, private endpoints, RBAC): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy and operate (CI/CD, endpoints, SDKs, APIs): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Monitor and optimize (logging, metrics, cost, reliability): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 2.1: Choose Azure AI services and solution architecture patterns

Section 2.1: Choose Azure AI services and solution architecture patterns

AI-102 expects you to recognize which Azure AI capability fits a requirement and how to assemble it into a manageable architecture. Start by separating control plane (Azure resource management, RBAC, networking, monitoring) from data plane (calling the model/service endpoints). Architecture questions often hide the key clue in non-functional requirements: latency, data residency, customization, or operational simplicity.

Common patterns include: (1) API-first inference where an app calls Azure OpenAI, Azure AI Vision, or Azure AI Language directly; (2) RAG where Azure OpenAI is paired with Azure AI Search for retrieval and grounding; (3) document processing pipelines combining Azure AI Document Intelligence with Storage/Event Grid/Functions; and (4) agentic orchestration where a hosted API (Azure OpenAI) calls tools/functions in your backend via a controlled tool layer. The exam typically wants you to choose the simplest pattern that meets constraints and is supportable.

  • Use Azure AI Search for indexing, retrieval, hybrid/vector search, and enrichment pipelines—don’t “store embeddings in a database” unless the scenario explicitly avoids Search.
  • Use Azure AI Document Intelligence for structured extraction from forms and documents (tables, key-value pairs) rather than generic OCR-only approaches.
  • Use Azure AI Vision for image analysis/OCR scenarios where you don’t need custom training; use custom workflows (e.g., Custom Vision) only when the requirement calls for domain-specific classification/detection.

Common trap: Over-selecting services. If a scenario only needs language detection and key phrase extraction, don’t add OpenAI “because it can.” The exam often marks down architectures that increase cost/complexity without a requirement. Conversely, if the scenario requires “grounded responses with citations,” a pure prompt-only OpenAI design is usually insufficient; pairing with Azure AI Search is the intended solution.

Exam Tip: When you see “citations,” “company documents,” “latest policy,” or “don’t hallucinate,” translate that to: retrieval + grounding (Azure AI Search) + safe prompt orchestration (system message + tool results + output constraints).

Section 2.2: Provisioning resources, regions, quotas, and capacity planning

Section 2.2: Provisioning resources, regions, quotas, and capacity planning

Provisioning is a frequent source of exam questions because it mixes Azure fundamentals (subscriptions, resource groups, regions) with AI-specific constraints (model availability, quotas, throughput). Your job is to plan where resources live, how they scale, and what happens when you hit limits. In practice tests, the wrong answer is often the one that ignores regional availability or assumes unlimited capacity.

For Azure OpenAI in particular, availability varies by region and by model family. Capacity is governed by quotas and sometimes by deployment-specific throughput. When a scenario mentions “spiky traffic,” “high throughput,” or “SLA,” think about: multiple deployments, load distribution, and fallback strategy (e.g., alternate deployment/model/region) that still respects data residency requirements.

  • Choose regions based on data residency, service/model availability, and latency. The exam may force a tradeoff; residency usually wins unless stated otherwise.
  • Plan resource organization: separate dev/test/prod into different resource groups or subscriptions to prevent accidental cross-environment access and to simplify cost attribution.
  • Understand quotas: if you hit request/token limits, the corrective action may be to request quota increase, optimize prompts, batch requests, or add deployments—not “scale the app service” alone.

Common trap: Confusing Azure resource scaling with model/service quota. Scaling your compute does not increase an Azure OpenAI quota. Also watch for “global” designs that violate residency: sending EU customer prompts to a US region is usually disallowed in the scenario.

Exam Tip: If the question mentions “cannot change application code,” prefer solutions like adding deployments, adjusting configuration, or using API Management policies over refactoring. If it mentions “predictable monthly cost,” look for capacity planning, request limiting, caching, and prompt optimization rather than unconstrained autoscale.

Section 2.3: Security and compliance: Entra ID, RBAC, networking, key management

Section 2.3: Security and compliance: Entra ID, RBAC, networking, key management

Security is heavily tested in AI-102 because AI solutions often process sensitive prompts, documents, and derived embeddings. Expect to be examined on how clients authenticate, where secrets live, and how to restrict network access. A secure architecture typically uses Microsoft Entra ID for identity, role-based access control for authorization, private networking where required, and Key Vault for secrets and keys.

For authentication, prefer Entra ID (Azure AD) token-based auth and managed identities over embedding keys in code. Many Azure AI services support both API keys and Entra ID. When the scenario says “no secrets in code” or “rotate credentials,” the intended direction is managed identity + Key Vault and/or Entra ID auth. Use RBAC to constrain who can manage the resource (control plane) and who can call it (data plane), depending on service capabilities.

  • Key management: store service keys, connection strings, and customer-managed keys (where supported) in Azure Key Vault; use Key Vault access policies or RBAC with least privilege.
  • Networking: apply private endpoints/Private Link to keep traffic on Microsoft backbone; disable public network access where the scenario demands isolation; use VNet integration for apps calling private endpoints.
  • Authorization: use RBAC roles (built-in or custom) scoped to resource groups/subscriptions; avoid granting Owner/Contributor when a narrower role fits.

Common trap: Assuming “RBAC” automatically protects data-plane calls for every service. Some services still rely on API keys for data-plane operations unless Entra ID is supported and configured. Read the question carefully: if it asks about “calling the endpoint,” that’s data plane; if it asks about “creating deployments, changing settings,” that’s control plane.

Exam Tip: When you see “exfiltration risk,” “public internet,” or “only from our network,” look for private endpoints + firewall rules + disabling public access. When you see “auditable access” and “separation of duties,” look for Entra ID + RBAC + logging to a central workspace.

Section 2.4: Deployment models: REST/SDK integration, environments, IaC concepts

Section 2.4: Deployment models: REST/SDK integration, environments, IaC concepts

Deployment questions test whether you can move from a prototype to repeatable environments. You should be comfortable with calling Azure AI services through REST or SDKs, selecting the right endpoint type, and automating provisioning. The “correct” exam answer usually emphasizes consistency: same configuration in dev/test/prod, managed secrets, and minimal manual steps.

For integration, REST is universal and language-agnostic; SDKs provide convenience, retries, and object models. The exam often frames a scenario like “existing microservice uses HTTP” (choose REST) vs “developer wants fastest implementation in Python/.NET” (choose SDK). API Management is a frequent companion when you need centralized authentication, throttling, transformation, versioning, or a single facade endpoint to multiple backends.

  • Environment strategy: separate resources per environment; use configuration (not code changes) for endpoints, model deployment names, and feature flags.
  • IaC: represent resources in ARM/Bicep/Terraform; keep templates in source control; parameterize per environment; avoid “click-ops” in production.
  • CI/CD: automate deployments via Azure DevOps/GitHub Actions; include approvals for prod and automated validation (smoke tests) after deployment.

Common trap: Confusing “model deployment” with “application deployment.” In Azure OpenAI, you deploy a model deployment inside the Azure resource (and reference it by deployment name). Your app deployment (App Service/Functions/AKS) is separate. Questions may ask which setting changes: endpoint URL, API version, deployment name, or model name.

Exam Tip: When a scenario says “repeatable,” “standardized,” or “governed,” assume IaC. When it says “no downtime” or “safe rollout,” think blue/green or canary deployments for the app tier—plus versioned prompts/configuration stored in source control.

Section 2.5: Observability and operations: monitoring, troubleshooting, cost management

Section 2.5: Observability and operations: monitoring, troubleshooting, cost management

Operations is where many candidates lose points: they know which service to use but not how to keep it healthy and affordable. AI-102 expects you to use Azure-native observability: Azure Monitor, Log Analytics, Application Insights, resource diagnostics, and alerts. You should also be ready to reason about reliability (retries, timeouts, regional incidents) and cost controls (budgets, quotas, prompt optimization).

At runtime, instrument your application to capture request IDs, latency, dependency calls, and error rates. Enable diagnostic settings on supported Azure AI resources to export logs/metrics to Log Analytics or Event Hubs for SIEM integration. For generative AI, operational troubleshooting often includes: prompt size/token usage, rate limiting (429), timeouts, and content filtering outcomes. For retrieval pipelines, failures might be indexing errors, chunking/enrichment problems, or stale indexes.

  • Monitoring: set alerts on error rate, latency, throttling responses, and saturation; build dashboards per environment.
  • Troubleshooting: correlate app logs with service metrics; check quota/throttling before scaling compute; validate DNS/network paths for private endpoints.
  • Cost management: use Cost Management budgets/alerts; tag resources; right-size tiers; cache frequent responses; reduce tokens via concise prompts and smaller context windows; choose cheaper models when acceptable.

Common trap: Treating AI spend as “just another API.” Token-based pricing means prompt design is an operational lever. If the scenario asks to reduce cost without losing functionality, look for prompt/context reduction, caching, and retrieval narrowing before proposing “buy reserved capacity” style answers that may not apply.

Exam Tip: If you see intermittent failures plus 429s, the best answer is usually throttling/backoff + quota review (and possibly multiple deployments), not “increase App Service instances.” If you see “need audit trail,” ensure logs flow to a central workspace with retention and access controls.

Section 2.6: Exam-style practice: Plan and manage an Azure AI solution

Section 2.6: Exam-style practice: Plan and manage an Azure AI solution

This domain practice set is where you build the “architect/operator” reflex the exam demands. After attempting the questions, review at the objective level, not just right/wrong. Ask: did you miss a service-selection cue, a security requirement, a networking constraint, or an operational detail (quota/monitoring/IaC)? Most misses cluster into predictable categories.

Use a three-pass review method. Pass 1: classify each item by objective—architecture, provisioning, security, deployment, operations. Pass 2: identify the keyword triggers you overlooked (e.g., “private,” “regulated,” “no secrets,” “data residency,” “high throughput,” “audit”). Pass 3: rewrite the solution in your own words as an “Azure-native plan,” explicitly naming identity method, network path, secret store, deployment approach, and monitoring destination.

  • Architecture misses: you chose a flexible service (OpenAI) when a specialized service (Vision/Language/Document Intelligence) was simpler and more correct.
  • Provisioning misses: you ignored region/model availability or quota limits; you proposed scaling compute instead of addressing service capacity.
  • Security misses: you used API keys when the scenario demanded Entra ID/managed identity; you left public access enabled when private endpoints were required.
  • Ops misses: you didn’t route diagnostics to Log Analytics/Application Insights; you proposed reactive troubleshooting without alerts, budgets, or dashboards.

Exam Tip: When stuck between two plausible answers, choose the one that improves governance and repeatability (RBAC, Key Vault, private endpoints, IaC, centralized monitoring). The AI-102 exam consistently favors secure, automatable, production-ready choices over ad-hoc or developer-convenience approaches.

As you complete the 50+ questions, track a “trap list” of recurring mistakes (quota vs scaling, control-plane vs data-plane permissions, private networking requirements, and deployment-name vs model-name confusion). Reducing those four errors alone typically yields a noticeable score increase in this chapter’s objective area.

Chapter milestones
  • Design Azure AI solution architecture and resource organization
  • Secure AI workloads (identity, keys, private endpoints, RBAC)
  • Deploy and operate (CI/CD, endpoints, SDKs, APIs)
  • Monitor and optimize (logging, metrics, cost, reliability)
  • Domain practice set: 50+ questions with objective-level review
Chapter quiz

1. A healthcare company is deploying an Azure AI solution that processes PHI. The security team requires that service-to-service authentication not use API keys and that access be centrally governed by least privilege. Which approach best meets the requirement for calling Azure AI services from an application hosted in Azure?

Show answer
Correct answer: Use Microsoft Entra ID authentication with managed identity and assign Azure RBAC roles to the identity
Managed identity with Microsoft Entra ID uses token-based authentication and allows least-privilege control via Azure RBAC, aligning with production and regulated requirements. Using Key Vault (option B) is better than embedding keys, but it still relies on key-based auth and doesn’t satisfy the requirement to avoid keys. Embedding keys (option C) is insecure and hard to govern; rotation alone doesn’t provide least-privilege or centralized access control.

2. A financial services company must ensure that traffic from its virtual network to Azure AI services does not traverse the public internet. The solution must also prevent public network access to the AI resource. What should you configure?

Show answer
Correct answer: A private endpoint for the Azure AI resource and disable public network access on the resource
Private endpoints provide private IP connectivity over the Microsoft backbone and can be paired with disabling public network access, meeting the requirement to avoid public internet paths. Service endpoints (option B) do not provide the same private-link semantics for all Azure AI services and can still leave public endpoints in play; IP rules reduce exposure but don’t guarantee private connectivity. API Management external mode (option C) still exposes a public endpoint and subscription keys don’t address private network access requirements.

3. You manage dev, test, and prod environments for an Azure AI solution. The team wants repeatable deployments of AI resources and application configuration, with approvals before production changes. Which approach best aligns with Azure-native CI/CD and governance expectations for the AI-102 exam?

Show answer
Correct answer: Use Infrastructure as Code (Bicep or ARM templates) in an Azure DevOps or GitHub Actions pipeline with environment approvals for production
IaC plus pipelines enables consistent, auditable, and repeatable deployments across environments and supports approvals and change control—key production expectations. Manual portal setup (option B) is error-prone and not repeatable. Deploying only code (option C) leaves infrastructure drift unmanaged and undermines consistent security, networking, and configuration across environments.

4. An AI application experiences intermittent failures when calling an Azure AI service. You suspect throttling due to quotas or capacity limits. What is the most appropriate first step to confirm the cause using Azure-native operational tooling?

Show answer
Correct answer: Review Azure Monitor metrics for the AI resource (for example, request counts, errors, throttled requests) and correlate with logs
Azure Monitor metrics and logs help confirm whether failures are due to throttling, rate limits, or other error conditions; this aligns with the exam focus on monitoring and operating reliably. Increasing tier (option B) may not address regional quota/capacity and is premature without evidence. Adding retries (option C) can help resilience but without telemetry you can’t verify root cause and may worsen throttling by increasing traffic.

5. A company has multiple teams building solutions that use Azure AI services. The platform team wants to organize resources to simplify governance, cost tracking, and access control while enabling separate dev/test/prod environments. Which design best meets these goals?

Show answer
Correct answer: Use separate Azure subscriptions per environment (dev/test/prod) and organize resources into resource groups by workload; apply RBAC and policies at subscription/resource group scope
Separating environments by subscription and grouping by workload enables clear RBAC boundaries, policy assignment, and cost management (chargeback/showback) consistent with enterprise governance patterns. A single resource group (option B) creates governance and blast-radius issues and complicates least-privilege access. Per-developer resources and key-based access (option C) increases sprawl, weakens governance, and conflicts with secure-by-default practices emphasized for production scenarios.

Chapter 3: Implement Generative AI Solutions

This chapter maps to the AI-102 objective area focused on implementing generative AI solutions in Azure. The exam doesn’t just test whether you know what “LLMs” are—it tests whether you can configure Azure OpenAI correctly, design prompts that are reliable under real constraints, implement retrieval-augmented generation (RAG) with Azure AI Search, and apply safety/evaluation practices that keep solutions trustworthy and supportable.

As you read, keep the exam’s style in mind: many questions are scenario-based and hinge on one or two “tell” details (for example, needing grounded answers with citations, minimizing hallucinations, controlling cost via tokens, or enforcing safety policy). Your job is to identify which Azure component solves the scenario and which configuration detail makes it correct.

Exam Tip: When a prompt-only solution is proposed for a requirement that explicitly needs “company data,” “citations,” “latest policy,” or “grounded answers,” assume prompt-only is insufficient. The exam expects RAG (Azure AI Search + embeddings) or another knowledge source integration.

The rest of the chapter breaks down the implementation decisions you’ll repeatedly see in questions: model/deployment settings, prompt patterns, embeddings and search modes, ingestion/chunking/citations, and safety + evaluation workflows.

Practice note for Azure OpenAI fundamentals (models, tokens, deployments, parameters): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Prompt engineering and system design for reliability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG with Azure AI Search (indexing, chunking, embeddings, citations): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Content safety, grounding, evaluation, and responsible AI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 60+ questions mapped to generative AI objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Azure OpenAI fundamentals (models, tokens, deployments, parameters): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Prompt engineering and system design for reliability: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for RAG with Azure AI Search (indexing, chunking, embeddings, citations): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Content safety, grounding, evaluation, and responsible AI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 60+ questions mapped to generative AI objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 3.1: Azure OpenAI configuration: deployments, model selection, quotas

AI-102 expects you to understand the difference between a model and a deployment in Azure OpenAI. You select a model family (for example, a chat model for conversational generation or an embeddings model for vectorization) and create a deployment that assigns a name, region/resource, and capacity constraints. Many exam items describe “the app calls deployment X” rather than calling the model directly—because deployments are what your code targets.

Model selection is typically driven by: (1) capability (reasoning quality, tool/function calling, multimodal needs), (2) latency and cost, and (3) token limits (input + output context size). Token accounting is a frequent hidden constraint: long system prompts, large retrieved context, and verbose responses all consume tokens and can push you over limits or increase cost.

Quotas and throughput constraints matter in production scenarios. Expect questions where multiple teams share a resource, or where you must avoid throttling at peak load. The correct answer often involves separating workloads into different deployments/resources, selecting an appropriate pricing tier/throughput configuration, or controlling max tokens and concurrency at the application level.

  • Use separate deployments for chat/completions vs embeddings (different models, different traffic patterns).
  • Tune generation parameters (temperature, top_p, max_tokens) to balance determinism, creativity, and cost.
  • Plan for rate limiting: implement retries with backoff and monitor usage.

Exam Tip: If a scenario mentions “inconsistent outputs” or “needs repeatable formatting,” look for lowering temperature and enforcing structured outputs—not “switch to a bigger model” as the first move.

Common trap: Confusing token limits with character limits. The exam expects you to reason in tokens, especially when combining user input + system prompt + retrieved passages + tool outputs.

Section 3.2: Prompt patterns: instructions, few-shot, structured outputs, tools overview

The exam treats prompt engineering as an engineering discipline: you should design prompts for reliability, not just “better wording.” Start with a strong system message that defines role, constraints, and how to behave when information is missing. Then layer user instructions and retrieved context. A robust pattern is: (1) system rules, (2) developer constraints (format, safety, scope), (3) user request, (4) grounding context (RAG excerpts), (5) response schema.

Few-shot prompting appears when you need consistent classification, extraction, or rewriting behavior. The key is to keep examples small, representative, and aligned with the output format you want. Overloading few-shot examples can waste tokens and still fail if the schema is ambiguous.

Structured outputs are heavily tested because they reduce downstream parsing errors. If the scenario says “must return JSON” or “must integrate into an automated workflow,” you should think: enforce a strict schema, explicitly define allowed keys, and instruct the model to avoid extra prose. In real solutions, you may also validate and retry when schema violations occur.

Tools/functions overview: AI-102 increasingly expects awareness of tool calling (sometimes called function calling). The model can choose to call a tool (search, database lookup, calculator, ticket creation) and then incorporate the tool result into the final response. This is core to building reliable, “agentic” behaviors later, but in this chapter your focus is recognizing when tools are needed to meet requirements like “get current data” or “perform an action.”

Exam Tip: When you see requirements like “must not fabricate,” “use only provided sources,” or “respond with citations,” combine prompt constraints with an architectural control (RAG and/or tool calling). Prompting alone is rarely considered sufficient for strict grounding requirements.

Common trap: Assuming that adding more instructions always improves reliability. On the exam, the better answer often tightens constraints, adds clear schemas, and reduces ambiguity—rather than adding long, conflicting rules.

Section 3.3: Retrieval-augmented generation: embeddings, vector search, hybrid search

RAG is a centerpiece of the generative AI objective. The exam tests whether you can explain and implement the pipeline: create embeddings for documents and queries, store vectors in Azure AI Search, retrieve top matches, and inject the retrieved passages into the prompt for grounded generation.

Embeddings convert text into numeric vectors such that “semantic similarity” is measurable. In Azure AI Search, vector search uses these embeddings to retrieve meaningfully related content even when keywords don’t match. That’s why vector search is the go-to when users ask questions in different wording than the source documents.

Hybrid search (keyword + vector) is often the best practical default. Keyword search helps with precision for exact terms (policy IDs, product names, error codes) while vector search helps with semantic recall. Many AI-102 scenarios hint at this: if users search by “part number” or exact phrase, keyword matters; if they ask open-ended questions, semantic matching matters. Hybrid helps both.

  • Vector search: best for semantic similarity; needs embeddings and vector fields in the index.
  • Keyword (BM25): best for exact matches; needs searchable text fields and analyzers.
  • Hybrid: combines both scoring signals; frequently the most robust choice.

Exam Tip: If a question mentions “synonyms,” “similar meaning,” or “users phrase things differently,” think embeddings/vector or hybrid. If it mentions “exact identifiers” or “must match a code,” think keyword or hybrid with filters.

Common trap: Treating embeddings as a substitute for good indexing. You still need a well-designed index schema (fields, filters, metadata) and a retrieval strategy (top K, reranking, and context window management).

Section 3.4: Data ingestion for RAG: chunking strategies, metadata, filters, citations

RAG quality is often determined by ingestion design. AI-102 questions commonly describe a system that “returns irrelevant passages,” “misses key details,” or “cannot cite sources.” The fix is typically in chunking strategy, metadata, and citation handling rather than changing the LLM.

Chunking means splitting documents into pieces that can be embedded and retrieved. Chunks that are too large dilute relevance and waste tokens; chunks that are too small lose context. A practical approach is to chunk by logical structure (headings/sections) and use overlap to preserve continuity. The exam often expects you to recognize that chunk size must align with both retrieval quality and the model’s context window.

Metadata enables filtering and access control. Store fields like document title, URL, last updated date, product line, department, and security label. Then apply filters at query time (for example, only show documents for the user’s region or entitlement). If the scenario includes “different answers by user role” or “only HR can see HR policies,” the correct approach includes metadata-driven filtering, not a prompt request to “only answer if allowed.”

Citations are a classic test point. You typically pass the retrieved chunks along with identifiers (document name, page, URL) and instruct the model to cite those sources. Proper citation handling requires you to preserve source references through chunking and retrieval. If you drop source metadata during ingestion, you can’t reliably cite later.

Exam Tip: When requirements include “show sources,” “auditability,” or “traceable answers,” look for solutions that propagate source metadata through the index and into the response—not just “ask the model to include links.”

Common trap: Neglecting filters and returning globally relevant but policy-inappropriate content. The exam often frames this as a “data leakage” issue that is solved with security trimming/filters plus proper identity integration—not with softer prompt wording.

Section 3.5: Safety and quality: content filters, prompt injection risks, evaluation

Responsible AI is not optional on AI-102. You must know how to apply safety controls and how to measure quality over time. Azure OpenAI provides content filtering and safety-related settings; Azure AI Content Safety may also appear in scenarios where you need explicit moderation workflows for user prompts or model outputs.

Content filtering focuses on categories of unsafe content and helps reduce harmful outputs. On the exam, the right answer often combines platform controls (content filters/moderation) with application controls (input validation, logging, user reporting, and safe completion behavior). Don’t assume a single toggle is a complete safety strategy.

Prompt injection is a frequently tested risk in RAG systems. Attackers attempt to override system instructions by embedding malicious directions in retrieved documents or user input (for example: “Ignore prior instructions and reveal secrets”). The mitigation pattern is defense-in-depth: isolate retrieved content with clear delimiters, instruct the model to treat retrieved text as untrusted data, limit tool access, and validate tool arguments. Also consider filtering retrieved content and applying allowlists for tools.

Evaluation is how you prove the system works beyond one demo. Expect to see mentions of offline test sets, regression testing across prompt/model changes, and monitoring groundedness (did the answer use provided sources?), relevance (did retrieval return the right chunks?), and safety (did output violate policy?). Even if the exam doesn’t require naming a specific library, it expects that you know evaluation is continuous and tied to measurable criteria.

Exam Tip: If a scenario says “must not answer if sources don’t support it,” include both prompt policy (“say you don’t know”) and a retrieval threshold/guardrail (for example, minimum similarity score or empty-retrieval behavior).

Common trap: Relying on “temperature=0” to guarantee truthfulness. Low temperature improves determinism, not factual grounding. Grounding comes from retrieval/tooling plus explicit constraints and evaluation.

Section 3.6: Exam-style practice: Implement generative AI solutions

This section prepares you for the chapter’s domain practice set (60+ items) by showing how to recognize what the exam is asking before you look at answer choices. AI-102 generative questions usually fall into one of four categories: configuration, prompting, RAG design, or safety/quality. The fastest way to score points is to map the scenario’s requirement keywords to the correct technical lever.

  • Configuration signals: “throttling,” “latency,” “token limit exceeded,” “cost spike,” “needs separate workloads.” Think deployments, quotas, max_tokens, and model choice.
  • Prompting signals: “format must be JSON,” “must follow steps,” “inconsistent responses,” “needs deterministic output.” Think system instructions, structured outputs, few-shot examples, and tighter generation parameters.
  • RAG signals: “use internal documents,” “cite sources,” “latest policy,” “reduce hallucinations.” Think embeddings + Azure AI Search + hybrid/vector search + grounding prompt template.
  • Safety/quality signals: “jailbreak,” “prompt injection,” “harmful content,” “needs monitoring,” “must refuse.” Think content filters, moderation, tool restrictions, retrieval thresholds, and evaluation.

Also practice eliminating wrong answers. If the requirement is about retrieval relevance, changing temperature won’t fix it. If the requirement is access control, a prompt instruction won’t enforce it. If the requirement is citations, you need source metadata and a response format that preserves it.

Exam Tip: In multi-select questions, expect at least one choice related to architecture (RAG/search), one related to prompt/formatting, and one related to safety/monitoring. The exam likes “complete solutions,” not single-feature fixes.

Finally, remember that generative AI solutions are systems. The model is only one component. The exam rewards candidates who design end-to-end: ingestion → indexing → retrieval → prompting → post-processing → evaluation/monitoring.

Chapter milestones
  • Azure OpenAI fundamentals (models, tokens, deployments, parameters)
  • Prompt engineering and system design for reliability
  • RAG with Azure AI Search (indexing, chunking, embeddings, citations)
  • Content safety, grounding, evaluation, and responsible AI
  • Domain practice set: 60+ questions mapped to generative AI objectives
Chapter quiz

1. You are building a customer support copilot using Azure OpenAI. The business requirement states: responses must be grounded in the latest internal policy documents and must include citations to the source paragraphs. Which design best meets the requirement?

Show answer
Correct answer: Implement RAG by indexing policy documents in Azure AI Search, generate embeddings for chunks, retrieve top matches per query, and have the model answer using retrieved passages with citations.
RAG with Azure AI Search is the exam-expected approach when requirements explicitly call for company data + grounded answers + citations. Option B is insufficient because prompt-only instructions cannot guarantee grounding or verifiable citations without a knowledge source. Option C may improve style/consistency but does not inherently provide traceable citations to specific document chunks and is slower to keep current as policies change.

2. A team reports that their Azure OpenAI chat solution produces different answers for the same user question across runs. They need more deterministic, repeatable outputs for a compliance workflow. Which parameter change is MOST appropriate?

Show answer
Correct answer: Lower temperature (and optionally constrain top_p) to reduce randomness in token sampling.
Lowering temperature reduces sampling randomness and is a standard control for more repeatable outputs; limiting top_p can further reduce variance. Option A does the opposite by increasing randomness. Option C affects response length/cost but does not directly address nondeterminism in generation.

3. You are implementing ingestion for RAG with Azure AI Search. The documents are long PDFs with multiple sections. Users often ask questions that require specific details from a single section. Which chunking strategy is MOST likely to improve retrieval quality and reduce hallucinations?

Show answer
Correct answer: Chunk documents into smaller, semantically coherent passages (optionally with overlap) before generating embeddings and indexing them.
Chunking into semantically coherent passages improves embedding relevance and retrieval precision, which helps ground responses and reduce hallucinations. Option A typically harms precision because embeddings represent an entire document, making it harder to retrieve the most relevant section. Option C is brittle and costly due to token limits and does not scale; exam scenarios usually expect Azure AI Search indexing rather than stuffing full documents into prompts.

4. You are designing a production generative AI app in Azure. A requirement states: the solution must block or filter hateful/sexual content and provide an auditable safety configuration. Which Azure capability should you use?

Show answer
Correct answer: Azure AI Content Safety (or Azure OpenAI content filtering policies) to detect and filter unsafe categories and log decisions.
Azure AI Content Safety / Azure OpenAI content filtering is the correct control for detecting and enforcing policy-based filtering with auditable configuration. Option B is not sufficient because prompts are not a reliable enforcement boundary. Option C improves grounding but does not prevent unsafe user prompts or unsafe generations; retrieval does not equal safety enforcement.

5. A company has a generative AI assistant that answers questions from product manuals via RAG. They want to evaluate whether answers are grounded in retrieved content and to detect hallucinations during testing. Which evaluation approach best aligns with this goal?

Show answer
Correct answer: Run evaluations that compare the model’s answer to retrieved passages (groundedness/faithfulness) and track metrics over a test set; investigate low-scoring cases.
Groundedness/faithfulness evaluation checks whether responses are supported by retrieved sources, which is the key RAG reliability concern on the exam. Option B measures cost/length but not factual support. Option C confuses diversity with hallucination; higher temperature increases variation but does not provide evidence that outputs are unsupported by sources.

Chapter 4: Implement an Agentic Solution + NLP Solutions

This chapter targets two AI-102 clusters you will see repeatedly in scenarios: implementing agentic solutions (tools/functions, orchestration patterns, memory/state, evaluation) and implementing NLP solutions (classification, extraction, summarization, Q&A, speech, and translation). The exam rarely asks for “definitions only.” Instead, it tests whether you can choose the right orchestration and language service for a requirement, recognize reliability and safety gaps, and pick the most appropriate Azure building block when constraints (latency, cost, privacy, or accuracy) are specified.

In practice items, pay attention to phrasing like “must call an internal API,” “multi-step workflow,” “maintain conversation history,” “support multilingual,” “PII,” and “human-in-the-loop.” These phrases are cues pointing to tools/function calling, state management, grounding (often via Azure AI Search), and governance controls. This chapter also connects speech and translation choices to conversational experiences—another common exam pattern.

Exam Tip: When a question describes “the model should decide which operation to run,” that’s an agent/tool-orchestration cue. When it says “extract key phrases/entities/sentiment,” that’s an Azure AI Language cue. When it says “convert audio calls to text,” that’s Speech to text; “real-time captions” implies streaming; “translate documents” often implies Translator document translation rather than simple text translation.

Practice note for Agent design: goals, planning, tool use, and orchestration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Function calling/tools and state: memory, grounding, and session handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Language solutions: classification, extraction, summarization, and Q&A: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Speech and translation building blocks and integration choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 60+ questions covering agentic + NLP objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Agent design: goals, planning, tool use, and orchestration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Function calling/tools and state: memory, grounding, and session handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Language solutions: classification, extraction, summarization, and Q&A: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Speech and translation building blocks and integration choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 60+ questions covering agentic + NLP objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 4.1: Agent fundamentals: orchestration, planners, and multi-step workflows

AI-102 agent questions typically assess whether you can design a reliable multi-step workflow, not whether you can name an “agent framework.” An agentic solution combines (1) a goal, (2) an orchestrator (the control loop), (3) tools (functions/APIs), and (4) state (memory) so the system can plan, act, and verify results. You’ll see scenarios like “book a meeting, check calendar, email confirmation” or “triage support tickets, query knowledge base, open an incident.” These are multi-step, tool-using workflows.

Orchestration patterns commonly tested include: a simple tool-calling loop (model chooses tools), planner-executor (a planner creates steps; an executor runs them), and router patterns (classify intent then dispatch to specialized workflows). The exam often wants you to separate planning from execution when you need auditability, deterministic control, or compliance. If a scenario mentions “approval required” or “must log every step,” choose an orchestrator pattern that supports explicit step tracking and human-in-the-loop gates.

  • Single-agent loop: best for straightforward tasks with a few tools; simplest to implement but can be harder to constrain.
  • Planner + executor: improves transparency, allows validation of the plan, and is easier to test; good when the task is long-running or high-risk.
  • Multi-agent / specialist routing: use when tasks require different toolsets or domain-specific prompts; avoid if the requirement emphasizes simplicity or minimal latency.

Common trap: Picking “fine-tune the model” to solve a workflow problem. Multi-step business processes are usually solved with orchestration + tools + grounding, not fine-tuning. Another trap is assuming the model should “remember everything” implicitly—agents need explicit state handling and grounding to remain consistent over time.

Exam Tip: Look for requirements like “must be deterministic,” “must not take external actions without approval,” or “must be resilient to tool failures.” Those signal you should design an orchestrator that validates plans, enforces policy gates, and handles retries/timeouts rather than relying on a free-form chat loop.

Section 4.2: Tools/functions: schema design, reliability, retries, and guardrails

Tools (function calling) are the mechanism that lets a model invoke code safely: query a database, call an internal API, run a search, or trigger a workflow. AI-102 questions focus on designing tool schemas that are precise, constrained, and testable. A strong schema has clear names, typed parameters, allowed enums, and descriptions that steer the model toward correct inputs. When the question says “reduce hallucinations when calling APIs,” the answer is rarely “better prompt” alone—tight schema + validation + retries is the core.

Reliability patterns you should know: validate inputs server-side, implement idempotency for operations (so retries don’t double-charge or double-create), and use structured error returns that the model can reason about (e.g., error codes and human-readable messages). If you need consistent outputs for downstream systems, force structured responses (JSON) and enforce schema validation. If the requirement mentions “must not expose secrets,” ensure tool results do not return credentials; use managed identities and secure secret storage rather than embedding keys in prompts.

  • Retries/timeouts: implement exponential backoff for transient failures; set timeouts to prevent infinite tool loops.
  • Guardrails: restrict tool availability by user role and context; add allowlists for domains/endpoints; require confirmation for destructive actions.
  • Grounding: tools like search retrieval reduce hallucination by providing sources; ensure the model is instructed to cite or reference tool outputs.

Common trap: Overloading one tool with many responsibilities (“do_everything”). On the exam, prefer small, composable tools because they are easier to secure and validate. Another trap: assuming function calling automatically makes actions safe. The model can still request risky calls unless your orchestrator enforces policy (e.g., “no delete unless user confirmed”).

Exam Tip: When an option mentions “validate parameters,” “use enums,” “server-side checks,” “idempotent operations,” or “human confirmation,” those are strong signals for the most secure/reliable tool design choice—exactly what AI-102 likes to reward.

Section 4.3: Memory and context: vector stores, conversation state, privacy considerations

Memory is one of the most tested agent topics because it’s easy to get wrong in production and easy to describe in exam scenarios. Distinguish between (1) short-term context (the chat history in the prompt window), (2) long-term semantic memory (vector store embeddings), and (3) operational state (explicit variables like “current step,” “selected customer,” “cart items”). The exam often expects you to choose the right kind of memory for the requirement rather than “store everything.”

Vector memory is best for “remember relevant facts” and “retrieve similar prior cases” (semantic search). Operational state is best for workflow correctness (e.g., the agent must know it has already created a ticket). Conversation state is best for continuity, but it’s bounded by token limits and can leak sensitive content if not handled carefully.

  • Vector stores: store embeddings of notes, prior chats, or documents; retrieve top-k relevant chunks; use metadata filters (userId, tenantId) to prevent cross-user leakage.
  • Session handling: persist conversation IDs; store structured state separately from free-text; design for expiration and data minimization.
  • Grounding vs memory: don’t confuse “memory” with “authoritative knowledge.” For factual answers, grounding via retrieval from approved sources is safer than recalling prior conversation.

Privacy and governance appear in subtle wording: “PII,” “regulated,” “data residency,” “do not store user prompts,” or “allow users to delete history.” In those cases, prefer ephemeral session context, encryption at rest, strict access controls, and retention policies. If the scenario requires multi-tenant isolation, you need metadata partitioning and access checks in your retrieval layer.

Common trap: Storing raw conversation transcripts into a vector store without tenant/user filters. That can create cross-tenant retrieval, a serious security failure and a common exam “gotcha.”

Exam Tip: If a requirement says “the assistant should remember the user’s preference,” choose structured profile storage (key-value) rather than stuffing preferences into prompt text. If it says “find similar past incidents,” choose embeddings + vector search with filters.

Section 4.4: Azure AI Language: entity extraction, sentiment, summarization, custom models

Azure AI Language is the exam’s home for classic NLP: entity recognition, key phrase extraction, sentiment analysis, summarization, and classification. AI-102 scenario questions usually describe business outcomes—“tag emails,” “extract contract terms,” “summarize call notes,” “detect negative feedback”—and you must map them to the right Language capability.

For extraction tasks, think in terms of general named entities versus domain-specific fields. If the prompt says “extract people/locations/organizations,” prebuilt entity recognition is often enough. If it says “extract policy number, claim type, deductible,” you likely need a custom model (custom Named Entity Recognition) trained with labeled examples. For classification, choose custom text classification when labels are business-specific (e.g., “Billing Dispute,” “Cancellation,” “Fraud Suspected”). For summarization, ensure you recognize the distinction between summarizing a single document and summarizing conversations; scenarios may require concise “action item” summaries for support workflows.

  • Sentiment: used for customer feedback routing and escalation; beware that “sentiment” is not the same as “toxicity/safety,” which is typically handled by separate safety approaches.
  • Extractive vs abstractive summarization: extractive pulls key sentences; abstractive generates new phrasing. If the requirement says “must not introduce new information,” extractive is safer.
  • Q&A: if the scenario is about answering from a curated knowledge base, align to Q&A patterns (often paired with retrieval and grounding).

Common trap: Choosing generative summarization when the scenario demands auditability and “no new content.” Another trap is choosing prebuilt models when the scenario clearly lists domain-specific entities—AI-102 expects you to recognize when custom NER/classification is needed.

Exam Tip: Watch for verbs: “classify” (labels), “extract” (entities/phrases), “summarize” (shorten), “analyze opinions” (sentiment). When the question says “domain-specific fields,” “custom model” is often the differentiator.

Section 4.5: Speech and translation: speech-to-text/text-to-speech and Translator choices

Speech and translation questions frequently appear as integration choices: which service, which mode (batch vs real-time), and what constraints apply (latency, streaming, speaker separation, or custom vocabulary). Speech-to-text (STT) is used for transcribing audio; text-to-speech (TTS) is used to generate spoken responses. The exam often uses clues like “live captions,” “call center streaming,” or “near real-time” to push you toward streaming transcription rather than offline/batch processing.

For translation, the key decision is whether you are translating short text strings, conversations, or entire documents. Azure AI Translator supports text translation; document translation is tailored for files and preserving structure. If a scenario mentions “translate PDFs or Office documents while preserving formatting,” select document translation. If it mentions “translate chat messages,” text translation is typically sufficient. For multilingual voice experiences, you may combine STT → Translator → TTS, but you should consider where to handle language detection and whether you need consistent terminology (custom glossary/terminology features).

  • Speech integration: use STT for input, optionally diarization (speaker identification) if the scenario needs “Agent vs Customer” separation.
  • TTS choices: choose neural voices when naturalness is required; consider caching for repeated prompts to reduce cost/latency.
  • End-to-end conversational apps: ensure you separate “recognition errors” from “model reasoning errors” when troubleshooting.

Common trap: Using translation to solve “summarize foreign language documents.” Translation changes language; summarization changes length. Many scenarios require both, in the right order (often translate → summarize, or summarize in-source-language if supported). Another trap is ignoring streaming requirements—batch transcription won’t satisfy “real-time captions.”

Exam Tip: If the requirement includes “preserve document layout,” think document translation. If it includes “live,” “stream,” or “captions,” think streaming STT. If it includes “brand voice,” “natural speech,” think neural TTS and consider pronunciation/lexicon tuning.

Section 4.6: Exam-style practice: Implement an agentic solution and NLP solutions

This domain combines agent design and NLP selection in scenario form. Your job is to map requirements to architecture choices. A strong approach is to read the scenario and mark keywords in four buckets: actions (tools), knowledge (grounding/retrieval), state (memory/session), and language tasks (classification/extraction/summarization/speech/translation). Most wrong answers fail one bucket—e.g., they propose a tool but omit state handling, or they propose NLP analysis but ignore multilingual requirements.

When evaluating options, ask: does the solution (1) constrain tool inputs, (2) validate outputs, (3) handle failures, and (4) respect privacy boundaries? For example, if an agent must “open a ticket,” the correct design includes an idempotent create-ticket tool, confirmation before submission, and a stored “ticketCreated=true” state to prevent duplicates. If it must “summarize and tag emails,” the correct mapping is summarization + custom classification (if labels are business-specific) and entity extraction (if fields are needed). If it must “support voice calls,” add STT and possibly diarization; if it must “support multiple languages,” add Translator and decide whether translation happens before or after NLP analysis.

  • Identify the service vs the pattern: the exam may offer both a service name and an orchestration pattern; you usually need both to satisfy the full requirement.
  • Prefer explicit state: for workflow correctness, store structured state, not just chat history.
  • Prefer grounded answers: when correctness matters, retrieve from authoritative sources and instruct the model to rely on retrieved context.

Common trap: Picking the “most powerful” option (e.g., generative model for everything) when the requirement calls for a deterministic, auditable NLP API (entity extraction/classification) or a controlled tool workflow. Another trap is missing security constraints: if it says “internal-only,” ensure private networking/auth is implied; if it says “no data retention,” ensure memory choices support minimization.

Exam Tip: If two answers both meet functional needs, the exam often rewards the one that adds governance: validation, least privilege tool access, tenant filtering in vector retrieval, and clear separation of planning/execution for auditability.

Chapter milestones
  • Agent design: goals, planning, tool use, and orchestration patterns
  • Function calling/tools and state: memory, grounding, and session handling
  • Language solutions: classification, extraction, summarization, and Q&A
  • Speech and translation building blocks and integration choices
  • Domain practice set: 60+ questions covering agentic + NLP objectives
Chapter quiz

1. A support chatbot must decide at runtime whether to (1) look up an order in an internal Order API, (2) create a return request, or (3) answer a general policy question from a knowledge base. The requirement states: “The model should decide which operation to run and must not hallucinate order status.” Which design best fits the requirement?

Show answer
Correct answer: Use an agent pattern with function calling/tools for the Order API and a retrieval step (grounding) from Azure AI Search for policy content
Function calling/tool use is the exam cue for “model should decide which operation to run,” and grounding via retrieval (commonly Azure AI Search) addresses the “must not hallucinate” requirement for policies while the Order API is the source of truth for order status. Fine-tuning (B) does not reliably provide real-time order status and still risks hallucination. A prompt-only approach (C) is brittle, increases user burden, and does not guarantee correctness or safe handling of sensitive/order data.

2. You are implementing a multi-turn agent that schedules appointments. The agent must remember user preferences (time window, location) during a session, but the system must avoid storing personal data longer than 30 minutes. Which approach best meets the requirement?

Show answer
Correct answer: Store conversation state in a short-lived session store with a 30-minute TTL and only persist minimal, necessary fields (not full transcripts)
Session handling with explicit state management (short-lived storage + TTL) aligns with the requirement to remember preferences during the session while enforcing retention limits. Long-term vector memory (B) violates the retention constraint and increases privacy risk. Using only the context window (C) can fail across turns as the conversation grows or if the system needs to scale across servers—state is not reliably maintained without external session storage.

3. A company needs to process thousands of customer emails per hour. For each email, they must: (1) detect the language, (2) classify it into one of 15 categories (billing, cancellation, technical issue, etc.), and (3) extract key entities like account number and product name. Which Azure capability is the most appropriate primary service for these NLP tasks?

Show answer
Correct answer: Azure AI Language (Text Analytics / custom classification and entity extraction capabilities)
Azure AI Language is designed for language detection, classification, and entity extraction—common AI-102 NLP objectives. Speech (B) is for audio, not email text. Translator (C) can translate but does not provide classification/extraction as the primary capability; translating first is unnecessary unless explicitly required and adds cost/latency.

4. You are building a call-center solution that must provide real-time captions during live calls, with minimal latency, and store the final transcript after the call ends. Which building block is the best fit for the real-time caption requirement?

Show answer
Correct answer: Azure AI Speech to text using streaming recognition for real-time transcription
Real-time captions imply streaming speech-to-text. Speech to text supports low-latency streaming transcription. Summarization (B) is a post-processing NLP task and doesn’t produce live captions from audio. Document translation (C) is for translating documents/transcripts and is not a real-time captioning solution.

5. A compliance team needs to translate large batches of legal documents (PDF/DOCX) between languages while preserving document structure and formatting. Which Azure translation option best matches this requirement?

Show answer
Correct answer: Azure AI Translator Document Translation
Document Translation is designed for batch translation of documents while preserving layout/formatting and supports storage-based workflows. Text Translation (B) is suited for strings/snippets and typically requires you to handle document parsing and reassembly yourself, risking formatting loss. Prompt-based translation with an LLM (C) is brittle for large documents (context limits), harder to govern, and not purpose-built for preserving document structure at scale.

Chapter 5: Implement Computer Vision + Knowledge Mining and Information Extraction

This chapter targets the AI-102 objectives that frequently appear as scenario-based questions: selecting the right computer vision capability (image analysis vs OCR vs document extraction), designing pipelines that meet latency/throughput constraints, and building knowledge mining solutions with Azure AI Search (indexes, enrichment, semantic ranking, and vector search). The exam rarely rewards “memorize the SKU” answers; instead, it tests whether you can map a business requirement to the correct service feature, then anticipate operational constraints (cost, scale, security, and integration patterns).

You should read each prompt and classify it into one of three buckets: (1) “understand an image” (tags, captions, objects, people, spatial relationships), (2) “read text” (OCR, handwriting, layout), or (3) “turn documents into searchable knowledge” (chunking, extraction, enrichment, indexing, and retrieval). Many wrong options will sound plausible but fail one critical requirement—like needing structured table extraction, requiring private networking, or needing near-real-time updates.

Exam Tip: When a question includes “search across documents,” “enrich content,” “synonyms/analyzers,” “semantic answers,” or “vector similarity,” you are in Azure AI Search territory—not “just store embeddings” or “just use a database.” Conversely, if the question includes “key-value pairs,” “line items,” “tables,” “invoices,” or “forms,” you should think Document Intelligence first, with Azure AI Search as the retrieval layer.

The sections below align to the chapter lessons: vision fundamentals; pipeline design (batch vs real-time and edge); search architecture and enrichment; and information extraction workflows that connect Document Intelligence to search indexers and skillsets. The final section prepares you for the domain practice set by showing how to recognize what the exam is actually asking for—without giving you memorized one-liners that break under a new scenario.

Practice note for Vision fundamentals: image analysis, OCR, and document scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design vision pipelines (batch vs real-time, edge considerations): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining architecture: Azure AI Search indexes and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Information extraction workflows with Document Intelligence and skillsets: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: 60+ questions for vision + knowledge mining objectives: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Vision fundamentals: image analysis, OCR, and document scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design vision pipelines (batch vs real-time, edge considerations): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Knowledge mining architecture: Azure AI Search indexes and enrichment: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 5.1: Azure AI Vision capabilities: image analysis, spatial understanding concepts

Azure AI Vision (and related vision capabilities) is tested in AI-102 primarily through feature selection: can you choose the right API output for a given requirement? “Image analysis” scenarios commonly ask for captions, tags, object detection, brand/logo detection, adult content detection, and people-related insights. Your job is to map the business question (“What’s in this image?”) to the correct capability, and then to describe how you would operationalize it (input sources, response fields, confidence thresholds, and error handling).

Spatial understanding concepts show up as “where is the object relative to others?” or “extract bounding boxes/regions.” The exam often expects you to know that many vision outputs include coordinates (bounding boxes, polygons) and confidence scores, which let you implement downstream logic (e.g., highlight an object in a UI, crop a region before OCR, or validate that an object is present before accepting a photo upload).

  • Use image analysis for high-level understanding: captions/tags, detected objects, scene description.
  • Use region metadata for spatial logic: overlay boxes, compute proximity, drive workflow decisions.
  • Use confidence scores to avoid brittle automation: set thresholds, route low-confidence cases to human review.

Exam Tip: Watch for trap answers that propose “Custom Vision” when no training requirement exists. If the prompt says “general objects,” “common scenes,” or “no labeled dataset,” use prebuilt image analysis. Custom models are justified when you need domain-specific classes (e.g., “detect this specific part defect”) and can label examples.

Also expect design questions about privacy and compliance: images can contain biometric identifiers or sensitive content. In a correct design, you typically store only what you need (metadata rather than full images), apply encryption at rest, and restrict access via managed identities and private endpoints where required.

Section 5.2: OCR and document scenarios: forms, tables, handwriting considerations

OCR is not just “read the text.” AI-102 frequently distinguishes between simple OCR (text lines/words) and document-centric extraction (layout, tables, key-value pairs). When a scenario includes receipts, invoices, insurance forms, or onboarding packets, the exam expects you to consider Document Intelligence for structure. OCR alone may return text, but not reliable field mapping or table reconstruction.

Handwriting is a classic edge case. If the prompt mentions handwritten notes, signatures, or mixed printed/handwritten forms, you must choose an approach that supports handwriting recognition and anticipate lower accuracy. A strong exam answer includes a mitigation plan: image preprocessing (deskew, denoise), capturing better input (mobile capture guidance), and a human-in-the-loop review for low-confidence fields.

  • Forms: prioritize key-value extraction and field confidence; define validation rules (dates, totals).
  • Tables: require layout-aware extraction; ensure you can preserve rows/columns and header alignment.
  • Handwriting: expect variability; use confidence thresholds and exception workflows.

Exam Tip: If the question says “extract line items,” “table rows,” or “map to a schema,” OCR-only options are usually wrong. The correct path is document extraction (prebuilt or custom models) that returns structured outputs and bounding regions.

Common trap: assuming all PDFs are text-searchable. Many PDFs are scanned images; you still need OCR or document extraction to make them searchable. Another trap is ignoring multi-page documents—ensure the design supports page-wise extraction and stable IDs so that you can reprocess only changed pages in an update pipeline.

Section 5.3: Vision solution design: latency, throughput, storage, and integration patterns

Design questions focus on pipeline shape. Start by classifying the workload: real-time (user waits for an answer) vs batch (process large backlogs). Real-time designs optimize latency and reliability: keep images small but readable, use direct API calls behind an API gateway, and implement retries with idempotency. Batch designs optimize throughput and cost: queue work, process asynchronously, and scale out with worker instances.

Integration patterns that show up on AI-102 include event-driven processing (blob upload triggers indexing or analysis) and decoupled microservices (API layer → queue → workers → storage). For edge considerations, the exam usually wants you to recognize constraints like limited bandwidth, intermittent connectivity, or data residency. In such cases, you may prefilter/compress at the edge, run lightweight validation locally, and upload only required artifacts for cloud inference—or design for local processing where permitted by the service and deployment model.

  • Latency: synchronous calls, small payloads, caching, and minimizing round trips.
  • Throughput: async queues, parallelism, batch submission, and backpressure controls.
  • Storage: store originals in Blob Storage; store derived metadata separately; track versions.
  • Integration: Functions/Logic Apps for orchestration; Event Grid for triggers; Service Bus for reliability.

Exam Tip: In architecture questions, look for hints like “must not lose messages,” “at-least-once processing,” or “bursty uploads.” Those indicate queue-based decoupling (e.g., Service Bus) rather than direct, synchronous processing.

A frequent trap is ignoring cost drivers. Vision calls scale with number of images/pages, resolution, and reprocessing frequency. A well-scored answer includes “avoid reprocessing unchanged documents,” “store extracted text/JSON,” and “monitor throughput and failures.” Another trap is sending sensitive images through multiple services unnecessarily; keep the data path minimal and controlled.

Section 5.4: Azure AI Search: indexes, analyzers, vector search, semantic ranking

Knowledge mining on AI-102 centers on Azure AI Search architecture: index design, ingestion, and query features. Your index is the contract between ingestion and retrieval—fields, types, and attributes (searchable, filterable, sortable, facetable, retrievable). The exam tests whether you can model fields correctly and choose analyzers that match the language and tokenization needs. For example, “part numbers” and “SKU-like identifiers” often require preserving tokens rather than aggressive stemming.

Vector search is tested as “find similar content” and “semantic retrieval” based on embeddings. You’ll often see scenarios involving RAG where documents are chunked, embedded, and stored in a vector field. Semantic ranking is distinct: it reorders results and can produce better snippets/answers for natural language queries using semantic configurations.

  • Indexes: define fields and attributes; plan for document chunking and metadata filters.
  • Analyzers: choose language analyzers; handle exact-match identifiers with appropriate field configuration.
  • Vector search: store embeddings; enable similarity queries; combine with metadata filtering.
  • Semantic ranking: improves relevance ordering and captions; requires semantic configuration.

Exam Tip: When the prompt says “hybrid search” (keyword + vector), the correct answer typically includes both a searchable text field and a vector field plus filters (e.g., security trimming). Don’t confuse semantic ranking with vector similarity—semantic ranking doesn’t replace embeddings.

Common traps: (1) forgetting security trimming (per-user/role access) and returning results a user shouldn’t see; (2) indexing huge documents without chunking, leading to poor relevance and token limits downstream; (3) using the wrong field attributes—if a field must be used in filters (like department, region, classification), it must be filterable.

Section 5.5: Enrichment and extraction: skillsets, indexers, Document Intelligence integration

This is where the exam links ingestion + AI enrichment. An Azure AI Search indexer pulls content from a data source (commonly Blob Storage), optionally uses a skillset to enrich it, then writes enriched fields into the search index. The skillset is a pipeline of skills (built-in or custom) that can extract text, detect language, split text, extract entities/key phrases, and call out to custom web APIs. For document-heavy solutions, Document Intelligence frequently provides structured extraction (fields, tables) that becomes searchable content and metadata.

A practical design: store raw documents in Blob Storage, run Document Intelligence to produce JSON (fields, tables, confidence), then map those outputs into the index—either via an indexer + skillset that processes extracted text, or via an application layer that pushes documents into the index. The exam will probe which component does what: indexers ingest, skillsets enrich, the index stores fields, and semantic/vector features apply at query time.

  • Skillsets: define enrichment steps; map inputs/outputs; control projections into index fields.
  • Indexers: schedule and incremental updates; connect to Blob/SQL/Cosmos sources.
  • Document Intelligence: structured extraction for forms and tables; feed outputs to search.
  • Custom skills: use when built-in skills don’t meet domain needs; remember scalability and auth.

Exam Tip: If the scenario requires “extract invoice fields and enable search,” the best answer usually combines Document Intelligence (extraction) + Azure AI Search (indexing/retrieval). Answers that suggest “OCR then regex” are typically traps unless the prompt explicitly states simple, fixed-format text.

Another trap is failing to plan for enrichment errors. Skillsets can fail per document; robust designs include storing enrichment status, capturing diagnostic logs, and reprocessing only failed items. Also be alert for schema drift: when extraction models evolve, you may need versioned fields or index rebuild strategies.

Section 5.6: Exam-style practice: Computer vision and knowledge mining

Your practice set for this chapter will feel “architectural”: long prompts with multiple constraints. To consistently score well, use a repeatable decision process. First, underline the primary task (image understanding vs reading text vs extracting structured fields vs enabling search). Second, identify nonfunctional constraints (latency, volume, security, network isolation, cost). Third, select the minimal set of services that satisfy the requirements, then verify each requirement is covered by an explicit feature (not a hope).

For vision items, expect distractors that mix up image analysis and OCR. If the output needed is “caption/tags/objects,” pick image analysis. If the output needed is “text,” choose OCR or Document Intelligence depending on whether structure is required. For knowledge mining, distinguish between (a) storing data, (b) indexing data, and (c) ranking/retrieving data. Azure AI Search is the index + retrieval engine; Blob Storage is not a search engine; embeddings alone do not provide filtering, faceting, or enterprise search controls.

  • Correct-answer signal: explicit mention of index fields, analyzers, semantic config, vector fields, skillsets, or indexers.
  • Correct-answer signal: chunking strategy, metadata filters, and security trimming in retrieval.
  • Trap signal: proposing training a custom model without labeled data/time, or ignoring table/key-value needs.
  • Trap signal: synchronous designs for high-volume backlogs or queue-less designs for bursty ingestion.

Exam Tip: In multi-choice architecture questions, eliminate options that fail a single “must” requirement (e.g., “must support incremental updates,” “must keep data private,” “must return only authorized results”). AI-102 often includes one option that is “mostly right” but misses an operational requirement like monitoring, retries, or incremental indexing.

As you work the domain practice set, practice explaining your selection in one or two sentences: “Use Document Intelligence to extract structured fields and tables, store JSON + confidence, push enriched chunks and metadata into Azure AI Search with vector + keyword fields, then use semantic ranking and filters for retrieval.” If you can articulate that mapping quickly, you’re thinking the way the exam expects.

Chapter milestones
  • Vision fundamentals: image analysis, OCR, and document scenarios
  • Design vision pipelines (batch vs real-time, edge considerations)
  • Knowledge mining architecture: Azure AI Search indexes and enrichment
  • Information extraction workflows with Document Intelligence and skillsets
  • Domain practice set: 60+ questions for vision + knowledge mining objectives
Chapter quiz

1. A retail company wants to process millions of product photos to generate alt text (captions) and detect whether a person is present. The processing can run overnight, and results will be stored for later use on the website. Which Azure capability is the best fit?

Show answer
Correct answer: Azure AI Vision Image Analysis to generate captions and detect people/objects in batch
Image Analysis is designed to “understand an image” (captions, tags, object/person detection) and suits offline/batch processing. Document Intelligence targets document/form understanding (key-value pairs, tables, line items) and is not intended for general product photo captioning. Azure AI Search indexers can ingest content and run enrichment, but Search is the retrieval layer; it does not replace vision analysis for generating captions/object detection and would typically store the outputs in an index rather than being the primary analysis service.

2. A manufacturing company needs to detect safety-gear compliance from camera frames on an assembly line. The decision must be made within 200 ms and the site has intermittent internet connectivity. Which design best meets the requirement?

Show answer
Correct answer: Deploy a vision model at the edge (e.g., containerized inference) and send summarized results to the cloud when available
Edge inference addresses both latency (local processing within tight SLAs) and unreliable connectivity (cloud sync when available). Sending frames to a cloud endpoint is likely to violate the 200 ms requirement and fails during outages. Azure AI Search is for indexing and retrieval (text/vector/semantic ranking), not real-time computer vision classification from video frames.

3. A legal firm wants to search across PDFs, scanned images, and Office documents. They need enrichment (OCR for scanned pages, key phrase/entity extraction) and features like semantic answers and vector similarity for RAG. Which architecture best fits?

Show answer
Correct answer: Azure AI Search index with indexers and skillsets for enrichment, enabling semantic ranking and vector search
Azure AI Search is the purpose-built knowledge mining layer: it supports indexing, enrichment via skillsets (including OCR and NLP skills), and retrieval features like semantic ranking/answers and vector search. Embeddings-only storage without a search index misses core search capabilities (filters, analyzers, scoring, semantic answers, hybrid retrieval, incremental indexing). OCR plus SQL text matching lacks enrichment orchestration, relevance ranking, and modern retrieval patterns expected in AI-102 scenarios.

4. An accounts payable team must extract vendor name, invoice number, totals, and line items (table rows) from PDF invoices and scanned images. The output must be structured JSON for downstream processing. Which service should you choose first for extraction?

Show answer
Correct answer: Azure AI Document Intelligence (prebuilt invoice or custom model) to extract fields and tables
Document Intelligence is designed for information extraction from documents, including key-value pairs and table/line-item extraction from invoices and forms. Image Analysis focuses on visual understanding (tags/captions/objects) and does not provide reliable structured invoice/line-item extraction. An OCR-only enrichment pipeline in Azure AI Search can make invoice text searchable, but OCR alone does not produce the structured, schema-aligned fields/line items required for downstream processing; Search is typically the retrieval layer, not the primary extraction engine.

5. You build an Azure AI Search solution that indexes documents from Blob Storage. New documents must become searchable within minutes, and the index must include extracted entities and a vector field for similarity search. Which approach best meets the requirement with minimal custom code?

Show answer
Correct answer: Configure an Azure AI Search indexer with a schedule, attach a skillset for entity extraction and chunking, and write embeddings into a vector field in the index
An indexer + skillset is the standard knowledge-mining pattern for near-real-time updates (scheduled incremental indexing), enrichment (entity extraction/chunking), and indexing vectors for similarity search. A nightly full rebuild fails the “within minutes” requirement and increases cost/operational risk. Document Intelligence can extract content, but without Azure AI Search you lack an index optimized for retrieval (filters, relevance, semantic ranking, vector/hybrid search); folder naming is not a search solution and does not provide similarity search.

Chapter 6: Full Mock Exam and Final Review

This chapter is your dress rehearsal for AI-102. The goal is not to “see more content,” but to practice the exam behaviors that raise your score: time management, reading for intent, eliminating distractors, and identifying which Azure AI service is being tested. You will complete two timed mock blocks (mixed domains), perform a weak-spot analysis tied to the official objectives, and finish with a targeted final review plan that prioritizes high-frequency decision points (service selection, security/governance, RAG patterns, and evaluation).

As you work through this chapter, keep the AI-102 outcomes in view: planning and managing an Azure AI solution (governance, security, monitoring, cost, deployment), implementing generative AI solutions (Azure OpenAI, prompt design, RAG, safety), implementing agentic solutions (tools/functions, orchestration, memory, evaluation), and building solutions across Vision, Language, Speech, and knowledge mining with Azure AI Search, enrichment, and Document Intelligence.

Exam Tip: Your score is often decided by “service fit” and “operationalization” details, not by definitions. If two answers sound plausible, the correct one usually aligns best with a constraint in the prompt: latency, private networking, data residency, cost control, evaluation/monitoring, or content safety requirements.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final objective-by-objective review plan: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final objective-by-objective review plan: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 6.1: Mock exam instructions, timing strategy, and question triage

This mock exam is split into two parts to mirror the experience of switching between independent questions and scenario-style reasoning. Use exam-like conditions: single sitting per part, no notes, no documentation lookup, and no “just checking” after each item. Your objective is to train your pacing and decision-making under pressure.

Timing strategy: budget your time per question and protect time for review. If you find yourself re-reading the same paragraph more than twice, you are likely in a rabbit hole. Mark it and move on. Build a triage habit: answer “fast wins” immediately, defer “calculation/architecture” items, and isolate “ambiguous wording” items for second pass.

  • Pass 1: Answer items you can solve confidently in under 60–90 seconds.
  • Pass 2: Return to medium items; eliminate distractors using constraints (security, data flow, scale, model choice, service capabilities).
  • Pass 3: Resolve remaining items; if still unsure, choose the option that best satisfies governance and operational requirements.

Exam Tip: Watch for “multi-service” traps. Many AI-102 questions test whether you understand boundaries: Azure AI Search is for retrieval and indexing, Azure OpenAI is for generation, Document Intelligence is for structured extraction from documents, and Azure AI Vision is for image analysis/OCR. If an answer suggests one service can do all steps end-to-end without orchestration, treat it skeptically.

Question triage cues: if you see terms like “private endpoint,” “customer-managed keys,” “managed identity,” or “RBAC,” the question is likely testing governance/security and not the model choice itself. If you see “grounding,” “citations,” “hallucinations,” “semantic ranker,” or “hybrid search,” it’s likely a RAG or Azure AI Search decision point.

Section 6.2: Mock Exam Part 1 (timed, mixed domains)

Mock Exam Part 1 should feel like a representative slice of AI-102: a mixed set across governance, generative AI, agentic orchestration, vision, language, and search. While you work, force yourself to name the objective being tested before selecting an answer (even silently). This prevents you from solving the wrong problem—one of the most common causes of avoidable misses.

What this block typically tests: (1) service selection under constraints, (2) secure deployment patterns, and (3) interpretation of operational requirements (monitoring, cost, latency). For example, if a scenario mentions enterprise controls, assume you must consider identity (managed identities), network isolation (private endpoints/VNet integration where applicable), and safe output controls (content filtering and logging strategy).

Exam Tip: When two answers both “work,” the exam often rewards the one that reduces operational risk: least-privilege access, minimized data movement, and built-in monitoring. If an option requires you to store secrets in code or skip content safety, it is almost never correct.

Common traps in Part 1: confusing embedding generation with retrieval, or mixing up OCR services. Remember the pipeline: you may use Document Intelligence to extract structured fields from PDFs/forms, use Azure AI Search to index text + vectors (and optionally semantic ranking), then use Azure OpenAI to generate grounded answers from retrieved chunks. Another trap is assuming an LLM “remembers” state without explicitly implementing memory (conversation state store, tool outputs, or message history management).

To identify correct answers, highlight the verb in the requirement: “extract,” “classify,” “translate,” “search,” “generate,” “monitor,” “secure,” “evaluate.” Then map it to the Azure AI capability. If the item mentions “evaluation,” look for language about offline test sets, metrics (quality, groundedness, safety), and continuous monitoring rather than a one-time manual review.

Section 6.3: Mock Exam Part 2 (timed, mixed domains + scenario questions)

Mock Exam Part 2 adds scenario-style reasoning where multiple design decisions must align: data ingestion, enrichment, indexing, retrieval, generation, and governance. Your job is to keep the architecture consistent from end to end. The exam frequently tests whether you can spot a mismatch—such as a RAG design that retrieves documents but fails to include citations/grounding instructions, or a knowledge mining pipeline that enriches content but never maps outputs into an index schema.

In scenario questions, read once for business goal and constraints, then read again for “non-negotiables”: private networking, regulated data, cost caps, latency SLOs, multilingual needs, or a requirement to handle images/forms. Those non-negotiables usually eliminate half the options immediately.

Exam Tip: Scenario items are where you win time by diagramming mentally: (1) source, (2) extraction/enrichment, (3) index/store, (4) retrieval, (5) generation/action, (6) monitoring/safety. If an option skips a stage the prompt clearly needs, it is likely wrong even if it sounds advanced.

High-frequency scenario patterns include: RAG with Azure AI Search (hybrid + vector + semantic), document processing with Document Intelligence, and agentic orchestration with tools/functions. For agentic items, watch for the distinction between: (a) tools/functions (how the model calls actions), (b) orchestration (control flow, retries, guardrails), (c) memory (conversation state, tool results, user profile), and (d) evaluation (test harnesses, regression suites, safety checks). The exam expects you to treat these as explicit components, not implicit “magic.”

Common trap: selecting an answer that uses a powerful model to “infer” structured data when the prompt indicates you need reliable extraction. If the question emphasizes accuracy and schema consistency, prefer Document Intelligence or well-defined extraction techniques, then optionally use an LLM for summarization or explanation—not for primary field capture.

Section 6.4: Results review framework: categorize misses by objective and root cause

After each mock part, do not merely check correct/incorrect. Convert results into an objective-by-objective remediation plan. Create a simple table with columns: Objective area, Question theme, Your choice, Correct choice, Why you missed it, What rule would have prevented the miss. This turns review into repeatable improvement.

Categorize each miss by root cause:

  • Concept gap: You did not know the capability boundary (e.g., Search vs OpenAI vs Document Intelligence).
  • Constraint miss: You overlooked a requirement (private endpoint, compliance, cost, latency, language support).
  • Distractor bias: You picked the “most complex” option rather than the most appropriate.
  • Execution error: You knew it but rushed (misread “not,” mixed up service names, reversed steps).

Exam Tip: If you see a pattern of “constraint misses,” your fix is not more study—it’s a reading protocol. Underline (mentally) constraints and restate them before answering. Many candidates lose points because they solve an imagined problem instead of the one on the screen.

Then tie misses back to the course outcomes: governance/security/monitoring issues map to “Plan and manage an Azure AI solution.” RAG and prompt safety map to “Implement generative AI solutions.” Tool calling, memory, and evaluation misses map to “Implement an agentic solution.” OCR/image analysis misses map to “Implement computer vision solutions.” Language, translation, and speech choices map to “Implement NLP solutions.” Indexing/enrichment/document extraction misses map to “Implement knowledge mining and information extraction.” Your final review plan in Section 6.5 should be based on these categories—not on what feels interesting.

Section 6.5: Final review: high-frequency services, limits, and decision points

Your final review is a “decision-point” review, not a feature tour. Focus on what the exam repeatedly asks you to choose between. Start with a shortlist of high-frequency services: Azure OpenAI (chat/completions, embeddings, safety patterns), Azure AI Search (vector/hybrid/semantic, index schema, ingestion), Document Intelligence (OCR + structured extraction from documents), Azure AI Vision (image analysis/OCR scenarios), Azure AI Language (classification, entity extraction, summarization), Speech (speech-to-text, text-to-speech, translation), and the operational layer (identity, networking, monitoring, cost governance).

Key decision points to rehearse:

  • RAG architecture: chunking strategy, embedding generation, vector + keyword hybrid retrieval, semantic ranking, grounding instructions, and citation handling.
  • Safety: where to apply content filtering, prompt injection mitigations, and logging/monitoring without leaking sensitive data.
  • Agent design: tool/function calling boundaries, orchestration flow, state/memory storage, and evaluation/regression testing.
  • Knowledge mining: when to use enrichments, how outputs map to searchable fields, and how to handle document formats.
  • Governance: RBAC/managed identity, key management, network controls, and cost controls.

Exam Tip: If an option promises better “accuracy” but ignores evaluation, it’s incomplete. The exam increasingly expects you to plan for measurement: curated test sets, automatic checks for groundedness, and monitoring for drift or unsafe outputs.

Also review limits and “gotchas” conceptually (without memorizing every number): token/latency trade-offs for model selection, the impact of chunk size on retrieval, and the operational cost of calling models repeatedly (agent loops, tool retries). When asked to optimize cost, look for answers that reduce calls (cache embeddings, reuse retrieved context, choose smaller models where appropriate) while preserving safety and monitoring.

Section 6.6: Exam day checklist: environment, pacing, and last-minute do’s/don’ts

On exam day, your goal is to eliminate preventable errors: distractions, pacing breakdowns, and second-guess spirals. Prepare your environment (quiet space, stable connectivity, comfortable display settings). If testing remotely, ensure you comply with proctor rules to avoid interruptions that cost time and focus.

Pacing checklist: commit to your triage passes from Section 6.1. Do not let a single scenario consume disproportionate time early. Maintain forward motion and rely on review passes to recover difficult items.

  • Before start: breathe, skim the interface, confirm time remaining display, plan pass structure.
  • During: read constraints first, map to objective, eliminate options that violate governance/safety, then decide.
  • Review: revisit only marked items; do not re-open items you answered confidently unless you spot a clear misread.

Exam Tip: Last-minute do: rehearse service boundaries and end-to-end pipelines (documents → extraction → indexing → retrieval → generation → monitoring). Last-minute don’t: cram obscure features. AI-102 is strongest on applied architecture choices, operational readiness, and matching requirements to the right Azure AI capability.

Common “final hour” traps: changing correct answers due to doubt, missing negation words (NOT/EXCEPT), and choosing an answer that is technically possible but operationally unrealistic (hardcoded keys, no monitoring, no safety strategy). Treat “secure by default” and “observable by default” as tiebreakers when uncertain. Your final objective-by-objective review plan should be short and targeted: one page of rules that would have prevented your misses, then rest.

Chapter milestones
  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final objective-by-objective review plan
Chapter quiz

1. You are running a timed mock exam and notice that in many items two answers seem plausible. To increase your score on the real AI-102 exam, which strategy best aligns with Chapter 6 guidance for selecting the correct option?

Show answer
Correct answer: Choose the option that best satisfies an explicit constraint in the prompt (for example, private networking, latency, data residency, cost control, evaluation/monitoring, or safety).
AI-102 questions are frequently decided by service fit and operational constraints (security, networking, residency, cost, latency, monitoring, safety). Option A matches the exam behavior emphasized in the chapter: read for intent and let constraints break ties. Option B is incorrect because the exam tests correct service/application of requirements, not novelty. Option C is incorrect because broader capability can violate constraints (cost, latency, governance) and is not inherently the best fit.

2. A company is building a retrieval-augmented generation (RAG) chatbot over internal policy documents. Requirements: (1) user prompts and retrieved content must not be exposed to the public internet, (2) the solution must support enterprise governance and monitoring, and (3) the chatbot must cite sources. Which architecture best meets the requirements?

Show answer
Correct answer: Azure OpenAI for generation + Azure AI Search for retrieval, both accessed through Private Link/managed networking, with citations returned from retrieved chunks.
Option A is the standard AI-102 RAG pattern: Azure AI Search provides grounded retrieval (enables citations), Azure OpenAI generates responses, and Private Link/managed networking supports the private-networking requirement; governance/monitoring aligns with operationalization objectives. Option B is incorrect because stuffing entire corpora into prompts is not scalable, increases cost/latency, and does not provide reliable, verifiable citations. Option C is incorrect because search alone can return passages but does not satisfy the generative conversational requirement and typically won’t produce synthesized answers.

3. You are doing a weak-spot analysis after Mock Exam Part 2. Your results show repeated misses in these areas: choosing between Azure AI Search vs Azure AI Document Intelligence, and knowing when to use Azure OpenAI function calling. Which next step best aligns with an objective-by-objective final review plan?

Show answer
Correct answer: Map each missed question to the official AI-102 objective area and create targeted drills focused on the specific decision points (service selection for knowledge mining vs extraction, and tool/function calling patterns).
Option A follows the exam-aligned approach: tie misses to objectives and practice high-frequency decision points (service fit and agentic patterns), which is central to Chapter 6. Option B is inefficient for score gains because it is not targeted and delays practice on known weak areas. Option C is insufficient because AI-102 emphasizes applying services to constraints and scenarios; definitions alone don’t resolve near-tie answer choices.

4. A team is deploying a generative AI assistant. Requirements: (1) prevent the assistant from returning harmful content, (2) detect policy violations in both user input and model output, and (3) keep an audit trail for compliance reviews. Which combination is the most appropriate?

Show answer
Correct answer: Use Azure AI Content Safety to filter/analyze prompts and completions, and configure logging/monitoring (for example, Application Insights/Log Analytics) to retain audit records per policy.
Option A addresses safety and operationalization: Azure AI Content Safety is designed to classify/filter harmful content for both input and output, and auditability requires appropriate logging/retention through Azure monitoring/governance controls. Option B is incorrect because prompting alone is not a sufficient safety control and disabling logging conflicts with audit requirements (you would instead implement compliant retention/redaction). Option C is incorrect because Vision is not the primary service for text safety classification, and local files do not meet typical enterprise governance/monitoring expectations.

5. During a timed mock exam block, you are behind schedule. You encounter a long scenario about an agent that uses tools, memory, and evaluation. According to Chapter 6 exam behaviors, what is the best approach to maximize your score under time pressure?

Show answer
Correct answer: Skim for the specific constraint(s) and the service/feature being tested, eliminate distractors that violate those constraints, and flag the item if needed to return later.
Option A reflects the recommended exam strategy: read for intent, identify constraints, eliminate distractors, and use flag/review to manage time. Option B is incorrect because over-investing time on one item commonly lowers the overall score; certification exams reward consistent pacing. Option C is incorrect because random guessing without using constraints and option comparison leaves points on the table; even under time pressure you can often eliminate one or two options quickly.
More Courses
Edu AI Last
AI Course Assistant
Hi! I'm your AI tutor for this course. Ask me anything — from concept explanations to hands-on examples.