HELP

Generative AI Solutions on Azure (AI-102) Exam Prep 2026

AI Certification Exam Prep — Beginner

Generative AI Solutions on Azure (AI-102) Exam Prep 2026

Generative AI Solutions on Azure (AI-102) Exam Prep 2026

Master AI-102 domains with practice-heavy prep and a full mock exam.

Beginner ai-102 · microsoft · azure · azure-ai

Prepare with confidence for Microsoft AI-102 (Azure AI Engineer Associate)

This course blueprint is designed for beginners who want a clear, exam-aligned path to passing the AI-102: Designing and Implementing a Microsoft Azure AI Solution exam by Microsoft. You’ll learn the concepts and decision-making patterns tested on the exam and then validate your readiness through domain-focused practice sets and a full mock exam.

AI-102 measures applied engineering skills across modern Azure AI capabilities. In 2026, that increasingly includes generative AI patterns, agentic orchestration, and search-driven grounding, alongside core computer vision and natural language processing workloads. This course organizes those skills into a 6-chapter “book” that maps directly to the official exam domains.

What this course covers (mapped to the official exam domains)

  • Plan and manage an Azure AI solution: selecting services, securing resources, monitoring performance, and controlling cost.
  • Implement generative AI solutions: choosing model deployments, prompt engineering, grounding with enterprise data, evaluation, and safe production rollouts.
  • Implement an agentic solution: designing tool-using workflows, managing state and memory, and applying safety boundaries.
  • Implement computer vision solutions: OCR and image/document processing patterns, deployment considerations, and troubleshooting.
  • Implement NLP solutions: classification and extraction patterns, conversational design, and speech integration considerations.
  • Implement knowledge mining and information extraction: ingestion and enrichment pipelines, indexing, and retrieval strategies that support RAG.

How the 6 chapters are structured

Chapter 1 starts with the exam itself: registration, scoring, question formats, and a practical study strategy. You’ll also set up an environment plan so you can practice without getting blocked by tooling.

Chapters 2–5 each focus on one or two domains with an emphasis on what the exam asks you to decide: which Azure service to use, how to secure it, how to deploy it, and how to troubleshoot it. Each chapter ends with exam-style practice milestones to reinforce the most-tested scenarios.

Chapter 6 is a full mock exam experience with structured review. You’ll identify weak areas by domain, fix gaps, and walk into the exam with a clear checklist for pacing and accuracy.

Why this helps you pass AI-102

AI-102 questions often look like real project briefs: you’re given constraints (security, latency, cost, data residency, safety), then asked to choose the best design or implementation step. This course blueprint emphasizes that skill—turning requirements into correct Azure AI decisions—while keeping the learning path beginner-friendly.

  • Domain-aligned structure so you can track progress against the official objectives
  • Practice-first milestones to build speed and accuracy under exam conditions
  • A mock exam and weak-spot analysis workflow to raise your score quickly

If you’re new to certifications, start by setting a test date to create urgency and a realistic schedule. You can begin on Edu AI today: Register free or browse all courses.

Who this course is for

This course is for learners preparing for the Microsoft Azure AI Engineer Associate certification who have basic IT literacy but no prior certification experience. If you can navigate cloud concepts and follow step-by-step labs, you can succeed here.

What You Will Learn

  • Plan and manage an Azure AI solution: governance, security, monitoring, and cost controls
  • Implement generative AI solutions: model selection, prompt engineering, grounding, and evaluation
  • Implement an agentic solution: tool use, orchestration patterns, and safety constraints
  • Implement computer vision solutions: image analysis, OCR, and document-oriented vision workflows
  • Implement NLP solutions: classification, extraction, conversational patterns, and speech integration
  • Implement knowledge mining and information extraction: search indexing, enrichment, and RAG-ready pipelines

Requirements

  • Basic IT literacy (networks, web apps, and cloud concepts)
  • Comfort using a browser and command line basics
  • An Azure account (free tier is fine) for optional hands-on practice
  • No prior Microsoft certification experience required

Chapter 1: AI-102 Orientation, Exam Mechanics, and Study Plan

  • Understand AI-102: role, skills measured, and exam domains
  • Register for the exam: scheduling, pricing, ID, and policies
  • Scoring, question formats, and time management strategy
  • Build a 4-week study plan with labs, notes, and spaced repetition
  • Set up your practice environment (Azure account, tools, and tracking)

Chapter 2: Plan and Manage an Azure AI Solution

  • Design the solution: services, regions, and reference architectures
  • Secure and govern AI resources: identity, network, and data controls
  • Operationalize deployments: CI/CD, versioning, and rollbacks
  • Monitor reliability and cost: logging, alerts, quotas, and budgets
  • Domain practice set: plan/manage scenarios and troubleshooting

Chapter 3: Implement Generative AI Solutions (Azure OpenAI)

  • Choose models and build prompts: system messages, few-shot, and constraints
  • Ground responses with enterprise data: RAG patterns and citations
  • Evaluate and improve quality: testing, safety, and regression checks
  • Deploy to production: rate limits, streaming, caching, and fallbacks
  • Domain practice set: generative AI build-and-fix exam scenarios

Chapter 4: Implement an Agentic Solution + Knowledge Mining

  • Design agents: goals, memory, tools, and orchestration boundaries
  • Implement tool use: function calling, connectors, and tool validation
  • Build knowledge mining pipelines: ingestion, enrichment, and indexing
  • Optimize retrieval for agents: ranking, filters, and hybrid search
  • Domain practice set: agent workflows and search/index case studies

Chapter 5: Implement Computer Vision Solutions + Implement NLP Solutions

  • Computer vision essentials: analysis, detection, and OCR workflows
  • Document and image pipelines: preprocessing, quality, and error handling
  • NLP fundamentals: classification, extraction, and conversational design
  • Speech and multimodal integration: voice-in/voice-out and accessibility
  • Domain practice set: vision + NLP mixed scenarios and edge cases

Chapter 6: Full Mock Exam and Final Review

  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final Review Sprint: formulas, patterns, and common traps

Jordan Whitaker

Microsoft Certified Trainer (MCT)

Jordan Whitaker is a Microsoft Certified Trainer who builds and teaches Azure AI certification tracks for learners moving from fundamentals to role-based exams. He specializes in AI-102 readiness through scenario-based design, governance, and production deployment patterns on Azure.

Chapter 1: AI-102 Orientation, Exam Mechanics, and Study Plan

AI-102 is not a “memorize services” exam. It measures whether you can design, implement, and operate Azure AI solutions under real-world constraints: security, reliability, cost, evaluation, and responsible AI. The 2026 version of this course emphasizes generative AI in Azure (Azure OpenAI, grounding with Azure AI Search, and agent orchestration), but the exam still expects you to be fluent across classic Azure AI capabilities (vision, language, speech) and the operational layer (monitoring, governance, deployment patterns).

This chapter helps you orient to what the exam is really testing, how to register and plan around policies, how scoring and question formats work, and how to build a four-week preparation plan that blends labs, notes, and spaced repetition. You will also set up a practice environment that mirrors exam scenarios: multiple Azure resources, identity controls, cost guardrails, and a repeatable way to track what you learned.

Throughout this chapter, you’ll see coaching on common traps (what distractor choices look like), and how to recognize the “most correct” answer when more than one option is technically possible in Azure.

Practice note for Understand AI-102: role, skills measured, and exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register for the exam: scheduling, pricing, ID, and policies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, question formats, and time management strategy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 4-week study plan with labs, notes, and spaced repetition: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Set up your practice environment (Azure account, tools, and tracking): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Understand AI-102: role, skills measured, and exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Register for the exam: scheduling, pricing, ID, and policies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Scoring, question formats, and time management strategy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build a 4-week study plan with labs, notes, and spaced repetition: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Set up your practice environment (Azure account, tools, and tracking): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 1.1: AI-102 overview and Azure AI Engineer job tasks

AI-102 targets the day-to-day responsibilities of an Azure AI Engineer: selecting capabilities, integrating them into apps, and operating them safely at scale. The exam is aligned to job tasks such as building conversational experiences, implementing retrieval-augmented generation (RAG), integrating vision and language, and applying security and governance controls. In 2026, expect more scenarios that combine multiple services—e.g., Azure OpenAI + Azure AI Search + Azure AI Document Intelligence—rather than isolated “what service does X?” prompts.

Map the course outcomes to what shows up on the test: (1) Plan and manage an Azure AI solution (identity, network, monitoring, cost controls); (2) Implement generative AI (model selection, prompt design, grounding, evaluation); (3) Implement agentic solutions (tool use, orchestration patterns, safety constraints); (4) Computer vision (image analysis, OCR, document workflows); (5) NLP and speech (classification, extraction, conversation, speech integration); and (6) Knowledge mining (indexing, enrichment, RAG-ready pipelines).

Exam Tip: When a scenario mentions “governance,” “least privilege,” “private endpoints,” “logging,” or “budget alerts,” the exam is testing the management plane, not just model prompts. Don’t jump to “use GPT-4” when the real requirement is “meet compliance and monitor usage.”

Common traps include choosing a service because it “can” do the job, while missing the one that fits the requirement precisely. Example: for document-heavy OCR with structured extraction, the test usually wants Document Intelligence rather than generic OCR; for semantic retrieval over enterprise content, it often wants Azure AI Search with vector/hybrid retrieval and grounding rather than dumping content into prompts.

Section 1.2: Exam registration, delivery options, and accommodations

Register for AI-102 through Microsoft’s certification portal and schedule via the exam delivery provider (commonly Pearson VUE). You typically can choose online proctored delivery or a test center. Your decision should be strategic: online proctoring is convenient but unforgiving about environment compliance; test centers reduce the risk of technical disqualification.

Bring valid, unexpired government-issued ID that matches your registration name. Pay attention to the exact name formatting (middle initials, accent marks) because mismatches can cause check-in delays. Read the policies on breaks, personal items, and workspace requirements—especially for online delivery (clean desk, webcam room scan, stable internet).

If you need accommodations (extra time, assistive technologies), start early. Accommodation approval can take time and may affect how you schedule. Plan your study timeline so administrative steps do not collide with your intended exam date.

Exam Tip: If you plan to take the exam online, run the system test days in advance and again on exam day. Many failures are not “internet down,” but permissions, corporate VPN constraints, or webcam/security software conflicts.

Pricing varies by region and may change; budget for one retake if possible. Treat the scheduling step as part of preparation: a fixed date increases follow-through and makes your four-week plan realistic.

Section 1.3: Scoring model, passing criteria, and retake policy

AI-102 uses a scaled scoring model typical of Microsoft role-based exams. You receive a score on a scale (commonly 1–1000), with a published passing threshold (often 700). Because of scaling, you should not try to compute your percentage correct during the exam; instead, focus on maximizing points by avoiding preventable errors on high-certainty questions and returning to medium-certainty items later.

Some questions may be weighted differently, and some unscored items may appear for exam calibration. Practically, this means you should treat every question as if it counts and avoid spending excessive time trying to identify “experimental” items.

Retake policies can evolve, but typically there is a waiting period between attempts (with longer waits after multiple failures). This is why your prep should emphasize skill acquisition (labs and implementation) rather than last-minute memorization. If you need a retake, your goal is to convert weak domains into strengths, not simply “do more practice questions.”

Exam Tip: Track misses by objective, not by question. If you miss three items related to grounding and retrieval, the fix is: build one end-to-end RAG pipeline (indexing → retrieval → prompt assembly → evaluation), then re-test. That is far more efficient than re-reading documentation.

Common scoring traps include over-investing time in one complex scenario early and rushing later, and failing to revisit flagged items due to poor time budgeting. Your time management strategy starts here: aim for consistent pacing and reserve time for review.

Section 1.4: Question types: case studies, hotspots, and multi-select

AI-102 commonly uses mixed formats: traditional multiple-choice, multi-select (“choose all that apply”), hotspot (selecting regions in a UI diagram), and case studies with multiple related questions. Each format has its own pitfalls, and your method should adapt accordingly.

For case studies, read the requirements and constraints first (security, data residency, latency, budget), then scan the existing architecture. Many wrong answers are “valid” Azure features that violate a constraint mentioned once in the case. Build a mini checklist: identity, network, data source, evaluation/monitoring, and cost controls. That checklist mirrors the real job—and the exam’s intent.

Hotspots test whether you recognize where settings live (portal, resource configuration, deployment options). The trap is guessing based on service names. Slow down and match the setting to the layer: is it model deployment, content filtering, network isolation, search index configuration, or application code?

Multi-select questions often include plausible distractors. Use elimination: first pick options that are required by the prompt (must/only), then validate that each selected option does not conflict with constraints. If the question says “minimize operational overhead,” fully managed services and built-in integrations usually beat custom orchestration.

Exam Tip: Watch for absolutes: “always,” “only,” “must use,” and “cannot.” These words turn a broad Azure discussion into a narrow exam answer. Also watch for hidden requirements like “customer-managed keys,” “private access,” or “no public internet”—these immediately eliminate many otherwise-correct choices.

Section 1.5: Study strategy mapped to official exam domains

Your four-week plan should follow the exam domains rather than random exploration. AI-102 rewards integration skills: connecting Azure OpenAI with retrieval, applying responsible AI controls, and operationalizing solutions. Use a cycle of Learn → Build → Evaluate → Note each week, with spaced repetition to retain key decisions and service boundaries.

  • Week 1 (Foundations + Management): governance, identity (managed identities, RBAC), network patterns (private endpoints where applicable), logging/monitoring (Azure Monitor, Application Insights), and cost controls (budgets, quotas). Build one small app that calls a model and emits traces/metrics.
  • Week 2 (Generative AI + RAG): model selection, prompt patterns, grounding with Azure AI Search (vector/hybrid retrieval), and evaluation. Build a RAG pipeline with citations and test failure modes (hallucination, stale content).
  • Week 3 (Agentic + Multimodal): tool calling/function calling patterns, orchestration boundaries, safety constraints, and integrating vision/document processing where needed. Implement a tool-using assistant that can query a search index and extract structured fields from a document.
  • Week 4 (Language/Vision/Speech + Review): classification/extraction, conversation patterns, OCR and document workflows, plus full-domain review. Rebuild one end-to-end scenario from scratch under time pressure.

Spaced repetition is essential. Maintain a “decision log” of why you chose a service and what constraint drove the choice (e.g., “Document Intelligence for structured form extraction; Search for retrieval; private endpoint for data exfiltration control”). Review that log on days 1, 3, 7, and 14 after you write it.

Exam Tip: Your notes should be organized by objective verbs: plan, implement, integrate, monitor, evaluate. The exam rarely asks for trivia; it asks what you would do next, what you would configure, or which approach meets constraints.

Section 1.6: Lab setup checklist and resource planning

Set up a practice environment that lets you repeat common exam builds without wasting time or money. You want consistency (same naming, same regions where possible) and guardrails (budgets and cleanup). Start with one Azure subscription you control, and create a dedicated resource group per lab week so you can delete cleanly.

  • Account & access: Azure subscription, ability to create resources, and an identity strategy (your user + a managed identity for apps). Confirm you can assign RBAC roles in your scope.
  • Core tools: Azure portal, Azure CLI, VS Code, Git, and a simple HTTP client (curl or Postman). Use environment variables for endpoints/keys to avoid hardcoding secrets.
  • Services you will likely provision: Azure OpenAI (where available), Azure AI Search, Azure AI Document Intelligence, and observability (Application Insights / Log Analytics). Add storage (Blob) as a document source for indexing.
  • Tracking: a lab journal (Markdown) capturing: goal, architecture sketch, key settings, cost notes, and what broke. This becomes your spaced-repetition source.

Plan resources with cost controls from day one: set a subscription budget alert, use deployment quotas wisely, and deallocate/delete when finished. If a lab requires uploading documents, store only non-sensitive sample data. Practice applying least privilege: separate “build-time” permissions from “run-time” permissions, and prefer managed identity where supported.

Exam Tip: The exam often rewards “operationally sane” answers: centralized logging, repeatable deployments, and secure access. When you build labs, mirror that mindset—enable diagnostics, tag resources, and document endpoints. Those habits translate directly into correct exam decisions.

Finally, keep a resource planning sheet: what you created, in which region, and when to delete it. Many candidates lose momentum due to surprise cost or quota issues; disciplined lab hygiene prevents that and keeps your prep on schedule.

Chapter milestones
  • Understand AI-102: role, skills measured, and exam domains
  • Register for the exam: scheduling, pricing, ID, and policies
  • Scoring, question formats, and time management strategy
  • Build a 4-week study plan with labs, notes, and spaced repetition
  • Set up your practice environment (Azure account, tools, and tracking)
Chapter quiz

1. You are mentoring a team preparing for AI-102. They have been memorizing lists of Azure AI services and features. You want them to align with what the AI-102 exam is primarily designed to measure. Which guidance should you give them?

Show answer
Correct answer: Focus on designing, implementing, and operating Azure AI solutions under constraints like security, reliability, cost, evaluation, and responsible AI—not just recalling service names.
AI-102 is role-based and evaluates applied solution design/implementation/operations under real-world constraints (security, cost, reliability, evaluation, responsible AI). Option B is incorrect because certification exams may include configuration decisions but are not centered on trivia like portal clicks or SKU memorization. Option C is incorrect because even with a generative AI emphasis, the exam still expects coverage across broader Azure AI capabilities and the operational layer.

2. A company plans to schedule AI-102 for multiple employees. One employee has a legal name that does not match the nickname used at work. Another employee wants to reschedule the exam date after booking. Which action best reduces the risk of test-day issues and policy violations?

Show answer
Correct answer: Ensure the registration name matches government-issued ID and review rescheduling/cancellation policies before booking.
Exam registration and check-in are governed by strict identity and policy requirements; the safest approach is to match the registration name to government ID and understand rescheduling/cancellation rules up front. Option B is incorrect because ID mismatches commonly prevent admission; they are not typically resolved after taking the exam. Option C is incorrect because certifications require individual candidate profiles/appointments; shared accounts can violate exam policies and cause invalidation.

3. You are 20 minutes into the AI-102 exam and encounter a multi-part scenario question with several plausible answers. You are unsure of the 'most correct' option. What is the best time-management approach aligned to typical certification exam mechanics?

Show answer
Correct answer: Select the best answer you can, flag it for review (if available), and move on to protect time for remaining questions.
Real certification exams reward steady pacing; when unsure, choosing the best option, flagging, and returning later helps you manage time and maximize overall score. Option B is incorrect because overspending time early increases the risk of rushing later questions, which typically lowers accuracy. Option C is incorrect because unanswered items generally provide no opportunity to earn points; skipping without selecting an option is usually worse than making your best attempt.

4. Your manager asks you to create a 4-week AI-102 study plan that improves retention and performance on scenario-based questions. Which plan best aligns with a certification-focused approach described in the course?

Show answer
Correct answer: Blend hands-on labs each week with note-taking and spaced repetition, and use practice questions to identify weak domains for targeted review.
A practical AI-102 plan emphasizes repeated exposure (spaced repetition), hands-on labs to build applied skills, and iterative practice to close gaps across exam domains. Option B is incorrect because cramming and delaying feedback reduces retention and leaves weak areas unaddressed. Option C is incorrect because AI-102 is scenario-driven and assesses implementation/operations decisions; avoiding labs undermines the applied competence the exam expects.

5. You are setting up a practice environment to mirror AI-102 exam scenarios for generative AI solutions. Which setup is most appropriate to reflect real-world constraints and the operational layer?

Show answer
Correct answer: Create multiple Azure resources with identity controls (RBAC), cost guardrails (budgets/alerts), and a repeatable tracking method for what you learned.
AI-102 expects you to work within constraints like identity/governance, cost management, and operations; a practice environment should reflect RBAC, cost controls, and a systematic learning log. Option B is incorrect because broad Owner access and shared environments reduce realism and can mask common exam scenarios related to security and governance. Option C is incorrect because skipping monitoring/tracking and operational practices conflicts with the exam’s emphasis on operating solutions (including cost and reliability) rather than only provisioning resources.

Chapter 2: Plan and Manage an Azure AI Solution

AI-102 increasingly tests whether you can run Azure AI solutions like production software: you must choose the right services and regions, secure identities and data paths, operate deployments with CI/CD and rollback discipline, and continuously monitor reliability and spend. This chapter maps directly to the “Plan and manage an Azure AI solution” outcome, but it also sets foundations you will reuse when implementing generative AI (grounding and evaluation depend on data governance) and agentic patterns (tool permissions depend on identity and network controls).

Expect scenario questions that blend multiple constraints: “must use private connectivity,” “data cannot leave region,” “needs least privilege,” “must meet an SLA,” or “budget is capped.” The best answers are typically the ones that combine platform-native controls (Azure Policy, RBAC, Private Link, Monitor, budgets) rather than custom code. Also watch for traps where you are asked to secure or monitor the service versus the application—AI-102 tends to reward choices that cover both layers.

Finally, plan for operational maturity: version your prompts and model deployments, set up logs and traces before launch, and define rollback paths. Many failures on the exam (and in real systems) come from skipping these “boring” steps, then trying to debug production without telemetry or proper change control.

Practice note for Design the solution: services, regions, and reference architectures: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure and govern AI resources: identity, network, and data controls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Operationalize deployments: CI/CD, versioning, and rollbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Monitor reliability and cost: logging, alerts, quotas, and budgets: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: plan/manage scenarios and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design the solution: services, regions, and reference architectures: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Secure and govern AI resources: identity, network, and data controls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Operationalize deployments: CI/CD, versioning, and rollbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Monitor reliability and cost: logging, alerts, quotas, and budgets: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 2.1: Solution planning: selecting Azure AI services and patterns

Section 2.1: Solution planning: selecting Azure AI services and patterns

AI-102 planning questions typically begin with requirements (latency, data residency, private network, multimodal needs, retrieval needs, throughput, and integration constraints) and ask you to select Azure services and a reference architecture. In practice, you’ll often combine Azure OpenAI (generation), Azure AI Search (retrieval), and storage (Azure Blob/ADLS) for a RAG-ready pipeline, plus Azure Functions or App Service for orchestration. For document workflows, Azure AI Document Intelligence often sits upstream to extract structured fields that then feed indexing and grounding.

Regional selection is a recurring exam objective disguised inside other topics. You need to ensure all dependent services are available in the chosen region(s) and that compliance requirements (for example, “data must stay in EU”) are met. Multi-region architectures are typically justified by resiliency and DR requirements, not as a default. When asked to minimize latency, co-locate compute, model endpoint, and data/search resources in the same region whenever possible.

Common patterns you should recognize on the exam:

  • RAG (retrieve-then-generate): Use Azure AI Search as the retrieval layer; use managed identity to access storage; store prompts and configs in a secure repo; log citations and retrieval scores.
  • Document-to-knowledge: Document Intelligence + Storage + Search indexing + optional enrichment; then Azure OpenAI for summarization/Q&A grounded on indexed chunks.
  • Agentic tool-use: An orchestrator (Functions/Container Apps) calling Azure OpenAI and tools (Search, data APIs) with least-privileged identities.

Exam Tip: If a question mentions “grounded responses” or “reduce hallucinations,” the best architecture almost always includes retrieval (Azure AI Search) and a data pipeline that produces searchable chunks, not just a bigger model.

Exam trap: choosing services based on brand familiarity rather than capability. For example, “store embeddings in Cosmos DB” may be viable, but if the scenario emphasizes “hybrid search, filters, ranking, and citations,” Azure AI Search is usually the intended answer because it’s purpose-built for retrieval and integrates directly with common RAG patterns.

Section 2.2: Identity and access: RBAC, managed identities, secrets

Section 2.2: Identity and access: RBAC, managed identities, secrets

Identity is where AI-102 questions become subtle: you’re often asked to secure calls between an app, Azure OpenAI, Search, and Storage without using long-lived keys. The exam expects you to know when to use Microsoft Entra ID (Azure AD) authentication, RBAC, and managed identities. A system-assigned managed identity is ideal when the identity lifecycle should track the resource (for example, a Function App). A user-assigned managed identity is preferred when multiple workloads share the same identity or when you want stable identity across redeployments.

RBAC decisions should reflect least privilege. For example, an indexing pipeline might need write access to a Search index, while a runtime chat app should only need query access. Don’t grant broad roles (Owner/Contributor) when a narrower role exists. Similarly, differentiate between data plane and management plane permissions: many services require specific data roles to read blobs or query indexes, even if the identity can manage the resource.

Secrets management is a frequent “gotcha.” Keys and connection strings should be in Azure Key Vault, not in appsettings.json or pipeline variables. Prefer managed identity to access Key Vault; avoid embedding secrets in code or containers. If the scenario mentions rotation requirements, Key Vault plus automated rotation (or re-deploy with updated references) is typically the correct direction.

Exam Tip: When you see “no secrets in code” or “use passwordless connections,” the intended answer is usually “managed identity + RBAC” (and Key Vault only where absolutely needed, such as third-party API keys).

Common trap: selecting “shared access signature (SAS)” by default. SAS can be correct for limited-time external sharing, but for internal service-to-service calls within Azure, managed identity is the more secure and exam-preferred approach.

Section 2.3: Network and data security: private endpoints, encryption, DLP

Section 2.3: Network and data security: private endpoints, encryption, DLP

AI-102 expects you to secure not only data at rest, but also the network path used by AI services. If a scenario says “no public internet access,” you should think immediately of Private Link (private endpoints) and disabling public network access on the services where supported. A typical secure architecture places the app in a VNet and uses private endpoints for Storage, Azure AI Search, and other PaaS services, with private DNS zones to resolve the private addresses correctly.

Encryption questions usually include a compliance constraint. By default, Azure encrypts data at rest, but some scenarios require customer-managed keys (CMK) in Key Vault. Recognize wording like “customer controls the keys,” “bring your own key,” or “key revocation required,” which points toward CMK rather than platform-managed keys. Also be ready for “encryption in transit” expectations: use HTTPS/TLS end-to-end; don’t propose plaintext internal calls.

Data Loss Prevention (DLP) and sensitive data handling appear more often in generative AI scenarios: prompt inputs, retrieved documents, and model outputs may contain PII. While AI-102 may not go deep into Purview configuration, you should know the strategy: classify data, restrict access, avoid logging sensitive payloads, and implement output filtering where appropriate. In enterprise settings, Microsoft Purview can help with data classification and governance, and logging should use redaction or structured fields rather than raw content dumps.

Exam Tip: If the prompt says “must remain on private network,” “exfiltration risk,” or “disallow public endpoints,” the answer almost always includes Private Endpoint + public access disabled + correct DNS planning. Missing DNS is a classic real-world failure and an exam distractor.

Common trap: confusing VNet integration with Private Link. VNet integration lets your app reach into a VNet; it does not automatically make the PaaS service private. Private endpoints are what give the PaaS service a private IP in your VNet.

Section 2.4: Responsible AI and compliance: policies, auditing, documentation

Section 2.4: Responsible AI and compliance: policies, auditing, documentation

Responsible AI is tested less as philosophy and more as operational controls: how you enforce policy, prove compliance, and audit changes. For Azure-based AI solutions, governance often starts with Azure Policy initiatives applied at management group/subscription scope to restrict regions, require tags, deny public network access, or enforce private endpoints. These controls prevent “configuration drift” and are commonly the best answer when the scenario asks for organization-wide enforcement.

Auditing and documentation are also exam-relevant. Logging access to Key Vault, tracking deployment changes, and retaining activity logs enable investigations and compliance attestations. If a scenario mentions “must demonstrate who changed the model deployment” or “audit prompt changes,” think in terms of CI/CD pipelines with approvals, repository history, and Azure Activity Log rather than ad-hoc manual updates in the portal.

For generative solutions, you should be ready to describe (at a high level) content safety and human oversight. Even if not asking for a specific service, the exam expects you to mention guardrails: input/output filtering, refusal behavior for disallowed content, and monitoring for policy violations. In production, this includes documenting intended use, limitations, and evaluation results (quality, groundedness, safety). Those artifacts often map to internal governance checklists in enterprises.

Exam Tip: When asked “how do you ensure all resources follow the same security configuration,” the answer is usually Azure Policy (prevent/deny) rather than a runbook or a wiki page.

Common trap: treating “responsible AI” as purely a model setting. On the exam, responsible AI is primarily about end-to-end system behavior—data sources, logging, review workflows, and change management—because that’s what auditors and regulators can validate.

Section 2.5: Monitoring and performance: metrics, tracing, and SLAs

Section 2.5: Monitoring and performance: metrics, tracing, and SLAs

Monitoring questions often combine reliability (availability, latency), debugging (root cause), and customer impact (SLA). Your core toolkit is Azure Monitor: metrics for near-real-time signals, logs (Log Analytics) for deeper queries, and Application Insights for distributed tracing. For an AI app, you want end-to-end correlation: an incoming request should be traceable through the orchestrator, retrieval calls, and model inference, with timings and failure reasons.

Expect scenarios like “intermittent timeouts,” “sudden latency increase,” or “answers are missing citations.” The correct approach is usually: instrument the app, monitor dependencies, alert on thresholds, and use structured logs that include retrieval parameters (topK, filters), model deployment name/version, and response metadata. You may also need to identify whether the bottleneck is the retrieval layer (Search latency), the model endpoint, or the application compute.

SLA interpretation is an exam favorite. Azure SLAs apply to services when deployed according to documented requirements (for example, multiple instances, zone redundancy, or specific tiers). The exam may ask what to do to meet an uptime requirement—look for answers that add redundancy (multiple instances/regions) and that remove single points of failure (for example, a single-zone deployment). Also, remember that an SLA for a component does not automatically produce the same SLA for the whole solution; end-to-end availability is the product of dependencies.

Exam Tip: If the scenario asks you to “identify which dependency is failing,” choose distributed tracing (Application Insights) over just metrics. Metrics show symptoms; traces show the path and the failing hop.

Common trap: enabling logging after an incident. The exam generally expects “monitoring by design”: logs, metrics, and alerts configured before go-live, with retention aligned to compliance needs and cost constraints.

Section 2.6: Cost, quotas, and lifecycle management for AI workloads

Section 2.6: Cost, quotas, and lifecycle management for AI workloads

Cost control is not an afterthought in AI-102—expect direct questions about budgets, quotas, and how to prevent surprise bills. Azure Cost Management budgets and alerts are the platform-native answer for spend governance at subscription/resource group scope. Tagging (project, environment, cost center) is frequently paired with budgets because it enables chargeback/showback and helps isolate which workload caused the increase.

Quotas and throttling are a practical operational concern for AI services. Scenarios may mention “requests failing with 429” or “deployment is rate-limited.” The correct exam approach is to recognize this as capacity/throughput management: implement retry with exponential backoff, queue bursts, and request quota increases when justified. Also, separate dev/test and production resources so load testing doesn’t consume production capacity.

Lifecycle management ties together CI/CD, versioning, and rollback. You should manage model deployments and application releases with staged environments (dev/test/prod), track versions (including prompt templates and retrieval settings), and keep a rollback plan when a new configuration degrades answer quality or increases cost. For example, a change from concise to verbose prompting can multiply token usage—cost spikes are often configuration-driven, not purely traffic-driven.

Exam Tip: If a prompt says “cap spend” or “prevent runaway usage,” look for “budgets + alerts” and “quotas/limits” rather than “ask users to use it less.” The exam favors enforceable controls.

Common trap: assuming autoscale always reduces cost. Autoscale can increase cost if it scales out due to retries or inefficient prompting. The exam’s better answer usually combines efficient design (token limits, caching, retrieval tuning) with governance (budgets/alerts) and operational safeguards (quotas, backpressure, circuit breakers).

Chapter milestones
  • Design the solution: services, regions, and reference architectures
  • Secure and govern AI resources: identity, network, and data controls
  • Operationalize deployments: CI/CD, versioning, and rollbacks
  • Monitor reliability and cost: logging, alerts, quotas, and budgets
  • Domain practice set: plan/manage scenarios and troubleshooting
Chapter quiz

1. A healthcare company is deploying an Azure OpenAI–based summarization service. Requirements: (1) All data must remain in West Europe. (2) The service must be reachable only from the company VNet (no public internet access). (3) Use platform-native controls rather than custom proxy code. What should you do?

Show answer
Correct answer: Deploy Azure OpenAI in West Europe and use Private Link (private endpoint) with public network access disabled.
Private Link provides private connectivity from a VNet to supported Azure AI services and is the platform-native approach for “VNet-only” access; choosing West Europe satisfies the data residency constraint. Resource firewall rules alone typically still use public endpoints and don’t provide true private connectivity, failing the “no public internet access” requirement. Front Door/WAF is a public edge service and does not guarantee data stays in the required region, nor does it provide private, VNet-only service access.

2. You manage multiple Azure AI resources across subscriptions. Security requires: enforce resource creation only in approved regions, require private endpoints for supported AI services, and prevent noncompliant deployments automatically. Which approach best meets the requirement?

Show answer
Correct answer: Create Azure Policy assignments with deny effects for disallowed regions and policies that require private endpoints (or deny public network access) for applicable resources.
Azure Policy is the Azure-native governance control to enforce compliance at deployment time (deny), including allowed locations and private networking requirements where supported. RBAC controls who can deploy but does not enforce configuration choices (a permitted user can still deploy to the wrong region or with public access). Defender for Cloud is valuable for posture management, but recommendations are not the same as preventative enforcement and typically require remediation after the fact.

3. Your team deploys prompt templates and model deployment configuration for a generative AI app. You need repeatable releases with the ability to quickly roll back to a known-good configuration if quality regresses. Which solution best aligns with CI/CD and rollback discipline on Azure?

Show answer
Correct answer: Store prompts and deployment configuration in Git, deploy via an Azure DevOps/GitHub Actions pipeline, and use versioned deployments so you can swap traffic back to a prior version.
Versioning in source control plus automated pipelines provides traceability, repeatability, and controlled rollback (e.g., redeploy prior versioned config or shift traffic back). Portal edits and wiki backups are not controlled, auditable, or reliably reproducible—common exam trap for “quick changes” that undermine rollback. Manually changing nodes/images is operationally risky and doesn’t address prompt/config versioning as first-class artifacts.

4. A generative AI application experiences intermittent 429 (Too Many Requests) errors from an Azure AI service during peak hours. Requirements: detect issues early and prevent unbounded spend. Which combination is most appropriate?

Show answer
Correct answer: Use Azure Monitor metrics and alerts for rate limiting/429s and configure cost budgets with alerts; also review and adjust service quotas/throughput where applicable.
429s indicate throttling/limit pressure; the exam expects platform-native monitoring (Azure Monitor metrics/logs and alert rules) plus quota/throughput planning. Budgets with alerts address spend controls. Advisor is helpful but not real-time and won’t catch peak-hour reliability issues quickly; approval workflows don’t directly prevent API throttling or runaway consumption. Timeouts and extra app instances can worsen throttling if they increase request volume, and avoiding monitoring conflicts with the requirement to detect issues early.

5. A company must grant an app the minimum permissions required to call an Azure AI resource. The app runs in Azure and should avoid secrets when authenticating. Which approach best implements least privilege and modern identity practices?

Show answer
Correct answer: Use a managed identity for the app and assign an appropriate Azure RBAC role on the AI resource (or resource group) scoped as narrowly as possible.
Managed identities remove the need for stored secrets and integrate with Azure RBAC so you can scope permissions narrowly, aligning with least privilege. API keys are secrets and generally provide coarse access without RBAC scoping, increasing risk and not meeting the “avoid secrets” requirement. A shared user account with a stored password is explicitly against modern practices (credential sharing, rotation burden, higher blast radius) and is not least-privilege or service-to-service authentication best practice for Azure.

Chapter 3: Implement Generative AI Solutions (Azure OpenAI)

This chapter maps to the AI-102 objective area for implementing generative AI solutions on Azure: selecting the right model and deployment, designing prompts that reliably satisfy constraints, grounding outputs with enterprise data (RAG), evaluating quality and safety, and preparing the solution for production (rate limits, streaming, caching, fallbacks, and cost controls). The exam does not reward “cool demo” thinking; it rewards operational thinking: which Azure resource to use, how parameters affect tokens and latency, how grounding changes risk, and how to design for reliability and governance.

Expect scenario questions where a solution works in a notebook but fails under load, returns ungrounded claims, or violates policy. Your job is to identify the Azure OpenAI building blocks (deployments, models, parameters, filters) and the architecture patterns (RAG, evaluation harness, telemetry, resilience) that turn a prototype into an enterprise-ready system.

Exam Tip: When options include “fine-tune” versus “RAG,” the exam frequently expects RAG for enterprise knowledge updates. Fine-tuning is for shaping style/behavior or learning stable patterns—not for frequently changing product docs and policies.

Practice note for Choose models and build prompts: system messages, few-shot, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Ground responses with enterprise data: RAG patterns and citations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Evaluate and improve quality: testing, safety, and regression checks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy to production: rate limits, streaming, caching, and fallbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: generative AI build-and-fix exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Choose models and build prompts: system messages, few-shot, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Ground responses with enterprise data: RAG patterns and citations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Evaluate and improve quality: testing, safety, and regression checks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Deploy to production: rate limits, streaming, caching, and fallbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: generative AI build-and-fix exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 3.1: Azure OpenAI basics: deployments, tokens, and parameters

Azure OpenAI is consumed through an Azure OpenAI resource, but the exam focuses on the concept of a deployment: you deploy a chosen model (for example, GPT-family chat models or embedding models) under a deployment name, then your application calls that deployment. Many candidates confuse “model” with “deployment.” On AI-102, deployment is the unit you reference in code and configure for scale and quota behavior.

Tokens are your primary sizing and cost unit. Prompts plus retrieved context plus the model’s response all count toward token usage, which directly affects cost and latency. Scenario questions often hide the real issue: the app times out because your RAG pipeline is injecting too much context, not because the model is “slow.”

Key parameters tested include temperature (creativity vs determinism), top_p (nucleus sampling), max tokens (caps output length), and stop sequences (hard boundaries). Another common objective is understanding that system messages (or higher-priority instructions) set behavior, but cannot guarantee perfect compliance; you must enforce constraints in application logic when it matters (for example, output must be valid JSON).

  • Temperature: Lower for reproducible, policy-heavy responses; higher for ideation.
  • Max tokens: Controls output length; if too small, the model truncates mid-answer—often misdiagnosed as a “formatting bug.”
  • Prompt size: Large prompts reduce remaining context window for output; long chat history is a frequent trap.

Exam Tip: If a question mentions “deterministic outputs for regression testing,” look for low temperature and stable prompts/versions, plus an evaluation harness—don’t assume model choice alone solves it.

Section 3.2: Prompt engineering patterns: structure, tools, and guardrails

The exam expects you to know prompt structure as an engineering discipline: clear role definition, explicit task, constraints, and output format. In Azure OpenAI chat patterns, the system message establishes persona and non-negotiable rules; the user message provides the request; developer/tool instructions (where applicable) define the app’s contract. Few-shot examples are used to demonstrate style and edge cases. However, few-shot is not a substitute for validation—especially for strict schemas.

Guardrails appear both in prompt and outside prompt. In-prompt guardrails include “refuse if missing sources,” “respond only using provided context,” and “output JSON matching this schema.” Out-of-prompt guardrails include schema validation, content filtering, and tool gating (only allow certain tools and arguments). On AI-102, watch for answers that rely only on “better prompt wording” when the requirement is enforceable via application logic.

Tool use and orchestration patterns appear as “agentic” behaviors even within a generative solution chapter: the model chooses between calling a search tool, a database lookup, or a calculator. The key is controlled tool invocation: restrict tools, validate arguments, and ensure tools return grounded facts that the model summarizes.

  • Few-shot: Use 2–5 examples to teach classification/extraction formats; keep them short to preserve tokens.
  • Constraints: Put hard requirements in a dedicated “Constraints” block to reduce instruction dilution.
  • Refusal behavior: Explicitly define when to say “I don’t know” versus when to ask clarifying questions.

Exam Tip: If you see “model returns invalid JSON sometimes,” the best answer usually combines structured prompting and output validation/retry—prompt-only fixes are a common exam trap.

Section 3.3: Grounding and RAG: retrieval, chunking, embeddings, and citations

Retrieval-Augmented Generation (RAG) is a centerpiece of AI-102 generative scenarios. The exam tests whether you can separate responsibilities: retrieval finds relevant documents; the model synthesizes an answer from those documents; citations provide traceability. Azure AI Search is commonly used as the retrieval store, while Azure OpenAI embeddings convert text into vectors for similarity search.

Chunking is where many solutions fail. Chunks that are too large reduce retrieval precision and inflate token costs; chunks that are too small lose context. A practical approach is to chunk by semantic boundaries (headings/paragraphs) and include overlap so key definitions are not split. You’ll also need metadata (source URL, title, section, security labels) to support filtering and citations.

Embeddings power vector search: generate embeddings at ingestion time for documents and at query time for the user question. Hybrid search (keyword + vector) is often the right choice when you have domain terminology, IDs, or product names. The exam frequently frames this as “users search by part number” or “legal clauses have exact phrasing,” which keyword search handles well.

Citations are both a product feature and a safety control: they reduce hallucinations by forcing the model to ground claims. In implementation, you pass retrieved snippets with their source identifiers and instruct the model to cite them. Your application should also enforce “no source, no claim” logic for high-risk domains.

  • Retrieval: Filter by permissions/tenant, then retrieve top-k passages.
  • Chunking: Semantic chunks + overlap; store metadata for traceability.
  • Citations: Require the model to cite snippet IDs and map them back to documents in the UI.

Exam Tip: When the prompt says “answer using enterprise docs,” but the model still hallucinates, the fix is usually to tighten the RAG loop (better retrieval, smaller top-k, enforced citations), not to raise temperature or add more general instructions.

Section 3.4: Safety and responsible AI: content filters and policy design

Safety is not optional on AI-102. You must know how safety controls appear at multiple layers: platform controls (content filters), application controls (input/output validation), and organizational policy (logging, red teaming, access control). Azure OpenAI includes content filtering that can block or flag prompts and completions. The exam often asks what to do when legitimate business content is being blocked (false positives): the correct approach is to adjust policy/configuration where allowed, add user education, and redesign prompts and workflows—rather than “turn off safety.”

Responsible AI policy design includes defining allowed use cases, prohibited content, escalation paths, and auditability. For enterprise apps, consider data privacy: don’t log sensitive user inputs unnecessarily, and apply least privilege access to data sources used for grounding. If you use RAG, ensure retrieval respects document-level permissions; otherwise, your “helpful chatbot” becomes a data exfiltration tool.

Common safety traps in exam scenarios include: (1) relying on a system message to prevent disallowed content without filters; (2) returning raw tool output that contains sensitive fields; (3) allowing the model to call arbitrary URLs or run code without constraints.

  • Input controls: Detect prompt injection attempts and strip/neutralize unsafe instructions.
  • Tool constraints: Only permit approved tools; validate tool arguments and results.
  • Output controls: Block/transform disallowed content; redact secrets; require citations for factual claims.

Exam Tip: If a question involves “users try to override system instructions,” look for a layered mitigation: prompt injection defenses, tool gating, and policy enforcement—not a single prompt rewrite.

Section 3.5: Evaluation: golden sets, prompt/version control, and telemetry

Quality improvement on AI-102 is framed as engineering discipline: define what “good” means, measure it, and prevent regressions. A golden set (also called a test set) is a curated collection of representative prompts with expected behaviors or scoring rubrics (for example, must cite sources, must not fabricate, must follow schema). The exam expects you to understand that evaluation is continuous—especially when prompts, retrieval settings, or models change.

Prompt and configuration version control is frequently tested implicitly. Treat prompts, system messages, safety policies, retrieval settings (top-k, filters), and chunking strategies as versioned artifacts. When a new release reduces answer quality, you need to correlate the regression to a specific change. This is where telemetry matters: log prompt templates (without sensitive user data when possible), token usage, retrieval hits, latency, refusal rates, and safety filter triggers.

Regression checks should cover both functional and safety requirements: schema validity, citation presence, and “no answer outside sources” rules. Many candidates focus only on accuracy and miss policy compliance metrics—yet exam scenarios often emphasize compliance, customer trust, and operational risk.

  • Golden sets: Include edge cases: ambiguous questions, missing data, adversarial prompts.
  • Scoring: Combine automated checks (JSON schema, citations) and human review for nuanced quality.
  • Telemetry: Track tokens, latency, retrieval quality, and filter outcomes to guide tuning.

Exam Tip: When asked how to “prove improvements,” choose answers involving repeatable evaluation (golden set + metrics) over subjective stakeholder feedback.

Section 3.6: Production readiness: latency, caching, resilience, and cost tuning

Production-readiness is a major discriminator in AI-102 questions. Rate limits and quota constraints can break an otherwise correct solution. Your design should include request throttling, retries with backoff, and graceful degradation when the model is overloaded. Streaming responses improve perceived latency (users see output immediately), but you must still handle mid-stream interruptions and partial outputs.

Caching is a powerful cost and latency lever. Cache embeddings for repeated queries, cache retrieval results for stable corpora, and cache final responses when prompts are identical (with careful consideration for personalization and security). On the exam, caching is often the right answer when the workload is repetitive and answers are stable, but it is wrong when data changes frequently or responses depend on user-specific permissions.

Fallback patterns include switching to a smaller/cheaper model for non-critical tasks, reducing max tokens, or disabling optional features (like long-context augmentation) under load. Another resilience pattern is “RAG-first, then refuse”: if retrieval returns no relevant sources, return a safe “cannot find in approved docs” response rather than hallucinating.

Cost tuning ties back to tokens: trim chat history, summarize long threads, reduce top-k, and keep chunks concise. Also consider batching where supported, and avoid unnecessary round trips (for example, don’t call the model twice when one call with a tool plan suffices).

  • Latency: Streaming + smaller prompts + efficient retrieval reduce end-to-end time.
  • Resilience: Retries, circuit breakers, and fallbacks prevent cascading failures.
  • Cost: Token budgets, caching, and model selection aligned to task criticality.

Exam Tip: If a scenario mentions “sudden spike in usage” or “429/rate limit errors,” the correct answers usually include throttling/queueing and backoff, plus capacity planning—prompt tweaks won’t fix quota failures.

Chapter milestones
  • Choose models and build prompts: system messages, few-shot, and constraints
  • Ground responses with enterprise data: RAG patterns and citations
  • Evaluate and improve quality: testing, safety, and regression checks
  • Deploy to production: rate limits, streaming, caching, and fallbacks
  • Domain practice set: generative AI build-and-fix exam scenarios
Chapter quiz

1. You are building a customer-support chat app using Azure OpenAI. The app must always respond in JSON with fields {"answer": string, "citations": string[], "confidence": "low"|"medium"|"high"}. In testing, users can sometimes prompt-inject the assistant to output plain text or extra fields. Which approach most reliably enforces the required structure while still allowing natural language generation? A. Put the JSON requirement in a user message and set temperature=0 B. Use a system message to require the schema and enable structured output/JSON mode (when supported) for the model deployment C. Use few-shot examples only (two to three examples of the JSON format) and set top_p=0.1

Show answer
Correct answer: Use a system message to require the schema and enable structured output/JSON mode (when supported) for the model deployment
B is correct: In Azure OpenAI prompt design, system messages are higher priority than user content and are the right place for non-negotiable constraints. When available, structured output/JSON mode increases reliability of producing valid JSON. A is wrong because user messages are easier to override via prompt injection and temperature alone does not guarantee schema compliance. C is wrong because few-shot helps but does not reliably enforce strict output constraints; it can still be broken under adversarial or edge-case prompts.

2. A company needs an internal "Policy Q&A" assistant. Policies change weekly and must be reflected immediately. The assistant must provide citations to the exact policy paragraphs used. Which solution best meets the requirement with the least operational overhead? A. Fine-tune a model monthly on the policy documents B. Implement RAG using Azure AI Search as a vector index and return citations from retrieved chunks C. Increase the model context window and paste the full policy manual into every prompt

Show answer
Correct answer: Implement RAG using Azure AI Search as a vector index and return citations from retrieved chunks
B is correct: The exam typically expects RAG for frequently changing enterprise knowledge and for traceable grounding. Azure AI Search (vector + keyword/hybrid) supports retrieval and you can pass chunk metadata (title, URL, section) back as citations. A is wrong because fine-tuning is not ideal for rapidly changing content and does not inherently provide verifiable citations. C is wrong because it is costly and brittle (token limits, latency) and still does not provide robust retrieval or citation management across a growing corpus.

3. You deployed an Azure OpenAI chat completion endpoint and added RAG. During evaluation, the assistant sometimes returns confident answers even when retrieval returns no relevant documents. You must reduce ungrounded responses without over-rejecting valid questions. What is the BEST change? A. Add a prompt rule: "If you cannot find relevant sources, say you don't know" and gate answers on a retrieval relevance threshold (e.g., top-k score) B. Fine-tune the model to be more cautious by training it to say "I don't know" more often C. Increase temperature to encourage more diverse answers, which will surface uncertainty

Show answer
Correct answer: Add a prompt rule: "If you cannot find relevant sources, say you don't know" and gate answers on a retrieval relevance threshold (e.g., top-k score)
A is correct: In production RAG, you typically combine prompt constraints with retrieval-aware gating (e.g., if similarity/score below threshold, respond with refusal/ask for clarification). This directly targets ungrounded answers when the system has no evidence. B is wrong because fine-tuning is heavier operationally and still cannot ensure grounding or citations; it also risks changing behavior in unrelated ways. C is wrong because higher temperature generally increases hallucination risk and does not reliably produce calibrated uncertainty.

4. A retail app uses Azure OpenAI with streaming responses. Under peak load, you receive intermittent HTTP 429 (Too Many Requests). The business requirement is to keep responses fast and resilient while controlling cost. Which design is MOST appropriate? A. Implement client-side exponential backoff with retry-after handling, add response caching for repeated prompts, and configure a fallback deployment/model for overload B. Disable streaming so requests complete faster and eliminate 429s C. Increase max_tokens for each request to reduce the number of calls

Show answer
Correct answer: Implement client-side exponential backoff with retry-after handling, add response caching for repeated prompts, and configure a fallback deployment/model for overload
A is correct: Certification scenarios emphasize operational readiness—handling rate limits with retries/backoff (respecting Retry-After), reducing load/cost via caching, and improving resilience with fallbacks across deployments/models/regions where applicable. B is wrong because streaming does not cause 429s; 429s come from rate/throughput limits, and disabling streaming can worsen perceived latency. C is wrong because higher max_tokens increases compute cost and can reduce throughput, making 429s more likely, not less.

5. You maintain a generative AI solution that summarizes incident tickets. After a prompt update, a regression occurs: summaries sometimes include PII that was previously redacted. You need an approach aligned with enterprise quality and safety practices to prevent future regressions. What should you implement? A. A versioned evaluation harness with a fixed test set, automated safety checks (PII/redaction), and regression gates in CI/CD before promoting prompts/models B. Ask the helpdesk team to manually spot-check a few random summaries each week C. Increase the system message length to repeat the PII policy multiple times

Show answer
Correct answer: A versioned evaluation harness with a fixed test set, automated safety checks (PII/redaction), and regression gates in CI/CD before promoting prompts/models
A is correct: The exam expects systematic evaluation—repeatable test sets, automated safety/quality metrics, and regression checks integrated into release pipelines to catch behavior drift from prompt/model changes. B is wrong because manual spot checks are not reliable, scalable, or timely for preventing regressions. C is wrong because repeating policies in prompts is not a substitute for testing and safety controls; longer prompts can also increase token cost and still fail under edge cases.

Chapter 4: Implement an Agentic Solution + Knowledge Mining

AI-102 increasingly tests whether you can move beyond “single prompt, single response” applications into agentic solutions that plan, call tools, and ground answers in enterprise knowledge. This chapter maps to three recurring objective themes you’ll see in case studies: (1) design agent boundaries (what the model decides vs what code decides), (2) implement safe tool use (function calling, connectors, validation), and (3) build knowledge-mining pipelines (ingestion, enrichment, indexing) that are Retrieval-Augmented Generation (RAG) ready.

Expect exam items to describe a business workflow (claims processing, HR policy Q&A, incident triage) and ask which architecture choice reduces hallucinations, improves traceability, or meets security requirements. The “best” answer usually ties model reasoning to deterministic systems: store state in your app, retrieve from Azure AI Search, validate tool inputs, and log everything for monitoring.

Exam Tip: When you see “must be auditable,” “must not expose secrets,” or “must ensure responses are grounded,” the correct answer usually includes (a) explicit tool boundaries, (b) retrieval from a governed index, and (c) server-side validation—not just prompt instructions.

Practice note for Design agents: goals, memory, tools, and orchestration boundaries: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Implement tool use: function calling, connectors, and tool validation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build knowledge mining pipelines: ingestion, enrichment, and indexing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Optimize retrieval for agents: ranking, filters, and hybrid search: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: agent workflows and search/index case studies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Design agents: goals, memory, tools, and orchestration boundaries: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Implement tool use: function calling, connectors, and tool validation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Build knowledge mining pipelines: ingestion, enrichment, and indexing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Optimize retrieval for agents: ranking, filters, and hybrid search: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: agent workflows and search/index case studies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 4.1: Agentic architectures: planners, executors, and state handling

On AI-102, “agentic” does not mean “let the model do everything.” It means you implement a pattern where the model proposes steps, while your application enforces boundaries and executes actions. A common architecture is Planner → Executor. The planner (LLM) decomposes a user goal into tasks (e.g., “find policy, summarize, draft email”), and the executor (your orchestration code) chooses which tools to call, in which order, with guardrails.

State handling is a frequent exam target. You should know where state lives: conversation state (messages), workflow state (current step, tool results), and business state (IDs, approvals, escalation flags). The safe default is: keep state in your service (database/cache) and feed the model only what it needs. This makes runs reproducible and limits prompt injection impact.

Design boundaries explicitly: what the model can decide (task selection, wording) vs what code decides (authorization checks, tool availability, retry logic, quotas). Orchestration patterns you may see include: single-agent tool loop (LLM chooses a tool repeatedly), multi-step workflow (fixed stages with an LLM in specific stages), and multi-agent (separate roles). Exam scenarios often reward the simplest pattern that meets requirements.

  • Goal definition: convert the user request into a constrained objective (inputs/outputs, allowed tools).
  • Termination: decide how the loop stops (max tool calls, confidence threshold, “no results” branch).
  • Error handling: tool failure should not produce fabricated results; return partial outcomes with citations.

Exam Tip: If the question mentions “unbounded tool calls,” “runaway costs,” or “infinite loops,” look for answers that add max-iterations, timeouts, and explicit termination criteria in the orchestrator.

Common trap: picking “agent with autonomy” when the scenario requires deterministic business rules (approvals, compliance). The exam typically expects you to keep approvals and policy enforcement in code, not in the model’s reasoning.

Section 4.2: Tooling and actions: functions, APIs, and secure execution

Tool use on Azure is typically implemented via function calling (the model returns a structured call) and your application executes the call against an API, database, or connector. AI-102 questions often test whether you understand that function calling is not execution—it is a suggestion that must be validated server-side.

When implementing tools, define a strict schema for each action (JSON arguments, types, enums). Validation steps should include: allowlist tool names, validate argument formats, enforce authorization (RBAC/ABAC), and sanitize inputs before passing to downstream systems. If the tool is an HTTP API, use managed identity where possible and store secrets in Key Vault. Tools should return minimal necessary data to reduce leakage into the model context.

Connectors and tool catalogs show up in enterprise scenarios: “use a CRM system,” “query a ticketing platform,” “send an email.” The correct design is usually to wrap each external dependency behind a controlled service layer so you can log requests, enforce policies, and mask sensitive fields. Then you expose only those wrapped actions as tools.

  • Tool validation: schema validation + policy validation (who can call it, when, and with what scope).
  • Secure execution: managed identity, private endpoints where required, and network isolation for internal APIs.
  • Idempotency: actions like “create ticket” should support retries without duplicate side effects.

Exam Tip: If an option says “let the model call the API directly with the API key,” it is almost always wrong. The exam expects a server-side mediator that holds credentials and applies authorization checks.

Common trap: confusing “tools for retrieval” (search) with “tools for actions” (write operations). Retrieval tools can be more permissive; action tools must be tightly controlled, logged, and often require human-in-the-loop in regulated workflows.

Section 4.3: Memory and context: short-term vs long-term, summarization

Memory is tested as an architecture decision: what stays in the prompt window (short-term) versus what is stored externally (long-term). Short-term memory is the immediate conversation (recent turns, current tool outputs) and is limited by the model’s context window and cost. Long-term memory is stored in a database, vector store, or Azure AI Search index and retrieved as needed.

For short-term memory, the exam favors controlled context construction: include only relevant recent turns, tool outputs, and retrieved snippets with citations. For long-term memory, store durable facts (user preferences, prior tickets, case notes) with metadata and retention rules. In enterprise contexts, long-term “memory” must obey privacy and data minimization requirements—store identifiers and structured summaries, not raw sensitive transcripts unless explicitly required and permitted.

Summarization is a key technique to manage context. A common pattern is “rolling summary”: after N turns, summarize the conversation into a compact state object, then drop older turns. Another pattern is “episodic memory”: store summaries per task and retrieve them later. The exam may ask how to reduce token usage while maintaining accuracy; summarization plus retrieval is usually the correct direction.

  • Short-term: last messages, current plan, latest tool result; keep it small and relevant.
  • Long-term: stored notes, embeddings for recall, searchable records with metadata filters.
  • Safety: never treat user-provided text as trusted instructions for tool use; separate “user content” from “system state.”

Exam Tip: If you see “prompt injection” or “user asks the assistant to reveal system instructions,” choose designs that isolate system prompts, store state outside the model, and retrieve trusted data from governed sources.

Common trap: assuming “vector memory” automatically equals “truth.” Long-term memory can preserve incorrect or outdated content. The exam often rewards adding timestamps, source fields, and re-validation steps (e.g., re-retrieve authoritative policy docs) before generating final responses.

Section 4.4: Azure AI Search foundations: indexes, analyzers, and scoring

Knowledge mining for agentic solutions typically centers on Azure AI Search. AI-102 expects you to understand index basics and how they affect retrieval quality for RAG. An index contains fields (searchable, filterable, sortable, facetable) and may include vector fields for embeddings. The exam commonly asks you to choose correct field attributes and analyzers based on query needs.

Analyzers matter for keyword search: language analyzers (e.g., English) handle stemming and tokenization; keyword analyzer preserves exact values (useful for IDs). If a scenario requires filtering by category, region, or security label, those fields must be filterable. If it requires “top policies by lastUpdated,” the field must be sortable. These are frequent case-study details.

Scoring and ranking: keyword queries use BM25 scoring; semantic ranking (when enabled) can rerank results using a semantic model; vector similarity uses cosine/dot-product depending on configuration. Many exam items are about selecting the right combination to improve relevance while controlling cost and latency.

  • Index design: separate content field(s) from metadata fields (source, ACL, timestamps, doc type).
  • Chunking: index text in chunks for better passage-level retrieval; store parent document IDs to reconstruct context.
  • Security trimming: include metadata for access control and apply filters at query time.

Exam Tip: If the scenario requires “users can only see documents they have access to,” look for answers that use per-document ACL metadata + filter queries (or separate indexes per tenant) rather than hoping the model will “not mention” restricted content.

Common trap: putting everything into one giant searchable field without metadata. The exam often expects you to design for filters and citations (source URL, page number, chunk ID) so the agent can provide grounded, traceable responses.

Section 4.5: Enrichment and extraction: skillsets, OCR, and entity extraction

Knowledge mining pipelines convert raw content (PDFs, images, Office files) into structured, searchable data. On AI-102, this is typically framed as “ingestion → enrichment → indexing.” In Azure AI Search, enrichment is implemented through cognitive skills (skillsets) applied during indexing. You may see requirements like “extract text from scanned PDFs,” “detect key phrases,” or “identify people and organizations.”

OCR is central for document workflows. If documents are images or scanned PDFs, you need OCR to extract text before chunking and indexing. Entity extraction and key phrase extraction improve downstream retrieval by adding structured fields that can be filtered or used for boosting (e.g., prioritize documents mentioning a product name or regulation code).

Design for traceability: store enrichment outputs with provenance—page number, bounding box (for OCR), and source file reference—so the agent can cite where information came from. This is a typical case-study requirement (“must provide citations”). Also consider normalization: dates, units, and IDs should be extracted into consistent formats for accurate filtering and sorting.

  • Skillset planning: choose only required skills to control cost and latency.
  • Validation: test enrichment on edge cases (low-quality scans, multilingual documents).
  • Index mapping: ensure enriched fields are mapped to index fields with correct attributes (filterable/searchable).

Exam Tip: When the prompt says “scanned documents” or “images,” the correct pipeline includes OCR before indexing. If the option jumps straight to “vectorize PDFs” without extracting text, it’s usually incomplete for search and citation needs.

Common trap: over-enriching everything. The exam often rewards minimal, targeted enrichment aligned to the query experience (e.g., entities needed for filters) rather than enabling every skill by default.

Section 4.6: Retrieval strategies: vector, keyword, hybrid, and semantic rerank

Agents succeed or fail based on retrieval quality. AI-102 tests whether you can pick an appropriate retrieval strategy—keyword, vector, hybrid, and semantic reranking—and tune it using filters and ranking controls. Keyword search is strong for exact terms (policy IDs, error codes). Vector search is strong for semantic similarity (paraphrases, “find procedures like this”). Hybrid combines both and is often the best default for enterprise RAG.

Semantic reranking is commonly used after initial retrieval (keyword/hybrid) to improve ordering of top results. In exam scenarios, semantic rerank is a good fit when the corpus is large and user questions are natural language, but you still need lexical precision and metadata filtering.

Optimization for agents means: apply metadata filters early (security trimming, doc type, date), retrieve a manageable top-k, then assemble context with deduplication and diversity (avoid returning five chunks from the same page). If the workflow has multiple steps, retrieve per step (policy lookup, then procedure lookup) rather than one massive retrieval.

  • Hybrid search: best when you need both exact matches and semantic matches.
  • Filters: reduce hallucinations by restricting the candidate set to relevant, authorized documents.
  • Rerank: improves top results; do it after filtering and initial recall.

Exam Tip: When you see “improve relevance without changing the model,” the answer is often “adjust retrieval”: add filters, hybrid search, semantic reranking, better chunking, or scoring profiles—rather than prompt tweaks alone.

Common trap: increasing top-k excessively. More retrieved text can reduce answer quality by adding noise and increasing token cost. The exam tends to favor targeted retrieval plus summarization over “stuff everything into the prompt.”

Chapter milestones
  • Design agents: goals, memory, tools, and orchestration boundaries
  • Implement tool use: function calling, connectors, and tool validation
  • Build knowledge mining pipelines: ingestion, enrichment, and indexing
  • Optimize retrieval for agents: ranking, filters, and hybrid search
  • Domain practice set: agent workflows and search/index case studies
Chapter quiz

1. A company is building an incident-triage agent on Azure. The agent must decide when to query Azure AI Search for runbooks, when to call a ticketing API, and when to stop and request human approval. The solution must be auditable and minimize hallucinations. Which design best aligns with AI-102 agent boundary guidance?

Show answer
Correct answer: Have application code orchestrate the workflow state machine (including approval gates), while the model proposes next actions and uses tools only through validated function calls; persist state and logs in the app.
B is correct: AI-102-style agentic solutions place deterministic control (state, approvals, audit logging, and orchestration boundaries) in application code, while allowing the model to suggest actions and call tools through controlled interfaces. A is wrong because letting the model execute tools end-to-end without external gates reduces auditability and increases the risk of unsafe or untraceable actions. C is wrong because embedding all knowledge in prompts is not scalable, is hard to govern/audit, and increases the chance of outdated or hallucinated answers compared to grounding via a governed index.

2. You implement function calling so an agent can create purchase orders through an internal API. The API must never receive unexpected fields, and the organization requires server-side validation for compliance. What should you implement?

Show answer
Correct answer: Define a strict tool schema (e.g., JSON schema) and validate tool arguments on the server before executing the API call; reject or repair invalid inputs.
A is correct: safe tool use on Azure emphasizes schema-defined function calls plus server-side validation/allow-listing before executing side effects. B is wrong because prompt-only controls are not enforceable; models can still emit malformed or unexpected fields. C is wrong because embedding secrets in prompts violates security requirements and direct model-to-API calling bypasses validation and governance controls.

3. A healthcare provider needs a knowledge mining pipeline to support an HR policy Q&A agent. Source documents include PDFs and scanned images. The agent must answer with citations and only from approved content. Which pipeline is most appropriate?

Show answer
Correct answer: Ingest documents, run OCR on images, enrich with metadata (department, effective date), and index into Azure AI Search; at query time retrieve top passages with filters and provide citations in the response.
A is correct: a RAG-ready knowledge mining pipeline uses ingestion + enrichment (including OCR for scans) and indexing into Azure AI Search, enabling governed retrieval and citation/traceability. B is wrong because prompt stuffing does not provide reliable, verifiable citations and is difficult to keep current and auditable. C is wrong because giving the model direct access to storage is not a governed retrieval pattern and makes it harder to enforce approved-content boundaries, ranking, filtering, and consistent citations.

4. A support agent uses Azure AI Search to retrieve troubleshooting steps. Users report that results are relevant but sometimes from the wrong product version. Each document has fields: productName, version, language, and lastUpdated. What is the best change to improve retrieval precision for the agent?

Show answer
Correct answer: Add server-side filters in the search query for productName, version, and language, and use lastUpdated as a ranking signal or tie-breaker.
A is correct: AI-102 retrieval optimization commonly uses structured filters (metadata constraints) to ensure the agent is grounded on the correct subset, then ranks within that subset using scoring/recency signals. B is wrong because temperature affects generation randomness, not the correctness of retrieved sources, and can worsen grounding. C is wrong because removing filters allows cross-version contamination; semantic ranking alone cannot guarantee version correctness and undermines traceability requirements.

5. A legal research agent must handle both keyword-heavy queries (case numbers, statute citations) and natural-language questions. The team wants to improve relevance without losing exact-match behavior. Which Azure AI Search approach best fits?

Show answer
Correct answer: Use hybrid search combining lexical (BM25) and vector search, optionally with semantic ranking, to support both exact identifiers and conceptual similarity.
A is correct: hybrid retrieval is a common best practice for agent grounding—lexical search preserves exact matching for citations/IDs, while vectors improve recall for paraphrases and conceptual queries; semantic ranking can further reorder results. B is wrong because vector-only retrieval can miss exact identifiers (e.g., case numbers) and may reduce determinism for citation-like queries. C is wrong because lexical-only retrieval can fail on natural-language paraphrases and reduces the agent’s ability to find semantically similar passages.

Chapter 5: Implement Computer Vision Solutions + Implement NLP Solutions

This chapter maps directly to the AI-102 skills measured around building vision and language features with Azure AI services. Expect the exam to test whether you can choose the right service for the job, design production-ready pipelines (quality, reliability, and cost), and integrate outputs into downstream business workflows. The highest-scoring answers usually show you understand the difference between “image understanding” and “document understanding,” and between “text analytics” and “conversational experiences,” plus the deployment realities (latency, throughput, and error handling).

From a solutions perspective, vision and NLP are rarely isolated. A common enterprise pattern is: ingest images or PDFs, extract text and structure with OCR/document intelligence, enrich with NLP (entities, key phrases, sentiment), then route to an agent or workflow for human review. The exam often hides this pipeline in scenario wording like “process invoices,” “extract from ID cards,” “classify customer emails,” or “enable voice for a kiosk.” Your job is to identify which components belong to which task and avoid over-engineering.

Exam Tip: When you see “tables,” “key-value pairs,” “forms,” or “multi-page PDFs,” you are typically in document processing territory. When you see “objects,” “tags,” “bounding boxes,” “visual features,” or “scene understanding,” you are typically in image analysis territory. Don’t pick a document-first service to solve an object-detection problem, and don’t pick a generic image analyzer to reliably reconstruct tables.

Practice note for Computer vision essentials: analysis, detection, and OCR workflows: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Document and image pipelines: preprocessing, quality, and error handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP fundamentals: classification, extraction, and conversational design: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Speech and multimodal integration: voice-in/voice-out and accessibility: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Domain practice set: vision + NLP mixed scenarios and edge cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Computer vision essentials: analysis, detection, and OCR workflows: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Document and image pipelines: preprocessing, quality, and error handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for NLP fundamentals: classification, extraction, and conversational design: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Speech and multimodal integration: voice-in/voice-out and accessibility: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 5.1: Vision solution selection: image analysis vs document processing

Section 5.1: Vision solution selection: image analysis vs document processing

AI-102 expects you to select the correct vision capability based on the data shape and the output required. “Image analysis” solutions focus on understanding pixels as visual content: identifying objects, generating captions/tags, detecting people/brands, or locating items with bounding boxes. “Document processing” solutions focus on reconstructing reading order and structure: pages, lines, words, tables, fields, and key-value pairs across multi-page files.

In practical architectures, the decision is driven by the business question. If the question is “What is in this picture?” you use image analysis. If the question is “What does this document say, and where is the invoice number?” you use document intelligence/OCR-first approaches. A common trap is to treat PDFs as images and run generic image analysis—this may extract some text but usually fails on tables, multi-column layouts, and reliable field extraction.

  • Choose image analysis when outputs are: labels/tags, captions, object detection bounding boxes, image similarity, or moderation signals.
  • Choose document processing when outputs are: OCR with reading order, layout structure, table extraction, key-value pairs, and field-centric schemas (e.g., invoice totals).
  • Choose a pipeline (both) when you need OCR plus visual cues, e.g., verifying a signature is present (visual) and extracting signer name (text).

Exam Tip: Watch for the word “layout.” On AI-102, “layout” usually means you need document-layout understanding (pages/blocks/tables), not just OCR text. Also watch for “handwritten” or “low-quality scans”—these are quality risk signals that should influence preprocessing and error handling.

When you justify the selection in a scenario, the best answer ties requirements to outputs (structure vs semantics) and to constraints (multi-page PDFs, table fidelity, compliance). That reasoning is what the exam rewards.

Section 5.2: OCR and document understanding: forms, layouts, and tables

Section 5.2: OCR and document understanding: forms, layouts, and tables

OCR is the foundation, but AI-102 cares about what you do after text recognition: reconstruct layout, extract structured fields, and handle tables reliably. Document understanding typically starts with an OCR pass that produces text plus geometry (bounding boxes). Layout-aware processing then groups words into lines/paragraphs, infers reading order, and identifies tables as grid-like structures rather than a flat stream of tokens.

Forms and invoices are common exam themes because they highlight why “plain OCR text” is not enough. In a form, the value “$1,250.00” is meaningless unless linked to a label like “Total Due.” In a table, row/column positions determine meaning. For these, your solution should emphasize key-value extraction and table extraction rather than only returning text.

  • Use layout outputs to preserve reading order in multi-column documents (common for insurance and legal PDFs).
  • Use table extraction when downstream systems require row-level data (line items) rather than a blob of text.
  • Plan for confidence scores and a human review threshold for critical fields (e.g., tax ID, totals).

Exam Tip: The exam frequently tests “confidence-aware” design. If a scenario mentions audits, financial reporting, or regulated data, propose thresholds, human-in-the-loop review, and storing original documents alongside extracted results for traceability.

Quality and error handling matter here: skew, blur, low DPI, compression artifacts, and handwriting can reduce accuracy. The exam won’t ask you to implement image filters in code, but it will expect you to recognize when preprocessing (deskew, denoise, contrast) and validation rules (required fields, format checks) are necessary. A common trap is assuming extraction is deterministic—production systems must handle partial extraction, missing pages, and ambiguous fields gracefully.

Section 5.3: Vision deployment concerns: throughput, batching, and latency

Section 5.3: Vision deployment concerns: throughput, batching, and latency

AI-102 scenarios often include operational constraints: “process 2 million receipts overnight,” “support near-real-time checkout,” or “limit cost.” Your design must address throughput (items/time), latency (time per request), and reliability (retries, timeouts, idempotency). Vision workloads can be compute-heavy and network-heavy; the wrong pattern can cause throttling or unexpected cost.

Start by classifying the workload as online (interactive) or batch (asynchronous). Online workloads prioritize predictable latency and user experience, usually with smaller payloads and faster models/features. Batch workloads prioritize throughput and cost efficiency, typically using queue-based ingestion and scalable workers.

  • Batching improves throughput when you can group work items, but it can increase per-item latency. Use it for offline jobs, not for UI-bound flows.
  • Use asynchronous patterns (queues, durable orchestration) for multi-page documents and large files where processing time is variable.
  • Implement exponential backoff and retry policies to handle transient failures and service throttling.

Exam Tip: If a question mentions “spikes,” “seasonal load,” or “unpredictable traffic,” the safe answer includes decoupling with a queue and scaling out consumers. If it mentions “instant feedback,” emphasize low-latency calls, payload size control, and caching where appropriate.

Another common exam trap is ignoring payload constraints and network overhead. Sending full-resolution images when thumbnails or cropped regions would suffice can inflate cost and latency. Similarly, processing entire documents when you only need specific pages is wasteful; a strong design extracts only what’s needed and logs metrics (request rate, error rate, latency percentiles) to prove it meets SLAs.

Section 5.4: NLP tasks: sentiment, key phrases, NER, summarization patterns

Section 5.4: NLP tasks: sentiment, key phrases, NER, summarization patterns

NLP on AI-102 is about selecting the correct text operation and shaping it into a workflow that produces business value. The exam commonly tests sentiment analysis (opinion/attitude), key phrase extraction (topic cues), named entity recognition (NER) for people/places/organizations/PII-like entities, and summarization patterns for condensing long content into action-ready text.

Think in terms of outputs and downstream decisions. Sentiment is rarely useful alone; it’s useful when you route items (e.g., negative emails to priority queue). Key phrases support search, tagging, and clustering. NER supports extraction into structured fields, redaction, and compliance review. Summarization supports agent handoffs and case notes, especially when you need a consistent “brief” for human operators.

  • Sentiment: best for customer feedback triage; watch for domain-specific language and sarcasm as accuracy risks.
  • Key phrases: best for indexing and analytics; ensure you normalize/merge synonyms if used for reporting.
  • NER: best for populating CRM fields or detecting sensitive entities; combine with validation rules and allow “unknown/other.”
  • Summarization: best for long tickets/calls; choose extractive vs abstractive behavior based on fidelity requirements.

Exam Tip: When a scenario says “extract invoice number, dates, amounts from text,” that is entity/field extraction (NER + patterns), not sentiment. When it says “create a short brief for an agent from a long transcript,” that is summarization—often paired with key phrases to drive routing.

A classic trap is choosing a heavy generative approach when a deterministic extractor is required. If the requirement is strict, auditable extraction (e.g., regulatory reporting), prioritize structured extraction patterns, confidence scores, and rule checks. Use generative summarization when the output is advisory, not authoritative.

Section 5.5: Conversational experiences: prompts, fallback, and escalation

Section 5.5: Conversational experiences: prompts, fallback, and escalation

Conversational design on AI-102 is evaluated through safety, reliability, and user outcomes—not just “can it chat.” You should be ready to describe how prompts, conversation state, and tools (search, databases, ticketing) work together, and how the system behaves when it cannot answer. The exam often frames this as a support bot, internal assistant, or case intake agent.

Strong conversational solutions define intent, constraints, and a fallback strategy. Your prompts should specify role, boundaries, and desired output format. Conversation state should store the minimum necessary context (and respect privacy), and tool use should be explicit: when to call a knowledge base, when to ask clarifying questions, and when to escalate.

  • Prompt patterns: set role, scope, and formatting; require citations or “source IDs” when grounded in documents.
  • Fallback: when confidence is low or grounding fails, ask a targeted question or offer options instead of hallucinating.
  • Escalation: route to a human when the user is stuck, sentiment is highly negative, or policy boundaries are hit.

Exam Tip: If an answer choice includes “respond even when you don’t know” or “make best effort without sources,” it’s usually wrong for enterprise scenarios. Prefer designs that refuse/redirect, request clarification, or escalate with a transcript and extracted entities for the human agent.

Also expect traps around logging and monitoring. For conversational systems, monitoring should include user abandonment, escalation rate, tool-call failures, and grounded answer rate. A robust design treats these as quality signals, not just operational telemetry.

Section 5.6: Speech integration basics: STT/TTS patterns and testing

Section 5.6: Speech integration basics: STT/TTS patterns and testing

Speech integration appears on AI-102 when scenarios involve call centers, kiosks, accessibility, or hands-free workflows. The exam expects you to understand the core loop: Speech-to-Text (STT) for input, NLP/LLM orchestration for intent and response generation, and Text-to-Speech (TTS) for output. The key is selecting patterns that handle real-world audio variability and provide testable behavior.

For voice-in, design for streaming vs batch transcription. Streaming STT reduces perceived latency and enables barge-in (user interrupts). Batch transcription fits recordings (voicemails, archived calls) and supports downstream summarization and analytics. For voice-out, TTS should match the channel (phone vs device speaker) and support accessibility requirements (clear pronunciation, pacing, SSML when needed).

  • STT: handle noise, accents, and domain vocabulary; test with representative audio, not studio samples.
  • TTS: validate intelligibility and latency; ensure fallback to text when audio is unavailable.
  • End-to-end: measure word error rate impact on downstream NER and intent detection; add confirmation steps for critical values.

Exam Tip: If the scenario includes “capture account number” or “medical dosage,” the safest design includes confirmation (“I heard… is that correct?”) and spell-out/phonetic strategies, because a small STT error becomes a major business error.

Finally, test multimodal edge cases: poor connectivity, partial utterances, and timeouts. The exam often rewards answers that mention resilience: retry policies, graceful degradation to text chat, and clear user messaging when speech services are unavailable.

Chapter milestones
  • Computer vision essentials: analysis, detection, and OCR workflows
  • Document and image pipelines: preprocessing, quality, and error handling
  • NLP fundamentals: classification, extraction, and conversational design
  • Speech and multimodal integration: voice-in/voice-out and accessibility
  • Domain practice set: vision + NLP mixed scenarios and edge cases
Chapter quiz

1. A company must process multi-page supplier invoices (PDFs). The solution must extract line items from tables and capture key-value pairs (invoice number, total, due date) with high accuracy. Which Azure AI service should you use?

Show answer
Correct answer: Azure AI Document Intelligence (Form Recognizer)
Azure AI Document Intelligence is designed for document understanding, including tables, key-value pairs, and multi-page PDFs. Azure AI Vision Image Analysis can describe images and perform OCR, but it is not optimized to reliably reconstruct complex tables and form structures at invoice scale. Azure AI Language analyzes text after extraction; it does not perform OCR or document structure extraction.

2. A retail app needs to detect common objects in user-submitted photos (for example: "backpack", "bicycle", "dog") and return bounding boxes for each detected object. Which service and feature best meets the requirement?

Show answer
Correct answer: Azure AI Vision Image Analysis with object detection
Object detection with bounding boxes is an image understanding task handled by Azure AI Vision Image Analysis. Document Intelligence models are for documents (forms, tables, key-value pairs) and are not intended for general object detection in scenes. NER operates on text, not pixels, so it cannot detect objects in images.

3. A support team receives thousands of customer emails per day. The company wants to automatically route each email to one of several departments (Billing, Technical Support, Sales) based on the message content. Which approach is most appropriate?

Show answer
Correct answer: Use Azure AI Language custom text classification to label each email
Routing by department is a text classification scenario; Azure AI Language provides classification capabilities suitable for production routing. OCR is only needed when content is in images or scanned documents; it does not solve the classification requirement. Document Intelligence targets structured document extraction (forms/tables) and is not the best fit for classifying email intent/categories.

4. You are designing a pipeline to process scanned application forms (images). Some scans are skewed or low contrast, causing OCR failures. You need a production-ready design that minimizes downstream errors and supports reprocessing. Which design choice best addresses reliability?

Show answer
Correct answer: Add preprocessing and quality checks (for example, deskew/denoise/contrast), then implement retry + dead-letter handling for failed documents
A robust document/image pipeline typically includes preprocessing (improving scan quality) plus operational error handling (retries, dead-letter queues, and reprocessing workflows). Running NLP entity extraction before OCR is incorrect because NLP requires text as input. Using a bot to prompt re-entry is an over-engineered and poor user experience for bulk scanned document ingestion compared to improving OCR reliability and handling failures systematically.

5. A company is building a hands-free kiosk for accessibility. Users should speak requests and hear spoken responses. The kiosk also needs to read short printed labels using the camera and speak them aloud. Which set of Azure services best fits the end-to-end requirements?

Show answer
Correct answer: Azure AI Speech for speech-to-text and text-to-speech, plus Azure AI Vision OCR for reading labels
Voice-in/voice-out requires Azure AI Speech (STT and TTS). Reading printed labels requires OCR, which is provided by Azure AI Vision. Azure AI Language can enrich text but does not provide speech recognition or voice synthesis. Azure AI Document Intelligence focuses on document understanding and does not replace speech services for audio input/output.

Chapter 6: Full Mock Exam and Final Review

This chapter is your conversion point: you stop “studying” and start performing. AI-102 (Generative AI Solutions on Azure) rewards candidates who can recognize patterns, choose the safest and most governable option, and justify tradeoffs under time pressure. The lessons in this chapter—Mock Exam Part 1, Mock Exam Part 2, Weak Spot Analysis, Exam Day Checklist, and the Final Review Sprint—are designed to mirror how the real exam feels: mixed domains, competing constraints, and answer choices that are all plausible unless you apply Azure-first reasoning.

Use this chapter like a practice lab. You will run a full mock exam in two parts, then perform a structured post-mortem. The goal is not a perfect score on the first attempt; the goal is a repeatable method for eliminating wrong answers quickly and consistently, especially where the exam tests governance, security, monitoring, evaluation, and cost controls.

Exam Tip: The most common way strong engineers miss AI-102 questions is by selecting an option that “works” technically but fails a hidden requirement: least privilege, private networking, data residency, monitoring, or cost containment. Train yourself to search for these constraints in every prompt.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final Review Sprint: formulas, patterns, and common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Final Review Sprint: formulas, patterns, and common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.

Sections in this chapter
Section 6.1: Mock exam instructions and time-box strategy

Section 6.1: Mock exam instructions and time-box strategy

For Mock Exam Part 1 and Part 2, simulate the real exam environment: one sitting, no notes, no internet, and a strict time box. If you can’t replicate the full duration, split into two sessions, but keep the total “clock time” pressure intact. The AI-102 exam frequently includes scenario-heavy items where time disappears in rereading; your strategy must protect you from over-investing early.

Use a three-pass approach. Pass 1: answer “instantly obvious” items and mark anything that requires calculation, nuanced tradeoffs, or careful reading of constraints. Pass 2: return to marked items and apply an elimination framework (security/governance first, then architecture fit, then cost/perf). Pass 3: review only those you are least confident on—do not churn every answer.

Exam Tip: Time box per item. If your average target is ~1–1.5 minutes, enforce a hard cutoff (e.g., 2 minutes). If you’re still undecided, pick the best governance-aligned choice and mark it. Overthinking is usually worse than a well-reasoned default.

  • Read the last line first: Identify the asked outcome (e.g., “minimize cost,” “ensure private access,” “reduce hallucinations”).
  • Highlight constraints: VNet/private endpoint, managed identity, customer-managed keys, content filtering, latency SLOs, evaluation requirements.
  • Prefer managed services: If multiple options meet requirements, Azure-native managed patterns often represent the “expected” answer.

Finally, treat every question as a test of objectives alignment. The exam is not asking “can you build it?” but “can you build it safely, observably, and within budget on Azure?”

Section 6.2: Mixed-domain question set (case study + scenario items)

Section 6.2: Mixed-domain question set (case study + scenario items)

Mock Exam Part 1 should feel like a case study: a business problem plus multiple technical requirements across domains (genAI, agents, vision, NLP, search, and operations). Your job is to translate narrative into architecture decisions. In AI-102, case-study style scenarios commonly test (1) model selection and deployment, (2) grounding/RAG design, (3) identity/networking, and (4) evaluation/monitoring strategy.

When you see a scenario about an internal knowledge assistant, assume the exam wants you to connect Azure OpenAI to Azure AI Search (vector + keyword), with a grounding strategy and an evaluation plan. Your “tell” is language like “use enterprise documents,” “reduce hallucinations,” “must cite sources,” or “ensure only authorized users see certain documents.” That last requirement is the trap: many candidates propose a great RAG pipeline but forget per-user access control. On Azure, think document-level security trimming, filtering by user claims/roles, and ensuring the app uses managed identity where possible.

Exam Tip: If a scenario includes “no data leakage” or “regulatory,” the correct answer usually layers controls: private endpoints + managed identity + key management (CMK if specified) + content safety + logging/monitoring. A single control is rarely enough in the exam’s framing.

Scenario items about image-heavy workflows typically test whether you choose the right vision capability: OCR for text extraction, document-oriented analysis for structured forms, and image analysis for tags/captions/object detection. The trap is choosing a general image model when the requirement is structured extraction or searchable fields. Likewise, if the scenario asks for search across extracted text and metadata, the expected end state is an index that is “RAG-ready”: chunking strategy, embeddings, and retrievable fields (including citations).

For operations-oriented scenario items, the exam checks that you can plan governance and cost controls: budgets/alerts, quota management, monitoring with Application Insights/Log Analytics, and safe rollout patterns. If the scenario mentions “spend is growing,” assume you need to control token usage, caching, prompt length, retrieval limits, and tier selection—plus FinOps mechanisms such as budgets and alerting.

Section 6.3: Mixed-domain question set (hotspot + multi-select items)

Section 6.3: Mixed-domain question set (hotspot + multi-select items)

Mock Exam Part 2 should stress your precision: hotspot-style items test whether you know where a feature is configured (networking vs identity vs service settings), and multi-select items punish partial understanding. The skill is not memorizing UI clicks—it’s knowing which layer owns the control.

For hotspot-style thinking, map controls to layers: identity (Entra ID, managed identity, RBAC), network (VNet integration, private endpoints, firewall rules), data (encryption, CMK, key vault), and safety (content filters, system messages, tool constraints). If you’re asked where to enforce “only this app can call the model,” think managed identity and RBAC/service auth; if asked where to prevent public access, think private endpoint and disabling public network access.

Exam Tip: In multi-select, do not over-select. The exam often includes one or two “nice-to-have” options that are not required by the scenario. Select only the minimum set that fully satisfies requirements; extra selections can make the answer wrong.

Agentic solution items often appear as multi-select: tool use, orchestration patterns, and safety constraints. The exam typically rewards patterns like: constrain tools via allowlists, validate tool outputs, isolate high-risk actions behind confirmations, and implement a policy boundary (e.g., system instructions + tool schema + server-side enforcement). A common trap is assuming prompt instructions alone are a security boundary. On the exam, “safety constraint” means you can enforce it in code, policy, or service configuration—not just in natural language.

Another frequent multi-select theme is evaluation and monitoring. Expect to pick components that collectively support quality and risk management: offline evaluation sets, automated metrics (groundedness/faithfulness where applicable), human review workflows for high-risk outputs, and telemetry for prompts/responses (redacted as needed). The trap is choosing logging that violates privacy requirements—if a scenario forbids storing PII, you need redaction, sampling, or storing only derived metrics.

Section 6.4: Review methodology: error log, domain mapping, and fixes

Section 6.4: Review methodology: error log, domain mapping, and fixes

This is the “Weak Spot Analysis” lesson: your score is less important than your error log. Immediately after each mock, capture every missed or guessed item and classify it by domain objective: plan/manage, genAI implementation, agents, vision, NLP, or knowledge mining. Then tag the failure mode: misread requirement, wrong service choice, missing security control, cost/latency tradeoff error, or evaluation/monitoring gap.

Use a two-column fix plan. Column A: “What rule would have prevented the miss?” Column B: “What will I do next time?” Example rules: “If private access is mentioned, prioritize private endpoints and disable public network access.” Or: “If grounding is required, include citations and retrieval constraints, not just a bigger model.” Translate each rule into a 1–2 sentence checklist item for your final sprint.

Exam Tip: Track “nearly right” answers separately. These are the ones you can convert fastest by learning the exam’s preferred pattern (e.g., managed identity over keys, Azure AI Search over a custom vector DB when nothing else is required, budgets/alerts for cost control, evaluation as a first-class requirement).

Finally, do targeted remediation: one small lab or one focused reading per error category, not broad re-study. If you missed three items due to access control in RAG, revisit security trimming patterns and how claims-based filtering is applied in retrieval. If you missed monitoring questions, review what belongs in Application Insights vs Log Analytics and what “end-to-end tracing” means for an AI app (prompt → retrieval → model → tool calls → user response).

Section 6.5: Final domain checklist: plan/manage, genAI, agents, vision, NLP, knowledge mining

Section 6.5: Final domain checklist: plan/manage, genAI, agents, vision, NLP, knowledge mining

Your “Final Review Sprint” is about rapid pattern recall. Use the checklist below as your last 48-hour pass. Each bullet is phrased the way the exam tests it: as a decision under constraints.

  • Plan/manage: Can you choose identity (managed identity/RBAC), network isolation (private endpoints), key management (CMK when required), monitoring (App Insights/Log Analytics), and cost controls (budgets, quotas, token discipline)? Common trap: solving functionality while ignoring governance requirements.
  • Generative AI: Can you justify model selection (capability vs cost/latency), prompt engineering (system vs user vs developer instructions, structured output), grounding (RAG patterns), and evaluation (offline sets + production telemetry)? Trap: “bigger model” as a default when the requirement is reliability or cost.
  • Agents: Can you design tool use safely (allowlists, schemas, confirmations, server-side enforcement) and pick an orchestration pattern (planner/executor, ReAct-style, routing) appropriate to the task? Trap: assuming the model will “behave” without guardrails.
  • Vision: Can you distinguish OCR vs document extraction vs general image analysis, and connect outputs to downstream indexing/search? Trap: picking the wrong vision feature when structured fields are required.
  • NLP: Can you choose classification/extraction and conversational patterns, and integrate speech when needed (speech-to-text/text-to-speech) while respecting latency and privacy? Trap: ignoring multilingual or real-time requirements.
  • Knowledge mining: Can you build an enrichment and indexing pipeline that yields RAG-ready content (chunking, metadata, embeddings), and can you apply security trimming? Trap: indexing everything without access control or provenance/citations.

Exam Tip: If two answers both “work,” pick the one that is (1) more governable, (2) more observable, and (3) more cost-controlled—this triad matches how Azure solutions are evaluated in enterprise settings and how the exam writers differentiate options.

Section 6.6: Exam day readiness: environment, pacing, and last-minute review

Section 6.6: Exam day readiness: environment, pacing, and last-minute review

This is your Exam Day Checklist lesson. First, control the environment: stable internet, quiet space, and a clean desk. For online proctoring, ensure your testing machine has updates paused, notifications disabled, and no background apps that could trigger proctor flags. For test center, arrive early and plan for check-in time so your mental energy goes into the first questions, not logistics.

Next, pacing: commit to your three-pass strategy from Section 6.1. If you feel stuck, look for the hidden constraint you may have missed—privacy, networking, identity, evaluation, or cost. Many AI-102 items are designed so that reading one requirement carefully collapses the problem to a single best answer.

Exam Tip: When you’re down to two choices, ask: “Which option reduces operational risk?” The exam frequently favors managed identity over secrets, private endpoints over IP allowlists, and measurable evaluation/monitoring over ad-hoc testing.

Last-minute review should be lightweight: skim your personal error-log rules, not the whole syllabus. Rehearse a few “anchor patterns” you can reuse: secure RAG (Search + citations + trimming), safe agent tooling (allowlist + confirmations), vision-to-index pipeline (OCR/extraction → enrichment → search), and production readiness (monitoring + budgets + content safety). Then stop. Mental freshness is a performance multiplier on scenario-heavy exams.

After the exam begins, keep confidence stable. The exam will include unfamiliar wording; translate it back into objectives and patterns you practiced in the mock exams. Your preparation is complete when you can consistently choose the safest, most governable Azure solution—not just the one that seems clever.

Chapter milestones
  • Mock Exam Part 1
  • Mock Exam Part 2
  • Weak Spot Analysis
  • Exam Day Checklist
  • Final Review Sprint: formulas, patterns, and common traps
Chapter quiz

1. You are reviewing results from a full AI-102 mock exam. You notice you consistently choose answers that functionally meet requirements but miss a hidden constraint (for example: private networking or least privilege). Which approach is MOST effective to improve accuracy under time pressure for the real exam?

Show answer
Correct answer: Adopt a fixed elimination checklist that always verifies security, networking, governance, monitoring, and cost constraints before selecting an answer
AI-102 frequently rewards Azure-first reasoning and constraint checking (least privilege, private networking, compliance, monitoring, cost controls). A repeatable elimination checklist directly targets the pattern of missing hidden requirements. Memorizing limits (B) can help in some questions but does not address systematic misses around governance/security. Picking the most complex solution (C) is a common trap; complexity often increases attack surface, cost, and operational burden without satisfying a specific constraint.

2. A company is preparing for exam day. They will take AI-102 at a testing center and have had issues previously with running out of time on case-study questions. Which action is BEST aligned with an exam-day checklist that improves time management and reduces avoidable mistakes?

Show answer
Correct answer: Plan a first pass that answers straightforward questions quickly, flags uncertain items for review, and reserves time for longer case-study items
A structured first-pass strategy (A) aligns with common certification exam tactics: capture easy points early, reduce anxiety, and ensure time remains for complex scenarios. Deep-reading everything first (B) consumes time without guaranteeing better accuracy and can backfire under time pressure. AI-102 does not assign publicly predictable weights by "question type" (C), and prioritizing one type first is not a reliable checklist tactic.

3. During weak spot analysis, you identify that most of your missed questions involve choosing between options that all "work" but differ in governance and security. In a final review sprint, which recurring pattern should you apply FIRST when two options appear technically equivalent?

Show answer
Correct answer: Prefer the option that enforces least privilege and supports private access paths (for example, managed identities and private endpoints) while meeting the stated requirements
When options are technically viable, AI-102 commonly tests secure-by-default and governable designs: least privilege, identity-based auth, and private networking (A). Speed of implementation (B) is often a trap when it leads to weaker security (for example, long-lived secrets/shared keys) or violates enterprise controls. Choosing the largest model (C) ignores cost containment and may not address governance/security constraints present in many exam scenarios.

4. You run Mock Exam Part 2 and score lower than Part 1. Your goal is to improve within one week using the chapter's method. Which sequence is MOST effective?

Show answer
Correct answer: Perform a structured post-mortem: categorize misses by domain (security/governance, monitoring, cost, etc.), document the hidden requirement you missed, then drill targeted questions for those domains
The chapter emphasizes repeatable improvement via weak spot analysis: classify errors, identify the overlooked constraint, and practice targeted scenarios (A). Retaking without analysis (B) risks memorization rather than skill transfer to new questions. Broad, unfocused study (C) is inefficient under a one-week timeline and does not address the specific failure patterns that AI-102 exploits.

5. A startup is building a generative AI assistant on Azure. During the final review sprint, you practice a common AI-102 trap: selecting an option that works but violates cost controls. The requirement states: "Minimize ongoing cost while maintaining adequate monitoring and governance." Which choice is MOST likely to be correct in an exam scenario?

Show answer
Correct answer: Enable monitoring and cost controls early (for example, Azure Monitor/Application Insights and budgets/alerts) and choose the smallest model/configuration that meets requirements, scaling only when justified by metrics
AI-102 often expects designs that balance capability with cost containment and operational readiness: right-size resources and implement monitoring/governance upfront (A). Choosing the largest model by default (B) is a cost trap and delaying monitoring undermines reliability and accountability. Disabling logging (C) may reduce costs but fails monitoring/governance requirements and makes incident response and evaluation difficult.
More Courses
Edu AI Last
AI Course Assistant
Hi! I'm your AI tutor for this course. Ask me anything — from concept explanations to hands-on examples.