AI Certification Exam Prep — Beginner
Master AI-102 domains with practice-heavy prep and a full mock exam.
This course blueprint is designed for beginners who want a clear, exam-aligned path to passing the AI-102: Designing and Implementing a Microsoft Azure AI Solution exam by Microsoft. You’ll learn the concepts and decision-making patterns tested on the exam and then validate your readiness through domain-focused practice sets and a full mock exam.
AI-102 measures applied engineering skills across modern Azure AI capabilities. In 2026, that increasingly includes generative AI patterns, agentic orchestration, and search-driven grounding, alongside core computer vision and natural language processing workloads. This course organizes those skills into a 6-chapter “book” that maps directly to the official exam domains.
Chapter 1 starts with the exam itself: registration, scoring, question formats, and a practical study strategy. You’ll also set up an environment plan so you can practice without getting blocked by tooling.
Chapters 2–5 each focus on one or two domains with an emphasis on what the exam asks you to decide: which Azure service to use, how to secure it, how to deploy it, and how to troubleshoot it. Each chapter ends with exam-style practice milestones to reinforce the most-tested scenarios.
Chapter 6 is a full mock exam experience with structured review. You’ll identify weak areas by domain, fix gaps, and walk into the exam with a clear checklist for pacing and accuracy.
AI-102 questions often look like real project briefs: you’re given constraints (security, latency, cost, data residency, safety), then asked to choose the best design or implementation step. This course blueprint emphasizes that skill—turning requirements into correct Azure AI decisions—while keeping the learning path beginner-friendly.
If you’re new to certifications, start by setting a test date to create urgency and a realistic schedule. You can begin on Edu AI today: Register free or browse all courses.
This course is for learners preparing for the Microsoft Azure AI Engineer Associate certification who have basic IT literacy but no prior certification experience. If you can navigate cloud concepts and follow step-by-step labs, you can succeed here.
Microsoft Certified Trainer (MCT)
Jordan Whitaker is a Microsoft Certified Trainer who builds and teaches Azure AI certification tracks for learners moving from fundamentals to role-based exams. He specializes in AI-102 readiness through scenario-based design, governance, and production deployment patterns on Azure.
AI-102 is not a “memorize services” exam. It measures whether you can design, implement, and operate Azure AI solutions under real-world constraints: security, reliability, cost, evaluation, and responsible AI. The 2026 version of this course emphasizes generative AI in Azure (Azure OpenAI, grounding with Azure AI Search, and agent orchestration), but the exam still expects you to be fluent across classic Azure AI capabilities (vision, language, speech) and the operational layer (monitoring, governance, deployment patterns).
This chapter helps you orient to what the exam is really testing, how to register and plan around policies, how scoring and question formats work, and how to build a four-week preparation plan that blends labs, notes, and spaced repetition. You will also set up a practice environment that mirrors exam scenarios: multiple Azure resources, identity controls, cost guardrails, and a repeatable way to track what you learned.
Throughout this chapter, you’ll see coaching on common traps (what distractor choices look like), and how to recognize the “most correct” answer when more than one option is technically possible in Azure.
Practice note for Understand AI-102: role, skills measured, and exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam: scheduling, pricing, ID, and policies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Scoring, question formats, and time management strategy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 4-week study plan with labs, notes, and spaced repetition: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up your practice environment (Azure account, tools, and tracking): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Understand AI-102: role, skills measured, and exam domains: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Register for the exam: scheduling, pricing, ID, and policies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Scoring, question formats, and time management strategy: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build a 4-week study plan with labs, notes, and spaced repetition: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set up your practice environment (Azure account, tools, and tracking): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 targets the day-to-day responsibilities of an Azure AI Engineer: selecting capabilities, integrating them into apps, and operating them safely at scale. The exam is aligned to job tasks such as building conversational experiences, implementing retrieval-augmented generation (RAG), integrating vision and language, and applying security and governance controls. In 2026, expect more scenarios that combine multiple services—e.g., Azure OpenAI + Azure AI Search + Azure AI Document Intelligence—rather than isolated “what service does X?” prompts.
Map the course outcomes to what shows up on the test: (1) Plan and manage an Azure AI solution (identity, network, monitoring, cost controls); (2) Implement generative AI (model selection, prompt design, grounding, evaluation); (3) Implement agentic solutions (tool use, orchestration patterns, safety constraints); (4) Computer vision (image analysis, OCR, document workflows); (5) NLP and speech (classification, extraction, conversation, speech integration); and (6) Knowledge mining (indexing, enrichment, RAG-ready pipelines).
Exam Tip: When a scenario mentions “governance,” “least privilege,” “private endpoints,” “logging,” or “budget alerts,” the exam is testing the management plane, not just model prompts. Don’t jump to “use GPT-4” when the real requirement is “meet compliance and monitor usage.”
Common traps include choosing a service because it “can” do the job, while missing the one that fits the requirement precisely. Example: for document-heavy OCR with structured extraction, the test usually wants Document Intelligence rather than generic OCR; for semantic retrieval over enterprise content, it often wants Azure AI Search with vector/hybrid retrieval and grounding rather than dumping content into prompts.
Register for AI-102 through Microsoft’s certification portal and schedule via the exam delivery provider (commonly Pearson VUE). You typically can choose online proctored delivery or a test center. Your decision should be strategic: online proctoring is convenient but unforgiving about environment compliance; test centers reduce the risk of technical disqualification.
Bring valid, unexpired government-issued ID that matches your registration name. Pay attention to the exact name formatting (middle initials, accent marks) because mismatches can cause check-in delays. Read the policies on breaks, personal items, and workspace requirements—especially for online delivery (clean desk, webcam room scan, stable internet).
If you need accommodations (extra time, assistive technologies), start early. Accommodation approval can take time and may affect how you schedule. Plan your study timeline so administrative steps do not collide with your intended exam date.
Exam Tip: If you plan to take the exam online, run the system test days in advance and again on exam day. Many failures are not “internet down,” but permissions, corporate VPN constraints, or webcam/security software conflicts.
Pricing varies by region and may change; budget for one retake if possible. Treat the scheduling step as part of preparation: a fixed date increases follow-through and makes your four-week plan realistic.
AI-102 uses a scaled scoring model typical of Microsoft role-based exams. You receive a score on a scale (commonly 1–1000), with a published passing threshold (often 700). Because of scaling, you should not try to compute your percentage correct during the exam; instead, focus on maximizing points by avoiding preventable errors on high-certainty questions and returning to medium-certainty items later.
Some questions may be weighted differently, and some unscored items may appear for exam calibration. Practically, this means you should treat every question as if it counts and avoid spending excessive time trying to identify “experimental” items.
Retake policies can evolve, but typically there is a waiting period between attempts (with longer waits after multiple failures). This is why your prep should emphasize skill acquisition (labs and implementation) rather than last-minute memorization. If you need a retake, your goal is to convert weak domains into strengths, not simply “do more practice questions.”
Exam Tip: Track misses by objective, not by question. If you miss three items related to grounding and retrieval, the fix is: build one end-to-end RAG pipeline (indexing → retrieval → prompt assembly → evaluation), then re-test. That is far more efficient than re-reading documentation.
Common scoring traps include over-investing time in one complex scenario early and rushing later, and failing to revisit flagged items due to poor time budgeting. Your time management strategy starts here: aim for consistent pacing and reserve time for review.
AI-102 commonly uses mixed formats: traditional multiple-choice, multi-select (“choose all that apply”), hotspot (selecting regions in a UI diagram), and case studies with multiple related questions. Each format has its own pitfalls, and your method should adapt accordingly.
For case studies, read the requirements and constraints first (security, data residency, latency, budget), then scan the existing architecture. Many wrong answers are “valid” Azure features that violate a constraint mentioned once in the case. Build a mini checklist: identity, network, data source, evaluation/monitoring, and cost controls. That checklist mirrors the real job—and the exam’s intent.
Hotspots test whether you recognize where settings live (portal, resource configuration, deployment options). The trap is guessing based on service names. Slow down and match the setting to the layer: is it model deployment, content filtering, network isolation, search index configuration, or application code?
Multi-select questions often include plausible distractors. Use elimination: first pick options that are required by the prompt (must/only), then validate that each selected option does not conflict with constraints. If the question says “minimize operational overhead,” fully managed services and built-in integrations usually beat custom orchestration.
Exam Tip: Watch for absolutes: “always,” “only,” “must use,” and “cannot.” These words turn a broad Azure discussion into a narrow exam answer. Also watch for hidden requirements like “customer-managed keys,” “private access,” or “no public internet”—these immediately eliminate many otherwise-correct choices.
Your four-week plan should follow the exam domains rather than random exploration. AI-102 rewards integration skills: connecting Azure OpenAI with retrieval, applying responsible AI controls, and operationalizing solutions. Use a cycle of Learn → Build → Evaluate → Note each week, with spaced repetition to retain key decisions and service boundaries.
Spaced repetition is essential. Maintain a “decision log” of why you chose a service and what constraint drove the choice (e.g., “Document Intelligence for structured form extraction; Search for retrieval; private endpoint for data exfiltration control”). Review that log on days 1, 3, 7, and 14 after you write it.
Exam Tip: Your notes should be organized by objective verbs: plan, implement, integrate, monitor, evaluate. The exam rarely asks for trivia; it asks what you would do next, what you would configure, or which approach meets constraints.
Set up a practice environment that lets you repeat common exam builds without wasting time or money. You want consistency (same naming, same regions where possible) and guardrails (budgets and cleanup). Start with one Azure subscription you control, and create a dedicated resource group per lab week so you can delete cleanly.
Plan resources with cost controls from day one: set a subscription budget alert, use deployment quotas wisely, and deallocate/delete when finished. If a lab requires uploading documents, store only non-sensitive sample data. Practice applying least privilege: separate “build-time” permissions from “run-time” permissions, and prefer managed identity where supported.
Exam Tip: The exam often rewards “operationally sane” answers: centralized logging, repeatable deployments, and secure access. When you build labs, mirror that mindset—enable diagnostics, tag resources, and document endpoints. Those habits translate directly into correct exam decisions.
Finally, keep a resource planning sheet: what you created, in which region, and when to delete it. Many candidates lose momentum due to surprise cost or quota issues; disciplined lab hygiene prevents that and keeps your prep on schedule.
1. You are mentoring a team preparing for AI-102. They have been memorizing lists of Azure AI services and features. You want them to align with what the AI-102 exam is primarily designed to measure. Which guidance should you give them?
2. A company plans to schedule AI-102 for multiple employees. One employee has a legal name that does not match the nickname used at work. Another employee wants to reschedule the exam date after booking. Which action best reduces the risk of test-day issues and policy violations?
3. You are 20 minutes into the AI-102 exam and encounter a multi-part scenario question with several plausible answers. You are unsure of the 'most correct' option. What is the best time-management approach aligned to typical certification exam mechanics?
4. Your manager asks you to create a 4-week AI-102 study plan that improves retention and performance on scenario-based questions. Which plan best aligns with a certification-focused approach described in the course?
5. You are setting up a practice environment to mirror AI-102 exam scenarios for generative AI solutions. Which setup is most appropriate to reflect real-world constraints and the operational layer?
AI-102 increasingly tests whether you can run Azure AI solutions like production software: you must choose the right services and regions, secure identities and data paths, operate deployments with CI/CD and rollback discipline, and continuously monitor reliability and spend. This chapter maps directly to the “Plan and manage an Azure AI solution” outcome, but it also sets foundations you will reuse when implementing generative AI (grounding and evaluation depend on data governance) and agentic patterns (tool permissions depend on identity and network controls).
Expect scenario questions that blend multiple constraints: “must use private connectivity,” “data cannot leave region,” “needs least privilege,” “must meet an SLA,” or “budget is capped.” The best answers are typically the ones that combine platform-native controls (Azure Policy, RBAC, Private Link, Monitor, budgets) rather than custom code. Also watch for traps where you are asked to secure or monitor the service versus the application—AI-102 tends to reward choices that cover both layers.
Finally, plan for operational maturity: version your prompts and model deployments, set up logs and traces before launch, and define rollback paths. Many failures on the exam (and in real systems) come from skipping these “boring” steps, then trying to debug production without telemetry or proper change control.
Practice note for Design the solution: services, regions, and reference architectures: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Secure and govern AI resources: identity, network, and data controls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Operationalize deployments: CI/CD, versioning, and rollbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Monitor reliability and cost: logging, alerts, quotas, and budgets: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: plan/manage scenarios and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design the solution: services, regions, and reference architectures: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Secure and govern AI resources: identity, network, and data controls: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Operationalize deployments: CI/CD, versioning, and rollbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Monitor reliability and cost: logging, alerts, quotas, and budgets: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 planning questions typically begin with requirements (latency, data residency, private network, multimodal needs, retrieval needs, throughput, and integration constraints) and ask you to select Azure services and a reference architecture. In practice, you’ll often combine Azure OpenAI (generation), Azure AI Search (retrieval), and storage (Azure Blob/ADLS) for a RAG-ready pipeline, plus Azure Functions or App Service for orchestration. For document workflows, Azure AI Document Intelligence often sits upstream to extract structured fields that then feed indexing and grounding.
Regional selection is a recurring exam objective disguised inside other topics. You need to ensure all dependent services are available in the chosen region(s) and that compliance requirements (for example, “data must stay in EU”) are met. Multi-region architectures are typically justified by resiliency and DR requirements, not as a default. When asked to minimize latency, co-locate compute, model endpoint, and data/search resources in the same region whenever possible.
Common patterns you should recognize on the exam:
Exam Tip: If a question mentions “grounded responses” or “reduce hallucinations,” the best architecture almost always includes retrieval (Azure AI Search) and a data pipeline that produces searchable chunks, not just a bigger model.
Exam trap: choosing services based on brand familiarity rather than capability. For example, “store embeddings in Cosmos DB” may be viable, but if the scenario emphasizes “hybrid search, filters, ranking, and citations,” Azure AI Search is usually the intended answer because it’s purpose-built for retrieval and integrates directly with common RAG patterns.
Identity is where AI-102 questions become subtle: you’re often asked to secure calls between an app, Azure OpenAI, Search, and Storage without using long-lived keys. The exam expects you to know when to use Microsoft Entra ID (Azure AD) authentication, RBAC, and managed identities. A system-assigned managed identity is ideal when the identity lifecycle should track the resource (for example, a Function App). A user-assigned managed identity is preferred when multiple workloads share the same identity or when you want stable identity across redeployments.
RBAC decisions should reflect least privilege. For example, an indexing pipeline might need write access to a Search index, while a runtime chat app should only need query access. Don’t grant broad roles (Owner/Contributor) when a narrower role exists. Similarly, differentiate between data plane and management plane permissions: many services require specific data roles to read blobs or query indexes, even if the identity can manage the resource.
Secrets management is a frequent “gotcha.” Keys and connection strings should be in Azure Key Vault, not in appsettings.json or pipeline variables. Prefer managed identity to access Key Vault; avoid embedding secrets in code or containers. If the scenario mentions rotation requirements, Key Vault plus automated rotation (or re-deploy with updated references) is typically the correct direction.
Exam Tip: When you see “no secrets in code” or “use passwordless connections,” the intended answer is usually “managed identity + RBAC” (and Key Vault only where absolutely needed, such as third-party API keys).
Common trap: selecting “shared access signature (SAS)” by default. SAS can be correct for limited-time external sharing, but for internal service-to-service calls within Azure, managed identity is the more secure and exam-preferred approach.
AI-102 expects you to secure not only data at rest, but also the network path used by AI services. If a scenario says “no public internet access,” you should think immediately of Private Link (private endpoints) and disabling public network access on the services where supported. A typical secure architecture places the app in a VNet and uses private endpoints for Storage, Azure AI Search, and other PaaS services, with private DNS zones to resolve the private addresses correctly.
Encryption questions usually include a compliance constraint. By default, Azure encrypts data at rest, but some scenarios require customer-managed keys (CMK) in Key Vault. Recognize wording like “customer controls the keys,” “bring your own key,” or “key revocation required,” which points toward CMK rather than platform-managed keys. Also be ready for “encryption in transit” expectations: use HTTPS/TLS end-to-end; don’t propose plaintext internal calls.
Data Loss Prevention (DLP) and sensitive data handling appear more often in generative AI scenarios: prompt inputs, retrieved documents, and model outputs may contain PII. While AI-102 may not go deep into Purview configuration, you should know the strategy: classify data, restrict access, avoid logging sensitive payloads, and implement output filtering where appropriate. In enterprise settings, Microsoft Purview can help with data classification and governance, and logging should use redaction or structured fields rather than raw content dumps.
Exam Tip: If the prompt says “must remain on private network,” “exfiltration risk,” or “disallow public endpoints,” the answer almost always includes Private Endpoint + public access disabled + correct DNS planning. Missing DNS is a classic real-world failure and an exam distractor.
Common trap: confusing VNet integration with Private Link. VNet integration lets your app reach into a VNet; it does not automatically make the PaaS service private. Private endpoints are what give the PaaS service a private IP in your VNet.
Responsible AI is tested less as philosophy and more as operational controls: how you enforce policy, prove compliance, and audit changes. For Azure-based AI solutions, governance often starts with Azure Policy initiatives applied at management group/subscription scope to restrict regions, require tags, deny public network access, or enforce private endpoints. These controls prevent “configuration drift” and are commonly the best answer when the scenario asks for organization-wide enforcement.
Auditing and documentation are also exam-relevant. Logging access to Key Vault, tracking deployment changes, and retaining activity logs enable investigations and compliance attestations. If a scenario mentions “must demonstrate who changed the model deployment” or “audit prompt changes,” think in terms of CI/CD pipelines with approvals, repository history, and Azure Activity Log rather than ad-hoc manual updates in the portal.
For generative solutions, you should be ready to describe (at a high level) content safety and human oversight. Even if not asking for a specific service, the exam expects you to mention guardrails: input/output filtering, refusal behavior for disallowed content, and monitoring for policy violations. In production, this includes documenting intended use, limitations, and evaluation results (quality, groundedness, safety). Those artifacts often map to internal governance checklists in enterprises.
Exam Tip: When asked “how do you ensure all resources follow the same security configuration,” the answer is usually Azure Policy (prevent/deny) rather than a runbook or a wiki page.
Common trap: treating “responsible AI” as purely a model setting. On the exam, responsible AI is primarily about end-to-end system behavior—data sources, logging, review workflows, and change management—because that’s what auditors and regulators can validate.
Monitoring questions often combine reliability (availability, latency), debugging (root cause), and customer impact (SLA). Your core toolkit is Azure Monitor: metrics for near-real-time signals, logs (Log Analytics) for deeper queries, and Application Insights for distributed tracing. For an AI app, you want end-to-end correlation: an incoming request should be traceable through the orchestrator, retrieval calls, and model inference, with timings and failure reasons.
Expect scenarios like “intermittent timeouts,” “sudden latency increase,” or “answers are missing citations.” The correct approach is usually: instrument the app, monitor dependencies, alert on thresholds, and use structured logs that include retrieval parameters (topK, filters), model deployment name/version, and response metadata. You may also need to identify whether the bottleneck is the retrieval layer (Search latency), the model endpoint, or the application compute.
SLA interpretation is an exam favorite. Azure SLAs apply to services when deployed according to documented requirements (for example, multiple instances, zone redundancy, or specific tiers). The exam may ask what to do to meet an uptime requirement—look for answers that add redundancy (multiple instances/regions) and that remove single points of failure (for example, a single-zone deployment). Also, remember that an SLA for a component does not automatically produce the same SLA for the whole solution; end-to-end availability is the product of dependencies.
Exam Tip: If the scenario asks you to “identify which dependency is failing,” choose distributed tracing (Application Insights) over just metrics. Metrics show symptoms; traces show the path and the failing hop.
Common trap: enabling logging after an incident. The exam generally expects “monitoring by design”: logs, metrics, and alerts configured before go-live, with retention aligned to compliance needs and cost constraints.
Cost control is not an afterthought in AI-102—expect direct questions about budgets, quotas, and how to prevent surprise bills. Azure Cost Management budgets and alerts are the platform-native answer for spend governance at subscription/resource group scope. Tagging (project, environment, cost center) is frequently paired with budgets because it enables chargeback/showback and helps isolate which workload caused the increase.
Quotas and throttling are a practical operational concern for AI services. Scenarios may mention “requests failing with 429” or “deployment is rate-limited.” The correct exam approach is to recognize this as capacity/throughput management: implement retry with exponential backoff, queue bursts, and request quota increases when justified. Also, separate dev/test and production resources so load testing doesn’t consume production capacity.
Lifecycle management ties together CI/CD, versioning, and rollback. You should manage model deployments and application releases with staged environments (dev/test/prod), track versions (including prompt templates and retrieval settings), and keep a rollback plan when a new configuration degrades answer quality or increases cost. For example, a change from concise to verbose prompting can multiply token usage—cost spikes are often configuration-driven, not purely traffic-driven.
Exam Tip: If a prompt says “cap spend” or “prevent runaway usage,” look for “budgets + alerts” and “quotas/limits” rather than “ask users to use it less.” The exam favors enforceable controls.
Common trap: assuming autoscale always reduces cost. Autoscale can increase cost if it scales out due to retries or inefficient prompting. The exam’s better answer usually combines efficient design (token limits, caching, retrieval tuning) with governance (budgets/alerts) and operational safeguards (quotas, backpressure, circuit breakers).
1. A healthcare company is deploying an Azure OpenAI–based summarization service. Requirements: (1) All data must remain in West Europe. (2) The service must be reachable only from the company VNet (no public internet access). (3) Use platform-native controls rather than custom proxy code. What should you do?
2. You manage multiple Azure AI resources across subscriptions. Security requires: enforce resource creation only in approved regions, require private endpoints for supported AI services, and prevent noncompliant deployments automatically. Which approach best meets the requirement?
3. Your team deploys prompt templates and model deployment configuration for a generative AI app. You need repeatable releases with the ability to quickly roll back to a known-good configuration if quality regresses. Which solution best aligns with CI/CD and rollback discipline on Azure?
4. A generative AI application experiences intermittent 429 (Too Many Requests) errors from an Azure AI service during peak hours. Requirements: detect issues early and prevent unbounded spend. Which combination is most appropriate?
5. A company must grant an app the minimum permissions required to call an Azure AI resource. The app runs in Azure and should avoid secrets when authenticating. Which approach best implements least privilege and modern identity practices?
This chapter maps to the AI-102 objective area for implementing generative AI solutions on Azure: selecting the right model and deployment, designing prompts that reliably satisfy constraints, grounding outputs with enterprise data (RAG), evaluating quality and safety, and preparing the solution for production (rate limits, streaming, caching, fallbacks, and cost controls). The exam does not reward “cool demo” thinking; it rewards operational thinking: which Azure resource to use, how parameters affect tokens and latency, how grounding changes risk, and how to design for reliability and governance.
Expect scenario questions where a solution works in a notebook but fails under load, returns ungrounded claims, or violates policy. Your job is to identify the Azure OpenAI building blocks (deployments, models, parameters, filters) and the architecture patterns (RAG, evaluation harness, telemetry, resilience) that turn a prototype into an enterprise-ready system.
Exam Tip: When options include “fine-tune” versus “RAG,” the exam frequently expects RAG for enterprise knowledge updates. Fine-tuning is for shaping style/behavior or learning stable patterns—not for frequently changing product docs and policies.
Practice note for Choose models and build prompts: system messages, few-shot, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Ground responses with enterprise data: RAG patterns and citations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Evaluate and improve quality: testing, safety, and regression checks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deploy to production: rate limits, streaming, caching, and fallbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: generative AI build-and-fix exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose models and build prompts: system messages, few-shot, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Ground responses with enterprise data: RAG patterns and citations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Evaluate and improve quality: testing, safety, and regression checks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deploy to production: rate limits, streaming, caching, and fallbacks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: generative AI build-and-fix exam scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Azure OpenAI is consumed through an Azure OpenAI resource, but the exam focuses on the concept of a deployment: you deploy a chosen model (for example, GPT-family chat models or embedding models) under a deployment name, then your application calls that deployment. Many candidates confuse “model” with “deployment.” On AI-102, deployment is the unit you reference in code and configure for scale and quota behavior.
Tokens are your primary sizing and cost unit. Prompts plus retrieved context plus the model’s response all count toward token usage, which directly affects cost and latency. Scenario questions often hide the real issue: the app times out because your RAG pipeline is injecting too much context, not because the model is “slow.”
Key parameters tested include temperature (creativity vs determinism), top_p (nucleus sampling), max tokens (caps output length), and stop sequences (hard boundaries). Another common objective is understanding that system messages (or higher-priority instructions) set behavior, but cannot guarantee perfect compliance; you must enforce constraints in application logic when it matters (for example, output must be valid JSON).
Exam Tip: If a question mentions “deterministic outputs for regression testing,” look for low temperature and stable prompts/versions, plus an evaluation harness—don’t assume model choice alone solves it.
The exam expects you to know prompt structure as an engineering discipline: clear role definition, explicit task, constraints, and output format. In Azure OpenAI chat patterns, the system message establishes persona and non-negotiable rules; the user message provides the request; developer/tool instructions (where applicable) define the app’s contract. Few-shot examples are used to demonstrate style and edge cases. However, few-shot is not a substitute for validation—especially for strict schemas.
Guardrails appear both in prompt and outside prompt. In-prompt guardrails include “refuse if missing sources,” “respond only using provided context,” and “output JSON matching this schema.” Out-of-prompt guardrails include schema validation, content filtering, and tool gating (only allow certain tools and arguments). On AI-102, watch for answers that rely only on “better prompt wording” when the requirement is enforceable via application logic.
Tool use and orchestration patterns appear as “agentic” behaviors even within a generative solution chapter: the model chooses between calling a search tool, a database lookup, or a calculator. The key is controlled tool invocation: restrict tools, validate arguments, and ensure tools return grounded facts that the model summarizes.
Exam Tip: If you see “model returns invalid JSON sometimes,” the best answer usually combines structured prompting and output validation/retry—prompt-only fixes are a common exam trap.
Retrieval-Augmented Generation (RAG) is a centerpiece of AI-102 generative scenarios. The exam tests whether you can separate responsibilities: retrieval finds relevant documents; the model synthesizes an answer from those documents; citations provide traceability. Azure AI Search is commonly used as the retrieval store, while Azure OpenAI embeddings convert text into vectors for similarity search.
Chunking is where many solutions fail. Chunks that are too large reduce retrieval precision and inflate token costs; chunks that are too small lose context. A practical approach is to chunk by semantic boundaries (headings/paragraphs) and include overlap so key definitions are not split. You’ll also need metadata (source URL, title, section, security labels) to support filtering and citations.
Embeddings power vector search: generate embeddings at ingestion time for documents and at query time for the user question. Hybrid search (keyword + vector) is often the right choice when you have domain terminology, IDs, or product names. The exam frequently frames this as “users search by part number” or “legal clauses have exact phrasing,” which keyword search handles well.
Citations are both a product feature and a safety control: they reduce hallucinations by forcing the model to ground claims. In implementation, you pass retrieved snippets with their source identifiers and instruct the model to cite them. Your application should also enforce “no source, no claim” logic for high-risk domains.
Exam Tip: When the prompt says “answer using enterprise docs,” but the model still hallucinates, the fix is usually to tighten the RAG loop (better retrieval, smaller top-k, enforced citations), not to raise temperature or add more general instructions.
Safety is not optional on AI-102. You must know how safety controls appear at multiple layers: platform controls (content filters), application controls (input/output validation), and organizational policy (logging, red teaming, access control). Azure OpenAI includes content filtering that can block or flag prompts and completions. The exam often asks what to do when legitimate business content is being blocked (false positives): the correct approach is to adjust policy/configuration where allowed, add user education, and redesign prompts and workflows—rather than “turn off safety.”
Responsible AI policy design includes defining allowed use cases, prohibited content, escalation paths, and auditability. For enterprise apps, consider data privacy: don’t log sensitive user inputs unnecessarily, and apply least privilege access to data sources used for grounding. If you use RAG, ensure retrieval respects document-level permissions; otherwise, your “helpful chatbot” becomes a data exfiltration tool.
Common safety traps in exam scenarios include: (1) relying on a system message to prevent disallowed content without filters; (2) returning raw tool output that contains sensitive fields; (3) allowing the model to call arbitrary URLs or run code without constraints.
Exam Tip: If a question involves “users try to override system instructions,” look for a layered mitigation: prompt injection defenses, tool gating, and policy enforcement—not a single prompt rewrite.
Quality improvement on AI-102 is framed as engineering discipline: define what “good” means, measure it, and prevent regressions. A golden set (also called a test set) is a curated collection of representative prompts with expected behaviors or scoring rubrics (for example, must cite sources, must not fabricate, must follow schema). The exam expects you to understand that evaluation is continuous—especially when prompts, retrieval settings, or models change.
Prompt and configuration version control is frequently tested implicitly. Treat prompts, system messages, safety policies, retrieval settings (top-k, filters), and chunking strategies as versioned artifacts. When a new release reduces answer quality, you need to correlate the regression to a specific change. This is where telemetry matters: log prompt templates (without sensitive user data when possible), token usage, retrieval hits, latency, refusal rates, and safety filter triggers.
Regression checks should cover both functional and safety requirements: schema validity, citation presence, and “no answer outside sources” rules. Many candidates focus only on accuracy and miss policy compliance metrics—yet exam scenarios often emphasize compliance, customer trust, and operational risk.
Exam Tip: When asked how to “prove improvements,” choose answers involving repeatable evaluation (golden set + metrics) over subjective stakeholder feedback.
Production-readiness is a major discriminator in AI-102 questions. Rate limits and quota constraints can break an otherwise correct solution. Your design should include request throttling, retries with backoff, and graceful degradation when the model is overloaded. Streaming responses improve perceived latency (users see output immediately), but you must still handle mid-stream interruptions and partial outputs.
Caching is a powerful cost and latency lever. Cache embeddings for repeated queries, cache retrieval results for stable corpora, and cache final responses when prompts are identical (with careful consideration for personalization and security). On the exam, caching is often the right answer when the workload is repetitive and answers are stable, but it is wrong when data changes frequently or responses depend on user-specific permissions.
Fallback patterns include switching to a smaller/cheaper model for non-critical tasks, reducing max tokens, or disabling optional features (like long-context augmentation) under load. Another resilience pattern is “RAG-first, then refuse”: if retrieval returns no relevant sources, return a safe “cannot find in approved docs” response rather than hallucinating.
Cost tuning ties back to tokens: trim chat history, summarize long threads, reduce top-k, and keep chunks concise. Also consider batching where supported, and avoid unnecessary round trips (for example, don’t call the model twice when one call with a tool plan suffices).
Exam Tip: If a scenario mentions “sudden spike in usage” or “429/rate limit errors,” the correct answers usually include throttling/queueing and backoff, plus capacity planning—prompt tweaks won’t fix quota failures.
1. You are building a customer-support chat app using Azure OpenAI. The app must always respond in JSON with fields {"answer": string, "citations": string[], "confidence": "low"|"medium"|"high"}. In testing, users can sometimes prompt-inject the assistant to output plain text or extra fields. Which approach most reliably enforces the required structure while still allowing natural language generation? A. Put the JSON requirement in a user message and set temperature=0 B. Use a system message to require the schema and enable structured output/JSON mode (when supported) for the model deployment C. Use few-shot examples only (two to three examples of the JSON format) and set top_p=0.1
2. A company needs an internal "Policy Q&A" assistant. Policies change weekly and must be reflected immediately. The assistant must provide citations to the exact policy paragraphs used. Which solution best meets the requirement with the least operational overhead? A. Fine-tune a model monthly on the policy documents B. Implement RAG using Azure AI Search as a vector index and return citations from retrieved chunks C. Increase the model context window and paste the full policy manual into every prompt
3. You deployed an Azure OpenAI chat completion endpoint and added RAG. During evaluation, the assistant sometimes returns confident answers even when retrieval returns no relevant documents. You must reduce ungrounded responses without over-rejecting valid questions. What is the BEST change? A. Add a prompt rule: "If you cannot find relevant sources, say you don't know" and gate answers on a retrieval relevance threshold (e.g., top-k score) B. Fine-tune the model to be more cautious by training it to say "I don't know" more often C. Increase temperature to encourage more diverse answers, which will surface uncertainty
4. A retail app uses Azure OpenAI with streaming responses. Under peak load, you receive intermittent HTTP 429 (Too Many Requests). The business requirement is to keep responses fast and resilient while controlling cost. Which design is MOST appropriate? A. Implement client-side exponential backoff with retry-after handling, add response caching for repeated prompts, and configure a fallback deployment/model for overload B. Disable streaming so requests complete faster and eliminate 429s C. Increase max_tokens for each request to reduce the number of calls
5. You maintain a generative AI solution that summarizes incident tickets. After a prompt update, a regression occurs: summaries sometimes include PII that was previously redacted. You need an approach aligned with enterprise quality and safety practices to prevent future regressions. What should you implement? A. A versioned evaluation harness with a fixed test set, automated safety checks (PII/redaction), and regression gates in CI/CD before promoting prompts/models B. Ask the helpdesk team to manually spot-check a few random summaries each week C. Increase the system message length to repeat the PII policy multiple times
AI-102 increasingly tests whether you can move beyond “single prompt, single response” applications into agentic solutions that plan, call tools, and ground answers in enterprise knowledge. This chapter maps to three recurring objective themes you’ll see in case studies: (1) design agent boundaries (what the model decides vs what code decides), (2) implement safe tool use (function calling, connectors, validation), and (3) build knowledge-mining pipelines (ingestion, enrichment, indexing) that are Retrieval-Augmented Generation (RAG) ready.
Expect exam items to describe a business workflow (claims processing, HR policy Q&A, incident triage) and ask which architecture choice reduces hallucinations, improves traceability, or meets security requirements. The “best” answer usually ties model reasoning to deterministic systems: store state in your app, retrieve from Azure AI Search, validate tool inputs, and log everything for monitoring.
Exam Tip: When you see “must be auditable,” “must not expose secrets,” or “must ensure responses are grounded,” the correct answer usually includes (a) explicit tool boundaries, (b) retrieval from a governed index, and (c) server-side validation—not just prompt instructions.
Practice note for Design agents: goals, memory, tools, and orchestration boundaries: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement tool use: function calling, connectors, and tool validation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build knowledge mining pipelines: ingestion, enrichment, and indexing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Optimize retrieval for agents: ranking, filters, and hybrid search: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: agent workflows and search/index case studies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design agents: goals, memory, tools, and orchestration boundaries: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Implement tool use: function calling, connectors, and tool validation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Build knowledge mining pipelines: ingestion, enrichment, and indexing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Optimize retrieval for agents: ranking, filters, and hybrid search: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: agent workflows and search/index case studies: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
On AI-102, “agentic” does not mean “let the model do everything.” It means you implement a pattern where the model proposes steps, while your application enforces boundaries and executes actions. A common architecture is Planner → Executor. The planner (LLM) decomposes a user goal into tasks (e.g., “find policy, summarize, draft email”), and the executor (your orchestration code) chooses which tools to call, in which order, with guardrails.
State handling is a frequent exam target. You should know where state lives: conversation state (messages), workflow state (current step, tool results), and business state (IDs, approvals, escalation flags). The safe default is: keep state in your service (database/cache) and feed the model only what it needs. This makes runs reproducible and limits prompt injection impact.
Design boundaries explicitly: what the model can decide (task selection, wording) vs what code decides (authorization checks, tool availability, retry logic, quotas). Orchestration patterns you may see include: single-agent tool loop (LLM chooses a tool repeatedly), multi-step workflow (fixed stages with an LLM in specific stages), and multi-agent (separate roles). Exam scenarios often reward the simplest pattern that meets requirements.
Exam Tip: If the question mentions “unbounded tool calls,” “runaway costs,” or “infinite loops,” look for answers that add max-iterations, timeouts, and explicit termination criteria in the orchestrator.
Common trap: picking “agent with autonomy” when the scenario requires deterministic business rules (approvals, compliance). The exam typically expects you to keep approvals and policy enforcement in code, not in the model’s reasoning.
Tool use on Azure is typically implemented via function calling (the model returns a structured call) and your application executes the call against an API, database, or connector. AI-102 questions often test whether you understand that function calling is not execution—it is a suggestion that must be validated server-side.
When implementing tools, define a strict schema for each action (JSON arguments, types, enums). Validation steps should include: allowlist tool names, validate argument formats, enforce authorization (RBAC/ABAC), and sanitize inputs before passing to downstream systems. If the tool is an HTTP API, use managed identity where possible and store secrets in Key Vault. Tools should return minimal necessary data to reduce leakage into the model context.
Connectors and tool catalogs show up in enterprise scenarios: “use a CRM system,” “query a ticketing platform,” “send an email.” The correct design is usually to wrap each external dependency behind a controlled service layer so you can log requests, enforce policies, and mask sensitive fields. Then you expose only those wrapped actions as tools.
Exam Tip: If an option says “let the model call the API directly with the API key,” it is almost always wrong. The exam expects a server-side mediator that holds credentials and applies authorization checks.
Common trap: confusing “tools for retrieval” (search) with “tools for actions” (write operations). Retrieval tools can be more permissive; action tools must be tightly controlled, logged, and often require human-in-the-loop in regulated workflows.
Memory is tested as an architecture decision: what stays in the prompt window (short-term) versus what is stored externally (long-term). Short-term memory is the immediate conversation (recent turns, current tool outputs) and is limited by the model’s context window and cost. Long-term memory is stored in a database, vector store, or Azure AI Search index and retrieved as needed.
For short-term memory, the exam favors controlled context construction: include only relevant recent turns, tool outputs, and retrieved snippets with citations. For long-term memory, store durable facts (user preferences, prior tickets, case notes) with metadata and retention rules. In enterprise contexts, long-term “memory” must obey privacy and data minimization requirements—store identifiers and structured summaries, not raw sensitive transcripts unless explicitly required and permitted.
Summarization is a key technique to manage context. A common pattern is “rolling summary”: after N turns, summarize the conversation into a compact state object, then drop older turns. Another pattern is “episodic memory”: store summaries per task and retrieve them later. The exam may ask how to reduce token usage while maintaining accuracy; summarization plus retrieval is usually the correct direction.
Exam Tip: If you see “prompt injection” or “user asks the assistant to reveal system instructions,” choose designs that isolate system prompts, store state outside the model, and retrieve trusted data from governed sources.
Common trap: assuming “vector memory” automatically equals “truth.” Long-term memory can preserve incorrect or outdated content. The exam often rewards adding timestamps, source fields, and re-validation steps (e.g., re-retrieve authoritative policy docs) before generating final responses.
Knowledge mining for agentic solutions typically centers on Azure AI Search. AI-102 expects you to understand index basics and how they affect retrieval quality for RAG. An index contains fields (searchable, filterable, sortable, facetable) and may include vector fields for embeddings. The exam commonly asks you to choose correct field attributes and analyzers based on query needs.
Analyzers matter for keyword search: language analyzers (e.g., English) handle stemming and tokenization; keyword analyzer preserves exact values (useful for IDs). If a scenario requires filtering by category, region, or security label, those fields must be filterable. If it requires “top policies by lastUpdated,” the field must be sortable. These are frequent case-study details.
Scoring and ranking: keyword queries use BM25 scoring; semantic ranking (when enabled) can rerank results using a semantic model; vector similarity uses cosine/dot-product depending on configuration. Many exam items are about selecting the right combination to improve relevance while controlling cost and latency.
Exam Tip: If the scenario requires “users can only see documents they have access to,” look for answers that use per-document ACL metadata + filter queries (or separate indexes per tenant) rather than hoping the model will “not mention” restricted content.
Common trap: putting everything into one giant searchable field without metadata. The exam often expects you to design for filters and citations (source URL, page number, chunk ID) so the agent can provide grounded, traceable responses.
Knowledge mining pipelines convert raw content (PDFs, images, Office files) into structured, searchable data. On AI-102, this is typically framed as “ingestion → enrichment → indexing.” In Azure AI Search, enrichment is implemented through cognitive skills (skillsets) applied during indexing. You may see requirements like “extract text from scanned PDFs,” “detect key phrases,” or “identify people and organizations.”
OCR is central for document workflows. If documents are images or scanned PDFs, you need OCR to extract text before chunking and indexing. Entity extraction and key phrase extraction improve downstream retrieval by adding structured fields that can be filtered or used for boosting (e.g., prioritize documents mentioning a product name or regulation code).
Design for traceability: store enrichment outputs with provenance—page number, bounding box (for OCR), and source file reference—so the agent can cite where information came from. This is a typical case-study requirement (“must provide citations”). Also consider normalization: dates, units, and IDs should be extracted into consistent formats for accurate filtering and sorting.
Exam Tip: When the prompt says “scanned documents” or “images,” the correct pipeline includes OCR before indexing. If the option jumps straight to “vectorize PDFs” without extracting text, it’s usually incomplete for search and citation needs.
Common trap: over-enriching everything. The exam often rewards minimal, targeted enrichment aligned to the query experience (e.g., entities needed for filters) rather than enabling every skill by default.
Agents succeed or fail based on retrieval quality. AI-102 tests whether you can pick an appropriate retrieval strategy—keyword, vector, hybrid, and semantic reranking—and tune it using filters and ranking controls. Keyword search is strong for exact terms (policy IDs, error codes). Vector search is strong for semantic similarity (paraphrases, “find procedures like this”). Hybrid combines both and is often the best default for enterprise RAG.
Semantic reranking is commonly used after initial retrieval (keyword/hybrid) to improve ordering of top results. In exam scenarios, semantic rerank is a good fit when the corpus is large and user questions are natural language, but you still need lexical precision and metadata filtering.
Optimization for agents means: apply metadata filters early (security trimming, doc type, date), retrieve a manageable top-k, then assemble context with deduplication and diversity (avoid returning five chunks from the same page). If the workflow has multiple steps, retrieve per step (policy lookup, then procedure lookup) rather than one massive retrieval.
Exam Tip: When you see “improve relevance without changing the model,” the answer is often “adjust retrieval”: add filters, hybrid search, semantic reranking, better chunking, or scoring profiles—rather than prompt tweaks alone.
Common trap: increasing top-k excessively. More retrieved text can reduce answer quality by adding noise and increasing token cost. The exam tends to favor targeted retrieval plus summarization over “stuff everything into the prompt.”
1. A company is building an incident-triage agent on Azure. The agent must decide when to query Azure AI Search for runbooks, when to call a ticketing API, and when to stop and request human approval. The solution must be auditable and minimize hallucinations. Which design best aligns with AI-102 agent boundary guidance?
2. You implement function calling so an agent can create purchase orders through an internal API. The API must never receive unexpected fields, and the organization requires server-side validation for compliance. What should you implement?
3. A healthcare provider needs a knowledge mining pipeline to support an HR policy Q&A agent. Source documents include PDFs and scanned images. The agent must answer with citations and only from approved content. Which pipeline is most appropriate?
4. A support agent uses Azure AI Search to retrieve troubleshooting steps. Users report that results are relevant but sometimes from the wrong product version. Each document has fields: productName, version, language, and lastUpdated. What is the best change to improve retrieval precision for the agent?
5. A legal research agent must handle both keyword-heavy queries (case numbers, statute citations) and natural-language questions. The team wants to improve relevance without losing exact-match behavior. Which Azure AI Search approach best fits?
This chapter maps directly to the AI-102 skills measured around building vision and language features with Azure AI services. Expect the exam to test whether you can choose the right service for the job, design production-ready pipelines (quality, reliability, and cost), and integrate outputs into downstream business workflows. The highest-scoring answers usually show you understand the difference between “image understanding” and “document understanding,” and between “text analytics” and “conversational experiences,” plus the deployment realities (latency, throughput, and error handling).
From a solutions perspective, vision and NLP are rarely isolated. A common enterprise pattern is: ingest images or PDFs, extract text and structure with OCR/document intelligence, enrich with NLP (entities, key phrases, sentiment), then route to an agent or workflow for human review. The exam often hides this pipeline in scenario wording like “process invoices,” “extract from ID cards,” “classify customer emails,” or “enable voice for a kiosk.” Your job is to identify which components belong to which task and avoid over-engineering.
Exam Tip: When you see “tables,” “key-value pairs,” “forms,” or “multi-page PDFs,” you are typically in document processing territory. When you see “objects,” “tags,” “bounding boxes,” “visual features,” or “scene understanding,” you are typically in image analysis territory. Don’t pick a document-first service to solve an object-detection problem, and don’t pick a generic image analyzer to reliably reconstruct tables.
Practice note for Computer vision essentials: analysis, detection, and OCR workflows: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Document and image pipelines: preprocessing, quality, and error handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP fundamentals: classification, extraction, and conversational design: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Speech and multimodal integration: voice-in/voice-out and accessibility: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain practice set: vision + NLP mixed scenarios and edge cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Computer vision essentials: analysis, detection, and OCR workflows: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Document and image pipelines: preprocessing, quality, and error handling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP fundamentals: classification, extraction, and conversational design: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Speech and multimodal integration: voice-in/voice-out and accessibility: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 expects you to select the correct vision capability based on the data shape and the output required. “Image analysis” solutions focus on understanding pixels as visual content: identifying objects, generating captions/tags, detecting people/brands, or locating items with bounding boxes. “Document processing” solutions focus on reconstructing reading order and structure: pages, lines, words, tables, fields, and key-value pairs across multi-page files.
In practical architectures, the decision is driven by the business question. If the question is “What is in this picture?” you use image analysis. If the question is “What does this document say, and where is the invoice number?” you use document intelligence/OCR-first approaches. A common trap is to treat PDFs as images and run generic image analysis—this may extract some text but usually fails on tables, multi-column layouts, and reliable field extraction.
Exam Tip: Watch for the word “layout.” On AI-102, “layout” usually means you need document-layout understanding (pages/blocks/tables), not just OCR text. Also watch for “handwritten” or “low-quality scans”—these are quality risk signals that should influence preprocessing and error handling.
When you justify the selection in a scenario, the best answer ties requirements to outputs (structure vs semantics) and to constraints (multi-page PDFs, table fidelity, compliance). That reasoning is what the exam rewards.
OCR is the foundation, but AI-102 cares about what you do after text recognition: reconstruct layout, extract structured fields, and handle tables reliably. Document understanding typically starts with an OCR pass that produces text plus geometry (bounding boxes). Layout-aware processing then groups words into lines/paragraphs, infers reading order, and identifies tables as grid-like structures rather than a flat stream of tokens.
Forms and invoices are common exam themes because they highlight why “plain OCR text” is not enough. In a form, the value “$1,250.00” is meaningless unless linked to a label like “Total Due.” In a table, row/column positions determine meaning. For these, your solution should emphasize key-value extraction and table extraction rather than only returning text.
Exam Tip: The exam frequently tests “confidence-aware” design. If a scenario mentions audits, financial reporting, or regulated data, propose thresholds, human-in-the-loop review, and storing original documents alongside extracted results for traceability.
Quality and error handling matter here: skew, blur, low DPI, compression artifacts, and handwriting can reduce accuracy. The exam won’t ask you to implement image filters in code, but it will expect you to recognize when preprocessing (deskew, denoise, contrast) and validation rules (required fields, format checks) are necessary. A common trap is assuming extraction is deterministic—production systems must handle partial extraction, missing pages, and ambiguous fields gracefully.
AI-102 scenarios often include operational constraints: “process 2 million receipts overnight,” “support near-real-time checkout,” or “limit cost.” Your design must address throughput (items/time), latency (time per request), and reliability (retries, timeouts, idempotency). Vision workloads can be compute-heavy and network-heavy; the wrong pattern can cause throttling or unexpected cost.
Start by classifying the workload as online (interactive) or batch (asynchronous). Online workloads prioritize predictable latency and user experience, usually with smaller payloads and faster models/features. Batch workloads prioritize throughput and cost efficiency, typically using queue-based ingestion and scalable workers.
Exam Tip: If a question mentions “spikes,” “seasonal load,” or “unpredictable traffic,” the safe answer includes decoupling with a queue and scaling out consumers. If it mentions “instant feedback,” emphasize low-latency calls, payload size control, and caching where appropriate.
Another common exam trap is ignoring payload constraints and network overhead. Sending full-resolution images when thumbnails or cropped regions would suffice can inflate cost and latency. Similarly, processing entire documents when you only need specific pages is wasteful; a strong design extracts only what’s needed and logs metrics (request rate, error rate, latency percentiles) to prove it meets SLAs.
NLP on AI-102 is about selecting the correct text operation and shaping it into a workflow that produces business value. The exam commonly tests sentiment analysis (opinion/attitude), key phrase extraction (topic cues), named entity recognition (NER) for people/places/organizations/PII-like entities, and summarization patterns for condensing long content into action-ready text.
Think in terms of outputs and downstream decisions. Sentiment is rarely useful alone; it’s useful when you route items (e.g., negative emails to priority queue). Key phrases support search, tagging, and clustering. NER supports extraction into structured fields, redaction, and compliance review. Summarization supports agent handoffs and case notes, especially when you need a consistent “brief” for human operators.
Exam Tip: When a scenario says “extract invoice number, dates, amounts from text,” that is entity/field extraction (NER + patterns), not sentiment. When it says “create a short brief for an agent from a long transcript,” that is summarization—often paired with key phrases to drive routing.
A classic trap is choosing a heavy generative approach when a deterministic extractor is required. If the requirement is strict, auditable extraction (e.g., regulatory reporting), prioritize structured extraction patterns, confidence scores, and rule checks. Use generative summarization when the output is advisory, not authoritative.
Conversational design on AI-102 is evaluated through safety, reliability, and user outcomes—not just “can it chat.” You should be ready to describe how prompts, conversation state, and tools (search, databases, ticketing) work together, and how the system behaves when it cannot answer. The exam often frames this as a support bot, internal assistant, or case intake agent.
Strong conversational solutions define intent, constraints, and a fallback strategy. Your prompts should specify role, boundaries, and desired output format. Conversation state should store the minimum necessary context (and respect privacy), and tool use should be explicit: when to call a knowledge base, when to ask clarifying questions, and when to escalate.
Exam Tip: If an answer choice includes “respond even when you don’t know” or “make best effort without sources,” it’s usually wrong for enterprise scenarios. Prefer designs that refuse/redirect, request clarification, or escalate with a transcript and extracted entities for the human agent.
Also expect traps around logging and monitoring. For conversational systems, monitoring should include user abandonment, escalation rate, tool-call failures, and grounded answer rate. A robust design treats these as quality signals, not just operational telemetry.
Speech integration appears on AI-102 when scenarios involve call centers, kiosks, accessibility, or hands-free workflows. The exam expects you to understand the core loop: Speech-to-Text (STT) for input, NLP/LLM orchestration for intent and response generation, and Text-to-Speech (TTS) for output. The key is selecting patterns that handle real-world audio variability and provide testable behavior.
For voice-in, design for streaming vs batch transcription. Streaming STT reduces perceived latency and enables barge-in (user interrupts). Batch transcription fits recordings (voicemails, archived calls) and supports downstream summarization and analytics. For voice-out, TTS should match the channel (phone vs device speaker) and support accessibility requirements (clear pronunciation, pacing, SSML when needed).
Exam Tip: If the scenario includes “capture account number” or “medical dosage,” the safest design includes confirmation (“I heard… is that correct?”) and spell-out/phonetic strategies, because a small STT error becomes a major business error.
Finally, test multimodal edge cases: poor connectivity, partial utterances, and timeouts. The exam often rewards answers that mention resilience: retry policies, graceful degradation to text chat, and clear user messaging when speech services are unavailable.
1. A company must process multi-page supplier invoices (PDFs). The solution must extract line items from tables and capture key-value pairs (invoice number, total, due date) with high accuracy. Which Azure AI service should you use?
2. A retail app needs to detect common objects in user-submitted photos (for example: "backpack", "bicycle", "dog") and return bounding boxes for each detected object. Which service and feature best meets the requirement?
3. A support team receives thousands of customer emails per day. The company wants to automatically route each email to one of several departments (Billing, Technical Support, Sales) based on the message content. Which approach is most appropriate?
4. You are designing a pipeline to process scanned application forms (images). Some scans are skewed or low contrast, causing OCR failures. You need a production-ready design that minimizes downstream errors and supports reprocessing. Which design choice best addresses reliability?
5. A company is building a hands-free kiosk for accessibility. Users should speak requests and hear spoken responses. The kiosk also needs to read short printed labels using the camera and speak them aloud. Which set of Azure services best fits the end-to-end requirements?
This chapter is your conversion point: you stop “studying” and start performing. AI-102 (Generative AI Solutions on Azure) rewards candidates who can recognize patterns, choose the safest and most governable option, and justify tradeoffs under time pressure. The lessons in this chapter—Mock Exam Part 1, Mock Exam Part 2, Weak Spot Analysis, Exam Day Checklist, and the Final Review Sprint—are designed to mirror how the real exam feels: mixed domains, competing constraints, and answer choices that are all plausible unless you apply Azure-first reasoning.
Use this chapter like a practice lab. You will run a full mock exam in two parts, then perform a structured post-mortem. The goal is not a perfect score on the first attempt; the goal is a repeatable method for eliminating wrong answers quickly and consistently, especially where the exam tests governance, security, monitoring, evaluation, and cost controls.
Exam Tip: The most common way strong engineers miss AI-102 questions is by selecting an option that “works” technically but fails a hidden requirement: least privilege, private networking, data residency, monitoring, or cost containment. Train yourself to search for these constraints in every prompt.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final Review Sprint: formulas, patterns, and common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final Review Sprint: formulas, patterns, and common traps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
For Mock Exam Part 1 and Part 2, simulate the real exam environment: one sitting, no notes, no internet, and a strict time box. If you can’t replicate the full duration, split into two sessions, but keep the total “clock time” pressure intact. The AI-102 exam frequently includes scenario-heavy items where time disappears in rereading; your strategy must protect you from over-investing early.
Use a three-pass approach. Pass 1: answer “instantly obvious” items and mark anything that requires calculation, nuanced tradeoffs, or careful reading of constraints. Pass 2: return to marked items and apply an elimination framework (security/governance first, then architecture fit, then cost/perf). Pass 3: review only those you are least confident on—do not churn every answer.
Exam Tip: Time box per item. If your average target is ~1–1.5 minutes, enforce a hard cutoff (e.g., 2 minutes). If you’re still undecided, pick the best governance-aligned choice and mark it. Overthinking is usually worse than a well-reasoned default.
Finally, treat every question as a test of objectives alignment. The exam is not asking “can you build it?” but “can you build it safely, observably, and within budget on Azure?”
Mock Exam Part 1 should feel like a case study: a business problem plus multiple technical requirements across domains (genAI, agents, vision, NLP, search, and operations). Your job is to translate narrative into architecture decisions. In AI-102, case-study style scenarios commonly test (1) model selection and deployment, (2) grounding/RAG design, (3) identity/networking, and (4) evaluation/monitoring strategy.
When you see a scenario about an internal knowledge assistant, assume the exam wants you to connect Azure OpenAI to Azure AI Search (vector + keyword), with a grounding strategy and an evaluation plan. Your “tell” is language like “use enterprise documents,” “reduce hallucinations,” “must cite sources,” or “ensure only authorized users see certain documents.” That last requirement is the trap: many candidates propose a great RAG pipeline but forget per-user access control. On Azure, think document-level security trimming, filtering by user claims/roles, and ensuring the app uses managed identity where possible.
Exam Tip: If a scenario includes “no data leakage” or “regulatory,” the correct answer usually layers controls: private endpoints + managed identity + key management (CMK if specified) + content safety + logging/monitoring. A single control is rarely enough in the exam’s framing.
Scenario items about image-heavy workflows typically test whether you choose the right vision capability: OCR for text extraction, document-oriented analysis for structured forms, and image analysis for tags/captions/object detection. The trap is choosing a general image model when the requirement is structured extraction or searchable fields. Likewise, if the scenario asks for search across extracted text and metadata, the expected end state is an index that is “RAG-ready”: chunking strategy, embeddings, and retrievable fields (including citations).
For operations-oriented scenario items, the exam checks that you can plan governance and cost controls: budgets/alerts, quota management, monitoring with Application Insights/Log Analytics, and safe rollout patterns. If the scenario mentions “spend is growing,” assume you need to control token usage, caching, prompt length, retrieval limits, and tier selection—plus FinOps mechanisms such as budgets and alerting.
Mock Exam Part 2 should stress your precision: hotspot-style items test whether you know where a feature is configured (networking vs identity vs service settings), and multi-select items punish partial understanding. The skill is not memorizing UI clicks—it’s knowing which layer owns the control.
For hotspot-style thinking, map controls to layers: identity (Entra ID, managed identity, RBAC), network (VNet integration, private endpoints, firewall rules), data (encryption, CMK, key vault), and safety (content filters, system messages, tool constraints). If you’re asked where to enforce “only this app can call the model,” think managed identity and RBAC/service auth; if asked where to prevent public access, think private endpoint and disabling public network access.
Exam Tip: In multi-select, do not over-select. The exam often includes one or two “nice-to-have” options that are not required by the scenario. Select only the minimum set that fully satisfies requirements; extra selections can make the answer wrong.
Agentic solution items often appear as multi-select: tool use, orchestration patterns, and safety constraints. The exam typically rewards patterns like: constrain tools via allowlists, validate tool outputs, isolate high-risk actions behind confirmations, and implement a policy boundary (e.g., system instructions + tool schema + server-side enforcement). A common trap is assuming prompt instructions alone are a security boundary. On the exam, “safety constraint” means you can enforce it in code, policy, or service configuration—not just in natural language.
Another frequent multi-select theme is evaluation and monitoring. Expect to pick components that collectively support quality and risk management: offline evaluation sets, automated metrics (groundedness/faithfulness where applicable), human review workflows for high-risk outputs, and telemetry for prompts/responses (redacted as needed). The trap is choosing logging that violates privacy requirements—if a scenario forbids storing PII, you need redaction, sampling, or storing only derived metrics.
This is the “Weak Spot Analysis” lesson: your score is less important than your error log. Immediately after each mock, capture every missed or guessed item and classify it by domain objective: plan/manage, genAI implementation, agents, vision, NLP, or knowledge mining. Then tag the failure mode: misread requirement, wrong service choice, missing security control, cost/latency tradeoff error, or evaluation/monitoring gap.
Use a two-column fix plan. Column A: “What rule would have prevented the miss?” Column B: “What will I do next time?” Example rules: “If private access is mentioned, prioritize private endpoints and disable public network access.” Or: “If grounding is required, include citations and retrieval constraints, not just a bigger model.” Translate each rule into a 1–2 sentence checklist item for your final sprint.
Exam Tip: Track “nearly right” answers separately. These are the ones you can convert fastest by learning the exam’s preferred pattern (e.g., managed identity over keys, Azure AI Search over a custom vector DB when nothing else is required, budgets/alerts for cost control, evaluation as a first-class requirement).
Finally, do targeted remediation: one small lab or one focused reading per error category, not broad re-study. If you missed three items due to access control in RAG, revisit security trimming patterns and how claims-based filtering is applied in retrieval. If you missed monitoring questions, review what belongs in Application Insights vs Log Analytics and what “end-to-end tracing” means for an AI app (prompt → retrieval → model → tool calls → user response).
Your “Final Review Sprint” is about rapid pattern recall. Use the checklist below as your last 48-hour pass. Each bullet is phrased the way the exam tests it: as a decision under constraints.
Exam Tip: If two answers both “work,” pick the one that is (1) more governable, (2) more observable, and (3) more cost-controlled—this triad matches how Azure solutions are evaluated in enterprise settings and how the exam writers differentiate options.
This is your Exam Day Checklist lesson. First, control the environment: stable internet, quiet space, and a clean desk. For online proctoring, ensure your testing machine has updates paused, notifications disabled, and no background apps that could trigger proctor flags. For test center, arrive early and plan for check-in time so your mental energy goes into the first questions, not logistics.
Next, pacing: commit to your three-pass strategy from Section 6.1. If you feel stuck, look for the hidden constraint you may have missed—privacy, networking, identity, evaluation, or cost. Many AI-102 items are designed so that reading one requirement carefully collapses the problem to a single best answer.
Exam Tip: When you’re down to two choices, ask: “Which option reduces operational risk?” The exam frequently favors managed identity over secrets, private endpoints over IP allowlists, and measurable evaluation/monitoring over ad-hoc testing.
Last-minute review should be lightweight: skim your personal error-log rules, not the whole syllabus. Rehearse a few “anchor patterns” you can reuse: secure RAG (Search + citations + trimming), safe agent tooling (allowlist + confirmations), vision-to-index pipeline (OCR/extraction → enrichment → search), and production readiness (monitoring + budgets + content safety). Then stop. Mental freshness is a performance multiplier on scenario-heavy exams.
After the exam begins, keep confidence stable. The exam will include unfamiliar wording; translate it back into objectives and patterns you practiced in the mock exams. Your preparation is complete when you can consistently choose the safest, most governable Azure solution—not just the one that seems clever.
1. You are reviewing results from a full AI-102 mock exam. You notice you consistently choose answers that functionally meet requirements but miss a hidden constraint (for example: private networking or least privilege). Which approach is MOST effective to improve accuracy under time pressure for the real exam?
2. A company is preparing for exam day. They will take AI-102 at a testing center and have had issues previously with running out of time on case-study questions. Which action is BEST aligned with an exam-day checklist that improves time management and reduces avoidable mistakes?
3. During weak spot analysis, you identify that most of your missed questions involve choosing between options that all "work" but differ in governance and security. In a final review sprint, which recurring pattern should you apply FIRST when two options appear technically equivalent?
4. You run Mock Exam Part 2 and score lower than Part 1. Your goal is to improve within one week using the chapter's method. Which sequence is MOST effective?
5. A startup is building a generative AI assistant on Azure. During the final review sprint, you practice a common AI-102 trap: selecting an option that works but violates cost controls. The requirement states: "Minimize ongoing cost while maintaining adequate monitoring and governance." Which choice is MOST likely to be correct in an exam scenario?