AI Certification Exam Prep — Beginner
A complete, domain-mapped AI-102 prep path with practice and a mock exam.
This course is a structured exam-prep blueprint for the Microsoft AI-102 exam (Azure AI Engineer Associate). It’s designed for beginners with basic IT literacy who want a clear, step-by-step path from “new to certification exams” to confident exam readiness. The focus is on the exact skills measured in the official domains, with an emphasis on how Microsoft tests design decisions, tradeoffs, and operational readiness—not just terminology.
The AI-102 exam spans six core domains. This course is organized as a 6-chapter “book,” with Chapters 2–5 mapping directly to the skill areas and Chapter 6 providing a full mock exam and final review cycle.
Chapter 1 is your exam on-ramp: registration, scoring, question styles, and a beginner-friendly study strategy so you don’t waste time on low-yield topics. Chapters 2–5 each go deep into one or two official domains, emphasizing common scenario patterns (security constraints, quota limitations, latency vs cost decisions, and “best answer” architecture selection). Chapter 6 delivers a full mock exam split into two parts, followed by weak-spot analysis and an exam-day checklist.
AI-102 questions often reward applied reasoning: choosing the right Azure AI capability, securing access properly, selecting a retrieval strategy, or identifying the most supportable deployment approach. This course is designed to build that reasoning through structured coverage and exam-style practice in every major chapter, culminating in a full mock exam and a final review map by domain.
If you’re ready to build a reliable study routine and prep with a domain-mapped plan, start by setting up your learning account and bookmarking your study schedule. Register free to begin, or browse all courses to compare related Azure AI tracks.
Microsoft Certified Trainer (MCT)
Jordan Whitaker is a Microsoft Certified Trainer who designs exam-aligned learning paths for Azure and AI certifications. He has coached learners through Microsoft certification prep with a focus on practical architecture, governance, and deployment skills.
AI-102 tests whether you can design and implement practical Azure AI solutions under real-world constraints: security, cost, reliability, and responsible AI. This chapter sets expectations for the exam experience, then gives you a repeatable study system that aligns to the domains you’ll be scored on. You’ll also learn how to set up a lab environment without burning time or money, how to read AI-102 questions for keywords and hidden constraints, and how to use diagnostics to set realistic score goals.
The biggest mistake candidates make is studying “by service” (e.g., only Azure OpenAI or only Azure AI Search) rather than “by objective.” The exam rewards integration: using Azure AI Search for retrieval-augmented generation (RAG), adding safety filters and evaluation, deploying with monitoring, and applying governance practices. Throughout this chapter, you’ll see how to translate an objective into (1) documentation to read, (2) a lab to build, and (3) question patterns to practice.
Exam Tip: Treat AI-102 as an implementation-and-operations exam, not a conceptual AI theory test. If your study notes don’t include deployment steps, quota constraints, authentication choices, and monitoring signals, you’re missing what the exam is measuring.
Practice note for Exam format, registration, and scoring: what to expect: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain-by-domain study plan and timeboxing for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Lab environment setup strategy (Azure subscription, quotas, tools): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for How to read AI-102 questions: keywords, distractors, and elimination: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and goal setting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam format, registration, and scoring: what to expect: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Domain-by-domain study plan and timeboxing for beginners: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Lab environment setup strategy (Azure subscription, quotas, tools): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for How to read AI-102 questions: keywords, distractors, and elimination: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Baseline diagnostic quiz and goal setting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 validates the skills of an Azure AI Engineer Associate: someone who plans, builds, integrates, and operates AI capabilities in Azure. The exam focuses on applied decision-making: choosing the right Azure AI services, wiring them together securely, and meeting non-functional requirements (latency, cost, governance, and reliability). Expect scenarios that combine multiple components—like Azure OpenAI + Azure AI Search + storage + managed identities + monitoring—because that reflects how solutions ship in production.
The course outcomes map to the job role you’re being tested on: planning and managing solutions (governance/security/cost/monitoring/deployment), implementing generative AI solutions with Azure OpenAI (prompting, RAG, safety, evaluation), implementing agentic solutions (tools/functions, orchestration, memory, reliability controls), plus computer vision, NLP, and knowledge mining with Azure AI Search indexing/enrichment.
On the exam, “role clarity” matters because it tells you what the question expects you to own. A common trap is over-indexing on model selection or data science details. AI-102 is not asking you to train foundation models; it’s asking you to deploy and integrate capabilities safely. When a question mentions production requirements (PII, private networking, RBAC, cost caps), interpret it as an engineering governance question, not a prompt-writing question—even if Azure OpenAI appears in the stem.
Exam Tip: When you see wording like “minimize administrative effort,” “follow least privilege,” or “meet compliance,” think first about managed identities, RBAC, private endpoints, logging, and content safety controls—these are frequent differentiators between answer choices.
Register for AI-102 through Microsoft’s certification portal and schedule via the authorized provider (typically online proctored or in-person). Plan your date backward from a realistic study runway: beginners often need multiple weeks of structured practice plus lab time. Your goal is not just completion of reading; it’s exposure to scenario questions where governance, deployment, and service limits appear together.
Know the policies because they affect your test-day performance. Online proctoring has strict rules: clean desk, no secondary screens, and limited breaks. In-person centers reduce proctoring friction but require travel time and can add scheduling constraints. If you need accommodations (extra time, assistive technology), apply early; delays here can disrupt your study calendar and force you into a suboptimal date.
Retake rules matter for strategy. Some candidates rush an attempt “to see the exam” and then spend the retake window frustrated. A better approach is to use a baseline diagnostic (without writing down exact items) to identify weak domains, then book the exam once you can consistently explain why each incorrect option is wrong. Budget both time and cost: factor in subscription charges for labs and potential retake fees.
Exam Tip: Schedule your exam at a time when you can do a full-length practice set in the same time slot during the two weeks prior. This reduces cognitive surprises and helps you pace case studies and multi-part items.
AI-102 uses scaled scoring, so your raw correct count translates to a scaled score. Treat every question as valuable; don’t assume some items “don’t count.” Your controllable variable is consistency: avoid preventable misses caused by misreading constraints, confusing similar services, or ignoring the requirement to choose the best option (not merely a working one).
Expect multiple question types: single-answer multiple choice, multiple-response, ordering/sequence (e.g., correct deployment steps), drag-and-drop matching, and case studies. Case studies are time traps: they present a business scenario, existing environment, requirements, and constraints. The exam often hides key constraints in non-obvious places—such as “must support customer-managed keys,” “data must not traverse public internet,” or “solution must be regionally pinned.” Those constraints eliminate otherwise attractive answers.
Case study tactics: read requirements first, then skim the environment for blockers (identity, networking, quotas), and only then read the question. Re-check each answer against the exact wording. Many distractors are “nearly right,” but violate one constraint (e.g., using keys in code instead of managed identity, or proposing a service that doesn’t support the needed feature in that context).
Exam Tip: Build a habit of underlining three items mentally: (1) the user goal (what outcome), (2) the hard constraint (what must/must not), and (3) the optimization target (minimize cost, maximize reliability, reduce latency). Most wrong answers fail one of these three.
Your study plan should be domain-driven and timeboxed, especially if you’re new to Azure AI. While exact weightings can change, AI-102 commonly evaluates competency across these six functional domains that align tightly with your course outcomes: (1) Plan and manage an Azure AI solution; (2) Implement generative AI solutions with Azure OpenAI; (3) Implement an agentic solution; (4) Implement computer vision solutions; (5) Implement NLP solutions; (6) Implement knowledge mining and information extraction with Azure AI Search.
Map each domain to what the exam actually tests:
A frequent trap is mixing similarly named services or features. For example, candidates confuse “knowledge mining” (indexing + enrichment pipelines) with “RAG” (retrieval at query time). The exam expects you to know how they complement: knowledge mining builds the searchable index; RAG uses retrieval results to ground a generative response.
Exam Tip: Whenever a question mentions “indexing,” “enrichment,” “skillset,” or “cognitive skills,” anchor on Azure AI Search ingestion. Whenever it mentions “grounding,” “citations,” or “use retrieved passages,” anchor on RAG with search + embeddings.
A beginner-friendly plan is a two-track workflow: concept track (reading + notes) and execution track (labs). Timebox by domain: allocate weekly blocks so you touch all domains early, then return for deepening. This prevents the common failure mode of spending all week on generative AI and then realizing you never practiced AI Search skillsets or OCR configuration.
Notes: keep them objective-aligned and decision-focused. Write down “if-then” rules that appear in questions: if private networking required → consider private endpoints; if secretless auth required → managed identity; if cost must be minimized → pick prebuilt capability rather than custom training, when it meets requirements.
Flashcards: convert your “if-then” rules and service differentiators into spaced repetition prompts. Your goal is speed: on exam day, you should instantly recognize which service/feature a keyword implies.
Labs: set up a stable practice environment. Use one Azure subscription (or a dedicated resource group strategy), define naming conventions, and clean up resources to control cost. Plan for quotas and regional availability: generative AI and some vision/NLP features can be region-limited. Track your Azure OpenAI quota and model availability, and document which region you can reliably deploy to.
Exam Tip: In labs, practice the “boring” pieces: identity, RBAC, logging, and key management. Many candidates can demo a chatbot but miss questions about secure access to storage/search or how to monitor failures and throughput.
Practice is where your score moves. The key is not volume; it’s rationale review. After each practice set, classify misses into buckets: (1) knowledge gap (didn’t know feature), (2) misread constraint (missed “must/only/not”), (3) elimination failure (didn’t spot why distractor violates requirements), or (4) execution gap (never built it in a lab). Each bucket has a different fix: read docs, slow down and annotate constraints, practice elimination, or run a lab.
Use a baseline diagnostic early to set goals and timebox. Your diagnostic isn’t about memorizing items; it’s about discovering which domains are currently unstable. Beginners often find planning/management and knowledge mining are weaker than expected because they require Azure platform fluency (identity, networking, indexing pipelines). Set a realistic target score and a retest plan only if needed, but don’t schedule a retake as motivation—schedule study milestones as motivation.
Track weak areas with a simple spreadsheet: domain, sub-skill, symptom, corrective action, and re-test date. Then apply spaced repetition: revisit the same sub-skill 1 day, 3 days, 7 days, and 14 days after you fix it. The exam rewards durable recognition of patterns, not short-term cramming.
Exam Tip: When reviewing rationales, always write a one-sentence “why the correct answer is best” and a one-sentence “why each distractor fails.” This builds elimination speed, which is often the difference between passing and timing out on case studies.
1. You are planning your AI-102 preparation. Based on the exam’s scoring approach, which study strategy best aligns with how AI-102 evaluates candidates?
2. You are setting up a lab environment for AI-102 practice. You have a limited budget and want to avoid getting blocked mid-lab by platform limits. What is the best initial approach?
3. During a practice question, you see the requirement: “The solution must minimize operational overhead and support monitoring in production.” Which approach best matches how to interpret and answer AI-102 questions?
4. A beginner candidate has four weeks to prepare for AI-102 and struggles to stay consistent. Which plan best reflects the chapter’s recommended domain-by-domain approach and timeboxing?
5. You take a baseline diagnostic quiz and score below your target. What is the best next action aligned with the chapter’s goal-setting guidance?
AI-102 does not only test whether you can call an API. It tests whether you can operate an AI solution in Azure: choose the right services, secure them, control cost, deploy safely, and monitor the system over time. Many exam scenarios describe an “AI app” in business terms (support chatbot, document processing pipeline, image moderation workflow) and then ask you to pick architecture, identity, networking, or operations choices that meet constraints like “no public internet,” “use least privilege,” “minimize cost,” or “support dev/test/prod.”
This chapter maps to the exam objective of planning and managing an Azure AI solution. Expect scenario questions that force tradeoffs: speed vs. cost, simplicity vs. governance, and proof-of-concept vs. production readiness. The correct answer is usually the one that aligns with Azure’s recommended patterns: managed identity over keys, private networking over public endpoints for sensitive data, Azure Monitor over ad-hoc logging, and automated deployments over manual portal changes.
You’ll also see knowledge mining and retrieval-augmented generation (RAG) architectures show up indirectly here: storage + Azure AI Search indexing + enrichment + Azure OpenAI inference. Even when the prompt engineering is the “main feature,” the exam often scores you on how you secure data, prevent exfiltration, and manage quotas.
Practice note for Design solution architecture across Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identity, network, and data security decisions for AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Cost management, quotas, capacity planning, and scaling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deployment patterns: dev/test/prod and CI/CD concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: governance and operations scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design solution architecture across Azure AI services: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identity, network, and data security decisions for AI workloads: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Cost management, quotas, capacity planning, and scaling: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Deployment patterns: dev/test/prod and CI/CD concepts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: governance and operations scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 expects you to recognize which Azure AI service best matches a requirement and why. Start by classifying the workload: (1) generative chat or content creation, (2) search and knowledge mining, (3) vision/OCR, (4) language understanding/extraction, or (5) orchestration/agentic workflows. For generative models, the managed path is Azure OpenAI, typically paired with Azure AI Search for RAG. For knowledge mining and enterprise search, Azure AI Search is the hub: it indexes content from Blob, SQL, Cosmos DB, and can apply enrichment skillsets (OCR, entity recognition, key phrase extraction) during indexing.
For document-heavy scenarios, distinguish between “OCR only” and “structured extraction.” Azure AI Vision OCR handles text detection; Azure AI Document Intelligence (formerly Form Recognizer) targets forms, invoices, receipts, and layout extraction. For image understanding beyond OCR (tags, captions, object detection), Azure AI Vision fits. For classic NLP (classification, NER, summarization), Azure AI Language is the go-to—unless the scenario explicitly requires generative output with grounding, in which case Azure OpenAI + RAG is usually more aligned.
Exam Tip: When a prompt-based chatbot must answer using internal PDFs and must cite sources, the exam is usually steering you to Azure OpenAI + Azure AI Search (RAG) rather than “fine-tune a model” or “store documents in a database and prompt directly.”
Common trap: picking the most “powerful” service instead of the most direct fit. The exam often rewards minimizing complexity: if you only need OCR, don’t introduce an LLM. If you need search relevance and citations, don’t rely on blob listing or “prompt stuffing.”
Identity questions are frequent because they combine governance, security, and operational best practices. Azure AI services typically support key-based authentication, and many support Microsoft Entra ID (Azure AD) via role-based access control (RBAC). For production, prefer Entra ID because it enables least privilege, auditing, and easy rotation without embedding secrets. For workloads running in Azure (App Service, Functions, AKS, Logic Apps), use managed identity so the runtime can obtain tokens without storing credentials.
Understand the “who calls what” chain. Example: an Azure Function calls Azure AI Search and Azure OpenAI. With a system-assigned managed identity on the Function, you can grant RBAC roles to that identity on the Search service and the Azure OpenAI resource. Your app code then requests tokens; no API keys required. If the scenario involves developers or CI/CD pipelines, service principals or federated credentials (OIDC) may appear as the right choice to avoid long-lived secrets.
Exam Tip: If the stem says “avoid storing secrets,” “use least privilege,” or “centralize access control,” the best answer is typically managed identity + RBAC, not API keys in configuration.
Common trap: confusing RBAC with data-plane permissions. Some services require both management-plane permissions (create resources) and data-plane roles (query indexes, use models). On the exam, read for verbs like “deploy the resource” vs. “call the endpoint.” Another trap is granting overly broad roles (Owner/Contributor) when a narrower built-in role exists; least privilege is a consistent scoring theme.
AI-102 operations questions often hinge on whether data must stay off the public internet. In Azure, the standard pattern is to use Private Link (private endpoints) so clients connect to services over a private IP in a virtual network. For Azure AI Search, Azure OpenAI, Storage, and many related services, private endpoints reduce exposure and help meet compliance requirements. Combine this with disabling public network access when the scenario demands it.
Data protection includes encryption at rest (typically enabled by default) and encryption in transit (HTTPS/TLS). Exam items may ask how to manage customer-managed keys (CMK) using Azure Key Vault for services that support it. Also consider secrets handling: store keys/certificates in Key Vault, not in code or app settings—unless a managed identity removes the need for secrets altogether.
Exam Tip: If the question includes “must not traverse the public internet,” the answer is usually private endpoint + VNet integration (or equivalent private networking), not “IP restrictions” alone.
Common trap: mixing up “service endpoint” vs. “private endpoint.” Service endpoints extend VNet identity to Azure services but do not provide the same private IP-based isolation as Private Link. Another trap: assuming that disabling public access is always possible; some scenarios require a transitional architecture (e.g., dev public, prod private), which the exam may reward if it’s aligned to environment separation and governance.
Production AI solutions require observability: you must detect failures, performance regressions, and usage anomalies (including cost spikes). In Azure, the baseline is Azure Monitor: metrics for service health and latency, logs (often via Log Analytics workspace), and alerts routed to action groups. For app-level tracing, Application Insights is commonly paired with Functions/App Service to correlate requests across dependencies.
For Azure AI Search, monitor query latency, throttling, and indexer failures. For Azure OpenAI, watch request rates, token usage, and throttling/429 responses—then implement retry with exponential backoff. Reliability decisions also include designing for transient faults and regional issues: use retries, circuit breakers, and idempotency where appropriate. If the scenario mentions “business-critical” or “high availability,” you may need to consider multi-region design (where supported) and clear RTO/RPO expectations, but avoid inventing complex architectures when a single-region SLA meets stated requirements.
Exam Tip: When you see “intermittent 429” or “requests failing under load,” the exam usually wants throttling-aware client behavior (retry/backoff) and/or capacity planning—not just “increase timeout.”
Common trap: logging sensitive content. The “best” operational answer often includes collecting enough telemetry to debug while applying data minimization: avoid storing full prompts or documents unless required, mask PII, and control access to logs using RBAC.
Cost is a first-class exam topic because AI workloads can scale unpredictably. Azure OpenAI is frequently constrained by quotas and token-based consumption; Azure AI Search costs depend on service tier/partitions/replicas; enrichment incurs additional compute. You must translate requirements into capacity choices: throughput, latency, index size, and concurrency. A good mental model: replicas scale query throughput (read), partitions scale index/storage and ingestion throughput (write). If the stem says “high query volume,” think replicas; if it says “large index” or “heavy ingestion,” think partitions.
Quotas show up as “deployment capacity” or rate limits. When usage exceeds quota, you’ll see throttling, and the right mitigation depends on the bottleneck: request a quota increase, implement batching, reduce tokens (shorter prompts, smaller context), cache responses, or move heavy processing offline (precompute embeddings, enrich at index time). For RAG, token cost is strongly driven by how much text you stuff into the prompt—improving retrieval quality and chunking strategy is a cost control mechanism, not only an accuracy tweak.
Exam Tip: If the scenario asks to “reduce cost without losing accuracy,” consider architectural changes like better retrieval (fewer chunks), caching, and using the smallest model that meets requirements—before jumping to “buy more capacity.”
Common trap: assuming “serverless” means “cheap.” A bursty workload can be expensive if each request is large (tokens) or triggers heavy enrichment. Read for constraints like “predictable workload” vs. “spiky traffic,” and match to reserved capacity/throughput planning where appropriate.
AI-102 expects you to treat AI resources as deployable infrastructure with controlled promotion across dev/test/prod. Environment separation reduces blast radius and supports compliance: separate resource groups/subscriptions, distinct keys/endpoints, and distinct data sets. Use Infrastructure as Code (IaC) such as Bicep or ARM templates (and commonly Terraform in real projects) to create repeatable deployments: AI Search services, indexes, data sources, indexers, OpenAI deployments, managed identities, and private endpoints.
CI/CD concepts appear as “how do you deploy safely?” The exam-friendly answer usually includes: version control, automated builds, automated tests, staged releases, and approvals for production. For AI systems, include configuration versioning (prompts, index schema, skillsets) and safe rollout patterns. Blue/green or canary deployments reduce risk: route a small percentage of traffic to the new version, monitor, then promote. Rollbacks must be planned—especially for schema changes (index fields) and model deployment changes (switching OpenAI deployments). A practical rollback approach includes keeping the prior index and deployment available until the new one is validated.
Exam Tip: If the stem mentions “avoid downtime” during index changes, prefer deploying a new index (v2), reindexing, then swapping the app to the new index—rather than modifying an index in-place.
Common trap: manual portal changes in production. The exam often frames this as “fastest” but not “correct.” Look for answers that emphasize automation, traceability, and policy-based governance (e.g., Azure Policy to restrict public network access or enforce tags), aligning operational controls with the solution’s security and cost requirements.
1. A company is building a RAG-based support chatbot using Azure OpenAI, Azure AI Search, and Azure Storage. Security policy requires that no service endpoints are reachable over the public internet and that access follows least privilege. Which approach best meets the requirements?
2. You are planning capacity for an Azure OpenAI workload used by multiple internal applications. The business expects periodic spikes and wants to avoid production outages caused by throttling. What should you plan for first to reduce the risk of request failures during spikes?
3. A team manages dev/test/prod environments for an AI document processing pipeline (Storage -> Azure AI Search indexing -> Azure OpenAI summarization). They want consistent, repeatable deployments and to prevent configuration drift caused by manual portal changes. Which approach best aligns with certification exam recommended practices?
4. A healthcare organization must ensure that only an Azure Function can read documents from a Storage account and call Azure AI Search indexing, without storing secrets in code. What is the best identity approach?
5. A company is running an AI content moderation workflow and wants centralized visibility into failures, latency, and request volume over time. They also need alerting when error rates exceed a threshold. Which solution best meets these operational requirements?
This chapter maps to the AI-102 skills measured around building production-grade generative AI apps with Azure OpenAI: choosing the right model and deployment, engineering prompts for reliable outputs, designing Retrieval-Augmented Generation (RAG) to ground responses, integrating tools/functions for agent-like behaviors, applying safety and responsible AI controls, and evaluating/optimizing for cost, latency, and quality. The exam frequently tests whether you can connect design decisions (model choice, context strategy, filters, caching) to real operational outcomes (accuracy, hallucination rate, throughput, cost per request, and data exposure risk).
Expect scenario questions that include partial constraints: “must not store prompts,” “needs citations,” “needs low latency,” “needs deterministic JSON,” or “budget is capped.” Your job is to recognize which Azure OpenAI features, prompt patterns, and RAG components address those constraints. A common trap is answering with a generic “use GPT-4” or “use RAG” without aligning to governance, token budgets, and safety controls. Use the sections below as your decision checklist.
Practice note for Model selection and prompt engineering fundamentals for Azure OpenAI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for RAG design: grounding, retrieval, and context window management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Safety, content filters, and responsible AI for generative apps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Evaluation and optimization: latency, cost, quality, and regression testing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: generative solution design and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Model selection and prompt engineering fundamentals for Azure OpenAI: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for RAG design: grounding, retrieval, and context window management: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Safety, content filters, and responsible AI for generative apps: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Evaluation and optimization: latency, cost, quality, and regression testing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: generative solution design and troubleshooting: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 expects you to understand Azure OpenAI as an Azure resource with one or more deployments (model + version + capacity configuration). Exam scenarios often distinguish between the “resource” (networking, identity, keys, private endpoints, logging) and the “deployment” (which model is served and how your app calls it). You typically choose a model family based on capability needs (reasoning quality, instruction following, multimodal), and then constrain cost/latency by selecting an appropriate size and response budget.
Model selection is not just “best model wins.” The exam tests tradeoffs: small/fast models for high-throughput classification or extraction; larger reasoning models for complex synthesis; and embedding models for semantic search/RAG indexing. Also consider whether you need vision input, strict structured outputs, or tool calling support in your chosen model/version.
Exam Tip: When a question mentions “capacity,” “throughput,” “quota,” or “multiple environments,” think in terms of separate deployments (dev/test/prod) and possibly separate resources for isolation. Answer options that only mention “switch the model in code” may be incomplete if the scenario needs governance, network isolation, or distinct monitoring boundaries.
Common trap: confusing Azure AI Search indexing/embeddings (offline or batch) with the chat completion runtime. On the exam, embeddings are typically generated during ingestion (or periodically refreshed), while chat completions are runtime and should be optimized for token usage and response size.
Prompt engineering fundamentals are heavily tested through “why is the model ignoring instructions?” or “how do I force JSON?” scenarios. You should know role hierarchy and intent: system messages define durable behavior (tone, rules, boundaries), user messages provide the task and data, and assistant messages can be used for few-shot demonstrations or to continue conversation state. The exam expects you to place policy-like instructions (safety boundaries, citation requirements, refusal behaviors) in the system message, not buried in the user prompt.
Few-shot prompting is a reliability tool: show 1–3 high-quality examples that match the desired format and edge cases (e.g., missing fields). This is especially important for extraction/classification tasks that must be consistent. For structured outputs, prefer explicit schemas and constraints (for example: “Return a JSON object with keys X, Y, Z; do not include extra keys; values must be strings.”). If the platform feature set includes structured output modes, use them to reduce format drift.
Exam Tip: If the scenario demands “machine-readable output,” choose answers that combine (1) explicit schema, (2) low temperature, and (3) validation/retry logic. Relying on “the model usually outputs JSON” is a classic exam trap; the correct design includes guardrails.
Common trap: putting critical rules in the user message while allowing user-provided text to override them. The exam often implies prompt injection attempts (“the document says ignore previous instructions”). Your best defense is layered: system policies, delimiters, and grounding to trusted content.
RAG design is a core AI-102 objective because it directly addresses hallucinations and enterprise knowledge needs. A standard RAG pipeline: ingest documents, chunk text, generate embeddings for chunks, store them in a vector index (often Azure AI Search), retrieve top-K relevant chunks at runtime, and provide those chunks as grounded context to the model. The exam tests whether you can tune chunking and retrieval to fit the model’s context window and still return accurate, cited answers.
Chunking: choose sizes that preserve meaning (headers + paragraphs) while enabling targeted retrieval. Overly large chunks waste tokens; overly small chunks lose context and reduce retrieval precision. Overlap can help maintain continuity, but too much overlap increases index size and cost.
Embeddings and vector search: embeddings represent semantic meaning; vector similarity finds conceptually related text even if keywords differ. Many scenarios use hybrid retrieval (keyword + vector) to handle both exact terms (policy numbers, product names) and semantic matches.
Citations: exams frequently require “show sources.” That implies you must pass document IDs/URLs/page numbers through the pipeline and instruct the model to cite retrieved sources only. Citations are also a safety control: they encourage answers grounded in retrieved evidence.
Exam Tip: If you see “must not answer from general knowledge” or “only answer from internal docs,” the correct solution includes (1) retrieval, (2) explicit instruction to use only retrieved content, and (3) a fallback behavior when retrieval confidence is low. A tempting wrong answer is “increase temperature/model size,” which does not solve grounding.
Agentic behaviors on AI-102 begin with “tool use”: the model decides when to call a function (tool) and returns structured arguments, while your application executes the tool and provides results back to the model. This pattern is used for database lookups, workflow automation (create ticket, send email), calculations, and calling Azure services (Search queries, storage retrieval, or internal APIs). The exam tests that you understand the boundary: the model proposes actions; your code enforces authorization, validation, and side-effect controls.
Function calling reduces brittle prompt parsing. Instead of asking the model to “output an API call,” you define a tool signature (name, description, JSON schema). The model returns a structured payload that you can validate. This is also a safety feature: you can block tools, require user confirmation, or enforce least privilege.
Exam Tip: If a scenario includes “the model executed an unintended action” or “prompt injection caused data exfiltration,” look for answers that add application-side controls: tool allow-lists, parameter validation, user confirmation, and RBAC checks. Answers that only say “improve the prompt” are usually insufficient because the exam wants defense in depth.
Common trap: treating tool outputs as trusted. Tool results should be treated like any external input—sanitize, constrain, and pass only what’s needed back into the model to limit token cost and exposure.
Generative apps must be safe by design, and AI-102 often frames this as “responsible AI + enterprise security.” You should recognize three layers: (1) platform safety features (content filtering), (2) grounding/RAG to reduce hallucinations, and (3) secure data handling (PII, secrets, retention, access control). Content filters typically apply to both prompts and completions; questions may describe blocked responses or false positives and ask what to tune (for example: adjusting filtering configuration, adding user messaging, or refining prompts and retrieval).
Grounding as a safety control: RAG plus strict instructions (“use only provided sources; cite them”) reduces unsupported claims. Pair this with abstention logic when evidence is missing.
Data handling: the exam expects least privilege, secure networking, and careful logging. Avoid storing sensitive prompts/responses unless necessary; if you must log, redact PII/secrets. Ensure that retrieved documents respect ACLs (Azure AI Search security trimming or application-side filtering). Mis-scoped retrieval is both a security incident and an exam pitfall.
Exam Tip: When you see “multi-tenant,” “role-based access,” or “internal HR/finance documents,” prioritize answers that enforce authorization at retrieval time (metadata filters, ACL trimming) rather than “post-filtering the final answer.” Post-filtering is too late because the model may have already seen unauthorized content.
Evaluation and optimization is where many candidates lose points because they focus only on “prompt quality” and ignore system metrics. AI-102 expects you to manage latency, throughput, and cost while preventing quality regressions. Define success metrics aligned to the task: grounded answer rate, citation correctness, factual consistency, extraction accuracy, refusal rate, and user satisfaction. For RAG, also track retrieval metrics (recall@K, precision@K, and how often the answer uses retrieved sources).
Regression testing: keep a fixed evaluation set of representative prompts/documents and compare outputs across prompt changes, model version changes, and retrieval tuning. Many production failures come from silent changes (new chunking strategy, updated embeddings, model upgrade) that degrade answers. The exam frequently hints at “it used to work” scenarios—choose answers that introduce repeatable tests and monitoring, not just ad-hoc prompt edits.
Latency and cost levers: reduce tokens (shorter context, smaller max output), cache frequent results (especially for deterministic system prompts + identical retrieval), and consider model right-sizing. Caching is especially effective for stable FAQs or repeated tool results, but be careful with personalization and access control.
Exam Tip: If the scenario says “cost spiked” or “token usage is high,” the correct answer usually combines (1) context reduction (better retrieval/chunking), (2) output limits, and (3) caching. A common wrong choice is “increase context window,” which can worsen both cost and latency unless paired with better retrieval discipline.
Finally, treat evaluation as continuous: monitor production for drift (new document formats, new user intents) and update chunking, prompts, and safety thresholds with controlled rollouts and regression checks.
1. You are building an API that uses Azure OpenAI to generate receipts in a strict JSON schema (no extra keys). The calling system rejects responses that contain non-JSON text. You must minimize format errors without adding significant latency. Which approach should you implement first?
2. A support chatbot must answer questions using an internal policy library and include citations. The policy corpus is 50,000 documents and changes daily. Users also ask follow-up questions in the same chat. You need to reduce hallucinations while staying within a limited context window. Which design best fits?
3. You are deploying a generative assistant for a bank. The app must block self-harm instructions and hate content, and it must not reveal customer PII in responses. The business requires consistent enforcement across prompts, regardless of user attempts to jailbreak. Which solution should you prioritize?
4. A team reports that their RAG-based assistant sometimes answers using outdated information even though the latest documents are in the index. They also notice that retrieved passages are correct, but the model ignores them and responds from general knowledge. What change is most likely to improve grounding to the retrieved content?
5. You operate a generative summarization service with strict cost and latency SLOs. After a prompt change, average tokens per request increased by 40%, cost rose, and P95 latency degraded. You need a process that detects this kind of degradation before production. What should you implement?
This chapter targets the AI-102 skills measured around building agentic applications: systems that can plan, call tools, observe outcomes, and iterate toward a goal while operating under constraints. On the exam, you’ll be tested less on “agent hype” and more on whether you can select the right architecture and controls: tool/function contracts, orchestration choices, memory/state design, and reliability and governance mechanisms that keep an agent safe, auditable, and cost-effective.
A practical way to frame an agent (and a common exam lens) is: Goal → Plan → Act (tool calls) → Observe (tool results) → Update state → Repeat. Your job is to implement that loop with the right boundaries: limiting tool permissions, validating tool inputs/outputs, handling errors, and deciding what the agent should remember. Many exam scenarios disguise these choices as “which component should you use?” questions—often between orchestration logic in your app vs. model-driven planning, and between short-lived session context vs. durable storage.
As you read, keep translating requirements into design decisions: Is the task deterministic or open-ended? Do we need approvals? What data can be stored? What failure modes matter? Those are the levers the exam expects you to pull.
Practice note for Agent design: goals, plans, tools, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Orchestration and tool/function integration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Memory, state, and knowledge access for multi-step tasks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Reliability engineering: guardrails, retries, and human-in-the-loop: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: agent workflows and control scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Agent design: goals, plans, tools, and constraints: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Orchestration and tool/function integration patterns: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Memory, state, and knowledge access for multi-step tasks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Reliability engineering: guardrails, retries, and human-in-the-loop: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: agent workflows and control scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Agent design starts with specifying goals (what “done” means), plans (the intermediate steps), tools (what actions are allowed), and constraints (policies, time/cost limits, and data boundaries). The exam frequently probes whether you can separate a model’s generative reasoning from the application’s responsibility for control. In practice, an agent loop includes: (1) interpret user intent, (2) draft a plan or next action, (3) call a tool, (4) observe results, and (5) decide whether to continue or stop.
Two design choices show up often. First, who owns the plan: a “model-plans” agent lets the LLM propose steps, but your app still enforces constraints and validates outputs. A “workflow-first” agent uses a predefined state machine where the LLM fills in slots (like parameters), which is easier to test and safer for regulated workloads. Second, termination: you need clear stop conditions (goal achieved, max steps, max cost, or human approval required). Many production incidents come from missing termination logic—on the exam, that maps to “limit iterations,” “timeouts,” and “budgets.”
Exam Tip: If a scenario emphasizes compliance, predictability, or auditability, prefer explicit workflow/state-machine orchestration and constrained toolsets over free-form autonomous planning.
Common trap: assuming the model can “remember” or “self-correct” without external state. On AI-102, memory and state are explicit engineering tasks; the loop must persist what matters and discard what doesn’t.
Tools/functions are how an agent affects the world: calling APIs, querying Azure AI Search, writing tickets, sending emails, or executing business operations. The exam focuses on correct contracts: define what the function does, required parameters, parameter types, allowed ranges, and what errors look like. If you use function calling, treat the schema as an interface specification, not documentation—tight schemas reduce hallucinated parameters and make tool calls reliably parseable.
Error handling is an agent’s “muscle.” You should implement: input validation before calling tools, output validation after tool returns, and well-defined retry behavior. A robust pattern is: (1) attempt tool call, (2) classify errors (transient vs. permanent), (3) retry with backoff for transient failures, and (4) fall back or escalate for permanent errors. For example, a 429/503 suggests backoff; a 400 with schema mismatch suggests you must reformat parameters rather than retry blindly.
Exam Tip: When you see “tool returns malformed data” or “agent calls API with wrong fields,” the best answer usually includes strict JSON schema, server-side validation, and safe failure behavior (don’t let the model invent missing required values).
Common trap: letting the model “handle” exceptions in natural language only. The correct engineering approach is to surface structured error details to the orchestration layer, then decide whether to retry, ask the user for missing inputs, or route to human review.
Orchestration is the control plane for agent behavior: deciding which model prompt, which tool, and which sequence of steps to use. AI-102 questions often present a requirement like “support multiple task types” or “use specialist skills” and ask you to choose between a single general agent, a router pattern, or a multi-agent design. A single-agent approach is simpler and easier to govern; you add tools and instructions but keep one loop. A router pattern classifies the request and dispatches it to a specialized workflow (e.g., “search-and-summarize” vs. “create-support-ticket”). A multi-agent pattern delegates subproblems to specialized agents (planner, researcher, writer), but increases complexity, latency, and failure surface area.
Delegation requires explicit boundaries: what context each agent receives, what outputs are accepted, and who has authority to call sensitive tools. A strong exam answer typically mentions least privilege: the “writer” agent shouldn’t have permission to send emails; only the “action” agent does after approvals.
Exam Tip: If the prompt mentions “different departments,” “different policies,” or “different tool permissions,” think router or multi-agent with separated tool scopes—not one omnipotent agent.
Common trap: choosing multi-agent when a router-to-workflows solution is safer. On the exam, prefer the simplest architecture that meets requirements, especially when governance and predictability are emphasized.
State answers: “What does the agent need to know right now to finish the task?” Memory answers: “What can it reuse later?” AI-102 expects you to distinguish short-lived session context (conversation turns, current plan, last tool results) from long-term memory (user preferences, prior cases, durable facts), and to handle privacy correctly.
Session state is typically stored in your app tier or a low-latency store and should be bounded: keep only what’s needed for the next steps to control token costs and reduce leakage risk. Long-term memory should be explicit and justifiable: store stable preferences (“use metric units”), prior decisions, or curated summaries—not raw transcripts by default. When you do store content, consider encryption, access control, retention policies, and user consent.
Exam Tip: If the stem mentions PII, regulatory requirements, or “should not store user prompts,” the correct design is usually ephemeral session state + redaction + minimal retention, with long-term memory limited to user-approved, non-sensitive summaries.
Common trap: conflating memory with RAG. Retrieval is for authoritative external knowledge; memory is for user- or agent-specific continuity. On the exam, pick retrieval for “latest policies” and memory for “user’s recurring preferences,” but apply strict governance to both.
Agent safety is primarily about controlling actions. A chatbot that only answers questions has a different risk profile than an agent that can modify data, send messages, or trigger workflows. AI-102 scenarios frequently ask how to prevent unauthorized actions, data exfiltration, or prompt injection leading to dangerous tool calls. Your design should implement least privilege at multiple layers: tool allowlists, API scopes, managed identities, and role-based access control on underlying resources.
Policy enforcement should be externalized where possible: use centralized policy and approvals rather than embedding rules only in prompts. For sensitive operations (e.g., “delete records,” “send to all users,” “approve refund”), implement human-in-the-loop approvals or multi-step confirmations. Additionally, constrain tools with server-side checks (e.g., user can only access their tenant’s data) so even if the model is manipulated, the tool cannot overreach.
Exam Tip: If a question mentions “agent should not execute unless…”, the best answer usually combines (1) explicit approval gates, (2) permission-scoped credentials, and (3) server-side authorization checks—prompts alone are not sufficient controls.
Common trap: overreliance on content filters for action safety. Content safety helps with outputs; governance for agents must also control capabilities and permissions.
Reliability engineering for agents means making behavior repeatable enough to trust in production. The exam focuses on guardrails (schema validation, tool constraints), retries/backoff, monitoring, and a basic evaluation strategy. You should test: (1) correctness of tool selection, (2) correctness of tool arguments, (3) grounding/citation behavior when using retrieval, and (4) refusal/approval behavior for restricted actions. Automated evals can measure task success rates and policy compliance across a regression suite of representative prompts.
Red-teaming basics are also relevant: attempt prompt injection, jailbreaks, and data-leak scenarios, and verify the agent fails safely (refuses, requests approval, or limits tool scope). When failures occur, incident response should include rapid rollback (feature flags), tighter tool permissions, updated allowlists/denylists, and expanding test cases to prevent recurrence.
Exam Tip: If the problem statement includes “intermittent failures” or “rate limits,” expect answers involving retries with exponential backoff, circuit breakers, and idempotent tool design. If it includes “unexpected actions,” expect approval gates, tool allowlists, and better logging/auditing.
Common trap: treating evaluation as subjective. The exam leans toward measurable checks: schema validity, citation presence, policy decisions, and tool-call success—things you can automate and monitor continuously.
1. A company is building an agent that can create support tickets. Requirements: the agent must call a "CreateTicket" tool, but only after validating required fields and ensuring the user explicitly confirmed the ticket submission. Which design best meets the requirement?
2. You are implementing a multi-step agent workflow: (1) query an internal catalog, (2) compute a quote, (3) generate an email. The agent must persist progress so that if the process fails after step 2, it can resume without repeating expensive tool calls. What should you use?
3. A team is integrating several tools into an agent: "SearchDocs", "GetCustomerRecord", and "UpdateCustomerRecord". Security policy requires least privilege and prevention of unintended data changes. Which approach best aligns with this requirement?
4. An agent uses a function/tool calling pattern to fetch inventory counts. During peak hours, the inventory API intermittently returns HTTP 429 (rate limit). The business requirement is to increase success rate while controlling costs and preventing infinite loops. What should you implement?
5. A company wants an agent that answers HR policy questions. Requirements: the agent must cite the most current policy and avoid storing employee questions containing personal data beyond the active session. Which design best fits?
This chapter targets the AI-102 skills that repeatedly show up in scenario questions: choosing the right vision capability (image analysis vs OCR vs document extraction), implementing NLP tasks (extraction, classification, summarization), and building knowledge mining pipelines with Azure AI Search (indexing, enrichment, skillsets). The exam rarely asks you to recite API names; it asks you to design an end-to-end solution under constraints like latency, cost, governance, security, and accuracy—then identify the best Azure service combination.
As you read, keep a “pipeline mindset.” Most real solutions on the exam chain together: content ingestion (Blob/ADLS/SharePoint), extraction (OCR/document intelligence), enrichment (language/vision skills), indexing (keyword + vector), and then an app layer that queries Search or orchestrates tools via an agent. Traps typically involve picking a model/service that can’t meet the input format, throughput, or retrieval requirement.
Exam Tip: When the scenario mentions “search across documents” with “filters/facets,” think Azure AI Search first. When it mentions “extract structured fields from forms,” think Document Intelligence (formerly Form Recognizer). When it mentions “describe an image / detect objects,” think Azure AI Vision image analysis. Then decide whether you need real-time calls, batch enrichment, or both.
Practice note for Computer vision solutions: image analysis and OCR design choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP solutions: extraction, classification, summarization, and conversation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Knowledge mining with Azure AI Search: indexing and enrichment pipelines: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Information extraction workflows: documents, entities, and metadata at scale: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: CV/NLP/Search integrated scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Computer vision solutions: image analysis and OCR design choices: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for NLP solutions: extraction, classification, summarization, and conversation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Knowledge mining with Azure AI Search: indexing and enrichment pipelines: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Information extraction workflows: documents, entities, and metadata at scale: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam-style practice set: CV/NLP/Search integrated scenarios: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 tests whether you can map a vision requirement to the correct capability and output type. “Image analysis” tasks include generating captions, tags, object detection, and sometimes dense captions depending on the feature set. “OCR” is specifically about extracting text and layout from images or scanned documents. “Document flows” usually imply multi-page PDFs, forms, invoices, receipts, or contracts where you must turn unstructured pages into structured fields and searchable content.
Design starts with the input and the expected output. If the app needs searchable text from images, OCR is non-negotiable. If the app needs key-value pairs (invoice number, totals, vendor), use a document extraction approach (Document Intelligence prebuilt models or custom models) rather than trying to regex raw OCR. For photos where the user wants “what’s in the picture,” image analysis is the better match.
Common trap: Selecting image analysis to “extract text from a PDF.” The exam expects you to recognize that OCR/document extraction is needed, and that PDFs may require a document-capable pipeline (including page handling) rather than a single-image endpoint.
Exam Tip: In scenario stems, keywords like “scanned,” “handwritten,” “invoice,” “receipt,” “multi-page,” “tables,” and “key-value pairs” point away from generic image analysis and toward OCR/document intelligence. Keywords like “detect people/vehicles,” “describe scene,” or “generate tags” point toward image analysis.
Many AI-102 questions are architecture decisions disguised as feature questions. You must decide whether to run vision in real time (synchronous API call as the user uploads) or batch (asynchronous processing of a backlog). Real-time is appropriate when user experience requires immediate feedback—e.g., a mobile app reading text from a package label. Batch is appropriate when processing large archives—e.g., back-scanning a million PDFs for knowledge mining.
Latency and throughput drive the design. For batch pipelines, you often stage raw content in Azure Blob Storage or ADLS Gen2, trigger processing via Event Grid/Functions, and write extracted artifacts (text, JSON fields, thumbnails) back to storage for later indexing. For real-time, you may still store the original content for audit/reprocessing, but your primary success metric is request latency and reliability (timeouts, retries, idempotency).
Common trap: Treating OCR output as “the source of truth” and discarding originals. Exam scenarios often include compliance, audit, or reprocessing requirements; keeping original documents (and associating them with extracted metadata) is a safer design choice.
Exam Tip: If the stem mentions “cost,” “large backlog,” “daily ingestion,” or “indexing pipeline,” prefer batch enrichment. If it mentions “interactive,” “upload and instantly,” “mobile,” or “live camera,” prefer synchronous calls—then mitigate risk with timeouts, fallbacks, and storing for later reprocessing.
NLP questions on AI-102 usually revolve around picking the right extraction or classification technique and knowing what the output is used for. Key phrase extraction produces topical phrases suitable for tagging and search facets. Entity recognition extracts typed entities (people, organizations, locations, dates, product identifiers) that can drive filtering, linking, or compliance checks. Classification assigns labels to text (e.g., “refund request,” “technical issue,” “legal hold”). The exam expects you to connect the NLP result to downstream actions: routing tickets, indexing fields, or triggering workflows.
For implementation, the typical pattern is: collect text (from documents or chat logs), clean/normalize it, run an NLP model, then persist outputs in a structured store or search index. In knowledge mining scenarios, those outputs become searchable fields and facets. In operational apps, those outputs become decision inputs (queue selection, priority, escalation).
Common trap: Confusing summarization with key phrase extraction. Summaries are narrative compressions; key phrases are short tags. If the requirement is “generate tags” or “improve search facets,” key phrases/entities are the better fit.
Exam Tip: Look for the verb in the requirement: “tag,” “extract,” “identify,” “detect” → extraction. “route,” “categorize,” “assign label” → classification. Then verify the data shape needed by the next component (Search fields, filters, or workflow rules).
AI-102 increasingly blends classic NLP tasks with Azure OpenAI prompting patterns. On the exam, “conversational patterns” often mean a chat experience that uses tools: retrieve knowledge, call a function, summarize results, and maintain state. Summarization use cases include meeting notes, customer support transcripts, case histories, or long documents where users need a brief synopsis plus citations.
A reliable pattern is: (1) retrieve relevant content (often via Azure AI Search), (2) ground the model with retrieved passages, (3) instruct the model to summarize with constraints (length, sections, tone), and (4) optionally run lightweight NLP to extract structured items (action items, entities, sentiment) for reporting. The exam aims to see that you can combine prompt design with deterministic post-processing when the output must be structured.
Common trap: Using only a prompt to produce “structured” results without validation. If the scenario emphasizes auditability, downstream automation, or strict schemas, the better answer usually includes function calling/JSON schema enforcement and/or post-processing plus retries.
Exam Tip: When you see “conversation must remember context,” separate “short-term context” (chat history window) from “long-term memory” (stored summaries, embeddings, or key facts). The correct design frequently summarizes older turns and stores them, rather than passing the entire conversation every time (cost/latency/token limits).
Also watch for safety and policy requirements: if the bot interacts with users, the exam may require content filtering, PII handling, or restricted tool access. Those are controls that sit around the conversational workflow, not after the fact.
Knowledge mining is a high-frequency AI-102 scenario: ingest heterogeneous content, enrich it, and make it searchable. Azure AI Search is the core service, and the exam expects you to understand index design and ingestion choices. Indexes contain fields with attributes that control behavior (searchable, filterable, sortable, facetable, retrievable). A common objective is to enable both keyword search (lexical match, filters, facets) and vector search (semantic similarity using embeddings). Many solutions need a hybrid approach: keyword for precision and compliance queries (“find ‘termination for cause’”), vector for “find similar” and natural-language questions.
Ingestion typically uses indexers that connect to data sources like Blob Storage, ADLS Gen2, or databases. Indexers can run on schedules for incremental updates. When the content is documents, you often split text into chunks before embedding, so each chunk becomes a searchable unit with its own vector and metadata (document id, page, section, security labels).
Common trap: Proposing “just use a database” for enterprise search requirements. If the scenario includes facets, relevance ranking, full-text queries, and enrichment pipelines, Azure AI Search is the expected platform component.
Exam Tip: If the stem mentions “permissions per document,” look for security trimming patterns (indexing ACLs, filtering by user/group claims) and ensure your index has filterable fields to enforce access at query time.
This is where many candidates lose points: knowing that Azure AI Search can do more than ingest text—it can run enrichment via skillsets and shape output via projections. A skillset is a pipeline of cognitive skills (built-in and custom) applied during indexing: OCR, language detection, entity recognition, key phrase extraction, text splitting, embedding generation, and custom Web API skills. The output is an enriched document tree that you must map into index fields.
Projections matter when you need to index at “chunk level” for RAG: instead of one index document per file, you project child items (pages/sections/chunks) into separate index documents with shared metadata. This improves retrieval quality and citation granularity. The indexing strategy should also preserve traceability: store references to the original blob path, page number, and offsets so you can show citations and re-run enrichment when models or schemas change.
Common trap: Indexing only the final summary and discarding extracted entities/metadata. Exam scenarios often require faceted navigation (“filter by vendor,” “filter by date range,” “show entities”), which needs structured fields, not just a summary blob.
Exam Tip: If the question emphasizes “at scale” and “documents,” think end-to-end: OCR/document extraction → enrichment skillset → projections for chunks → index fields for filters/facets → hybrid vector+keyword queries. The best answer is usually the one that maintains lineage (document id/page) and enables both precision filtering and semantic retrieval.
1. A company stores millions of scanned PDF invoices in Azure Blob Storage. Users must search across invoices by vendor name, invoice number, and total amount, and refine results using facets (for example, vendor) and filters (for example, date range). You must extract these fields from the documents and make them searchable with low operational overhead. Which solution best meets the requirements?
2. You are designing an API that receives a photo taken from a mobile device and must return a short description and detected objects in under 500 ms for an interactive user experience. The images can include street scenes and products. Which approach should you choose?
3. A support team wants to automatically route incoming customer emails into one of five categories (Billing, Technical, Account, Sales, Other). The model must return a single label per email and will be called in real time from a ticketing system. Which NLP capability best fits the requirement?
4. You are implementing a knowledge mining solution for an engineering portal. Content includes Word documents and PDFs in SharePoint and Azure Blob Storage. Users need keyword search plus semantic/vector search over extracted text, and results must include metadata filters such as project, author, and last modified date. Which design best meets these requirements?
5. A company processes 500,000 mixed documents per day (scanned forms, printed letters, and handwritten notes). They must extract structured fields from forms when possible, and otherwise extract entities (people, locations, dates) from the free text. The solution must be scalable and support downstream search and analytics. Which workflow is most appropriate?
This chapter is your transition from learning to scoring. AI-102 doesn’t reward trivia as much as it rewards correct engineering decisions under constraints: security boundaries, latency, cost, reliability, and governance. The goal of the full mock exam is not to “see what you get,” but to surface patterns: what you misread, what you over-assume, and what you fail to connect across domains (Azure OpenAI + AI Search + monitoring + identity).
Use this chapter as a guided rehearsal. You’ll run two mock blocks, perform weak-spot analysis, and finish with rapid recall drills mapped to exam outcomes: planning and managing Azure AI solutions; implementing generative AI solutions and RAG patterns; building agentic orchestration with tools/functions and memory; implementing CV and NLP solutions; and implementing knowledge mining with Azure AI Search indexing and enrichment.
Exam Tip: Treat every item like a real customer ticket. Ask: “What is the constraint?” (data residency, private network, cost ceiling, safety requirement, latency, multilingual input). The correct answer is usually the option that satisfies the constraint with the least added complexity—unless the prompt explicitly demands customization, deterministic output, or strict isolation.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final domain review and rapid recall drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 2: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Weak Spot Analysis: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Exam Day Checklist: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Final domain review and rapid recall drills: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Mock Exam Part 1: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI-102 questions commonly blend design and implementation details. You’ll see scenario prompts that implicitly test: identity and network access, data flow (ingest → enrich → index → retrieve), evaluation and safety controls, and operational readiness (monitoring, cost, deployment). Your pacing must prevent over-investing in early items while still reading carefully enough to catch constraints.
Run your mock in two passes. Pass 1 is “efficient correctness”: answer what you can in 60–90 seconds each and flag anything that requires rereading, computation, or comparing two nearly-correct services. Pass 2 is “constraint resolution”: return to flagged items and resolve them by mapping each requirement to a feature (e.g., private endpoint support, managed identity, content filtering, semantic ranker, vector search, skillset enrichment).
Exam Tip: When two options both “work,” choose the one that uses the Azure-native service intended for that outcome and minimizes custom glue code. AI-102 frequently rewards service-fit (e.g., Azure AI Search skillsets for enrichment rather than hand-rolled ETL, unless you need bespoke transformations).
Common pacing trap: rereading the entire scenario repeatedly. Instead, write a one-line “constraint summary” on scratch paper (e.g., “private, PII, citations, low latency, multilingual”). Use it to test each option quickly.
Mock Exam Part 1 should feel like a cross-domain sprint: short scenarios that jump between Azure OpenAI, Azure AI Search, Vision, Language, and operational controls. Your task is to build “service instinct”—the ability to map a requirement to the correct product feature without second-guessing.
Expect mixed-domain prompts such as: deploying a RAG app with governance requirements; choosing between document-level vs chunk-level retrieval; configuring vector search + semantic ranking; selecting OCR vs document intelligence for forms; and applying responsible AI controls for content. In agentic scenarios, the exam looks for correct tool/function boundaries, state management (conversation history vs long-term memory), and reliability techniques (timeouts, retries, idempotency, and safe tool invocation).
Exam Tip: For security and governance scenarios, the “right” answer often includes managed identity for service-to-service access, private endpoints for data-plane isolation, and logging/monitoring via Azure Monitor/Application Insights—these are high-frequency exam themes.
Common trap: assuming “more advanced” equals “more correct.” The exam frequently rewards simpler choices when they satisfy requirements (e.g., using built-in skillsets rather than custom skills, or using Azure OpenAI content filtering and prompt engineering before building a complex moderation pipeline).
Mock Exam Part 2 should simulate a case study and troubleshooting set, where you must keep multiple requirements in your head while diagnosing failures. Case studies often hide “gotchas” in constraints: data must remain in a private VNet; customer-managed keys are required; logs cannot contain PII; or latency targets force caching and smaller context windows.
For generative AI case studies, you’re typically asked to improve answer quality and reliability. Quality improvements map to: better chunking strategy, hybrid retrieval (vector + keyword), semantic ranking, query rewriting, grounding instructions, and evaluation metrics. Reliability improvements map to: fallbacks (retrieval failure handling), tool call validation, rate limit backoff, and deterministic formatting. Safety improvements map to: content filtering, prompt shields, red-teaming, and structured output validation.
Exam Tip: In troubleshooting items, distinguish between “symptom fixes” and “root cause fixes.” AI-102 tends to prefer changes that align with the architecture layer responsible (e.g., fix retrieval configuration in Search rather than adding more prompt text to compensate for bad indexing).
Common case-study trap: ignoring deployment realities. If the scenario mentions staging/production, blue-green or canary, or model versioning, the correct answer will include controlled rollout and monitoring—especially for prompt changes and agent tool updates.
Your score improves most during review, not during the mock. The goal is to convert every missed or guessed item into a reusable decision rule. Use a structured method: for each flagged question, write (1) the constraint, (2) the tested objective, (3) why the wrong option was tempting, and (4) the “trigger phrase” that should have guided you.
Build a “trap log” with categories that match AI-102 patterns:
Exam Tip: Your review notes should be phrased as if-then rules: “If private networking is required, then verify each service supports private endpoints and that the data path stays private.” These rules become rapid recall drills in Section 6.5.
A common review trap is rewriting the textbook. Don’t. Instead, extract the decision hinge. Example hinge: “Need citations and grounding → retrieve from Search and pass sources; don’t rely on chat history.” This is the kind of note that sticks under time pressure.
This final domain review is your weak-spot analysis translated into a memorization plan. AI-102 rewards reasoning from constraints, but there are still items you should memorize because they show up as near-miss distractors.
Plan and manage an Azure AI solution: Memorize core governance controls (RBAC, managed identity, private endpoints, key vault integration, logging/metrics). Reason about tradeoffs: cost vs latency, isolation vs complexity, and monitoring placement (app + service telemetry). Exam Tip: If the scenario mentions “least privilege,” assume managed identity and scoped roles; if it mentions “no public internet,” assume private endpoints and network rules.
Generative AI with Azure OpenAI: Memorize RAG building blocks (embeddings, chunking, hybrid retrieval, citations, evaluation). Reason about prompt design and safety: the correct approach usually layers retrieval grounding + structured output + safety filters. Trap: treating prompt engineering as a substitute for retrieval quality.
Agentic solutions: Memorize tool/function concepts, schemas, and orchestration patterns (planner-executor, tool-calling, memory types). Reason about reliability: validate tool inputs, handle retries, and maintain state intentionally. Trap: storing everything as “memory” instead of storing only what’s useful and safe.
Computer vision: Memorize when to use OCR vs image analysis vs custom vision workflows. Reason from the artifact type: scanned docs imply OCR; product defects imply custom detection/classification. Trap: using generic image captions when you need structured extraction.
NLP solutions: Memorize common tasks (classification, extraction, summarization, conversational patterns). Reason about multilingual needs, PII handling, and evaluation. Trap: confusing entity extraction with document search relevance.
Knowledge mining: Memorize indexing concepts (data sources, indexers, skillsets, field attributes, semantic config, vector fields). Reason about enrichment pipelines and incremental updates. Trap: forgetting that retrieval quality starts at indexing design.
On exam day, reduce variance. Your job is to execute your process, not to “feel confident.” Start with environment checks: stable internet, allowed testing space, and a quick system check if remote proctoring is used. Have scratch paper (if permitted) and write your constraint shorthand method at the start: “Read constraints → map to domain → eliminate violations → choose simplest compliant option.”
Time management: commit to a two-pass strategy. Pass 1 answers quickly and flags. Pass 2 resolves flags by re-reading only the relevant lines. Don’t let one ambiguous item steal minutes from five straightforward ones. If a question feels like two correct answers, re-check for a single word that changes everything (private/public, must/should, guarantee/best effort, audit/regulatory, multilingual, deterministic format).
Exam Tip: If you’re stuck, anchor on the exam objective being tested. Ask: “Is this really a model question, a search/indexing question, or a governance question?” Misclassification is a common reason candidates pick plausible but wrong answers.
Retake plan (if needed): schedule within 7–14 days while context is fresh. Use your trap log to drive targeted practice: one domain per day, then a mini-mock. Focus on converting repeated misses into if-then rules. The fastest score gains come from fixing recurring misreads, not learning new features.
1. You are building a RAG application using Azure OpenAI and Azure AI Search. The customer requires: (1) data must not traverse the public internet, (2) least-privilege access for the app to both services, and (3) minimal operational complexity. Which design best meets the requirements?
2. During a mock exam, your RAG answers are frequently incorrect because the model cites content that is not present in the retrieved documents. You want the fastest mitigation with the least architectural change while preserving answer quality. What should you do first?
3. You are implementing an agent that can call tools (functions) to look up order status and issue refunds. A compliance requirement states: refunds above $500 must require human approval, and all tool calls must be auditable. Which approach best satisfies this requirement?
4. A global support team needs near real-time monitoring to detect spikes in failed requests and elevated latency across an Azure OpenAI + Azure AI Search application. You also need to correlate the user request with downstream calls. Which solution is most appropriate?
5. You are doing weak-spot analysis after a mock exam. You notice you often choose solutions that are technically correct but violate a stated constraint (cost ceiling, private network, deterministic output). What is the best next step to improve exam performance?