AI Ethics, Safety & Governance — Beginner
Set clear AI rules in a week using plain policies, roles, and checklists.
AI tools are now part of everyday work: writing emails, summarizing meetings, drafting proposals, and analyzing documents. For small teams, the biggest problem is not “building AI.” It’s using AI without clear rules. That can lead to simple but costly issues: sharing sensitive information, publishing incorrect content, treating people unfairly, or making decisions nobody can explain later.
This beginner-friendly course is a short, practical “book” that helps you set up AI governance without needing a legal team or technical background. You will build a small, workable system: a basic policy, a few checklists, and a simple way to review and document AI use.
You will leave with a starter AI governance kit that fits a small organization. Instead of abstract theory, you will create clear, repeatable habits your team can actually follow.
The course has exactly six chapters, and each chapter builds on the previous one. You start by learning what AI governance means in simple terms. Next, you inventory the AI tools and use cases you already have. Then you turn that inventory into a practical policy, define roles and approvals, add checklists that reduce common risks, and finally set up incident handling and a 30-day rollout plan.
Every chapter focuses on “small-team friendly” actions: minimal meetings, minimal paperwork, and maximum clarity. The goal is progress you can sustain, not perfection.
This course is designed for absolute beginners. If you are a manager, team lead, operations professional, HR teammate, marketer, customer support lead, founder, or anyone who needs sensible AI rules for daily work, you are in the right place. You do not need coding skills, data science knowledge, or prior experience with AI.
As you go, think of one real AI tool and one real use case from your team (for example: “using a chatbot to draft customer replies”). You will apply the frameworks directly to that scenario. This makes the final output immediately usable: a policy you can publish, checklists you can adopt, and a workflow you can run.
If you want a simple, safe way to adopt AI without slowing your team down, this course will guide you step by step. Register free to begin, or browse all courses to see related topics.
AI Governance Lead & Risk Program Designer
Sofia Chen designs practical AI governance programs for small and mid-sized organizations, turning big compliance ideas into simple team habits. She focuses on privacy-by-design, safe AI use policies, and lightweight review workflows that non-technical teams can run.
Small teams adopt AI because it saves time: drafting emails, summarizing calls, writing code, triaging support tickets, generating images, or searching internal knowledge. The risk is that AI also changes how decisions are made, how data moves, and what gets shared outside your walls. AI governance is simply the set of rules and routines that keep those changes safe, intentional, and aligned with your team’s goals.
This chapter is written for day-to-day reality: you may not have a legal department, a security team, or an ML engineer. You still need clarity—what tools are allowed, what data is off-limits, who can approve new tools, and what to do when something goes wrong. Good governance is not bureaucracy; it is how you avoid avoidable incidents (privacy leaks, embarrassing errors, biased outputs, or insecure integrations) while increasing trust in the results you ship.
We will build five practical milestones into your thinking. First, a simple definition of “governance.” Next, you will learn to spot where AI already shows up in ordinary work. Then you’ll map the main risk types—privacy, errors, bias, and security—into a small set of failure modes you can check quickly. After that, you’ll pick your first governance goal (risk reduction or trust-building). Finally, you’ll draft a short “why now” statement to align your team and make the policy stick.
By the end of this chapter, you’ll know what to govern, why, and what “good enough” looks like for a small team starting from scratch. You are not trying to predict every edge case. You are trying to catch the common ones early and build the muscle to respond well when something slips through.
Practice note for Milestone 1: Understand what “governance” means (simple definition): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Identify where AI shows up in everyday work: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Learn the main risk types (privacy, errors, bias, security): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Decide your first governance goal (reduce risk, increase trust): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Draft your “why now” statement for the team: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Understand what “governance” means (simple definition): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Identify where AI shows up in everyday work: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
In small teams, “AI” can mean anything from a chatbot to an autocomplete feature. For governance, you need a working definition that is broad enough to catch real risk, but narrow enough that people can follow it. A practical definition: AI is any tool or feature that generates, classifies, ranks, or recommends content or decisions based on statistical patterns—especially when it is trained on large datasets or uses a model you did not create.
That includes: large language models (chatbots, meeting summarizers, coding assistants), image/video generators, automated scoring (lead scores, fraud flags), “smart” search, recommendation feeds, and vendor tools labeled “AI-powered.” It also includes workflow automation that uses AI to decide what to do next (e.g., auto-routing tickets based on message text).
Common mistake: treating AI as “just another software tool.” Unlike normal software, AI outputs can be plausible but wrong, can reflect bias in training data, and can leak sensitive inputs through logs, prompts, connectors, or human copy/paste. Another mistake is focusing only on “model building.” Many teams do not train models; they use and buy AI. Governance must cover all three: use, buy, and build.
Milestone tie-in: once you adopt this definition, you can identify where AI shows up in everyday work. Ask: “Does this tool generate content, classify, or recommend?” If yes, it falls under your AI policy—no argument about labels required.
Teams often stall because the words feel heavy. Here is the simple split that keeps you moving.
Governance is the operating system: who decides, what rules exist, what steps happen before and after AI use, and how you document choices. Think “process and accountability.”
Ethics is the value lens: what your team believes is acceptable, fair, and respectful—especially when users or customers could be harmed. Think “should we do it, even if we can?”
Compliance is the external constraint: laws, regulations, contracts, and industry standards you must follow (privacy laws, security requirements, customer DPAs, procurement rules). Think “must do.”
Milestone 1 (simple definition): a governance program for a small team can be as small as: a one-page policy, a one-page risk checklist, and a simple register of what AI tools you use. The goal is not to “be perfect.” The goal is to make decisions visible, consistent, and reviewable.
Engineering judgment shows up when you decide what level of formality fits the risk. A marketing team using AI for headline ideas needs a lighter process than a healthcare team summarizing patient notes. Governance is how you right-size the process without pretending all AI use is equal.
Most small teams touch AI in three ways: use (employees use tools), buy (you procure a vendor feature), and build (you integrate an API or ship an AI feature). Governance must follow the lifecycle, because risk shows up at different moments.
Use: People paste content into chatbots, connect tools to Google Drive, or ask an assistant to draft client responses. Risks here are often privacy and accidental disclosure. Practical control: define what data is allowed, require human review before sending outputs externally, and require approved accounts (not personal logins).
Buy: You add an “AI add-on” to a SaaS product. Risks here include vendor data handling, retention, training on your inputs, and unclear security posture. Practical control: an approval step before purchase or enablement, with a short vendor questionnaire (data use, retention, access controls, incident notification).
Build: You integrate an LLM API, ship a recommender, or automate decisions. Risks here include prompt injection, insecure connectors, logging sensitive prompts, and users over-trusting outputs. Practical control: threat modeling-lite, red teaming of common misuse cases, and release criteria (privacy review, safety tests, rollback plan).
Common mistake: only reviewing AI at purchase time. Teams often start with “free trials” and personal accounts, and only later realize data has been shared or outputs were used in customer work. Your lifecycle approach should include a lightweight onboarding step: when someone starts using a new AI tool, it gets recorded and checked once.
You do not need a long risk taxonomy to start. For beginner governance, you can catch most issues with four failure modes: wrong, harmful, leaked, and unsafe. These map cleanly to the main risk types you’ll see: errors, bias, privacy, and security.
Wrong (errors): AI makes up facts, cites nonexistent sources, misreads a table, or writes buggy code. Outcome: customer misinformation, bad decisions, rework. Control: require human verification, cite sources, run tests, and keep AI outputs as drafts unless validated.
Harmful (bias and unfairness): AI produces discriminatory language, unfair rankings, or stereotypes; it may disadvantage certain user groups. Outcome: reputational harm, user harm, legal risk. Control: avoid high-stakes automation at first, review outputs for sensitive categories, and document “no-go” uses (e.g., hiring decisions without structured review).
Leaked (privacy and confidentiality): sensitive data is pasted into prompts, stored in logs, or shared through connectors. Outcome: breach, contract violation, loss of trust. Control: clear data handling rules, redaction, approved tools only, and least-privilege access for integrations.
Unsafe (security and misuse): prompt injection, jailbreaking, malicious outputs, or insecure agent actions (sending emails, executing code, calling tools). Outcome: account takeover, data exfiltration, operational disruption. Control: restrict tool permissions, separate environments, input/output filtering where appropriate, and “human-in-the-loop” for actions.
Common mistake: treating “bias” as the only ethics issue. In small teams, the most frequent incidents are mundane: someone pastes a customer list into a chatbot, or a confident but incorrect summary gets emailed. Your checklist should prioritize the failure modes you are most likely to experience next week.
Beginner governance is about reducing surprise. “Good enough” means: people know the rules, risky uses get a second set of eyes, and you can answer basic questions quickly (What tools do we use? What data do we allow? Who approved this?). You are aiming for lightweight consistency, not heavyweight committees.
Start by choosing your first governance goal (Milestone 4). Most small teams pick one of these:
Then translate that goal into simple operating rules. Example “good enough” controls:
Common mistakes: writing a policy that is too abstract (“use AI responsibly”) or too strict to follow (forcing legal review for every prompt). A policy that people ignore is worse than a small policy people actually use. Good governance fits your workflow: a checkbox in a ticket, a template in procurement, a short review step before launch.
Your starter kit has three parts: a policy (rules), a checklist (risk spotting), and a register (inventory). This is enough to run a lightweight review and incident process without technical skills.
1) One-page AI Use Policy (what people may do). Include: approved tools/accounts; data you must not share (credentials, customer PII, health/financial data, confidential contracts); rules for prompt sharing (no pasting sensitive prompts into public forums); requirements for human review; and when disclosure is needed (e.g., customer-facing AI assistance).
2) One-page Risk Checklist (what to check before using/buying/building). Keep it tied to the four failure modes:
3) Simple AI Register (what exists). A spreadsheet is fine: tool name, owner, purpose, tier (1 or 2), data types used, approval date, renewal date, and notes on vendor terms.
Now write your “why now” statement (Milestone 5). Keep it short enough to read in a team meeting: “We’re adopting AI across writing, support, and engineering. To protect customer trust and move fast without surprises, we’re standardizing which tools we use, what data we share, and when we require review. This lets us benefit from AI while reducing privacy, security, and quality risks.”
Finally, define basic roles and a lightweight incident path: one person owns the policy, one person approves Tier 2 uses, and anyone can report issues in a shared channel. When an incident happens (wrong/harmful/leaked/unsafe), your process is: stop the use, capture what happened, notify the owner, decide containment, and update the policy/checklist so it is less likely next time. That feedback loop is governance in action.
1. In plain English, what does AI governance mean for a small team?
2. Why does the chapter say small teams need AI governance even without legal, security, or ML specialists?
3. Which set lists the main risk types the chapter highlights for quick checking?
4. Which statement best captures the chapter’s view that “good governance is not bureaucracy”?
5. What is the purpose of drafting a short “why now” statement for the team?
Small-team AI governance starts with a deceptively simple question: what AI are we actually using? If you can’t answer that quickly and confidently, every other governance step becomes guesswork. You can’t set rules, approvals, or incident response for tools you don’t know exist. You also can’t make good engineering judgments about privacy, security, or quality if you haven’t captured what data flows through each tool and who relies on its outputs.
This chapter is intentionally practical. You will build your first AI tool inventory (Milestone 1), assign each tool a purpose and owner (Milestone 2), classify tools by risk (Milestone 3), capture key facts that matter for privacy and reliability (Milestone 4), and set a lightweight monthly update habit so the inventory stays real (Milestone 5). The goal is not a perfect spreadsheet; the goal is a working map of your AI footprint that is accurate enough to drive decisions.
Think of the inventory as your “AI register.” In many organizations this becomes a formal governance artifact, but for small teams it can be a one-page table in a shared doc. What makes it powerful is consistency: every tool gets the same basic fields, every new tool gets added before it becomes routine, and every tool has an owner who can answer questions when something goes wrong.
Throughout, remember a key governance principle: inventory is a control. When you maintain it, you reduce surprise, speed up approvals, and create a single place to coordinate changes. When you don’t, you discover tools only after a customer complaint, a security finding, or a bad output shipped to production.
Practice note for Milestone 1: Create your first AI tool list (inventory): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Map each tool to a purpose and owner: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Classify tools by risk level (low/medium/high): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Capture key facts (data in/out, users, vendors): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Set a monthly update habit for the inventory: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Create your first AI tool list (inventory): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
An AI inventory is a living list of the AI-enabled tools, features, and integrations your team uses to create outputs or make decisions. It matters because it is the foundation for every later governance activity: policy, approvals, training, incident response, and audits. If you’re a small team, it also reduces cognitive load—people stop debating “do we use AI?” and start discussing specific tools and workflows.
For Milestone 1 (create your first AI tool list), start with a single table. Keep it small enough that you’ll maintain it. A good inventory includes both “obvious AI” (chatbots, image generators, model APIs) and “embedded AI” (CRM lead scoring, email triage, document summarization inside apps). Include any tool where a model influences content, recommendations, ranking, classification, or automation.
Engineering judgment comes in when deciding the boundary: don’t inventory every minor feature toggle, but do inventory anything that changes decisions, creates user-facing text, or stores prompts/inputs. A common mistake is starting too big and quitting. Start with your top 10 tools and expand as you find “shadow AI.”
End this section by saving the inventory somewhere shared (not a personal note) and naming it clearly (e.g., “Team AI Inventory”). That simple act turns it into a governance artifact your team can reference during reviews and onboarding.
Shadow AI is any AI tool used without the team realizing it, documenting it, or setting expectations for data handling. It’s common in small teams because individuals optimize for speed: they sign up for a free assistant, paste in a customer email thread, generate a proposal, and move on. The risk is not that people are “doing something wrong”; it’s that no one has assessed the data exposure, output quality, or vendor terms.
To find shadow AI, use a low-friction discovery process. Ask “what do you use to draft, summarize, code, design, research, or transcribe?” rather than “are you using AI?” People often don’t label their behavior as AI use. Run a 20-minute inventory workshop and collect tools live. Then follow up asynchronously with a short form that asks for the tool name and what it’s used for.
Use judgment when responding. Don’t start with enforcement; start with visibility. Tell the team the objective is safe enablement: “We want to keep the tools that help, and make sure we don’t leak sensitive data.” This framing increases honesty and reduces the incentive to hide usage.
Once you discover a shadow tool, immediately apply Milestone 2: assign an owner and write a one-line purpose. Ownership prevents the tool from remaining “everyone’s responsibility,” which is functionally nobody’s.
After you have a list, add two simple categories that unlock better decisions: internal vs. external and free vs. paid. These are not about accounting; they’re proxies for control and risk. Internal tools (your own models, self-hosted services, or tightly managed enterprise deployments) typically allow stronger access controls and clearer data handling. External tools (public web apps, third-party APIs, consumer assistants) often have more variability in retention, training usage, and support.
Free vs. paid matters because paid plans often include enterprise controls: admin logs, SSO, contractual terms, and explicit data processing commitments. Free plans may be great for experimentation but are frequently the worst place to put any sensitive input. A common mistake is assuming “everyone uses it, so it must be fine.” Popularity is not a privacy policy.
Engineering judgment: some external paid tools may be lower risk than an internal proof-of-concept with no access controls. Categories are not a verdict; they’re a way to prioritize which tools need more facts collected next.
Once categorized, you can set lightweight defaults (later chapters will formalize them), such as: “External free tools are allowed only for non-sensitive content,” and “External paid tools require an owner and a risk label before use.”
Milestone 4 is where governance becomes real: capture key facts about data in and data out. Every AI tool is a data pipeline. If you don’t document the touchpoints, you can’t enforce privacy rules, you can’t respond to incidents, and you can’t explain decisions to stakeholders.
Start by listing the input types people provide: customer tickets, call transcripts, contract text, internal strategy docs, source code, employee information, analytics exports, or “just a prompt.” Then list outputs: draft emails, code suggestions, summaries, classifications (e.g., sentiment), recommendations, or images. Record where outputs go next—into a customer reply, a public blog post, a sales deck, a product feature, or a support macro.
Use engineering judgment to set “default safe handling.” For example, if the tool touches regulated or contractual data (health, financial, student records, or sensitive customer data), treat the tool as higher risk until you confirm retention and access controls. If the tool’s output is used to make decisions (prioritizing leads, flagging fraud, ranking applicants), note that explicitly—decision support should be governed differently than brainstorming.
Write the data touchpoints in plain language so non-technical teammates can follow them. The point is not to model your architecture; the point is to prevent accidental oversharing and to ensure outputs are used appropriately.
Even small teams can capture vendor facts that dramatically improve safety. You don’t need a full legal review to record the basics: who provides the tool, where it runs, and who can access it. This is often the difference between a manageable issue and an emergency when something breaks or data is exposed.
For each external tool, record the vendor name, the product name, and the access method (web app, API, plugin, embedded feature). Note the hosting model if you know it (vendor cloud, your cloud, on-prem). If you don’t know, write “unknown” and assign the owner to find out. This is not busywork: hosting affects data residency, logging, and your ability to restrict access.
Common mistakes include relying on one person’s personal account for a team workflow, or assuming “it’s just a website” so access doesn’t matter. In practice, a single account can become a single point of failure and a compliance risk. Another mistake is not recording where to go when something happens. Your inventory should include “support path”: who to contact internally (the owner) and where vendor support lives.
These vendor notes also help you make purchase decisions later. When you can compare tools side-by-side on access control, data handling, and hosting, you stop choosing tools purely on feature demos.
Milestone 3 is to classify tools by risk level (low/medium/high) so you can focus effort where it matters. The label is not about fear; it’s a prioritization mechanism. A small team cannot do deep reviews for every tool, so you need a quick, repeatable approach that produces consistent outcomes.
Use a simple rule set based on two dimensions: data sensitivity and impact of mistakes. Data sensitivity asks what the tool can see (public vs. confidential vs. personal/regulated). Impact asks what happens if the tool is wrong (minor inconvenience vs. customer harm vs. legal/financial harm). Combine them into a label:
Engineering judgment: if a tool is “unknown” on key facts (retention, access, training usage), temporarily label it one level higher until clarified. This prevents uncertainty from silently becoming risk. Also note that “internal tool” does not automatically mean low risk; an internal model that can access your customer database and auto-take actions may be the highest risk system you have.
Finally, implement Milestone 5: set a monthly update habit. Put a 15-minute recurring calendar event where the team owner reviews new tools, removes retired ones, and re-checks high-risk entries. The inventory stays useful only if it reflects reality. Treat it like any other operational register: lightweight, consistent, and always ready when you need it.
1. Why does AI governance for small teams start with an AI tool inventory?
2. Which set of milestones best matches what Chapter 2 asks you to do?
3. What is the chapter’s recommended mindset for the inventory artifact itself?
4. Which of the following is a common mistake the chapter warns against?
5. What makes an AI inventory powerful and useful over time, according to the chapter?
A good AI policy for a small team is not a legal document and not a manifesto. It is a short, readable agreement that helps people move faster without creating avoidable risk. In practice, “governance” means deciding what tools and behaviors are acceptable, what needs extra care, and what is off-limits—then making those decisions easy to follow in daily work.
This chapter walks you through building a one-page AI use policy that your team will actually use. The trick is to design it like an operational tool: clear boundaries, examples that match real roles, and simple review steps for high-impact work. You will also add minimal data-handling rules, because most real AI incidents in small teams come from accidentally sharing the wrong information or trusting outputs without checking.
Keep your policy format intentionally small: one page, scannable headings, and a few checklists. If you need more detail later, attach an appendix. For now, optimize for adoption. People follow policies that (1) tell them what to do, (2) explain why in plain language, and (3) fit the way work already happens.
As you draft, aim for “minimum effective governance.” You are not trying to predict every future AI use case. You are trying to prevent predictable failures: leaking client data, using AI to make decisions it shouldn’t make, and shipping content that looks authoritative but is incorrect.
Practice note for Milestone 1: Choose the policy format (one page, readable): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Define what’s allowed, limited, and not allowed: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Add clear rules for sensitive data and confidentiality: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Add human review rules for high-impact outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Publish and communicate the policy in plain language: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Choose the policy format (one page, readable): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Define what’s allowed, limited, and not allowed: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Add clear rules for sensitive data and confidentiality: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Start your policy with three sentences: purpose, scope, and who must follow it. This sounds obvious, but many teams skip it—and then argue later about whether the policy “counts” for contractors, interns, or a specific tool. Write it so nobody has to guess.
Purpose should be operational: “Help the team use AI to work faster while protecting privacy, client trust, and our brand.” Avoid vague goals like “be ethical,” because they do not tell someone what to do at 4:30 p.m. when a deadline hits.
Scope should define what counts as “AI” for your team: chat assistants, code assistants, image generators, transcription tools, automated scoring tools, and any feature labeled “AI” in a SaaS product. Include both free and paid tools, browser extensions, and “built-in AI” features in docs, CRM, support, and marketing platforms.
Applies to should be explicit: employees, contractors, and anyone acting on behalf of the team. If your team works with agencies, include them too. Then define the policy format milestone: keep it one page with headings that match decisions people make: Allowed / Limited / Not Allowed, plus Data Rules and Output Rules. This structure is easier to follow than a narrative document and reduces “policy drift” as tools change.
Common mistake: writing a policy that reads like a threat. A practical policy assumes people want to do the right thing and gives them a safe path. Reserve enforcement language for the end; start with clarity and usability.
People follow policies when they see themselves in the examples. Create an “Allowed uses” section that lists safe, everyday tasks by role. This is your acceleration zone: tasks where AI help is low-risk because the inputs are non-sensitive and the outputs are reviewed naturally as part of work.
Use concrete examples rather than categories like “brainstorming.” For instance:
Also define “Limited uses” (allowed with extra steps). Examples: using AI to help with contract language, customer-facing medical/financial guidance, or any content that could be interpreted as official advice. Limited uses are not forbidden—they just require human review rules (see Section 3.5) and sometimes approval (for example, from a team lead).
Engineering judgment tip: If a task already has a strong review loop (PR review, editorial review, manager sign-off), it’s easier to allow AI assistance safely. If a task is often shipped “as-is” (quick replies, one-off analyses), treat it as higher risk and add guardrails.
Your prohibited list should be short, memorable, and defensible. The goal is not to ban AI; it’s to draw clear red lines where AI use creates unacceptable harm, legal exposure, or trust damage. People are more likely to comply when you explain why a red line exists.
Common mistake: creating red lines that are too abstract, like “no unethical use.” Replace abstraction with behaviors people can recognize. Another mistake is over-banning (e.g., “no AI in customer support”), which pushes usage underground. If you can’t tolerate a use case, explain what safe alternative looks like (e.g., “use AI to draft internal notes, but a human writes the final message”).
Finally, add a one-sentence escalation rule: if someone is unsure whether an activity is prohibited, they must treat it as prohibited until reviewed by the designated owner (defined in the policy header).
Data handling rules are the most important part of a small-team AI policy because they prevent the most common real-world incident: pasting sensitive information into a tool that stores it, reuses it, or exposes it via logs and support access. Write rules that answer one question: “What can I paste into an AI system?”
Use a simple three-tier model that fits on a page:
Be explicit about examples of personal data: names with contact details, addresses, government IDs, payment data, and any dataset that could identify an individual. Include “secrets” like passwords, tokens, private keys, and internal URLs that grant access. Small teams often forget that prompts are data: the prompt itself can contain restricted details, and prompts may be stored by vendors or copied into tickets and docs.
Add two practical rules people can execute:
Engineering judgment tip: if a tool offers “do not train on my data” but still stores prompts for abuse monitoring, treat it as retained unless you have confirmed retention terms. If you don’t have vendor clarity, default to using only Public data.
Most AI failures that reach customers are output failures: incorrect claims, made-up citations, unsafe advice, or confident nonsense. Your policy should treat AI output as a draft, not a source of truth. This is where you implement Milestone 4: human review rules for high-impact outputs.
Create a two-level review rule:
Add concrete checking steps that do not require technical skills:
Common mistake: equating “review” with a quick skim. Your policy should define what reviewers are responsible for: factual correctness of key claims, compliance with data rules, and appropriateness for the audience. Make ownership explicit: every AI-assisted artifact has a human accountable for it.
Practical outcome: this turns AI into a speed tool without letting it become an unchecked decision-maker or an unverified publisher.
Even a simple policy needs an exceptions path; otherwise, people will route around it. Define a lightweight request-and-approval process for cases that fall outside “Allowed” usage but may still be reasonable. Keep it fast: a short form in your ticketing tool or a template message in chat.
An exceptions request should capture: tool name, use case, what data will be used, who reviews outputs, and how results will be stored. Approvals should be role-based, not committee-based. For small teams, a common model is: team lead approves business need; security/privacy owner (or the most relevant person) approves data handling; legal approves contract terms when buying.
Enforcement should be predictable and proportionate. Spell out what happens when rules are broken:
Also define an incident reporting path that does not punish honesty: “If you think sensitive data was pasted into a tool, report it within 24 hours to X.” Include what to report (tool, time, what data, whether it was shared externally). The faster you know, the more you can do (revoke tokens, rotate credentials, contact vendors).
Milestone 5 is publishing and communication. Don’t just upload the policy to a drive. Announce it, explain the intent in plain language, and pin the one-page version where work happens (wiki homepage, onboarding checklist, shared channel). Revisit monthly for the first quarter: your goal is not perfection, it is a stable habit that keeps AI usage visible, safe, and useful.
1. Why does the chapter recommend a one-page AI use policy for small teams?
2. In this chapter, what does “governance” mean in practice for a small team?
3. Which set of categories should the policy define to create clear boundaries for AI use?
4. What is the main reason the chapter says to include minimal rules for sensitive data and confidentiality?
5. According to the chapter, when should human review rules be included and used?
Small teams move fast because decisions happen in hallways, group chats, and quick calls. That speed is a strength—until AI enters the picture. AI tools can touch customer data, generate public-facing content, and influence decisions in ways that are hard to notice until something goes wrong. Governance for a small team should not feel like a “committee.” It should feel like a safety rail: lightweight, predictable, and easy to follow.
This chapter gives you a practical review loop that fits a team of 3–30 people: assign three simple roles (owner, reviewer, approver), add one intake step for new tools or new use cases, keep approvals to two steps maximum, define a short “high-risk” trigger list that automatically escalates, and schedule periodic check-ins. The goal is not to eliminate risk. The goal is to catch obvious issues early, document decisions, and keep accountability clear when priorities shift.
The most common failure mode in small-team AI governance is ambiguity: nobody knows who is responsible, approvals are ad hoc, and “temporary” experiments quietly become production workflows. The review loop in this chapter prevents that by making one person responsible for each AI use case, one person responsible for challenge-review, and one person responsible for final authorization—while still keeping the process fast enough that people actually use it.
Practice note for Milestone 1: Assign three simple roles (owner, reviewer, approver): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Set an intake step for new AI tools or new use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Build a lightweight approval workflow (2-step max): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Create a “high-risk” trigger list for extra review: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Schedule periodic check-ins (quarterly or monthly): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Assign three simple roles (owner, reviewer, approver): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Set an intake step for new AI tools or new use cases: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Build a lightweight approval workflow (2-step max): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Create a “high-risk” trigger list for extra review: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Start with a principle: governance roles exist to create clarity, not to create blame. If people think a role is a liability, they will avoid it, rush reviews, or keep AI usage unofficial. Your goal is “single-threaded accountability” (someone owns the outcome) paired with “shared responsibility” (others can challenge decisions without being punished for slowing things down).
For small teams, three roles are enough for most AI activities:
Common mistake: assigning the same person to all three roles. If you do that, you’ve created paperwork, not governance. If your team is too small to fully separate roles, keep one separation: the approver should be different from the owner for anything external-facing or data-sensitive.
Practical outcome: every AI tool and use case has a named owner, a lightweight peer review, and a clear “yes” decision by someone with authority to accept risk. That’s accountability without scapegoating.
You don’t need governance jargon, but you do need decision clarity: who proposes, who checks, who decides, and who needs to be informed. Write it in plain language so it survives staff changes and busy weeks.
Use a simple table (even in a shared doc) that maps common AI decisions to your three roles:
Two-step maximum is a design constraint, not a suggestion. If everything needs three approvals, people will route around the process. A workable default is: (1) peer review by the reviewer, then (2) approval by a manager/lead. If it’s clearly low risk (e.g., internal brainstorming with no sensitive data), you can define “owner-only” usage that still requires an intake entry and a decision log entry.
Practical outcome: you remove “decision fog.” People know exactly what they can do today, what needs review, and who can make the call when time is tight.
An intake step is the heartbeat of a small-team review loop. It’s a short form (one page) that forces the owner to think clearly before an experiment becomes a dependency. Keep it fast: 10–15 questions, mostly checkboxes, with a few short text fields.
Your intake form should cover five areas:
Engineering judgment matters here: you are not trying to predict every failure mode; you are trying to identify the “big rocks” early. A common mistake is treating the intake as a compliance exercise and writing vague answers (“we will be careful”). Replace vagueness with concrete controls: “No customer PII in prompts,” “All outbound content must be reviewed by a human,” “Outputs stored in ticketing system with labels.”
Practical outcome: you can onboard new AI tools or new use cases without chaos, and you create a consistent record that makes later audits, incident reviews, and team handoffs straightforward.
Not every AI use needs the same scrutiny. The trick is to define a short “high-risk” trigger list that automatically adds an extra review gate. This keeps the normal path fast while ensuring you slow down for the situations most likely to cause harm.
Define your default workflow as two steps max: Reviewer check → Approver decision. Then add triggers that require a manager or privacy check (or both). Example high-risk triggers:
When a trigger fires, the process changes in a predictable way. For example: the reviewer must confirm data handling and a privacy-safe configuration, and the approver must explicitly accept the risk in the decision log. If your organization has a privacy or security point-person, route triggered items to them as a time-boxed consult, not an open-ended review.
Common mistake: making the trigger list too long or too subjective (“anything risky”). Keep it short and binary. People should be able to answer in under a minute whether escalation is required.
Practical outcome: you preserve speed for low-risk work while reliably catching the few scenarios that create reputational, legal, or customer harm.
Small teams often adopt AI through a credit card and a browser tab. That’s convenient, but procurement is where many governance risks hide: data retention, unclear ownership, weak security controls, and surprise pricing. You don’t need a complex vendor program; you need a short, repeatable vendor check.
Evaluate AI vendors with plain questions tied to real operational needs:
Engineering judgment: match diligence to impact. A tool used for internal brainstorming with no sensitive data can be approved with minimal vendor checks. A tool that touches customer records or generates regulated communications needs stronger assurances and a clear contract path.
Common mistake: approving a vendor because it “has enterprise customers.” That’s not evidence your use case is safe. Your governance loop should require that the owner documents the vendor’s data practices and that the approver confirms the risk is acceptable.
Practical outcome: fewer surprises—no accidental data sharing, fewer emergency migrations, and clearer costs as AI usage grows.
The final piece of a small-team review loop is consistency over time. People forget why choices were made, and new team members re-litigate old decisions. A simple decision log prevents this. It’s not bureaucracy; it’s memory.
Your decision log can be a spreadsheet or a page in your team wiki. Each entry should be short and searchable:
Schedule periodic check-ins—monthly for fast-changing usage, quarterly for stable environments. The check-in agenda is simple: confirm the tool is still being used as approved, review any incidents or near-misses, update conditions, and confirm the owner is still the right person.
Common mistake: logging only approvals. Log rejections and “approved with conditions” decisions too, because they teach the team what good looks like. Another mistake is forgetting to update the log when a pilot becomes production—make “status change” a reason to add a new entry or update the existing one.
Practical outcome: you can answer, quickly and confidently, “Why are we using this AI tool, under what rules, and who signed off?” That’s the backbone of governance that remains lightweight even as your team evolves.
1. What is the main purpose of the small-team AI review loop described in Chapter 4?
2. Which set of roles best matches the three simple roles the chapter recommends assigning?
3. What is the recommended approach to approvals in a small-team AI governance workflow?
4. What problem is the chapter describing when it warns that “temporary” experiments can quietly become production workflows?
5. How should a team handle AI tools or use cases that meet items on a “high-risk” trigger list?
Small teams rarely fail because they “don’t care about governance.” They fail because they are busy. AI tools add new failure modes—privacy leaks, confident hallucinations, biased wording, insecure accounts—on top of normal delivery pressure. The goal of this chapter is to give you checklists that act like guardrails: quick to run, hard to argue with, and reliable across different people and busy weeks.
A checklist is not a bureaucracy ritual. It is a compact memory aid plus a decision record. When it’s done well, it turns “we should probably think about that” into a repeatable step that any teammate can execute. In this chapter you’ll build four small checklists (privacy, quality, fairness, security) and then combine them into one practical “go/no-go” list for launches and sharing AI outputs.
As you read, keep the mindset: your checklist should be short enough to use, specific enough to matter, and tied to an action. Each item should lead to one of three outcomes: proceed, fix something, or escalate to a designated reviewer. This is how small teams stay fast while still being safe.
Practice note for Milestone 1: Use a privacy checklist for prompts and data sharing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Use a quality checklist for accuracy and hallucinations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Use a fairness checklist for harmful or biased outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Use a security checklist for access and account risks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Combine them into one “go/no-go” checklist for launches: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Use a privacy checklist for prompts and data sharing: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Use a quality checklist for accuracy and hallucinations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Use a fairness checklist for harmful or biased outputs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Use a security checklist for access and account risks: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Combine them into one “go/no-go” checklist for launches: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Checklists work because people are inconsistent under time pressure. In AI work, inconsistency shows up as “I thought it was fine” decisions: someone pastes sensitive data once, skips source checking once, or shares outputs that sound authoritative but are wrong. None of these are rare edge cases—they’re normal human shortcuts. A checklist is a way to design around those limits without requiring everyone to become an expert.
For small teams, the checklist also acts as a shared contract. It clarifies what “good enough” looks like and reduces debates that burn time. If the checklist says “no customer identifiers in prompts,” the discussion becomes about whether the content contains identifiers, not whether privacy matters. That shift is crucial: governance becomes operational, not philosophical.
To make a checklist usable, keep each line item: (1) observable (“Does the prompt contain X?”), (2) actionable (“Remove or mask X; use a synthetic example”), and (3) assigned (“Author checks; reviewer verifies”). Avoid vague items like “be ethical” or “ensure accuracy.” Replace them with concrete gates such as “verify two key claims against a primary source.”
Common mistakes include making a checklist too long (people stop using it), too abstract (people interpret it differently), or detached from workflow (it lives in a doc nobody opens). Place the checklist where work happens: a template in your ticketing system, a pull request checklist, or a “ready to share” note in your doc. Treat the completed checklist as a lightweight record of engineering judgment: what you checked, what you fixed, and what you accepted with rationale.
Milestone 1 is a privacy checklist for prompts and data sharing. The core rule is simple: assume anything you paste into an AI tool could be stored, reviewed for safety, or leaked via logs. Even when vendors promise strong controls, privacy-by-default keeps you safe across tools, accounts, and future changes.
Start by defining “restricted data” for your team in plain language. Typical categories include: customer personal data (names, emails, phone numbers, addresses), government IDs, payment details, authentication secrets, internal financials, unreleased product plans, legal matters, health information, and any dataset you promised to keep confidential. Then encode “what not to paste” as a prompt checklist that runs before you hit Enter.
Make the checklist practical with “safe alternatives.” Examples: mask values (“[EMAIL]”), aggregate counts instead of rows, paraphrase a message rather than pasting it, or create a minimal repro sample that contains no real customer data. A common mistake is forgetting attachments and screenshots—teams redact text but upload a spreadsheet or paste a screenshot with visible names. Your checklist should explicitly cover files, images, and links.
Outcome: by the end of this milestone you should have a one-page “Prompt Privacy Rules” sheet that anyone can apply in 30 seconds and a default behavior of using synthetic or redacted inputs.
Milestone 2 is a quality checklist aimed at accuracy and hallucinations. The practical reality is that AI can produce fluent output that is wrong in subtle ways. Your goal is not perfection; it is to reduce predictable failures with a small set of repeatable controls.
Start by classifying the output type, because the right checks differ. For internal brainstorming, you may accept higher error rates. For customer-facing instructions, legal claims, medical guidance, pricing, or technical configuration steps, you need stronger controls. Your checklist should explicitly ask: “What is the impact if this is wrong?” That single question drives the level of review.
Common mistakes include trusting the first draft, checking only grammar (not correctness), and failing to test edge cases. Add a simple “challenge pass”: ask the model to list uncertainties, or to produce counterexamples, or to provide a step-by-step rationale you can verify. The practical outcome is a workflow where AI speeds up drafting, but humans own correctness—especially at customer touchpoints.
Milestone 3 is a fairness checklist to catch harmful or biased outputs early. Bias in small-team AI use often appears in subtle forms: stereotyped descriptions, unequal tone in customer communication, exclusionary language in job posts, or “policy” text that discourages certain groups. You do not need a research lab to reduce these risks—you need a few consistent review habits.
First, define where harm can occur in your context. If you generate marketing copy, you risk stereotypes and exclusion. If you summarize tickets, you risk misrepresenting customers or using disrespectful labels. If you draft hiring materials, you risk discouraging protected groups. Your checklist should include a quick “who is affected?” prompt: name the user groups and stakeholders impacted by the output.
A common mistake is treating fairness as only “no slurs.” The more frequent risk is unequal impact from seemingly neutral wording (for example, “must have native English” when it’s not required). Another mistake is assuming the model is “objective.” Your practical outcome is a lightweight review step where you scan for risk patterns and adjust wording, examples, or decision logic before content reaches users.
Milestone 4 is a security checklist focused on access and account risks. Many AI incidents in small teams are not sophisticated attacks; they are ordinary security lapses: shared logins, personal accounts used for work, overbroad permissions, and unclear retention rules. Because AI tools often handle sensitive context (documents, conversations, code), account hygiene matters.
Start with identity and access management. Your checklist should require: named accounts (no shared passwords), strong authentication (MFA), and role-based access (only the people who need it). If your AI tool integrates with Google Drive, Slack, GitHub, or your CRM, treat that integration like any other system connection: it can expose more than you expect.
Common mistakes include assuming “enterprise” equals safe, forgetting that browser extensions can read content, and ignoring model settings that allow training on your data. Your practical outcome is a minimum security baseline: controlled accounts, explicit retention expectations, and a habit of reviewing what the tool can access before connecting it to internal systems.
Milestone 5 combines privacy, quality, fairness, and security into one “go/no-go” checklist for launches. This is the moment where checklists become a decision tool. Instead of arguing abstractly, you decide: are we ready to publish this content, ship this feature, or share these outputs with customers?
Your combined checklist should be short enough to run at the end of a project or before a major share-out. It should also define who can say “go.” For small teams, a practical model is: the author completes the checklist, and a designated reviewer (team lead, product owner, or rotating “AI steward”) signs off for customer-facing or high-impact releases.
Make “no-go” meaningful by pairing it with fast fixes. If privacy fails, redact and rerun. If quality fails, narrow the scope, add citations, or require human-authored final wording. If fairness fails, adjust tone and examples and re-review. If security fails, fix access before shipping. The practical outcome is a repeatable launch habit: you catch problems early, you can show what you checked, and you can ship with confidence without adding heavy process.
1. Why do small teams most often fail at AI governance, according to the chapter?
2. What is the chapter’s core purpose for using checklists in AI work?
3. Which description best matches a well-designed checklist in this chapter?
4. What should each checklist item be tied to, based on the chapter’s guidance?
5. Why does the chapter recommend combining privacy, quality, fairness, and security checklists into a single “go/no-go” checklist for launches?
Small teams usually adopt AI the same way they adopt any helpful tool: one person tries it, it works, and suddenly it’s everywhere. That speed is a strength—until something goes wrong. This chapter gives you a lightweight way to define “what counts as an incident,” respond without panic, and keep enough documentation to learn and improve without turning your team into a bureaucracy.
The goal is not to eliminate mistakes. The goal is to notice them early, reduce harm, and prevent repeats. Incidents are also where governance becomes real: your policy, roles, and risk checklist turn into concrete actions. You will set a simple report-and-respond process, create a few reusable templates (a register, a decision log, and an incident note), run a mini training with FAQs, and finish with a 30-day rollout plan with success measures.
Engineering judgment matters here. AI outputs are probabilistic, and failures often look like “almost correct” work. The best small-team governance assumes two truths: (1) incidents will happen, and (2) you can manage them with consistent habits—clear definitions, quick triage, and minimal but reliable documentation.
As you read, keep your team’s real workflows in mind: drafting customer emails, summarizing calls, building code snippets, creating marketing content, analyzing support tickets, or screening candidates. The best incident process is the one your team will actually use on a busy Tuesday.
Practice note for Milestone 1: Define what counts as an AI incident for your team: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Set a simple report-and-respond process: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Create a small set of templates (register, decision log, incident note): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Run a mini training and publish FAQs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 5: Build your 30-day plan and success measures: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 1: Define what counts as an AI incident for your team: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 2: Set a simple report-and-respond process: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 3: Create a small set of templates (register, decision log, incident note): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Milestone 4: Run a mini training and publish FAQs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Milestone 1 is to define what counts as an AI incident for your team. If you don’t define it, people will guess—and they’ll usually under-report because they’re unsure, embarrassed, or too busy. Your definition should be broad enough to capture meaningful risk, but narrow enough that it doesn’t label every typo as an emergency.
A practical definition: an AI incident is any event where an AI tool, AI-generated output, or AI-enabled decision (a) exposes restricted data, (b) creates harm or credible risk to people, customers, or the business, or (c) leads to a materially wrong decision that wouldn’t likely have happened without the AI.
Common mistake: only counting incidents once there is external damage. For small teams, near-misses are gold. A near-miss (caught before sending, publishing, or deploying) should be reported at a lower severity because it reveals a gap in your workflow.
Make the definition operational by including “report triggers” in plain language: “If you used AI with customer data,” “If an AI output would embarrass us if posted online,” or “If AI influenced a decision about a person.” This creates a shared reflex and reduces hesitation.
Milestone 2 is a simple report-and-respond process. You want four steps that fit on one screen and work regardless of whether the incident involves marketing copy or production code. A good triage process is consistent, quick, and calm: stop, assess impact, notify, document.
1) Stop: Pause the activity that could spread harm. Don’t keep prompting “until it looks right.” Don’t ship the change “just this once.” If it’s customer-facing, disable the feature, pause the campaign, revert the commit, or switch to manual review. Stopping is not an admission of failure; it is a safety control.
2) Assess impact: Do a fast, bounded assessment (15–30 minutes). Ask: What data was involved? Who could be affected (customers, employees, partners)? How many items (one email vs. 10,000 records)? Is it ongoing? What’s the likelihood of real-world harm? If you lack certainty, assume a higher risk until proven otherwise.
3) Notify: Define a short notification chain. For small teams, one “AI owner” (or operations lead) plus one privacy/security point of contact is usually enough. If the incident touches regulated data, employment decisions, or public statements, include legal/compliance or a designated executive sponsor. Make it explicit when to notify an external party (e.g., vendor support, customers) and who approves that message.
4) Document: Capture the minimum facts while they are fresh: tool name/version, where the output appeared, who saw it, what inputs were used (or a safe summary), timestamps, and actions taken. Documentation is not for blame; it’s for learning and defensibility.
Common mistake: triage becomes a debate about whether it is “really an incident.” Avoid this by letting anyone report and by classifying severity later. If it feels like it might matter, it should be logged. Your process should reward early reporting with quick support, not extra work.
After triage, Milestone 3 begins: you turn an incident into a fix. Corrective actions should target the true failure mode, not just the visible symptom. In small teams, most AI incidents come from one of four places: the prompt, the process, the tool, or access.
Fix the prompt: If the model produced unsafe or misleading content, improve instructions and constraints. Add “do not” rules (e.g., “Do not infer medical conditions”), force citations to internal sources, or require a structured output with checks (“List assumptions,” “Flag uncertainty,” “Ask clarifying questions before advising”). Keep a prompt snippet library so people don’t re-invent risky prompts.
Fix the process: If the incident occurred because AI output was treated as final, add a human review step at the right point. Examples: require a second-person check for outbound customer messages generated by AI; add a checklist item: “Verify with source doc”; label drafts clearly (“AI-assisted draft—needs review”). The best process fix is the smallest change that reliably prevents repetition.
Fix the tool: Sometimes the right fix is switching configurations or vendors: disable prompt logging, tighten retention, restrict plugins, turn off web browsing, or use an enterprise plan with better privacy controls. If hallucinations are unacceptable, use retrieval from approved documents, or constrain use to summarization instead of decision-making.
Fix access: Many incidents are permission problems. Remove access for high-risk tools, separate accounts, use SSO, set role-based permissions, and limit who can connect integrations to customer data. If contractors use AI, explicitly cover them in the policy and onboarding.
Close the loop with a brief “what changed” note and a date to re-check. Common mistake: writing a long postmortem that no one reads. Prefer a short incident note plus one concrete control improvement. Governance succeeds when fixes are observable in day-to-day behavior.
Milestone 3 also includes templates, because documentation should be easy. Create a single “governance folder” in the tool your team already uses (Google Drive, Notion, Confluence, SharePoint). Keep it boring, searchable, and shared with the right people. The point is not perfect recordkeeping; it is having the right artifacts when someone asks, “Why did we choose this tool?” or “What did we do after that incident?”
Minimum contents:
Where to store sensitive details: be careful not to create a new data leak through documentation. Store raw prompts or outputs only if necessary, and redact personal data. Often, a summarized description (“customer email with name removed”) is enough. Restrict incident notes to a small group if they contain security or HR-sensitive information.
Common mistake: scattering documents across email threads and chat messages. Your governance folder should be the single source of truth. Link to it from your AI use policy and pin it in your team chat so reporting feels like a normal workflow, not a special event.
Milestone 4 is to run a mini training and publish FAQs. Small teams don’t need a certification program; they need shared expectations and quick answers in the moment of use. Plan a 30–45 minute session that covers: what AI is allowed for, what data is restricted, how to use the risk checklist, and how to report incidents without fear.
Make the training practical. Bring three real examples from your work: a customer email draft, a meeting summary, and a data analysis task. Show what “good” looks like (sanitized inputs, clear prompt constraints, human review). Then show what “not allowed” looks like (pasting raw customer records into a public tool, using AI to make final HR decisions, or publishing AI content without review).
Your FAQ should remove friction. Include questions like: “Can I paste customer text if I remove names?”, “Can I use AI for code?”, “What if I already pasted something sensitive?”, “Which tools are approved?”, and “Who do I message if I’m unsure?” Put the FAQ in the same place people work, not in a hidden policy document.
Adoption depends on social design. Assign a visible point person (often the AI owner) who answers questions quickly and treats reports as helpful. Praise near-miss reporting. If people feel punished for raising issues, they will stop reporting—and you will only learn about incidents when customers do.
Common mistake: training focuses on abstract ethics instead of daily decisions. Teach rules as defaults: what to do every time (sanitize, label drafts, verify against sources) and what to do never (share secrets, bypass review, automate high-stakes decisions without oversight).
Milestone 5 is a 30-day rollout plan with success measures. Metrics should reinforce the behaviors you want: early reporting, fast containment, and steady improvement. Avoid vanity metrics like “number of AI prompts written.” Instead, measure governance outcomes: fewer surprises, faster review, better trust.
A practical 30-day plan:
Suggested metrics for a small team: median time from report to containment; number of near-misses reported (higher at first is good); percentage of AI tools with an owner and register entry; review turnaround time for new tools; and the count of customer-facing AI outputs that passed required review.
Common mistake: aiming for zero incidents. A mature governance program often sees more incident reports early because people finally have a safe way to flag problems. Success looks like faster response, fewer repeated failures, and increasing confidence from stakeholders who can see that AI use is intentional and controlled.
1. Which definition best matches the chapter’s approach to an “AI incident” for a small team?
2. What is the primary goal of the incident process described in the chapter?
3. Why does the chapter say “engineering judgment” matters in incident handling?
4. Which set of templates does the chapter recommend keeping small but reusable?
5. Which combination best reflects the chapter’s lightweight rollout approach to make governance real in day-to-day work?