AI In EdTech & Career Growth — Intermediate
Evaluate EdTech AI vendors fast—securely, legally, and procurement-ready.
AI-powered EdTech tools can deliver real instructional and operational gains—but they also introduce new privacy, security, and contracting risks that traditional vendor reviews often miss. This course is structured like a short technical book: six chapters that take you from framing the decision to producing a procurement-ready approval packet. You’ll learn how to ask the right questions, request the right evidence, and translate findings into contract language and measurable service commitments.
Whether you’re supporting a school district, a university, or an education company buying AI-enabled platforms, you need a process that is repeatable, auditable, and fast enough to keep up with product cycles. You’ll build a practical due diligence workflow that aligns IT, privacy, legal, procurement, and academic stakeholders—so decisions don’t stall or get made on trust alone.
AI features change data handling in subtle ways: prompts and chat logs can contain sensitive information, outputs can reveal protected data, and “model improvement” clauses can quietly expand how vendors use your content. You’ll learn to map data flows end-to-end, validate vendor statements about training and retention, and set guardrails that are realistic for classroom and campus operations.
This course demystifies common artifacts such as SOC 2 reports, ISO certificates, security whitepapers, and data protection addenda. You’ll learn what these documents can prove, what they can’t, and which follow-up questions reveal real maturity. On the privacy and legal side, you’ll connect practical EdTech realities to FERPA and COPPA expectations, plus GDPR/UK GDPR requirements when applicable—so you can confidently coordinate with counsel and avoid last-minute surprises.
Strong vendor governance doesn’t end with “approved.” You’ll learn how to negotiate SLAs that match educational impact, define support and escalation paths, set incident reporting timelines, and require ongoing assurance (not just one-time paperwork). Finally, you’ll build a scoring model and documentation trail that makes your decision defensible to leadership, auditors, and the public.
Ready to formalize your process and reduce risk on your next AI purchase? Register free to start, or browse all courses to compare learning paths.
EdTech Compliance & Security Lead (Privacy, SOC 2, Vendor Risk)
Sofia Chen leads vendor risk, privacy, and security reviews for K-12 and higher-ed technology deployments. She has supported SOC 2 audits, FERPA/GDPR contract addenda, and procurement teams implementing risk-based due diligence for AI-powered tools.
AI has moved EdTech procurement from “does it work?” to “does it work safely, lawfully, and predictably under pressure?” Traditional software due diligence focused on availability, basic security controls, and a privacy policy. AI adds new failure modes: sensitive data can be inferred from prompts, model outputs can leak regulated information, and vendors may rely on sub-processors you never directly contract with (foundation model providers, labeling services, analytics, or monitoring tools).
This chapter gives you a practical framework you can reuse: define the decision (use case, users, data, and impact), set risk appetite and approvals, build an evidence-first checklist and log, score objectively with go/no-go thresholds, and plan timelines and communication so procurement is defensible. The goal is not to eliminate risk; it is to make risk visible, assigned, and controlled—so your campus can adopt AI without surprises.
As you read, keep a single question in mind: “If this vendor has a bad day—an outage, breach, or inaccurate output—what happens to students, staff, and the institution?” Your due diligence process should turn that question into measurable requirements, documented evidence, and contract terms that hold up in audits and post-incident reviews.
Practice note for Define the decision: use case, users, data, and impact: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set risk appetite and approvals (IT, legal, procurement, academics): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Create a due diligence checklist and evidence log: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose a scoring model and go/no-go thresholds: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Plan timelines, ownership, and stakeholder communications: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Define the decision: use case, users, data, and impact: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set risk appetite and approvals (IT, legal, procurement, academics): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Create a due diligence checklist and evidence log: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Choose a scoring model and go/no-go thresholds: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Plan timelines, ownership, and stakeholder communications: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
AI changes vendor risk because it changes data behavior and decision behavior. In many EdTech tools, student data is stored and retrieved. In AI-enabled tools, student data may be transformed into embeddings, used to fine-tune a model, logged as prompts, or used to personalize outputs. Even if the vendor promises “we do not train on your data,” you still must ask where prompts are stored, who can access them, and what downstream systems receive them.
Start every review by defining the decision: the use case, users, data, and impact. Write a one-page “AI Use Case Definition” that answers: Who uses it (students, faculty, advisors, admins)? What decisions does it influence (grading feedback, advising, accommodations, disciplinary actions)? What data enters (PII, education records, health data, biometrics, minors’ data)? What outputs are produced (summaries, recommendations, automated messages), and how could those outputs cause harm (bias, hallucinations, disclosure, reputational damage)?
Common mistakes include treating AI as a feature instead of a system, accepting generic vendor claims (“enterprise-grade security”) without proof, and skipping data flow mapping because “it’s just chat.” In education, AI can implicate FERPA (education records), COPPA (children under 13), GDPR/UK GDPR (lawful basis, transparency, data subject rights), accessibility requirements, and institutional academic integrity policies. Your framework must connect the use case to the relevant legal and policy obligations before you ever negotiate price.
AI due diligence fails most often due to unclear ownership. Security may review controls, legal may review terms, and academics may champion usability—yet no one integrates the decision into a single “yes/no with conditions.” Build a stakeholder map early and use a simple RACI (Responsible, Accountable, Consulted, Informed) to prevent gaps.
At minimum, map: IT/security (security architecture, identity, logging, incident response), privacy/legal (FERPA/COPPA/GDPR/UK GDPR, DPA terms, sub-processor clauses), procurement (RFP/RFI governance, vendor onboarding, financial/insurance checks), academic leadership (pedagogical fit, integrity, fairness), accessibility/ADA office (assistive tech compatibility, VPAT review), and data governance/institutional research (data definitions, retention rules, analytics constraints). If minors are involved, include K-12 district compliance leadership or campus youth-program leads.
Set risk appetite and approvals explicitly. Example: “High-risk AI use cases (automated decisions impacting student standing or involving minors’ data) require CISO + privacy officer + counsel sign-off and a contractual right to audit.” Define escalation paths: when a vendor cannot meet a control, who can approve an exception, and what compensating controls are acceptable (e.g., disable training, reduce data scope, enforce SSO and least privilege, or route through a privacy proxy).
An evidence-first workflow means you don’t debate claims; you collect artifacts, log them, and decide based on what is verifiable. Start with an intake step (often an RFI) that requests: architecture overview, data flow diagram, SOC 2 Type II report or ISO 27001 certificate (and scope statement), penetration test executive summary, vulnerability management process, incident response policy, privacy policy, DPA template, sub-processor list, data retention schedule, and AI-specific documentation (model behavior limits, training data policy, prompt logging practices).
Create a due diligence checklist and an evidence log. The checklist is your control questions; the evidence log is your record of what you received, from whom, when, and what it proves. Treat this like change management: each artifact has an owner reviewer, a date reviewed, and an outcome (pass/fail/needs follow-up). This turns informal email threads into a defensible procurement record.
Use targeted questions to validate SOC 2/ISO coverage. A SOC 2 report is not a blanket approval; check: Is the system you’re buying in scope? Are sub-processors covered? Are there exceptions in the auditor’s testing? For AI, ask where prompts and outputs are stored, how long, and whether humans can review them. If the vendor uses third-party LLM APIs, require clarity on data handling at that layer and the contractual terms that prevent provider-side training on your data.
To keep reviews consistent, group findings into risk categories and map them to control families. This allows you to compare vendors fairly and communicate risks in language stakeholders understand. A practical set of categories for EdTech AI includes: Security, Privacy/Legal, Reliability/Operations, AI Safety & Quality, Accessibility & Equity, and Financial/Commercial.
Within Security, anchor on control families such as identity and access management (SSO/SAML, MFA, RBAC), encryption (at rest/in transit), secure SDLC, vulnerability management, logging and monitoring, incident response, and third-party risk management. For Privacy/Legal, track data minimization, purpose limitation, retention and deletion, sub-processor transparency, breach notification timelines, cross-border transfers, and FERPA “school official” requirements where applicable. For Reliability/Operations, focus on uptime, RTO/RPO, support hours, change management, and dependency risk (e.g., reliance on a single LLM provider).
AI Safety & Quality deserves its own controls: model limitations disclosure, human-in-the-loop options, output filtering, citation/grounding features, bias testing, evaluation methodology, and mechanisms to prevent prompt injection and data exfiltration. Don’t ignore the educational dimension: if the tool impacts grading or advising, require a documented policy for how staff validate outputs and how students can contest decisions.
Procurement becomes defensible when you can explain why one vendor is acceptable and another is not, using a scoring model that reflects campus risk. Build a weighted rubric with categories aligned to your control families. For example: Security 30%, Privacy/Legal 25%, Reliability/SLA 20%, AI Safety/Quality 15%, Accessibility 10%. Adjust weights by use case: a student-facing chatbot that handles education records might increase Privacy/Legal and AI Safety weight.
Define go/no-go thresholds before you start scoring. Examples of common “no-go unless exception approved” items: no signed DPA; refusal to disclose sub-processors; inability to support SSO/MFA for staff accounts; breach notification longer than your policy; use of student prompts for training by default; no clear data deletion process; or an SLA that disclaims meaningful remedies. A scoring model is not just arithmetic—it embeds engineering judgment about what risks are intolerable.
When a vendor fails a control, decide whether to reject, require remediation, or accept with compensating controls. Compensating controls might include limiting the data scope (no PII in prompts), configuring “no logging,” using a gateway that redacts identifiers, restricting use to staff-only accounts, or requiring a pilot in a sandbox tenant. Document exceptions with a named approver, an expiration date, and a follow-up plan. The most common mistake is granting permanent exceptions with no owner and no sunset, which turns temporary risk into institutional debt.
Assume you will need to justify this purchase later—to auditors, leadership, a regulator, or after an incident. Build a documentation pack as you go. At minimum, include: the AI Use Case Definition, data flow diagram, data classification decision, RACI and approval record, completed checklist, evidence log, security and privacy findings summary, scoring rubric and completed scorecard, exception approvals (if any), and final contract exhibits (DPA, SLA, security addendum, sub-processor list).
Keep the audit trail simple but consistent. Use versioned documents, store artifacts in a controlled repository, and record key decisions in a procurement ticket or governance system. For communications, don’t rely on informal chat threads; summarize decisions in writing with dates and named approvers. This is especially important for AI-ready SLAs: document uptime targets, support response times, incident response commitments, breach notification windows, and audit rights. If you negotiate “right to receive SOC 2 annually” or “advance notice of sub-processor changes,” include those as explicit contract terms, not side emails.
Finally, plan timelines and stakeholder communications. Set expectations early: security review may take two to four weeks depending on vendor responsiveness, and legal negotiation can be the critical path. Provide status updates in plain language (“waiting on SOC 2 scope statement,” “DPA redlines with counsel,” “pilot approved with data minimization constraints”). The practical goal is to move quickly without cutting corners, because the pack you create is what makes fast adoption repeatable.
1. What is the key shift AI introduces to EdTech procurement compared to traditional software due diligence?
2. Which of the following is an AI-specific failure mode highlighted in the chapter?
3. Why does the chapter stress that vendors may rely on sub-processors you do not directly contract with?
4. What is the purpose of using an evidence-first checklist and evidence log in the framework?
5. Which question should guide the due diligence process according to the chapter?
Before you can evaluate an EdTech AI vendor’s security claims or negotiate an AI-ready SLA, you need a clear picture of what data exists, where it moves, and what an AI feature changes about privacy risk. Many procurement teams start with a vendor questionnaire and end with a “meets FERPA” checkbox. That approach fails when AI is involved because AI tools introduce new data types (prompts, embeddings, generated content), new transfers (to model providers, logging systems, and analytics pipelines), and new ambiguity (whether a vendor “trains” on your data, how long they keep it, and what “delete” means).
This chapter is a practical workflow: (1) inventory and classify data (student, staff, research); (2) draw a data flow diagram from collection to processing, storage, and sharing; (3) validate vendor claims about training, retention, and deletion; (4) design consent, notice, and acceptable-use guardrails; and (5) turn the result into procurement and contract requirements. If you do this work up front, you can spot hidden transfers, reduce scope, and make your due diligence evidence-based rather than trust-based.
Throughout, keep a simple rule: every AI use case must have a documented “data story”—what is collected, why, where it goes, how long it stays, and who can access it. If a vendor cannot tell that story clearly, your risk is already high.
Practice note for Classify data types and sensitivity (student, staff, research): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Draw the data flow diagram (collection → processing → storage → sharing): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Validate vendor claims about training, retention, and deletion: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design consent, notice, and acceptable-use guardrails: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Write AI data requirements for procurement and contracts: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Classify data types and sensitivity (student, staff, research): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Draw the data flow diagram (collection → processing → storage → sharing): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Validate vendor claims about training, retention, and deletion: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Design consent, notice, and acceptable-use guardrails: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Start by classifying data based on sensitivity and regulatory impact, not on where it lives today. A useful approach is to define 3–5 tiers (e.g., Public, Internal, Confidential, Restricted) and then map common education data types to those tiers. In K-12, the presence of minors shifts defaults: treat most student-associated data as at least Confidential, and anything that could harm a student if exposed as Restricted. In higher ed, add an explicit category for research data (including IRB-covered human subjects data) because it often has contractual and ethical constraints beyond FERPA.
Build your inventory around “data elements” rather than systems. For an AI tutoring tool, list what may be sent: student name/ID, course and roster membership, assignment content, rubric scores, disability accommodations, behavior notes, free-text reflections, and any files uploaded. Then label each element with: owner (district/university), legal basis (FERPA/COPPA/GDPR/contract), and minimum necessary principle (is it required, optional, or prohibited?).
Common mistakes include relying on vendor labels (“PII” vs “non-PII”) without your own definitions, and forgetting derived data (risk scores, engagement metrics) that may be treated as student records once linked to an identifiable student. A practical outcome of this section is a one-page data classification matrix you can attach to RFIs/RFPs and DPAs: “these elements may be processed,” “these may not,” and “these require explicit approval.”
A data flow diagram (DFD) is the fastest way to reveal what a vendor description hides. Draw it even if the vendor provides an architecture diagram; marketing diagrams typically omit logging, support access, and subprocessors. Your DFD should follow the lifecycle: collection → processing → storage → sharing, and include every boundary crossing (student device to vendor, vendor to cloud host, vendor to AI model provider, vendor to analytics, vendor to customer support tools).
Use a consistent set of nodes: users (students, teachers, admins), client apps, your identity provider (SSO), the vendor application, the AI service (first-party or third-party), data stores, logging/telemetry, and external recipients (subprocessors, integrations, and partners). Label edges with the data elements from your inventory and whether the transfer is continuous, event-based, or batch. Include “non-obvious” flows: error reporting, feature analytics, content moderation, abuse detection, and human review queues.
Engineering judgment matters in deciding which flows are acceptable for the use case. For example, sending full assignment text to an external LLM provider may be acceptable for plagiarism feedback in higher ed, but not for K-12 counseling notes. A strong DFD becomes a contract exhibit: it anchors subprocessors, audit scope, and incident response expectations. If a vendor cannot confirm the DFD in writing, treat their security posture as unverified.
AI features create new categories of data that traditional EdTech reviews miss. The prompt itself can contain sensitive content (a student confiding mental health issues; a teacher pasting an IEP excerpt; a staff member uploading a disciplinary report). Outputs can also become sensitive because they may summarize or infer information (“student appears disengaged,” “likely has ADHD,” “risk of failing”). Derived data—embeddings, topic vectors, personalization profiles, and risk scores—often persist longer than raw text and can be harder to interpret and delete.
During due diligence, require the vendor to identify and classify: (1) prompt data, (2) uploaded attachments, (3) generated outputs, (4) conversation history, (5) safety/moderation logs, and (6) derived representations (embeddings, indexes). Then ask where each is stored, how it is used, and whether it is linked to an identifiable student or staff member. If outputs are stored in a student record system, they may become part of the education record and subject to access and amendment rights.
A common mistake is focusing only on “PII” and missing that a student’s writing sample can be identifiable even without a name. Practical outcomes here include: role-based restrictions (students cannot paste certain categories), UI warnings, and a “prohibited data” list baked into acceptable-use policy and in-product controls. Your DFD should show prompt and output storage separately, not as a single “AI data” bucket.
Vendor statements like “we don’t train on your data” are often ambiguous. You need precise definitions. “Training” may refer to (a) pretraining a foundation model, (b) fine-tuning a model on customer data, (c) using prompts/outputs to improve safety filters, (d) using aggregated analytics to improve product UX, or (e) building retrieval indexes (RAG) from your content. Some vendors consider (c) and (d) not “training,” even though they still involve reusing your data for broader benefit.
Translate marketing language into contractable commitments. Ask the vendor to specify, for each data category (prompts, outputs, attachments, logs, derived data): whether it is used for (1) training/fine-tuning, (2) evaluation, (3) safety monitoring, (4) product analytics, (5) retrieval indexing, or (6) third-party model provider improvement. Require an explicit opt-in for any cross-customer model improvement, especially in K-12.
Engineering judgment: sometimes limited retention for safety (e.g., abuse detection) is reasonable, but it must be bounded, transparent, and configurable. The practical outcome of this section is a set of procurement clauses: “No customer data used for training or fine-tuning without explicit written opt-in,” plus a requirement for a subprocessor list that includes AI model providers and labeling services.
Retention is where good intentions fail in production systems. A vendor may offer a “delete” button, but data can persist in logs, analytics warehouses, search indexes, or backups. Your job is to make retention concrete and testable. Break retention into categories: operational data (needed to run the service), security logs (needed for detection and investigations), analytics (often over-collected), and AI artifacts (conversation history, embeddings, vector indexes).
Ask for a retention schedule with default values and customer-configurable options. For example: prompts retained 0–30 days; outputs stored until user deletes; security logs 90–365 days; backups 30–90 days; derived embeddings purged within X days of deleting the source. Require the vendor to explain deletion semantics: soft delete vs hard delete, and how deletion propagates to subprocessors.
Common mistakes include accepting “we retain as long as necessary” and forgetting that “necessary” can expand over time. Practical outcomes: retention requirements in the RFP, SLA language tying support access to time-bounded workflows, and a DPA exhibit that lists each data store type and its deletion timeline. If the vendor can’t describe where data is stored, they can’t delete it reliably.
Privacy by design is not a policy statement; it is a set of controls that reduce data collection and prevent misuse in daily classroom workflows. Start with guardrails that align consent, notice, and acceptable use to the realities of AI. In K-12, confirm how COPPA consent is handled (school-based consent vs parental), how notices are delivered, and what the vendor expects teachers to communicate. In higher ed, focus on transparency, choice where appropriate, and clear boundaries for staff use.
Implement layered controls: (1) product configuration (disable prompt logging, restrict integrations, region lock), (2) identity and access (SSO, role-based access, least privilege), (3) data minimization (remove names/IDs from prompts by default), and (4) user experience nudges (inline warnings, “don’t paste” banners, and templates that avoid sensitive data). Create an acceptable-use policy that is specific to AI: what may be entered, what may not, and how outputs may be used (e.g., not as sole basis for discipline decisions).
The practical outcome is a “minimum AI privacy controls” checklist attached to your procurement package and referenced in the contract: what must be configurable, what must be logged, and what must be prohibited. When an AI feature is introduced mid-year, these controls let you re-evaluate quickly without restarting due diligence from scratch.
1. Why is a simple vendor questionnaire ending in a “meets FERPA” checkbox insufficient when evaluating an EdTech AI vendor?
2. Which sequence best matches the chapter’s workflow for AI vendor due diligence?
3. What is the purpose of drawing a data flow diagram in this chapter’s approach?
4. Which vendor claim requires explicit validation under this chapter’s guidance?
5. According to the chapter’s “data story” rule, which set of elements must be documented for every AI use case?
Security due diligence for an EdTech AI vendor is not a “checklist exercise.” Your goal is to decide whether the vendor’s controls are strong enough for your specific school or campus risk, and whether the vendor can prove those controls work in practice. This chapter gives you a practical workflow to review third-party assurance (SOC 2/ISO 27001), validate identity and access controls, evaluate encryption and isolation, and confirm the vendor can detect and respond to real incidents.
Start by grounding your review in the data flow you mapped in earlier work: what data enters the system (student records, staff HR data, behavioral analytics, special education indicators), where it is processed (vendor cloud, sub-processors, model providers), and what leaves the system (reports, model outputs, exports). Then align your questions to the most likely failure modes: unauthorized access, leakage across tenants, insecure integrations, unpatched vulnerabilities, and delayed breach response.
In procurement, time is limited, so prioritize evidence that is both objective and current. Vendor claims (“we encrypt everything,” “we are SOC 2 compliant”) are not evidence. Evidence is a report, a policy, a configuration screenshot, a penetration test executive summary, a sample audit log, or contract language that creates enforceable obligations. Your process should end with practical outcomes: a defensible risk rating, a list of required mitigations, and SLA/contract terms that match the operational reality of a school environment (limited IT staff, high account churn, and strict notification timelines).
The sections that follow walk through the core control areas you should validate for AI-enabled EdTech, including the common mistakes that cause schools to accept risk unintentionally.
Practice note for Review SOC 2/ISO evidence and interpret what it does (and doesn’t) cover: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Assess identity, access, and admin controls (SSO, RBAC, MFA): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Evaluate encryption, key management, and tenant isolation: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Check vulnerability management and secure SDLC signals: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Confirm incident response readiness and breach notification commitments: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Review SOC 2/ISO evidence and interpret what it does (and doesn’t) cover: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Assess identity, access, and admin controls (SSO, RBAC, MFA): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
SOC 2 and ISO 27001 can accelerate due diligence, but only if you read them like a risk reviewer—not like a marketing brochure. Start with the basics: SOC 2 Type II (not Type I) gives evidence that controls operated over a period of time. Verify the report period, the auditor, and which Trust Services Criteria (TSC) are included (Security is common; Availability, Confidentiality, Processing Integrity, and Privacy may be optional). If the vendor says “SOC 2 compliant,” ask for the actual report and the bridge letter (if the report is older than the last 3–6 months).
Next, validate scope. The most frequent gap in EdTech AI reviews is that the SOC 2 scope covers only the “core SaaS platform” but excludes critical components like the AI model pipeline, third-party model providers, data labeling vendors, analytics tools, or customer support systems where sensitive screenshots can live. If your data flow includes a subprocessed LLM API, but the report scope does not, your review is incomplete. Request a subprocessor list and map it to the scoped systems in the report.
For ISO 27001, ask for the certificate and the Statement of Applicability (SoA). The certificate alone does not tell you which Annex A controls are implemented or justified as “not applicable.” Use the SoA to spot gaps in access management, cryptography, supplier management, and incident response. The practical outcome of this section is a “coverage decision”: what risk can be reduced based on independent evidence, and what requires follow-up testing or contractual protection.
Identity is where EdTech deployments fail in real life. Your due diligence should confirm the vendor supports secure authentication (how users prove who they are) and secure authorization (what they can do once inside), including strong admin controls. Require SSO (SAML/OIDC) for staff and, where feasible, for students. If SSO is not available, insist on MFA for all admin accounts at minimum, and ideally for any account that can export data, manage integrations, or view sensitive student information.
Move past yes/no questions. Ask for details that reveal engineering maturity: Can the product enforce MFA (not just “supports”) and can you require it per role? Does it support conditional access (IP restrictions, device posture) or at least session timeouts and re-authentication for sensitive actions? How are passwords stored (salted hashing) and are accounts protected against brute force (rate limits, lockouts)?
Least privilege should be demonstrable. Request a role matrix showing permissions, and verify that the default role is restrictive. Ask whether the vendor supports separate admin roles (billing admin, security admin, content admin) and whether API keys and service accounts can be scoped and rotated. If the system includes AI features (prompt libraries, assistant configuration, data connectors), treat those as privileged actions: who can turn on retrieval over student data, who can change the model provider, and who can export conversation transcripts?
The outcome is a deployment-ready access plan: which roles your campus will use, which features must be disabled by default, and what controls must be contractually required (SSO/MFA enforcement, timely deprovisioning, audit logs for admin actions).
Infrastructure questions are not about whether the vendor uses “the cloud.” They are about whether your data is protected at rest, in transit, and across customers. Begin with encryption in transit: TLS 1.2+ for all web/API traffic, secure cipher suites, and HSTS where applicable. Then confirm encryption at rest for databases, object storage, and backups. If the vendor cannot clearly articulate what is encrypted, where, and how keys are managed, assume the design is fragile.
Key management is where vendors vary widely. Ask whether they use a managed KMS (AWS KMS, Azure Key Vault, GCP KMS), how keys are rotated, and who can access or use keys. For higher-risk data (special education, counseling notes, disciplinary data), ask about customer-managed keys (CMK) or at least strict internal separation of duties. A common mistake is accepting “we encrypt at rest” while the vendor’s staff can still access production data broadly through support tooling.
If AI is involved, include the model data path: embeddings stores, vector databases, feature stores, prompt logs, and training pipelines. Confirm whether your data is used for training by default, and if not, how that is technically enforced. Also ask where data is stored geographically and whether sub-processors can move it across regions. The outcome of this section should be a clear statement of isolation and cryptography assurances you can rely on, plus any compensating controls you must demand (e.g., disable data retention for prompts, restrict support access, or require regional hosting).
Secure systems come from secure engineering habits. Your due diligence should look for signals of a secure SDLC: code review practices, dependency management, secrets handling, and controlled releases. Ask whether the vendor uses automated scanning (SAST, dependency/SCA scanning, container scanning) and whether critical findings block releases. Ask how secrets are stored (no plaintext in repos; use secret managers) and how they prevent credentials from leaking into client-side code.
Change control matters for schools because unexpected updates can break integrations or alter privacy behavior. Validate whether the vendor has release approval, rollback plans, and change logging. If the product has an AI assistant, ask how prompt templates, safety filters, retrieval connectors, and model configurations are changed and tested. A frequent oversight is treating “AI configuration” as content rather than code—when in practice it can change output behavior and data exposure risk.
Be careful with “we do pen tests annually” as a vague claim. You need to know what was tested and what changed since. For a vendor that recently added AI features, you should ask specifically whether the AI components were included: prompt injection risks, data connector authorization, output filtering, and logging/retention of conversations. The outcome is a practical confidence level in the vendor’s engineering process and a list of required deliverables (test summaries, remediation attestations, and patch commitments) to include in your procurement file.
Controls that exist only on paper fail quietly. Monitoring and logging determine whether the vendor can detect misuse, contain incidents, and support your investigations. Start by asking what they log by default: authentication events (success/failure), admin actions, data exports, permission changes, API key creation, integration changes, and AI-relevant actions such as enabling retrieval over datasets or changing retention settings. Logs should be tamper-resistant, time-synchronized, and retained long enough to support investigations (often 90 days minimum; longer for higher-risk contexts).
Then focus on alerting and operational coverage. Ask whether they have 24/7 on-call for security incidents or only business-hours support. Schools operate on predictable cycles (enrollment, grading, testing windows), but attackers do not. For AI features, ask what anomalous behavior detection exists: spikes in exports, unusual admin actions, high-volume prompt activity, or repeated access denied events that may signal credential stuffing.
Common mistakes include accepting “we log everything” without understanding retention, access controls, and whether logs include the actions you care about (exports, role changes, connector configuration). The outcome is a monitoring expectation that is both technical and contractual: required log types, retention, admin access to audit trails, and support response times for suspected account compromise or data exposure.
Incident response (IR) is where due diligence becomes real. You are not only evaluating whether the vendor has an IR policy—you are confirming whether they can execute under pressure and whether their contract terms match your legal obligations. Ask for their IR plan or an executive summary: roles, escalation paths, severity definitions, and how they coordinate with sub-processors (cloud providers, model providers). Confirm they run tabletop exercises and track improvements.
Next, align breach notification commitments to school requirements. Many vendor templates use vague language like “without undue delay” but omit hard timelines. Your procurement position should specify: initial notification window (e.g., 24–72 hours after confirmation), ongoing updates cadence, and the minimum content of notifications (what happened, what data types, what users affected, what containment steps, what you should do). Also confirm whether they will notify affected individuals directly or support your institution in doing so, consistent with FERPA/COPPA/GDPR obligations and your district’s communications protocols.
Engineering judgment matters here: a vendor with strong prevention controls but weak IR can still be a bad fit for a campus with strict reporting timelines and limited internal incident capacity. The outcome of this section is a negotiated, AI-ready incident clause in your SLA/DPA that includes notification timelines, cooperation commitments, post-incident reporting, and (when justified) the right to obtain independent evidence of remediation.
1. According to Chapter 3, what is the primary goal of security due diligence for an EdTech AI vendor?
2. Which approach best reflects the chapter’s recommended starting point for a security review?
3. Which item is considered “evidence” (not merely a vendor claim) in Chapter 3?
4. When prioritizing what to investigate under limited procurement time, what does Chapter 3 recommend?
5. Which set best matches the chapter’s “most likely failure modes” to align questions against?
Legal review in EdTech AI procurement is not a “paperwork” step; it is where you convert policy promises into enforceable obligations and operating constraints. Schools and campuses handle high-risk data (minors, education records, special categories, research data), and AI features add new processing patterns: model prompts, embeddings, telemetry, and automated decisions. This chapter gives you a practical review workflow that connects regulations (FERPA, COPPA, GDPR/UK GDPR) to the contracts you sign (DPAs, AI terms, SLAs) and the controls you must operate (training, access controls, incident response, retention).
Start by identifying what rules apply and whose rules they are: (1) law and regulation (federal/state, GDPR/UK GDPR); (2) institutional policy (data classification, records retention, acceptable use); (3) district/campus contract standards (required addenda, insurance, audit rights). Then map the data flow for the specific AI use case: what data enters the vendor (inputs), what the vendor generates (outputs), what it logs (telemetry), and where it sends data next (sub-processors, cross-border transfers). Your goal is to ensure the contract matches the real architecture. A common mistake is reviewing a “standard DPA” in isolation while the product’s AI features are sending prompts to a separate model provider not covered by the DPA.
As you read documents, translate legal obligations into operational controls and training. If the contract says “delete within 30 days,” someone must own a deprovisioning workflow. If it says “use only for providing the service,” engineers must ensure analytics or model training is off by default for student data. If it promises “assist with DSARs,” support and data engineering must have a repeatable export/delete process. Treat legal review as requirements engineering: define roles, purposes, and boundaries; then validate evidence and implementation.
In the sections that follow, you’ll apply this workflow to the highest-leverage topics: controller/processor roles, FERPA clauses, COPPA consent, GDPR lawful basis and DSARs, international transfers, and AI-specific terms around ownership and prohibited uses.
Practice note for Identify applicable regulations and institutional policy requirements: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Review and negotiate DPAs: roles, purpose limitation, and sub-processors: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Validate international transfers and data residency needs: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Handle children’s data, parental consent, and classroom exceptions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Translate legal obligations into operational controls and training: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Identify applicable regulations and institutional policy requirements: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Before you negotiate any DPA language, determine whether the institution and vendor are acting as controller or processor (GDPR terms) and how that maps to US contract concepts (service provider/operator). In most classroom and campus deployments, the school is the controller (decides purpose and means), and the vendor is the processor (processes data on documented instructions). This distinction is not academic: it determines who can decide new uses, who must respond to rights requests, and whether the vendor can reuse data for product improvement or model training.
Practical test: ask, “If the vendor changed how it uses student data tomorrow, could it do so without our approval?” If the answer is yes, you are not in a clean processor relationship. Another test: “Can the vendor combine our students’ data with other customers’ data to improve a model?” If yes, that is typically a separate purpose requiring explicit permission, and may create joint-controller risks under GDPR and policy risk under FERPA expectations.
Common mistake: accepting a DPA that describes the vendor as a processor while the Terms of Service grant broad rights to “improve our services” using customer content. Resolve conflicts by stating that the DPA controls for personal data and education records, and by defining “improvement” narrowly (e.g., internal QA on aggregated/de-identified data). Your outcome is a role statement that matches reality, plus a list of permitted purposes that the product team can actually operate.
FERPA compliance for vendors is largely contractual and operational: the institution must ensure the vendor qualifies as a “school official” with “legitimate educational interests,” and that the vendor is under the direct control of the institution regarding use and maintenance of education records. Your legal review should therefore focus on clauses that lock the vendor into FERPA-aligned behavior, supported by practical controls.
Start by defining what counts as an education record in the use case. In AI tools, this often includes assignment submissions, feedback, grades, accommodations notes, and sometimes chat transcripts that become part of the record. Then ensure the contract clearly states: (1) the vendor will use data only to deliver the educational service; (2) the institution retains control; (3) the vendor will not redisclose data except as allowed under the agreement; (4) the vendor will support access, amendment, and deletion requests as directed by the institution.
Disclosures matter. If the institution must provide annual FERPA notices or specific disclosures about third-party services, capture exactly what the vendor needs to supply: a plain-language description of data categories, purposes, and sub-processors. Common mistake: leaving disclosure to marketing language that omits AI-specific processing (prompt logs, evaluation datasets). Practical outcome: a FERPA addendum or DPA schedule that can be copied into parent/student notices and that stands up if questioned by counsel or auditors.
COPPA risk increases when an EdTech tool is used by children under 13 and the vendor collects personal information online. AI features often expand collection beyond obvious fields (name/email) into telemetry (device identifiers, usage patterns), free-text prompts, voice recordings, or behavioral analytics. Your review should identify whether the product is directed to children, used in an elementary context, or knowingly collects data from under-13 users.
For school-authorized educational use, COPPA allows schools to consent on behalf of parents in many situations, but only if the vendor uses the data solely for the educational purpose and not for commercial purposes unrelated to the school. Translate that into contract and configuration: prohibit interest-based advertising; limit analytics; disable optional data collection; and require deletion when no longer needed.
Common mistake: focusing only on “consent language” while ignoring embedded SDKs (crash reporting, analytics) that collect identifiers. For AI, also watch for “help us improve the model” checkboxes: for under-13 users, treat opt-in as sensitive and require school approval, not child choice. Practical outcome: a documented COPPA stance—either (1) tool not approved for under-13 accounts, enforced by roster rules, or (2) tool approved with a verified school-consent workflow and minimized telemetry controls.
If you serve learners or staff in the EU/EEA or UK—or partner with institutions there—GDPR/UK GDPR becomes central. Begin with lawful basis for processing. In education, common bases include performance of a task in the public interest (public institutions), performance of a contract, or legitimate interests (often for limited analytics). Consent is fragile in power-imbalanced contexts like schools and is usually a last resort for non-essential processing. Your contract should state the lawful basis assumptions and prohibit the vendor from repurposing data in ways that would require new consent.
Next, determine whether a DPIA (Data Protection Impact Assessment) is required. AI tools frequently trigger DPIA criteria: systematic monitoring, large-scale processing of children’s data, or automated decision-making that affects learners. Even if not strictly required, running a lightweight DPIA is good practice: document the purpose, necessity, risks (bias, hallucinations, privacy leakage), and mitigations (human review, access controls, data minimization, transparency).
Common mistake: assuming GDPR is “handled by the DPA” without checking whether the vendor actually has DSAR workflows, export formats, and identity verification steps. Practical outcome: a GDPR annex (or addendum) with (1) processing details, (2) security measures, (3) DSAR and breach procedures, and (4) DPIA support commitments (information sharing, mitigation cooperation). This turns compliance into an operational capability rather than a statement.
Modern AI vendors rarely process data alone. They rely on cloud hosting, logging providers, support platforms, and often third-party model APIs. Your due diligence must treat sub-processors as part of the system, because data protection obligations flow downstream. Review the vendor’s sub-processor list and ensure it is specific (company names, locations, service function) and kept current with notice of changes and an objection mechanism.
International data transfers are where many reviews fail. If personal data moves from the EU/UK to a country without an adequacy decision, you typically need Standard Contractual Clauses (SCCs) (or the UK Addendum) plus a transfer risk assessment and “supplementary measures” where appropriate. Data residency requirements (e.g., “EU-only processing”) must be validated technically: not just where databases live, but where support staff access occurs, where logs are stored, and where model providers process prompts.
Common mistake: accepting “we use AWS” as sufficient. You need the region, the services used, and whether any data is routed to US-based observability tools or LLM providers. Practical outcome: an approved sub-processor register linked to the contract, plus SCCs where needed and a clear statement of where data is stored and processed (including support access paths).
AI terms are now as important as security terms. They define who owns inputs (prompts, student work), who owns outputs (generated feedback, summaries), and what the vendor can do with both. In education, you want the institution (or user, as appropriate) to retain rights to inputs and outputs, while granting the vendor only a limited license to process them to provide the service. Explicitly address whether customer content can be used to train models, to create benchmarks, or to improve a shared model.
Prohibited uses should be written in a way that matches real classroom risks and AI failure modes. For example, prohibit using the tool for high-stakes decisions without human review (discipline, grading without oversight, admissions screening). Require safeguards against generating or storing sensitive categories (IEP details, health data) unless explicitly approved and protected. Include confidentiality expectations for prompts and outputs, and clarify that the vendor will not claim ownership over student-created work embedded in prompts.
Common mistake: ignoring conflicts between AI terms and privacy terms—e.g., a DPA says “processor only,” but AI terms say “we may use content to develop and improve AI models.” Resolve by making the DPA and an AI addendum control for student/staff personal data and education records. Practical outcome: AI-ready terms that align with your data classification and training program: staff know what they can paste into the tool, what must be redacted, and when to use approved “student mode” configurations.
1. In this chapter, why is legal review described as more than a “paperwork” step in EdTech AI procurement?
2. What is the recommended first step in the chapter’s practical review workflow?
3. Which scenario best illustrates the chapter’s “common mistake” in reviewing DPAs for AI-enabled products?
4. What does the chapter mean by translating legal obligations into operational controls and training?
5. Which pair of deliverables does the chapter specify for documenting legal/privacy obligations and data movement?
Due diligence often ends with a reassuring demo, a security questionnaire, and a “trust us” slide about reliability. Contracting is where you convert those assurances into enforceable commitments that match instructional reality. In EdTech AI, weak contracts create a predictable pattern: a classroom depends on a tool during testing week, the vendor schedules “routine maintenance,” the service slows down, and everyone learns—too late—that the agreement never defined downtime, response times, or data export formats.
This chapter focuses on practical contracting mechanics: defining service scope and non-functional requirements (availability and latency), aligning SLAs and credits to the educational impact, setting clear support and change-management expectations, establishing audit rights and ongoing assurance, and finalizing an exit strategy with portability and deletion evidence. You are not trying to “win” negotiations; you are trying to reduce operational and compliance risk with language that can be measured, reported, and enforced.
A useful mindset is to treat the contract like an interface specification. If your district or campus depends on the system for identity, rostering, grading workflows, or AI tutoring, you need precise definitions and observable metrics. If the vendor cannot define or measure a commitment, you cannot manage it—and you cannot defend the procurement decision when things fail.
The rest of this chapter breaks down the contract components that matter most for AI-enabled EdTech services and offers practical wording strategies and common mistakes to avoid.
Practice note for Define service scope and non-functional requirements (availability, latency): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Negotiate SLAs, SLOs, and credits that match instructional impact: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set support, escalation, and change-management expectations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Lock in audit rights, reporting, and ongoing assurance: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Finalize exit strategy: portability, deletion certificate, transition support: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Define service scope and non-functional requirements (availability, latency): document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Negotiate SLAs, SLOs, and credits that match instructional impact: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Set support, escalation, and change-management expectations: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
People use SLA, SLO, and KPI interchangeably, which leads to contracts full of “metrics” that don’t actually create obligations. Treat them as three different layers of control.
SLA (Service Level Agreement) is the enforceable promise: if the vendor fails to meet it, there is a remedy (service credits, termination rights, or other contractual consequences). SLO (Service Level Objective) is the target the vendor designs to hit most of the time; it can be stricter than the SLA and is useful for operational transparency, but it may not trigger a remedy. KPI (Key Performance Indicator) is a business/operational metric that informs governance (e.g., adoption, accuracy, time-to-resolution), usually without contractual penalties.
In education, the trick is choosing which items must be SLAs because failures directly disrupt instruction or create compliance exposure. For example: availability during school hours, incident response timing for a suspected breach, or time to restore roster sync after SIS changes. Other metrics—like AI feature accuracy or model response time—may start as KPIs and mature into SLOs/SLAs once you have baseline data.
Common mistake: accepting a KPI dashboard as “the SLA.” If there is no contractual definition, reporting cadence, and remedy, it’s just a chart. Practical outcome: you leave contracting with a short list of true SLAs (enforceable), a longer list of SLOs (transparent operational targets), and KPIs (governance signals) that feed QBRs and renewal decisions.
Availability is the headline number, but the definitions underneath determine whether it protects instruction. Start by stating the service scope and non-functional requirements: which components must be up, and what “usable” means (e.g., login works, assignments load, AI responses return within a defined latency).
Define uptime carefully. Many vendors exclude large categories of failure: third-party outages, “beta” features, integrations, or anything they deem “customer misconfiguration.” You should accept some exclusions, but not the ones that shift core reliability risk onto the school. If the product depends on a specific cloud region or an LLM provider, you can allow third-party exclusions only if the vendor offers mitigations (multi-region, graceful degradation, caching, or alternative providers) and clear communication duties.
Service credits should reflect instructional impact. A flat 5% credit for a major outage during exams rarely motivates improvement. Consider tiered credits (higher credits for school-hour outages), and add a chronic failure clause: if uptime falls below a threshold for two of three months, you gain the right to terminate or require a corrective action plan.
Common mistake: accepting a single uptime number (e.g., 99.9%) without clarifying hours, time zone, exclusions, and degradation. Practical outcome: you can map availability commitments to real classroom risk and know exactly what evidence you need when the vendor misses the mark.
Support language is where many EdTech contracts quietly fail. “We provide email support” is not a plan when an SSO outage blocks 18,000 students. Build support expectations around incident severity, school calendars, and who is allowed to open tickets.
Start with tier definitions (P1/P2/P3) tied to instructional impact. A P1 in education is often “students cannot access required instruction or assessment,” not merely “system is down.” Require initial response time and restoration targets (or workaround targets) for each severity. Response time alone is insufficient if the vendor responds quickly but resolves slowly.
Include expectations for customer responsibilities (e.g., you maintain SIS credentials, your network allows vendor IP ranges) so failures don’t become finger-pointing. Also require a post-incident review for major incidents with root cause, corrective actions, and prevention commitments.
Common mistake: forgetting that integration issues (SIS sync, rostering, SSO) are the most frequent classroom blockers. Practical outcome: you can run a predictable escalation workflow during high-stakes periods and ensure vendor changes do not break school operations without warning.
Security contracting should feel like “ongoing assurance,” not a one-time questionnaire. Attach security requirements as a schedule or exhibit so they are versioned and enforceable. This is where you lock in reporting, attestations, and your right to verify controls.
Start with attestations: SOC 2 Type II and/or ISO 27001 can be strong evidence, but only if you specify delivery timing (e.g., annually within 30 days of issuance), scope (which systems and subservice organizations), and notification of material exceptions. If a vendor has neither, require a roadmap with interim controls and the right to reassess pricing/renewal based on progress.
For AI systems, add specifics: how prompts and outputs are logged, how long they are retained, whether they are used for model training, and how data is segregated by tenant. Require that any use of student data for training is opt-in with clear written consent and a documented de-identification standard when applicable.
Common mistake: treating a SOC 2 as “proof of security” without checking scope or carve-outs. Practical outcome: you have a repeatable mechanism for evidence refresh, you can detect control drift, and you have defined rights when the vendor’s security posture changes.
Exit planning is a core risk control. Schools change tools, grant funding expires, and vendors get acquired. If you cannot export data cleanly or obtain a deletion certificate, you carry privacy and operational risk long after the contract ends.
Begin with data ownership and permitted use. The institution should own or control student/staff data; the vendor should act as a processor/service provider with narrowly defined purposes. Explicitly prohibit secondary uses such as advertising, data brokerage, or training foundation models on identifiable student data unless there is a separate, opt-in agreement.
Also handle legal holds and records retention. The vendor should support your retention obligations (e.g., exporting records before deletion) and clearly state what data must be retained for compliance, how it is protected, and when it is ultimately purged.
Common mistake: assuming “you can export your data” means “you can rebuild your program elsewhere.” Practical outcome: if you switch vendors mid-year, you can move rosters, content mappings, and essential learning records without losing continuity or violating privacy commitments.
Signing the contract is the beginning of vendor management, not the end. AI products evolve quickly—models change, features appear, and data flows expand. Your goal is to prevent “control drift,” where the service gradually moves outside the risk posture you approved.
Set a cadence for Quarterly Business Reviews (QBRs) or at least semiannual reviews, and specify required inputs: uptime reports, incident summaries, support performance, roadmap and deprecations, security evidence refresh, and privacy-impact changes. Tie these reviews to a named owner on both sides (vendor CSM + your IT/security/privacy lead) so issues are not trapped in informal email threads.
Use renewal time as leverage. If the vendor consistently misses SLOs or drifts on controls, renewals should require a corrective action plan, pricing adjustments, or feature limitations. Conversely, if performance is strong, you can simplify governance by narrowing reporting to what matters most.
Common mistake: treating renewals as a procurement formality. Practical outcome: you maintain a living due diligence posture, detect early warning signals, and keep the vendor relationship aligned to instructional needs and regulatory obligations over time.
1. What is the chapter’s main reason for emphasizing contracting in EdTech AI due diligence?
2. Which contracting approach best reflects the chapter’s recommended mindset?
3. Which set of terms best represents the non-functional requirements the chapter highlights as needing clear definition?
4. What is the key risk illustrated by the testing-week “routine maintenance” example?
5. Which combination best captures the chapter’s recommended exit strategy elements?
Due diligence becomes real during procurement. This is where a school or campus translates risk appetite into measurable requirements, tests what vendors actually built, and creates a decision record that can survive scrutiny from leadership, auditors, and families. In AI-enabled EdTech, the procurement workflow must do three things at once: protect student/staff data, ensure the product improves learning or operations, and produce a contract and implementation plan that won’t collapse under the first incident or outage.
This chapter walks through an end-to-end execution pattern: writing an AI-aware RFI/RFP, running demos and proof-of-value with privacy-safe data, scoring vendors using a weighted matrix, preparing an executive approval packet, and launching with governance. The goal is not “more paperwork.” The goal is a repeatable workflow that makes tradeoffs explicit and keeps high-risk assumptions from slipping into production by accident.
Procurement is also where engineering judgment matters. Many AI claims are technically true but operationally misleading (for example, “we don’t store prompts” while keeping extensive telemetry, or “we’re SOC 2” without clarifying scope exclusions). Your job is to convert marketing language into testable commitments, request evidence at the right depth, and document any exceptions before approval—not after a complaint or breach.
Practice note for Draft an AI-aware RFI/RFP with measurable requirements: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Run demos and proof-of-value with privacy-safe test data: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Score vendors with a weighted matrix and document exceptions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Prepare the executive approval packet and implementation gates: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Launch with governance: training, policies, and post-launch reviews: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Draft an AI-aware RFI/RFP with measurable requirements: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Run demos and proof-of-value with privacy-safe test data: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Score vendors with a weighted matrix and document exceptions: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Prepare the executive approval packet and implementation gates: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Practice note for Launch with governance: training, policies, and post-launch reviews: document your objective, define a measurable success check, and run a small experiment before scaling. Capture what changed, why it changed, and what you would test next. This discipline improves reliability and makes your learning transferable to future projects.
Vendors hand-wave when requirements are vague. “Must be secure” invites a brochure. “Encrypt data at rest using AES-256 and in transit using TLS 1.2+; provide evidence of configuration and key management approach” forces an answer you can verify. Start your RFI/RFP with a short context statement: the use case (instructional tutoring, grading assistance, student support chatbot, analytics), data types involved (student PII, education records, staff HR data), and where the tool will be used (district-managed devices, BYOD, LMS integration). This aligns vendor responses to the actual risk profile.
Write requirements in measurable terms and separate them into: (1) mandatory “shall” items, (2) scored “should” items, and (3) informational questions. For AI features, specify boundaries: what the model can do, what it must not do, and how you will detect drift. Examples of requirements vendors cannot hand-wave include: acceptable identity standards (SAML 2.0/OIDC with district IdP), audit logging fields (user ID, timestamp, action, dataset, prompt category), data retention windows (e.g., prompts deleted in 30 days unless legally required), and training restrictions (no training on student content without explicit written authorization).
Common mistakes: listing every possible requirement (making all vendors noncompliant), failing to specify evidence type (policy vs. report vs. screenshot vs. contract clause), and not distinguishing pilot vs. production requirements. A practical approach is a two-tier RFP: pilot requirements (safe test data, limited access, restricted features) and production requirements (full logging, SSO, SLA, DPA, security sign-off). This keeps innovation possible without normalizing shortcuts.
Questionnaires succeed when they are short enough to complete, specific enough to validate, and mapped to a decision. Organize your security and privacy questionnaire around verifiable controls: governance, access control, encryption, vulnerability management, SDLC, incident response, data handling, and AI model governance. For each control, request both an answer and evidence. For example: “Describe your vulnerability scanning cadence” is incomplete; add “Provide the last two completed scan summaries and remediation SLA policy.”
For security posture, anchor on SOC 2 Type II or ISO 27001 when possible, but don’t stop at the badge. Ask for: SOC 2 report with management response, scope statement (systems, products, regions), the list of subservice organizations, and the bridge letter if the report is older than 12 months. For ISO 27001, request the certificate, statement of applicability, and recent internal audit summary. Then add targeted questions that connect to your use case, such as how the vendor isolates tenant data, how admin roles are constrained, and whether privileged access is logged and reviewed.
Privacy evidence must be similarly concrete: DPA template, data retention schedule, data deletion method (including backups), subprocessors and locations, DPIA/PIA availability, and mechanisms for parental consent and age gating where applicable. For AI, request documentation on content filters, hallucination risk mitigation, and how the system handles sensitive attributes. If the vendor offers “no training on your data,” ask what they mean operationally: do they use prompts for product improvement, evaluation, abuse detection, or fine-tuning? Each can be a form of processing requiring controls and transparency.
Finally, make demos and proof-of-value part of evidence gathering. Require vendors to show admin consoles, audit logs, data export/delete flows, and SSO setup—not just end-user features. Run the proof-of-value using privacy-safe test data: synthetic student records, anonymized transcripts, or district-approved de-identified samples. This reduces risk while still validating whether the AI performs and whether controls behave as promised.
A defensible selection process is a scoring model plus a decision record that explains tradeoffs. Build a weighted matrix that reflects institutional priorities. Typical categories include: instructional/operational fit, data protection (security + privacy), AI governance and safety, integration and interoperability, accessibility (WCAG alignment), vendor viability, and total cost of ownership. Weighting is where leadership intent becomes measurable—if privacy and security are “critical,” they must carry enough weight to change outcomes.
Define scoring anchors to reduce subjectivity. A “5” should mean the vendor provided evidence and demonstrated the control; a “3” might mean partial control or policy-only; a “1” means missing or contradicted. Require evaluators to cite artifacts (SOC report section, screenshot of audit log, contract clause). This prevents “gut feel” decisions and supports later audits or public records requests.
Document exceptions explicitly. The most common failure mode in AI procurement is informal acceptance of risk during a pilot that quietly becomes permanent. Your decision record should include: identified risks, accepted risks, mitigations, owners, and due dates. Keep it readable—one or two pages—so it can be attached to an executive approval packet. When questioned later (“Why did we choose this tool?”), your answer should be in the record, not in someone’s inbox.
Procurement does not end at award; it ends at controlled production launch. Use implementation gates to prevent “contract signed” from turning into “data flowing” without safeguards. Define gates as objective checks with named approvers (IT security, privacy office, legal, instructional sponsor). A typical sequence is: Gate 0 (pilot approval), Gate 1 (security and privacy validation), Gate 2 (integration readiness), Gate 3 (production launch), Gate 4 (post-launch review).
Gate 0 should limit scope: smallest viable user group, least sensitive data, and feature flags that disable risky capabilities (e.g., file uploads, external browsing, student-to-AI chat) until proven safe. Gate 1 should verify the controls you evaluated: SSO works, roles are least-privilege, audit logging is enabled, retention settings match contract terms, and subprocessors align with disclosed lists. Gate 2 checks operational readiness: LMS/SIS integration mapping, data flow diagram sign-off, accessibility review, and support processes (ticket routing, escalation paths).
Gate 3 requires the “approval packet” artifacts: final contract/DPA, SLA summary (uptime, support hours, incident response), risk register with accepted exceptions, and training plan. Include a cutover and rollback plan—AI tools can change user behavior quickly, and you need a way to pause or disable features if issues arise. Common mistakes include skipping tenant configuration hardening, failing to test data deletion, and not verifying that “pilot settings” won’t reset during upgrades. Treat configuration as security: capture screenshots/config exports and store them with the implementation record.
A strong contract cannot compensate for unclear human rules. AI procurement must launch with policy rollout that matches the use case and audience. Create or update acceptable use guidance to cover: what data users may enter (no student IEP details, no disciplinary notes, no SSNs), what AI output may be used for (drafting, brainstorming, translation) and what requires human verification (grading decisions, placement recommendations, safety-critical communications). For staff, clarify that AI is assistive and does not replace professional judgment.
Academic integrity needs explicit, teachable boundaries. Define what constitutes permitted collaboration vs. prohibited outsourcing. Provide examples that map to classroom reality: “You may use AI to generate study questions from your notes, but you may not submit AI-generated answers as your own unless the assignment explicitly allows it.” Pair rules with disclosure expectations (when students must cite AI use) and with equitable access considerations.
Publish role-based guidance: students, educators, administrators, and IT support. For educators, include prompt hygiene, bias awareness, and how to spot unsafe outputs. For administrators, include how to read audit logs and how to handle suspected misuse. Make training practical: short modules and job aids embedded where people work (LMS banners, admin console notes). A common mistake is issuing a policy PDF without operational reinforcement. Tie policy to onboarding checklists, annual refreshers, and an escalation channel for questions.
AI vendor risk is not static. Models, features, and subprocessors change; your governance must assume change. Establish post-award routines: quarterly vendor check-ins, annual evidence refresh (new SOC 2/ISO artifacts), and change notifications review (new data processing, new regions, new AI features). Track contract commitments in a control register: retention, deletion SLAs, audit rights, incident notification timelines, and support response targets.
Run incident drills before you need them. Tabletop exercises should cover: suspected data leakage through prompts, account compromise, harmful content generated for students, and vendor outage during testing season. Validate who contacts whom, what logs are needed, and what communications are required for families and regulators. Ensure your contract supports the drill outcomes: access to logs, cooperation language, and timelines that are realistic for a school environment.
Plan renewals as mini re-evaluations, not automatic extensions. Review usage metrics, learning outcomes, support ticket trends, and any policy violations. Re-check data flows: integrations often expand quietly over time. If exceptions were accepted during selection (e.g., missing feature for admin logging), confirm they were remediated by the agreed date; if not, renegotiate, impose restrictions, or prepare an exit plan. Good procurement execution ends with optionality: a documented offboarding process, data export format, deletion confirmation, and a clear path to switch vendors if risk or performance becomes unacceptable.
1. What is the primary purpose of making an RFI/RFP "AI-aware" in this chapter’s procurement workflow?
2. Why does the chapter recommend running demos and proof-of-value using privacy-safe test data?
3. What is the key benefit of using a weighted scoring matrix during vendor evaluation?
4. According to the chapter, when should exceptions to requirements be documented?
5. Which procurement outcome does the chapter say must be achieved simultaneously for AI-enabled EdTech?