Production AI Institute — vendor-neutral certification for AI practitioners
Verify a credentialFor organisationsContact
InsightsIndustry Playbooks
⚡ Industry Playbook

Energy & Critical Infrastructure AI Deployment Playbook

Deploying AI in power grids, pipelines, water treatment, and other critical infrastructure is categorically different from enterprise AI. Errors don't produce wrong answers — they can trigger cascading grid failures, pipeline ruptures, or loss of control in safety-instrumented systems. This guide covers the regulatory landscape, OT-specific risk patterns, and how the Production Safety Framework maps to critical infrastructure AI requirements.

⚠ Critical Infrastructure Context

AI systems in critical infrastructure face nation-state threat actors, 99.999% uptime requirements, and physical safety consequences. Every PSF domain is elevated to Critical or High in this vertical. Human oversight requirements are non-negotiable and encoded in grid codes, pipeline regulations, and national security frameworks worldwide.

Regulatory Landscape

Critical infrastructure AI sits at the intersection of cybersecurity regulation, safety engineering standards, and emerging AI-specific frameworks. Unlike finance or healthcare where AI regulation is still developing, the OT security standards (IEC 62443, NERC CIP) already impose change management, supply chain, and monitoring requirements that directly govern AI deployment.

FrameworkScopeAI RelevancePenalty
NERC CIPNorth American bulk electric systemCIP-007 (systems security), CIP-010 (change management), CIP-013 (supply chain). AI systems touching BES Cyber Systems require change management review before deployment.Up to $1M/day per violation
IEC 62443Industrial automation & control systems worldwideSecurity Level (SL) requirements apply to AI components in OT networks. SL-2 minimum for most grid AI; SL-3/4 for safety instrumented systems.Regulatory enforcement varies by jurisdiction
EU NIS2 DirectiveEssential services in EU member statesEnergy, water, transport, digital infrastructure operators must implement risk management for AI-assisted control systems. Incident reporting within 24h.Up to €10M or 2% global turnover
NIST CSF 2.0US critical infrastructure (all sectors)Govern, Identify, Protect, Detect, Respond, Recover functions must address AI-specific risks. AI RMF integration recommended for autonomous control systems.No direct penalties; contractual/procurement implications
UK CNI Cyber StrategyUK critical national infrastructureNCSC guidance on AI in CNI covers adversarial ML, supply chain integrity, and AI-assisted threat detection. Sector-specific requirements for energy licensees.Enforcement via sector regulators (Ofgem, etc.)
TSA Pipeline DirectivesUS natural gas and hazardous liquid pipelinesSD-02C requires cybersecurity implementation plans covering all OT systems, including AI-based anomaly detection and automated control systems.Up to $13,000/day per violation

AI Systems in Critical Infrastructure

AI deployment across the energy and infrastructure sector ranges from low-risk forecasting tools to safety-critical control system components. The table below maps deployment maturity, risk level, and applicable regulatory frameworks.

AI SystemDeploymentRisk LevelKey RegulationsPSF Domains
Demand ForecastingWideMediumGrid code compliance, balancing market rulesD1, D2, D4
Grid Stability / Frequency ResponseLimited (testing)CriticalNERC CIP, grid code, protective relay standardsD1, D2, D5, D6, D7
Predictive MaintenanceWideMediumNERC CIP-007 (asset management)D1, D4, D8
Anomaly Detection (SCADA)GrowingHighNERC CIP-007, IEC 62443, NIS2D1, D4, D7
Pipeline Leak DetectionModerateHighTSA Pipeline Directives, PHMSAD1, D2, D4, D6
Renewable Energy DispatchWideHighGrid code, balancing market rules, emissions regulationsD1, D2, D4, D6
Substation AutomationGrowingCriticalIEC 61850, NERC CIP, protective relay standardsD1, D2, D5, D6, D7

OT-Specific AI Risk Patterns

Critical infrastructure AI faces risks that do not exist in enterprise environments. The following patterns represent the highest-priority failure modes identified in operational technology deployments.

OT/IT Network Convergence
Critical

AI systems that span IT (data collection, model training) and OT (real-time control) networks create lateral movement paths between air-gapped zones. A compromised AI inference endpoint becomes a pivot point into the control network.

PSF: D7 (Security) + D1 (Input Governance)
Sensor Data Manipulation
Critical

AI systems in SCADA and DCS environments ingest sensor telemetry as input. Adversarial manipulation of sensor readings — spoofed temperature, pressure, flow, or voltage data — can cause AI systems to issue dangerous automated commands.

PSF: D1 (Input Governance) + D7 (Security)
Model Latency in Real-Time Control
High

Grid-balancing and protective relay applications require sub-millisecond response times. Cloud-hosted AI inference introduces latency incompatible with real-time control requirements. Hybrid edge/cloud architectures are required but create synchronisation risks.

PSF: D5 (Deployment Safety) + D4 (Observability)
AI-Induced Cascade Failure
Critical

Interconnected grid systems can amplify AI errors. A demand-forecasting error in one region can propagate incorrect dispatch signals across interconnected systems, triggering load imbalances that require manual intervention at scale.

PSF: D2 (Output Validation) + D6 (Human Oversight)
Vendor Model Updates in OT Environments
High

Traditional OT change management requires extensive testing before deploying software updates. AI vendors that push silent model updates violate OT change control processes and may introduce unvalidated behaviour into safety-critical systems.

PSF: D5 (Deployment Safety) + D8 (Vendor Resilience)
Legacy Protocol Incompatibility
Medium

Critical infrastructure runs Modbus, DNP3, IEC 61850, and proprietary SCADA protocols that predate modern AI. AI integration layers must translate between these protocols and AI inference APIs without introducing timing vulnerabilities or data integrity issues.

PSF: D3 (Data Protection) + D5 (Deployment Safety)

The OT/IT Convergence Problem

The most dangerous architectural mistake in critical infrastructure AI is connecting IT-side AI systems (data pipelines, model training, dashboards) to OT-side control systems (SCADA, DCS, PLC networks) without enforcing strict unidirectional data flow. AI creates pressure to bridge these networks for real-time inference — which is exactly what adversaries exploit.

Required Architecture Principles

  1. Unidirectional security gateways (data diodes) between OT and IT networks — data flows from OT to IT only for AI training and monitoring
  2. Edge inference for real-time AI control — models must run on OT-side edge hardware, not cloud endpoints
  3. Out-of-band model updates — model updates delivered via removable media or dedicated update channels, not over production OT networks
  4. Separate AI training and inference infrastructure — training on IT-side historian data; inference on OT-side edge devices
  5. No direct API calls from OT to cloud AI providers — eliminates the most common OT/IT convergence attack path

Human Oversight in Safety-Critical AI

Grid codes and pipeline safety regulations in most jurisdictions require human authorisation for load shedding, protective relay tripping, and emergency shutdown commands. AI may assist or recommend — it may not autonomously execute these actions. This is not an AI governance choice; it is a regulatory requirement enforced by grid operators (NERC, ENTSO-E, National Grid ESO) and pipeline regulators.

Human Oversight Threshold Table
ActionAI RoleAuthorisation Required
Load shedding (intentional outage)Recommend + rank optionsControl room operator
Protective relay trippingAlert + flag anomalySystem operator (automatic relay remains independent)
Emergency shutdown (ESD)Recommend + provide diagnosticsSenior operator or automatic safety system (not AI)
Renewable dispatch curtailmentOptimise scheduleCan be automated with validated constraints + override
Demand response signalForecast + recommend levelCan be automated within pre-approved parameters
Predictive maintenance schedulingPrioritise work ordersMaintenance supervisor review for safety-critical assets

PSF Domain Mapping for Critical Infrastructure

All eight PSF domains are elevated in critical infrastructure. The ratings below reflect the specific risk environment of energy and utility AI deployments — not the general enterprise baseline.

DomainNameRatingInfrastructure Context
D1Input GovernanceCriticalSensor input validation is safety-critical. Spoofed SCADA telemetry is the primary AI attack vector in OT environments. All AI inputs from field devices require integrity verification.
D2Output ValidationCriticalAI outputs that trigger automated actuation (switching, valve control, load shedding) must pass output validation before execution. No AI command to physical infrastructure without validation layer.
D3Data ProtectionHighGrid topology, asset locations, and operational patterns are sensitive national security data. Exfiltration of AI training data or model parameters constitutes infrastructure intelligence gathering.
D4ObservabilityCriticalReal-time monitoring of AI behaviour in OT environments is required for incident response. Minimum: 100ms telemetry on all AI control outputs. Automated rollback triggers on anomalous command sequences.
D5Deployment SafetyCriticalCanary deployment is insufficient for OT — requires parallel operation testing against simulation environments. Model version pinning is mandatory; NERC CIP-010 change management applies to all AI deployments touching BES.
D6Human OversightCriticalHuman-in-the-loop is non-negotiable for load shedding, protective relay tripping, and emergency shutdown commands. AI may recommend; human must authorise. Required by most grid codes.
D7SecurityCriticalAI systems in OT environments face nation-state threat actors. Prompt injection, adversarial sensor manipulation, and supply chain compromise are documented attack vectors against critical infrastructure AI.
D8Vendor ResilienceHighSingle-vendor AI dependency in critical infrastructure creates systemic risk. Multi-vendor AI strategy and documented fallback-to-manual procedures are required for NERC CIP compliance and business continuity.

NERC CIP Compliance Checklist for AI Systems

For North American bulk electric system operators, AI systems that touch BES Cyber Systems must be integrated into the existing NERC CIP compliance program. The following checklist covers minimum requirements.

AI systems touching BES Cyber Systems documented in NERC CIP-010 configuration management plan
Change management review completed before any AI model update in OT environments
OT/IT network segmentation enforced — AI inference endpoints isolated from control plane
All sensor inputs to AI systems validated for integrity before processing (D1)
AI-generated control commands blocked from automated execution without validation layer (D2)
Real-time telemetry on all AI outputs at ≤100ms interval; alerts on anomalous command sequences (D4)
Human authorisation required for load shedding, relay tripping, and emergency shutdown commands (D6)
AI vendor supply chain assessed under NERC CIP-013; model provenance documented
Fallback-to-manual procedures tested at least quarterly; operators trained without AI assistance
Grid topology and SCADA data classified; AI training data handling reviewed under D3
Penetration testing of AI inference endpoints using OT-specific threat scenarios (D7)
Multi-vendor AI strategy or documented single-vendor risk acceptance with compensating controls (D8)
Incident response plan updated to include AI-specific failure modes and cascade failure scenarios
Regulatory notification procedures defined for AI-related incidents under NIS2 (24h) / sector rules
From reading to credential

You understand the gaps.
Get the credential that proves it.

The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.

Start AIDA — free →CPAP practitioner credential
The Production AI Brief

Get framework updates in your inbox

PSF assessments, deployment guides, and production AI analysis. Weekly. No hype.

Related guides

D6 Human Oversight — HITL patterns for production agentsD7 Security — prompt injection, supply chain, and adversarial MLD5 Deployment Safety — model versioning and canary patternsD1 Input Governance — validating inputs at production scaleHealthcare AI Deployment PlaybookLegal & Government AI Deployment Playbook