Requirements Document

Introduction

This control plane provides comprehensive governance, security, and operational oversight for AWS AI services, specifically Amazon Bedrock and Amazon SageMaker. It enables organizations to inventory AI assets, assess and score risks, validate guardrails, conduct red-team testing, govern model lifecycles, enforce approval gates, and generate audit-ready evidence. The platform is designed for Docker-first deployment, CI/CD readiness, and AWS-native integration, supporting sandbox-first deployment before production rollout.

Glossary

AWS AI Governance Control Plane: The platform that provides governance, security, and operational oversight for AWS AI services
Bedrock: Amazon Bedrock, a fully managed service that makes foundation models (FMs) from leading AI companies available via an API
Bedrock Agent: An AI-powered application built using Bedrock that can use tools, knowledge bases, and guardrails
Bedrock Guardrail: A set of policies that help ensure responsible AI usage by blocking or modifying harmful content
SageMaker Model: A machine learning model deployed in Amazon SageMaker
SageMaker Endpoint: A deployed model that can serve real-time predictions
Model Package Group: A collection of model versions in SageMaker
Model Package: A versioned model artifact in SageMaker with metadata
Model Card: A structured document that describes a model's purpose, limitations, and ethical considerations
Agent Alias: A pointer to a specific version of a Bedrock agent
Action Group: A set of functions that a Bedrock agent can invoke
Lambda Tool: An AWS Lambda function used as a tool by a Bedrock agent
Knowledge Base: A Bedrock feature that enables agents to retrieve information from external data sources
Risk Score: A numerical value representing the risk level of an AI asset based on governance gaps
Sandbox Account: A dedicated AWS account for testing and development before production deployment
CI/CD Pipeline: Continuous Integration and Continuous Deployment pipeline for automated testing and deployment

Requirements

Requirement 1: Bedrock AI Asset Discovery

User Story: As a platform operator, I want to discover all Bedrock assets across my AWS account, so that I can maintain a complete inventory and assess governance coverage.

Acceptance Criteria

WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock agents in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock agent aliases in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock action groups in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all Lambda functions used as tools by Bedrock agents
WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock knowledge bases in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock guardrails in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all model IDs used by Bedrock agents
IF a Bedrock resource cannot be discovered due to permissions or service errors, THEN THE Scanner SHALL log the error and continue scanning other resources
WHERE a Bedrock resource has no associated guardrail, THEN THE Scanner SHALL flag it as missing guardrail coverage

Requirement 2: SageMaker AI Asset Discovery

User Story: As a platform operator, I want to discover all SageMaker assets across my AWS account, so that I can maintain a complete inventory and assess model governance.

Acceptance Criteria

WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker models in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker endpoints in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker model package groups in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker model packages in the target AWS account and Region
WHEN the inventory scanner runs, THE Scanner SHALL record the approval status for each model package
WHEN the inventory scanner runs, THE Scanner SHALL discover model card availability for each model package
WHEN the inventory scanner runs, THE Scanner SHALL discover monitoring configuration for each SageMaker endpoint
IF a SageMaker resource cannot be discovered due to permissions or service errors, THEN THE Scanner SHALL log the error and continue scanning other resources

Requirement 3: IAM Role Inventory

User Story: As a platform operator, I want to record IAM roles used by AI assets, so that I can assess permission risks and ensure least-privilege access.

Acceptance Criteria

WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each Bedrock agent
WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each Lambda function used as a Bedrock tool
WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each SageMaker endpoint
WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each SageMaker pipeline
WHERE an IAM role has broad permissions (e.g., * actions or resources), THEN THE Scanner SHALL flag it as high-risk
WHERE an IAM role lacks proper logging configuration, THEN THE Scanner SHALL flag it as missing logging coverage

Requirement 4: Asset Classification

User Story: As a platform operator, I want to classify AI assets by owner, environment, sensitivity, user exposure, and risk level, so that I can prioritize governance efforts.

Acceptance Criteria

WHEN an asset is discovered, THE Scanner SHALL record the owner tag or metadata
WHEN an asset is discovered, THE Scanner SHALL record the environment tag (e.g., sandbox, development, staging, production)
WHEN an asset is discovered, THE Scanner SHALL record the sensitivity classification (e.g., public, internal, confidential, restricted)
WHEN an asset is discovered, THE Scanner SHALL record the user exposure level (e.g., internal, partner, public)
WHEN an asset is discovered, THE Scanner SHALL calculate a risk score based on governance gaps
WHERE an asset has missing owner information, THEN THE Scanner SHALL assign a higher risk score
WHERE an asset is in production environment with missing guardrails, THEN THE Scanner SHALL assign the highest risk score

Requirement 5: Governance Control Validation

User Story: As a platform operator, I want to flag missing or weak governance controls, so that I can remediate security and compliance gaps.

Acceptance Criteria

WHERE a Bedrock agent has no guardrail attached, THEN THE Scanner SHALL flag it as missing guardrail coverage
WHERE a Bedrock agent has no owner tag, THEN THE Scanner SHALL flag it as missing owner information
WHERE a Bedrock agent's Lambda tools lack CloudWatch logging, THEN THE Scanner SHALL flag it as missing logging coverage
WHERE a SageMaker model has no model card, THEN THE Scanner SHALL flag it as missing model card
WHERE a SageMaker model package has no approval status, THEN THE Scanner SHALL flag it as unapproved
WHERE a SageMaker endpoint lacks monitoring configuration, THEN THE Scanner SHALL flag it as missing monitoring
WHERE an IAM role has broad permissions (e.g., * actions or resources), THEN THE Scanner SHALL flag it as high-risk
WHERE a production model lacks approval, THEN THE Scanner SHALL flag it as unapproved production deployment

Requirement 6: Data Persistence

User Story: As a platform operator, I want to persist inventory records and evidence snapshots, so that I can track changes over time and generate reports.

Acceptance Criteria

WHEN the inventory scanner completes, THE Scanner SHALL store normalized asset records in DynamoDB or PostgreSQL
WHEN the inventory scanner completes, THE Scanner SHALL store raw API response evidence in S3
WHEN the inventory scanner completes, THE Scanner SHALL store evidence snapshots with timestamps
WHERE evidence storage fails, THEN THE Scanner SHALL log the error and retry up to 3 times
WHERE normalized records storage fails, THEN THE Scanner SHALL log the error and retry up to 3 times

Requirement 7: REST API Endpoints

User Story: As a developer or operator, I want to access asset data via REST API endpoints, so that I can integrate with other systems and build custom dashboards.

Acceptance Criteria

THE API SHALL expose a GET /assets endpoint for listing all assets with pagination
THE API SHALL expose a GET /assets/{id} endpoint for retrieving asset details
THE API SHALL expose a GET /summary/risk endpoint for risk summary statistics
THE API SHALL expose a GET /export endpoint for exporting inventory data in JSON or CSV format
WHERE a requested resource does not exist, THEN THE API SHALL return HTTP 404
WHERE a request is malformed, THEN THE API SHALL return HTTP 400 with error details

Requirement 8: Dashboard Visualization

User Story: As a platform operator, I want a dashboard page with critical AI assets and governance gaps, so that I can quickly identify and address high-priority issues.

Acceptance Criteria

THE Dashboard SHALL display critical AI assets with highest risk scores
THE Dashboard SHALL display assets missing guardrail coverage
THE Dashboard SHALL display unapproved production models
THE Dashboard SHALL display high-risk agents (e.g., missing guardrail, broad permissions, no owner)
THE Dashboard SHALL display stale evaluations (e.g., models not evaluated in 90 days)
WHERE dashboard data is unavailable, THEN THE Dashboard SHALL display an error message and retry option

Requirement 9: Docker-First Deployment

User Story: As a DevOps engineer, I want to deploy the platform using Docker, so that I can ensure consistent environments across development, testing, and production.

Acceptance Criteria

THE Platform SHALL provide a Dockerfile for building the application container
THE Platform SHALL provide a docker-compose.yml for local development and testing
THE Platform SHALL support configuration via environment variables
WHERE a required environment variable is missing, THEN THE Application SHALL fail fast with a descriptive error message
THE Platform SHALL be designed for sandbox-first deployment before production rollout

Requirement 10: CI/CD Readiness

User Story: As a DevOps engineer, I want the platform to be CI/CD-ready, so that I can automate testing, building, and deployment.

Acceptance Criteria

THE Platform SHALL include unit tests for all core components
THE Platform SHALL include integration tests for AWS service interactions
THE Platform SHALL provide a build script for creating Docker images
THE Platform SHALL provide deployment scripts for AWS environments
WHERE tests fail, THEN THE CI/CD Pipeline SHALL fail and report the error

Requirement 11: AWS-Native Integration

User Story: As a platform operator, I want the platform to integrate natively with AWS services, so that I can leverage AWS security and management features.

Acceptance Criteria

THE Platform SHALL use AWS SDK for JavaScript/Python for all AWS service interactions
THE Platform SHALL use IAM roles for authentication to AWS services
THE Platform SHALL support AWS Organizations for multi-account scanning
WHERE AWS service calls fail due to permissions, THEN THE Platform SHALL return descriptive error messages
THE Platform SHALL encrypt data at rest using AWS KMS where applicable

Requirement 12: Sandbox-First Deployment

User Story: As a platform operator, I want to deploy the platform in a sandbox environment first, so that I can validate functionality before production rollout.

Acceptance Criteria

THE Platform SHALL support deployment to a dedicated sandbox AWS account
THE Platform SHALL provide deployment scripts for sandbox environments
WHERE production deployment is attempted without validation, THEN THE Platform SHALL warn the user
THE Platform SHALL provide a health check endpoint for sandbox validation

Requirement 13: Evidence Generation for Audits

User Story: As a compliance officer, I want audit-ready evidence generated by the platform, so that I can demonstrate governance and compliance.

Acceptance Criteria

WHEN the inventory scanner runs, THE Scanner SHALL generate an evidence report with scan metadata
THE Evidence Report SHALL include scan timestamp, AWS account ID, Region, and scanner version
THE Evidence Report SHALL include all discovered assets with their governance status
THE Evidence Report SHALL include risk scores and flagged governance gaps
WHERE evidence generation fails, THEN THE Scanner SHALL log the error and retry up to 3 times

Requirement 14: Prompt Red-Team Harness (Module 2)

User Story: As a security engineer, I want to run red-team tests on Bedrock prompts and agents, so that I can identify and mitigate security vulnerabilities.

Acceptance Criteria

WHEN the red-team harness runs, THE Harness SHALL execute predefined attack patterns against Bedrock prompts
WHEN the red-team harness runs, THE Harness SHALL test agent behavior with malicious inputs
WHEN the red-team harness runs, THE Harness SHALL record test results with timestamps and attack patterns
WHERE an attack pattern succeeds in bypassing guardrails, THEN THE Harness SHALL flag it as a security vulnerability
THE Harness SHALL provide a summary report with vulnerability severity and remediation recommendations

Requirement 15: SageMaker Model Governance Gate (Module 3)

User Story: As a ML engineer, I want to enforce approval gates for SageMaker model deployments, so that I can ensure model quality and compliance.

Acceptance Criteria

WHEN a model deployment is requested, THE Gate SHALL check if the model has a valid approval status
WHEN a model deployment is requested, THE Gate SHALL verify model card availability
WHEN a model deployment is requested, THE Gate SHALL check for recent model evaluations
WHERE a model lacks approval, THEN THE Gate SHALL block the deployment and return an error
WHERE a model lacks a model card, THEN THE Gate SHALL warn the user but allow deployment with override
WHERE a model lacks recent evaluations, THEN THE Gate SHALL warn the user but allow deployment with override

Requirement 16: Tailscale VPN Gateway

User Story: As a DevOps engineer, I want to deploy a Tailscale gateway VM to connect the control plane to my on-premises network, so that I can securely access internal resources without exposing them to the public internet.

Acceptance Criteria

THE Platform SHALL deploy an Ubuntu VM with 8 GB of memory in a public subnet
THE VM SHALL use the lowest-cost AMI available for Ubuntu
THE VM SHALL run Tailscale to establish a secure connection to the on-premises network
THE VM SHALL be named "tailscale"
WHERE the Tailscale connection fails, THEN THE System SHALL log the error and retry up to 3 times
WHERE the VM cannot be accessed via Tailscale, THEN THE System SHALL provide diagnostic logs for troubleshooting