Requirements Document
Introduction
This control plane provides comprehensive governance, security, and operational oversight for AWS AI services, specifically Amazon Bedrock and Amazon SageMaker. It enables organizations to inventory AI assets, assess and score risks, validate guardrails, conduct red-team testing, govern model lifecycles, enforce approval gates, and generate audit-ready evidence. The platform is designed for Docker-first deployment, CI/CD readiness, and AWS-native integration, supporting sandbox-first deployment before production rollout.
Glossary
- AWS AI Governance Control Plane: The platform that provides governance, security, and operational oversight for AWS AI services
- Bedrock: Amazon Bedrock, a fully managed service that makes foundation models (FMs) from leading AI companies available via an API
- Bedrock Agent: An AI-powered application built using Bedrock that can use tools, knowledge bases, and guardrails
- Bedrock Guardrail: A set of policies that help ensure responsible AI usage by blocking or modifying harmful content
- SageMaker Model: A machine learning model deployed in Amazon SageMaker
- SageMaker Endpoint: A deployed model that can serve real-time predictions
- Model Package Group: A collection of model versions in SageMaker
- Model Package: A versioned model artifact in SageMaker with metadata
- Model Card: A structured document that describes a model's purpose, limitations, and ethical considerations
- Agent Alias: A pointer to a specific version of a Bedrock agent
- Action Group: A set of functions that a Bedrock agent can invoke
- Lambda Tool: An AWS Lambda function used as a tool by a Bedrock agent
- Knowledge Base: A Bedrock feature that enables agents to retrieve information from external data sources
- Risk Score: A numerical value representing the risk level of an AI asset based on governance gaps
- Sandbox Account: A dedicated AWS account for testing and development before production deployment
- CI/CD Pipeline: Continuous Integration and Continuous Deployment pipeline for automated testing and deployment
Requirements
Requirement 1: Bedrock AI Asset Discovery
User Story: As a platform operator, I want to discover all Bedrock assets across my AWS account, so that I can maintain a complete inventory and assess governance coverage.
Acceptance Criteria
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock agents in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock agent aliases in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock action groups in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Lambda functions used as tools by Bedrock agents
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock knowledge bases in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all Bedrock guardrails in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all model IDs used by Bedrock agents
- IF a Bedrock resource cannot be discovered due to permissions or service errors, THEN THE Scanner SHALL log the error and continue scanning other resources
- WHERE a Bedrock resource has no associated guardrail, THEN THE Scanner SHALL flag it as missing guardrail coverage
Requirement 2: SageMaker AI Asset Discovery
User Story: As a platform operator, I want to discover all SageMaker assets across my AWS account, so that I can maintain a complete inventory and assess model governance.
Acceptance Criteria
- WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker models in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker endpoints in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker model package groups in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL discover all SageMaker model packages in the target AWS account and Region
- WHEN the inventory scanner runs, THE Scanner SHALL record the approval status for each model package
- WHEN the inventory scanner runs, THE Scanner SHALL discover model card availability for each model package
- WHEN the inventory scanner runs, THE Scanner SHALL discover monitoring configuration for each SageMaker endpoint
- IF a SageMaker resource cannot be discovered due to permissions or service errors, THEN THE Scanner SHALL log the error and continue scanning other resources
Requirement 3: IAM Role Inventory
User Story: As a platform operator, I want to record IAM roles used by AI assets, so that I can assess permission risks and ensure least-privilege access.
Acceptance Criteria
- WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each Bedrock agent
- WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each Lambda function used as a Bedrock tool
- WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each SageMaker endpoint
- WHEN the inventory scanner runs, THE Scanner SHALL record the IAM role ARN for each SageMaker pipeline
- WHERE an IAM role has broad permissions (e.g.,
*actions or resources), THEN THE Scanner SHALL flag it as high-risk - WHERE an IAM role lacks proper logging configuration, THEN THE Scanner SHALL flag it as missing logging coverage
Requirement 4: Asset Classification
User Story: As a platform operator, I want to classify AI assets by owner, environment, sensitivity, user exposure, and risk level, so that I can prioritize governance efforts.
Acceptance Criteria
- WHEN an asset is discovered, THE Scanner SHALL record the owner tag or metadata
- WHEN an asset is discovered, THE Scanner SHALL record the environment tag (e.g.,
sandbox,development,staging,production) - WHEN an asset is discovered, THE Scanner SHALL record the sensitivity classification (e.g.,
public,internal,confidential,restricted) - WHEN an asset is discovered, THE Scanner SHALL record the user exposure level (e.g.,
internal,partner,public) - WHEN an asset is discovered, THE Scanner SHALL calculate a risk score based on governance gaps
- WHERE an asset has missing owner information, THEN THE Scanner SHALL assign a higher risk score
- WHERE an asset is in production environment with missing guardrails, THEN THE Scanner SHALL assign the highest risk score
Requirement 5: Governance Control Validation
User Story: As a platform operator, I want to flag missing or weak governance controls, so that I can remediate security and compliance gaps.
Acceptance Criteria
- WHERE a Bedrock agent has no guardrail attached, THEN THE Scanner SHALL flag it as missing guardrail coverage
- WHERE a Bedrock agent has no owner tag, THEN THE Scanner SHALL flag it as missing owner information
- WHERE a Bedrock agent's Lambda tools lack CloudWatch logging, THEN THE Scanner SHALL flag it as missing logging coverage
- WHERE a SageMaker model has no model card, THEN THE Scanner SHALL flag it as missing model card
- WHERE a SageMaker model package has no approval status, THEN THE Scanner SHALL flag it as unapproved
- WHERE a SageMaker endpoint lacks monitoring configuration, THEN THE Scanner SHALL flag it as missing monitoring
- WHERE an IAM role has broad permissions (e.g.,
*actions or resources), THEN THE Scanner SHALL flag it as high-risk - WHERE a production model lacks approval, THEN THE Scanner SHALL flag it as unapproved production deployment
Requirement 6: Data Persistence
User Story: As a platform operator, I want to persist inventory records and evidence snapshots, so that I can track changes over time and generate reports.
Acceptance Criteria
- WHEN the inventory scanner completes, THE Scanner SHALL store normalized asset records in DynamoDB or PostgreSQL
- WHEN the inventory scanner completes, THE Scanner SHALL store raw API response evidence in S3
- WHEN the inventory scanner completes, THE Scanner SHALL store evidence snapshots with timestamps
- WHERE evidence storage fails, THEN THE Scanner SHALL log the error and retry up to 3 times
- WHERE normalized records storage fails, THEN THE Scanner SHALL log the error and retry up to 3 times
Requirement 7: REST API Endpoints
User Story: As a developer or operator, I want to access asset data via REST API endpoints, so that I can integrate with other systems and build custom dashboards.
Acceptance Criteria
- THE API SHALL expose a
GET /assetsendpoint for listing all assets with pagination - THE API SHALL expose a
GET /assets/{id}endpoint for retrieving asset details - THE API SHALL expose a
GET /summary/riskendpoint for risk summary statistics - THE API SHALL expose a
GET /exportendpoint for exporting inventory data in JSON or CSV format - WHERE a requested resource does not exist, THEN THE API SHALL return HTTP 404
- WHERE a request is malformed, THEN THE API SHALL return HTTP 400 with error details
Requirement 8: Dashboard Visualization
User Story: As a platform operator, I want a dashboard page with critical AI assets and governance gaps, so that I can quickly identify and address high-priority issues.
Acceptance Criteria
- THE Dashboard SHALL display critical AI assets with highest risk scores
- THE Dashboard SHALL display assets missing guardrail coverage
- THE Dashboard SHALL display unapproved production models
- THE Dashboard SHALL display high-risk agents (e.g., missing guardrail, broad permissions, no owner)
- THE Dashboard SHALL display stale evaluations (e.g., models not evaluated in 90 days)
- WHERE dashboard data is unavailable, THEN THE Dashboard SHALL display an error message and retry option
Requirement 9: Docker-First Deployment
User Story: As a DevOps engineer, I want to deploy the platform using Docker, so that I can ensure consistent environments across development, testing, and production.
Acceptance Criteria
- THE Platform SHALL provide a Dockerfile for building the application container
- THE Platform SHALL provide a docker-compose.yml for local development and testing
- THE Platform SHALL support configuration via environment variables
- WHERE a required environment variable is missing, THEN THE Application SHALL fail fast with a descriptive error message
- THE Platform SHALL be designed for sandbox-first deployment before production rollout
Requirement 10: CI/CD Readiness
User Story: As a DevOps engineer, I want the platform to be CI/CD-ready, so that I can automate testing, building, and deployment.
Acceptance Criteria
- THE Platform SHALL include unit tests for all core components
- THE Platform SHALL include integration tests for AWS service interactions
- THE Platform SHALL provide a build script for creating Docker images
- THE Platform SHALL provide deployment scripts for AWS environments
- WHERE tests fail, THEN THE CI/CD Pipeline SHALL fail and report the error
Requirement 11: AWS-Native Integration
User Story: As a platform operator, I want the platform to integrate natively with AWS services, so that I can leverage AWS security and management features.
Acceptance Criteria
- THE Platform SHALL use AWS SDK for JavaScript/Python for all AWS service interactions
- THE Platform SHALL use IAM roles for authentication to AWS services
- THE Platform SHALL support AWS Organizations for multi-account scanning
- WHERE AWS service calls fail due to permissions, THEN THE Platform SHALL return descriptive error messages
- THE Platform SHALL encrypt data at rest using AWS KMS where applicable
Requirement 12: Sandbox-First Deployment
User Story: As a platform operator, I want to deploy the platform in a sandbox environment first, so that I can validate functionality before production rollout.
Acceptance Criteria
- THE Platform SHALL support deployment to a dedicated sandbox AWS account
- THE Platform SHALL provide deployment scripts for sandbox environments
- WHERE production deployment is attempted without validation, THEN THE Platform SHALL warn the user
- THE Platform SHALL provide a health check endpoint for sandbox validation
Requirement 13: Evidence Generation for Audits
User Story: As a compliance officer, I want audit-ready evidence generated by the platform, so that I can demonstrate governance and compliance.
Acceptance Criteria
- WHEN the inventory scanner runs, THE Scanner SHALL generate an evidence report with scan metadata
- THE Evidence Report SHALL include scan timestamp, AWS account ID, Region, and scanner version
- THE Evidence Report SHALL include all discovered assets with their governance status
- THE Evidence Report SHALL include risk scores and flagged governance gaps
- WHERE evidence generation fails, THEN THE Scanner SHALL log the error and retry up to 3 times
Requirement 14: Prompt Red-Team Harness (Module 2)
User Story: As a security engineer, I want to run red-team tests on Bedrock prompts and agents, so that I can identify and mitigate security vulnerabilities.
Acceptance Criteria
- WHEN the red-team harness runs, THE Harness SHALL execute predefined attack patterns against Bedrock prompts
- WHEN the red-team harness runs, THE Harness SHALL test agent behavior with malicious inputs
- WHEN the red-team harness runs, THE Harness SHALL record test results with timestamps and attack patterns
- WHERE an attack pattern succeeds in bypassing guardrails, THEN THE Harness SHALL flag it as a security vulnerability
- THE Harness SHALL provide a summary report with vulnerability severity and remediation recommendations
Requirement 15: SageMaker Model Governance Gate (Module 3)
User Story: As a ML engineer, I want to enforce approval gates for SageMaker model deployments, so that I can ensure model quality and compliance.
Acceptance Criteria
- WHEN a model deployment is requested, THE Gate SHALL check if the model has a valid approval status
- WHEN a model deployment is requested, THE Gate SHALL verify model card availability
- WHEN a model deployment is requested, THE Gate SHALL check for recent model evaluations
- WHERE a model lacks approval, THEN THE Gate SHALL block the deployment and return an error
- WHERE a model lacks a model card, THEN THE Gate SHALL warn the user but allow deployment with override
- WHERE a model lacks recent evaluations, THEN THE Gate SHALL warn the user but allow deployment with override
Requirement 16: Tailscale VPN Gateway
User Story: As a DevOps engineer, I want to deploy a Tailscale gateway VM to connect the control plane to my on-premises network, so that I can securely access internal resources without exposing them to the public internet.
Acceptance Criteria
- THE Platform SHALL deploy an Ubuntu VM with 8 GB of memory in a public subnet
- THE VM SHALL use the lowest-cost AMI available for Ubuntu
- THE VM SHALL run Tailscale to establish a secure connection to the on-premises network
- THE VM SHALL be named "tailscale"
- WHERE the Tailscale connection fails, THEN THE System SHALL log the error and retry up to 3 times
- WHERE the VM cannot be accessed via Tailscale, THEN THE System SHALL provide diagnostic logs for troubleshooting