Database Maintenance Guide
Overview
ActingWeb stores temporary data (tokens, sessions, auth codes) in the database’s attribute storage. This data has defined lifetimes and should be automatically cleaned up to prevent unbounded database growth.
This guide explains how to configure your deployment for proper data lifecycle management with both DynamoDB and PostgreSQL backends.
Note
ActingWeb is designed for serverless and containerized deployments where many instances may scale concurrently. Never add cleanup logic to your serving path as this impacts cold start time and request latency.
Backend Selection
Choose the appropriate section based on your database backend:
DynamoDB: Use DynamoDB’s built-in TTL feature (zero-overhead, automatic)
PostgreSQL: Use pg_cron or scheduled cleanup scripts
DynamoDB TTL Configuration
ActingWeb stores a ttl_timestamp field on temporary data. You must enable
DynamoDB TTL on your attributes table for automatic cleanup.
Why TTL?
Zero runtime overhead in your Lambda functions
DynamoDB handles deletion automatically in the background
No impact on cold start time or request latency
Works reliably at any scale
Enabling TTL
Using AWS CLI:
aws dynamodb update-time-to-live \
--table-name {your_prefix}_attributes \
--time-to-live-specification "Enabled=true, AttributeName=ttl_timestamp"
Using Terraform:
resource "aws_dynamodb_table" "actingweb_attributes" {
name = "${var.prefix}_attributes"
billing_mode = "PAY_PER_REQUEST"
hash_key = "id"
range_key = "bucket_name"
attribute {
name = "id"
type = "S"
}
attribute {
name = "bucket_name"
type = "S"
}
ttl {
attribute_name = "ttl_timestamp"
enabled = true
}
}
Using CloudFormation:
ActingWebAttributesTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: !Sub "${Prefix}_attributes"
AttributeDefinitions:
- AttributeName: id
AttributeType: S
- AttributeName: bucket_name
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
- AttributeName: bucket_name
KeyType: RANGE
BillingMode: PAY_PER_REQUEST
TimeToLiveSpecification:
AttributeName: ttl_timestamp
Enabled: true
Verification
Verify TTL is enabled on your table:
aws dynamodb describe-time-to-live --table-name {your_prefix}_attributes
Expected output:
{
"TimeToLiveDescription": {
"TimeToLiveStatus": "ENABLED",
"AttributeName": "ttl_timestamp"
}
}
Scheduled Cleanup Lambda
While DynamoDB TTL handles most cleanup automatically, orphaned index entries may remain when the primary data is deleted. Deploy a scheduled cleanup Lambda to handle these cases.
Key Requirements
Deploy as a separate Lambda function from your serving Lambdas
Trigger via EventBridge/CloudWatch Events on a schedule (e.g., daily)
Do NOT call cleanup methods from your request handling code
Set appropriate timeout (5 minutes recommended)
Cleanup Handler Example
"""
cleanup_handler.py - Database maintenance Lambda
Deploy as a separate Lambda from your serving function.
Triggered by EventBridge on a schedule (daily recommended).
"""
import logging
from actingweb.config import Config
from actingweb.oauth_session import OAuth2SessionManager
from actingweb.oauth2_server.token_manager import ActingWebTokenManager
logger = logging.getLogger(__name__)
def handler(event, context):
"""
Clean up expired tokens and orphaned index entries.
Returns:
dict: Cleanup results
"""
# Initialize with your application's config
config = Config(
database="dynamodb",
# ... your config settings ...
)
results = {}
# Clean up OAuth sessions
session_mgr = OAuth2SessionManager(config)
results["oauth_sessions"] = session_mgr.clear_expired_sessions()
# Clean up SPA tokens
results["spa_tokens"] = session_mgr.cleanup_expired_tokens()
# Clean up MCP tokens and indexes
token_mgr = ActingWebTokenManager(config)
results["mcp_tokens"] = token_mgr.cleanup_expired_tokens()
logger.info(f"Cleanup complete: {results}")
return {
"statusCode": 200,
"body": results
}
Serverless Framework Deployment
# serverless.yml
functions:
cleanup:
handler: cleanup_handler.handler
timeout: 300 # 5 minutes
memorySize: 256
events:
- schedule:
rate: cron(0 3 * * ? *) # Daily at 03:00 UTC
enabled: true
environment:
AWS_DB_PREFIX: ${self:custom.dbPrefix}
AWS SAM Template
# template.yaml
CleanupFunction:
Type: AWS::Serverless::Function
Properties:
Handler: cleanup_handler.handler
Runtime: python3.11
Timeout: 300
MemorySize: 256
Events:
DailyCleanup:
Type: Schedule
Properties:
Schedule: cron(0 3 * * ? *)
Enabled: true
Required IAM Permissions
The cleanup Lambda needs these DynamoDB permissions:
# IAM policy for cleanup Lambda
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:GetItem
- dynamodb:DeleteItem
Resource:
- !GetAtt ActingWebAttributesTable.Arn
Recommended Schedule: Daily at low-traffic time (e.g., 03:00 UTC)
Monitoring
Set up CloudWatch alarms to detect issues:
Table Size Monitoring
Alert if the attributes table item count grows beyond expected threshold (indicates TTL may not be working):
# CloudWatch alarm for table size
TableSizeAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: ActingWeb-AttributeTableSize
MetricName: ItemCount
Namespace: AWS/DynamoDB
Statistic: Average
Period: 86400 # 24 hours
EvaluationPeriods: 1
Threshold: 100000 # Adjust based on your expected volume
ComparisonOperator: GreaterThanThreshold
Dimensions:
- Name: TableName
Value: !Ref ActingWebAttributesTable
Cleanup Lambda Monitoring
Alert if the cleanup Lambda fails:
CleanupErrorAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: ActingWeb-CleanupLambdaErrors
MetricName: Errors
Namespace: AWS/Lambda
Statistic: Sum
Period: 3600
EvaluationPeriods: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
Dimensions:
- Name: FunctionName
Value: !Ref CleanupFunction
TTL Deletion Monitoring
Monitor DynamoDB’s TimeToLiveDeletedItemCount metric to verify TTL is
actively cleaning up data:
aws cloudwatch get-metric-statistics \
--namespace AWS/DynamoDB \
--metric-name TimeToLiveDeletedItemCount \
--dimensions Name=TableName,Value={your_prefix}_attributes \
--start-time $(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 86400 \
--statistics Sum
Troubleshooting with CloudWatch Logs Insights
Query cleanup Lambda results:
fields @timestamp, @message
| filter @message like /Cleanup complete/
| sort @timestamp desc
| limit 20
Find cleanup errors:
fields @timestamp, @message
| filter @message like /Error/ or @message like /error/
| filter @logStream like /cleanup/
| sort @timestamp desc
| limit 50
Monitor token creation rate (if logging enabled):
fields @timestamp, @message
| filter @message like /Stored access token/ or @message like /Created refresh token/
| stats count() as token_count by bin(1h)
PostgreSQL Cleanup Configuration
PostgreSQL doesn’t have automatic TTL deletion like DynamoDB, but ActingWeb stores
a ttl_timestamp field that you can use for scheduled cleanup. There are three
approaches: pg_cron extension, external cron job, or scheduled Lambda/Cloud Function.
Option 1: pg_cron Extension (Recommended)
pg_cron is a PostgreSQL extension that runs scheduled jobs inside the database.
Installation:
-- Enable pg_cron extension (requires superuser)
CREATE EXTENSION pg_cron;
Schedule cleanup job:
-- Run cleanup daily at 03:00 UTC
SELECT cron.schedule(
'actingweb-ttl-cleanup',
'0 3 * * *',
$$
DELETE FROM attributes
WHERE ttl_timestamp IS NOT NULL
AND ttl_timestamp < EXTRACT(EPOCH FROM NOW())::BIGINT
$$
);
Verify job is scheduled:
SELECT * FROM cron.job WHERE jobname = 'actingweb-ttl-cleanup';
View job run history:
SELECT *
FROM cron.job_run_details
WHERE jobid = (SELECT jobid FROM cron.job WHERE jobname = 'actingweb-ttl-cleanup')
ORDER BY start_time DESC
LIMIT 10;
Notes:
pg_cron is available on AWS RDS PostgreSQL 12.5+, Google Cloud SQL, and Azure Database
Runs inside the database process (zero external infrastructure)
Automatic retries on failure
Job history tracking built-in
Option 2: External Cron Job
Run a scheduled script from cron/systemd timer:
cleanup_postgres.sh:
#!/bin/bash
# cleanup_postgres.sh - Scheduled PostgreSQL cleanup
PGPASSWORD="$PG_DB_PASSWORD" psql \
-h "$PG_DB_HOST" \
-p "$PG_DB_PORT" \
-U "$PG_DB_USER" \
-d "$PG_DB_NAME" \
-c "DELETE FROM attributes WHERE ttl_timestamp IS NOT NULL AND ttl_timestamp < EXTRACT(EPOCH FROM NOW())::BIGINT;"
echo "Cleanup completed at $(date)"
Crontab entry:
# Run daily at 03:00
0 3 * * * /path/to/cleanup_postgres.sh >> /var/log/actingweb-cleanup.log 2>&1
Systemd timer (alternative):
# /etc/systemd/system/actingweb-cleanup.timer
[Unit]
Description=ActingWeb PostgreSQL Cleanup Timer
[Timer]
OnCalendar=daily
OnCalendar=03:00
Persistent=true
[Install]
WantedBy=timers.target
# /etc/systemd/system/actingweb-cleanup.service
[Unit]
Description=ActingWeb PostgreSQL Cleanup
[Service]
Type=oneshot
ExecStart=/path/to/cleanup_postgres.sh
User=actingweb
Environment="PG_DB_HOST=localhost"
Environment="PG_DB_PASSWORD=secretpassword"
Enable: systemctl enable --now actingweb-cleanup.timer
Option 3: Cloud Function/Lambda
Deploy a serverless function for PostgreSQL cleanup:
AWS Lambda Example:
"""
cleanup_postgres_handler.py - PostgreSQL maintenance Lambda
"""
import os
import logging
from psycopg import connect
logger = logging.getLogger(__name__)
def handler(event, context):
"""Clean up expired PostgreSQL attributes."""
conn = connect(
host=os.environ["PG_DB_HOST"],
port=int(os.environ["PG_DB_PORT"]),
dbname=os.environ["PG_DB_NAME"],
user=os.environ["PG_DB_USER"],
password=os.environ["PG_DB_PASSWORD"],
)
try:
with conn.cursor() as cur:
cur.execute("""
DELETE FROM attributes
WHERE ttl_timestamp IS NOT NULL
AND ttl_timestamp < EXTRACT(EPOCH FROM NOW())::BIGINT
""")
deleted_count = cur.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} expired attributes")
return {
"statusCode": 200,
"body": {"deleted": deleted_count}
}
finally:
conn.close()
Serverless Framework deployment:
# serverless.yml
functions:
postgresCleanup:
handler: cleanup_postgres_handler.handler
timeout: 300
memorySize: 256
events:
- schedule:
rate: cron(0 3 * * ? *) # Daily at 03:00 UTC
environment:
PG_DB_HOST: ${env:PG_DB_HOST}
PG_DB_PORT: 5432
PG_DB_NAME: actingweb
PG_DB_USER: actingweb
PG_DB_PASSWORD: ${env:PG_DB_PASSWORD}
PostgreSQL Monitoring
Check for expired records awaiting cleanup:
SELECT COUNT(*)
FROM attributes
WHERE ttl_timestamp IS NOT NULL
AND ttl_timestamp < EXTRACT(EPOCH FROM NOW())::BIGINT;
Monitor table size:
SELECT
pg_size_pretty(pg_total_relation_size('attributes')) AS total_size,
(SELECT COUNT(*) FROM attributes) AS row_count,
(SELECT COUNT(*) FROM attributes WHERE ttl_timestamp IS NOT NULL) AS ttl_rows
FROM pg_class
WHERE relname = 'attributes';
Create monitoring view:
CREATE VIEW attribute_cleanup_status AS
SELECT
COUNT(*) FILTER (WHERE ttl_timestamp IS NULL) AS permanent_rows,
COUNT(*) FILTER (WHERE ttl_timestamp IS NOT NULL AND ttl_timestamp >= EXTRACT(EPOCH FROM NOW())::BIGINT) AS active_ttl_rows,
COUNT(*) FILTER (WHERE ttl_timestamp IS NOT NULL AND ttl_timestamp < EXTRACT(EPOCH FROM NOW())::BIGINT) AS expired_rows,
pg_size_pretty(pg_total_relation_size('attributes')) AS table_size
FROM attributes;
Query the view:
SELECT * FROM attribute_cleanup_status;
Cleanup Script with Application Context
For cleanup that requires application logic (OAuth sessions, MCP tokens):
"""
cleanup_postgres_app.py - Application-aware PostgreSQL cleanup
Run via cron or cloud scheduler.
"""
import logging
from actingweb.config import Config
from actingweb.oauth_session import OAuth2SessionManager
from actingweb.oauth2_server.token_manager import ActingWebTokenManager
logger = logging.getLogger(__name__)
def main():
# Initialize with PostgreSQL backend
config = Config(
database="postgresql",
# ... your config settings ...
)
results = {}
# Clean up OAuth sessions
session_mgr = OAuth2SessionManager(config)
results["oauth_sessions"] = session_mgr.clear_expired_sessions()
# Clean up SPA tokens
results["spa_tokens"] = session_mgr.cleanup_expired_tokens()
# Clean up MCP tokens
token_mgr = ActingWebTokenManager(config)
results["mcp_tokens"] = token_mgr.cleanup_expired_tokens()
logger.info(f"PostgreSQL cleanup complete: {results}")
return results
if __name__ == "__main__":
main()
Data Lifecycle Reference
The following table shows TTL values for different data types:
Data Type |
TTL |
Notes |
|---|---|---|
OAuth sessions |
10 minutes |
Postponed actor creation flow |
SPA access tokens |
1 hour |
Web app authentication |
SPA refresh tokens |
2 weeks |
Web app token refresh |
MCP auth codes |
10 minutes |
OAuth2 authorization flow |
MCP access tokens |
1 hour |
MCP client authentication |
MCP refresh tokens |
30 days |
MCP client token refresh |
Index entries |
+2 hours |
Buffer over data TTL |
These values are defined in actingweb/constants.py:
# OAuth session TTL (for postponed actor creation)
OAUTH_SESSION_TTL = 600 # 10 minutes
# SPA token TTLs
SPA_ACCESS_TOKEN_TTL = 3600 # 1 hour
SPA_REFRESH_TOKEN_TTL = 86400 * 14 # 2 weeks
# MCP token TTLs
MCP_AUTH_CODE_TTL = 600 # 10 minutes
MCP_ACCESS_TOKEN_TTL = 3600 # 1 hour
MCP_REFRESH_TOKEN_TTL = 2592000 # 30 days
# Index buffer
INDEX_TTL_BUFFER = 7200 # 2 hours
Rollout Guide
When deploying TTL support to an existing application:
Update Library: Upgrade to ActingWeb version with TTL support
Deploy Code First: Deploy your application code - TTL timestamps are stored but DynamoDB doesn’t act on them yet - This is a safe, non-breaking change
Enable DynamoDB TTL: Apply the TTL configuration - Run the AWS CLI command or apply Terraform/CloudFormation - DynamoDB begins background cleanup
Deploy Cleanup Lambda: Deploy the scheduled cleanup Lambda - Handles orphaned index entries - Run immediately after deployment to clean existing data
Enable Monitoring: Set up CloudWatch alarms - Table size monitoring - Cleanup Lambda error alerts - TTL deletion metrics
See Also
Attributes and Buckets (Global) - Attribute storage system details
Deployment - General deployment guide
ActingWeb Authentication System - Authentication configuration