Database Backends Reference

ActingWeb supports two production-ready database backends: DynamoDB and PostgreSQL. This document provides a detailed comparison to help you choose the right backend for your use case.

Quick Comparison

Feature	DynamoDB	PostgreSQL
Setup Complexity	Low (auto-creates tables)	Medium (requires Alembic migrations)
Local Development	DynamoDB Local (Docker)	PostgreSQL (Docker or native install)
Installation	`pip install 'actingweb[dynamodb]'`	`pip install 'actingweb[postgresql]'`
Scaling	Automatic, serverless	Manual (vertical/horizontal scaling)
Cost Model	Pay-per-request or provisioned throughput	Instance-based (compute + storage)
Query Flexibility	Key-based with Global Secondary Indexes	Full SQL with JOINs, aggregations, CTEs
Latency (local)	Higher (network overhead to AWS)	Lower (direct database connection)
Latency (same region)	Very low (<10ms)	Very low (<5ms)
Production Management	Fully managed by AWS	Self-managed or RDS/Cloud SQL
Multi-region	Built-in global tables	Manual replication (streaming/logical)
Transactions	Limited (single table, conditions)	Full ACID transactions
Backup/Restore	Point-in-time recovery built-in	pg_dump or continuous archiving
TTL/Cleanup	Built-in automatic TTL deletion (enable TTL on `attributes.ttl_timestamp`)	Self-contained opportunistic purge (no cron required)
Connection Pooling	Not applicable (HTTP API)	Built-in (psycopg3 ConnectionPool)

When to Use DynamoDB

Best For:

AWS-centric deployments: Already using AWS Lambda, API Gateway, etc.
Global distribution: Need multi-region active-active setup with minimal effort
Variable traffic: Traffic patterns with large spikes or long idle periods
Serverless architecture: Zero server management, automatic scaling
Key-value workloads: Simple lookups by actor ID, property name, trust relationship
Pay-per-use economics: Want to pay only for actual usage, not idle capacity

Advantages:

Zero administration: No servers to manage, no capacity planning
Automatic scaling: Handles traffic spikes without intervention
Global tables: Multi-region replication with conflict resolution
Built-in TTL: Automatic cleanup of expired data
On-demand pricing: Pay only for requests (or use provisioned capacity)
High availability: 99.99% SLA within region, 99.999% for global tables

Considerations:

Cost at scale: Can become expensive with sustained high traffic (>100k requests/day)
AWS lock-in: Difficult to migrate away from AWS ecosystem
Index management: Global Secondary Indexes have cost and design implications

Example Use Cases:

Serverless web application on AWS Lambda
Mobile backend with global user base
IoT platform with burst traffic patterns
Microservices with variable load

When to Use PostgreSQL

Best For:

Predictable traffic: Steady load or known capacity requirements
Cost optimization: High-traffic apps where instance-based pricing is cheaper
Multi-cloud/on-prem: Want to avoid cloud vendor lock-in
Existing PostgreSQL expertise: Team already knows PostgreSQL
Transaction requirements: Need complex multi-table transactions

Advantages:

Lower latency: Direct connection pool reduces overhead
Cost efficiency: Fixed monthly cost regardless of request volume
Rich ecosystem: Extensive tools, extensions (PostGIS, full-text search, etc.)
ACID transactions: Full transactional guarantees across operations
No vendor lock-in: Works on any cloud or on-premises
Advanced features: Triggers, stored procedures, custom functions

Considerations:

Manual scaling: Must provision capacity and handle scaling yourself
Database management: Need to manage backups, updates, monitoring
Connection limits: Connection pooling required for high concurrency
Migration setup: Must run Alembic migrations before first use
TTL cleanup: Handled by the library’s self-contained opportunistic purge (see Expired Token Cleanup (TTL)); no pg_cron required for ActingWeb’s own token tables

Example Use Cases:

Traditional web application with steady traffic
Reporting/analytics requirements
Multi-tenant SaaS application
Enterprise deployment with on-premises requirements

Performance Characteristics

Actor Operations

Operation	DynamoDB	PostgreSQL	Notes
Actor create	5-15ms	2-5ms	PostgreSQL faster for local/same-region
Actor read	5-10ms	1-3ms	Both very fast for key-based lookup
Actor update	5-10ms	2-4ms	PostgreSQL slightly faster
Property write	5-10ms	1-3ms	PostgreSQL benefits from connection pooling
Property read	5-10ms	1-2ms	Both efficient for primary key lookup
Trust list	10-20ms	2-5ms	PostgreSQL better for complex queries

Scaling Characteristics

DynamoDB:

Horizontal scaling: Automatic partition management
Read capacity: Auto-scales to millions of requests/second
Write capacity: Auto-scales to millions of requests/second
Latency: Consistent regardless of table size
Limits: 400KB item size, 25 operations per transaction

PostgreSQL:

Vertical scaling: Larger instances for more CPU/RAM
Horizontal scaling: Read replicas, sharding (manual setup)
Connection pooling: Essential for Lambda/serverless (use external pool like RDS Proxy)
Latency: Can degrade with very large tables without proper indexing
Limits: 1GB per row, no practical transaction limit

Cost Analysis

Example: 1 million requests/day

DynamoDB Costs (us-east-1):

On-Demand Pricing:
- 1M writes/day × $1.25/million = $1.25/day = $37.50/month
- 3M reads/day × $0.25/million = $0.75/day = $22.50/month
- Storage: 10GB × $0.25/GB = $2.50/month
Total: ~$63/month

Provisioned Capacity (if steady load):
- 12 WCU × 730 hours × $0.00065 = $5.69/month
- 35 RCU × 730 hours × $0.00013 = $3.32/month
- Storage: 10GB × $0.25/GB = $2.50/month
Total: ~$12/month (62% savings with provisioned)

PostgreSQL Costs (AWS RDS):

db.t4g.small (2 vCPU, 2GB RAM):
- Instance: $0.017/hour × 730 hours = $12.41/month
- Storage: 20GB SSD × $0.115/GB = $2.30/month
- Backup storage: 20GB × $0.095/GB = $1.90/month
Total: ~$17/month

db.t4g.medium (2 vCPU, 4GB RAM):
- Instance: $0.034/hour × 730 hours = $24.82/month
- Storage: 50GB SSD × $0.115/GB = $5.75/month
- Backup storage: 50GB × $0.095/GB = $4.75/month
Total: ~$35/month

Analysis: For this traffic level, PostgreSQL is more cost-effective. DynamoDB becomes competitive only with: - Highly variable traffic (benefit from on-demand) - Global replication requirements - Zero-ops preference (worth the premium)

Expired Token Cleanup (TTL)

ActingWeb’s SPA/mobile session tokens (the /oauth/spa/token flow) are stored in the attributes table with a ttl_timestamp. Access tokens are short-lived (1 hour); refresh tokens rotate, and a used refresh token’s TTL is shortened to a bounded reuse-detection window (SPA_REFRESH_TOKEN_REUSE_WINDOW, 2 days) so consumed tokens do not accumulate. Expired rows still need to be deleted from storage. How that happens depends on the backend:

PostgreSQL — automatic, no configuration required. The library purges expired token rows itself. The /oauth/spa/token endpoint opportunistically calls OAuth2SessionManager.maybe_purge_expired_tokens(), which runs at most once per SPA_TOKEN_PURGE_INTERVAL (1 hour) per process and issues a single set-based DELETE backed by the idx_attributes_ttl partial index (cost scales with the number of expired rows, not table size). No pg_cron or scheduled job is needed for ActingWeb’s own token tables. Applications may still call purge_expired_tokens() directly from a scheduled job if they want a deterministic cadence.

DynamoDB — enable native TTL on the table (one-time setup). DynamoDB deletes expired items automatically, but TTL must be enabled on the attributes table, pointed at the ttl_timestamp attribute. The library does not (and cannot reliably) enable this for you, and a full-table Scan to purge is deliberately avoided, so without this step expired tokens are never removed.

Important

Enabling TTL on the shared attributes table is safe even though that table also holds permanent data (internal store, cached values, application attributes). DynamoDB TTL deletes an item only when its ttl_timestamp attribute is present and in the past. ActingWeb writes ttl_timestamp only for temporary data that is meant to expire — SPA access/refresh tokens, OAuth login sessions, mobile exchange tickets, state nonces, and index entries. Every permanent write goes through set_attr with no ttl, so those items have no ttl_timestamp and are never eligible for TTL deletion. (If your own code calls set_attr(..., ttl_seconds=...) it is likewise opting that item into expiry — don’t pass a TTL for data you want to keep.)

Enable it once per environment:

aws dynamodb update-time-to-live \
    --table-name <PREFIX>_attributes \
    --time-to-live-specification "Enabled=true, AttributeName=ttl_timestamp"

The table name is <AWS_DB_PREFIX>_attributes (default prefix demo_actingweb). TTL deletion is asynchronous (AWS removes items within ~48h of expiry), which is fine for token cleanup. The ttl_timestamp is written with a clock-skew buffer so items are never deleted before they are logically expired.

Migration Between Backends

ActingWeb provides full migration tooling to switch backends without data loss.

DynamoDB → PostgreSQL

See PostgreSQL Migration Guide for complete migration guide.

Summary:

Install PostgreSQL and run Alembic migrations
Export DynamoDB data: python scripts/migrate_dynamodb_to_postgresql.py export
Import to PostgreSQL: python scripts/migrate_dynamodb_to_postgresql.py import
Validate: python scripts/migrate_dynamodb_to_postgresql.py validate
Update configuration: DATABASE_BACKEND=postgresql
Deploy and verify

Downtime: Can be zero with blue-green deployment strategy.

PostgreSQL → DynamoDB

Reverse migration is possible but requires custom export logic:

Export PostgreSQL to JSON (use pg_dump or custom script)
Transform data to DynamoDB format (handle composite keys)
Import using boto3 batch writes
Enable DynamoDB TTL on attributes table
Update configuration: DATABASE_BACKEND=dynamodb

Implementation Details

Table Schema Mapping

ActingWeb uses the same logical schema for both backends:

Table	Primary Key	DynamoDB Implementation	PostgreSQL Implementation
actors	`id`	Hash key: `id`	Primary key: `id`
properties	`(id, name)`	Hash key: `id`, Range: `name`	Composite key: `(id, name)`
attributes	`(id, bucket_name)`	Hash key: `id`, Range: `bucket_name`	Composite key: `(id, bucket_name)`
trusts	`(id, peerid)`	Hash key: `id`, Range: `peerid`	Composite key: `(id, peerid)`
peertrustees	`(id, peerid)`	Hash key: `id`, Range: `peerid`	Composite key: `(id, peerid)`
subscriptions	`(id, peer_sub_id)`	Hash key: `id`, Range: `peer_sub_id`	Composite key: `(id, peer_sub_id)`
subscription_diffs	`(id, subid_seqnr)`	Hash key: `id`, Range: `subid_seqnr`	Composite key: `(id, subid_seqnr)`

Index Strategy

DynamoDB:

Global Secondary Indexes (GSI) for non-key lookups: - actors: GSI on creator field - properties: GSI on value field (reverse lookups) - trusts: GSI on secret field (token lookups)

PostgreSQL:

B-tree indexes on frequently queried columns: - CREATE INDEX idx_actors_creator ON actors(creator) - CREATE INDEX idx_properties_value ON properties(value) - CREATE INDEX idx_trusts_secret ON trusts(secret) - CREATE INDEX idx_attributes_ttl ON attributes(ttl_timestamp) (for cleanup)

Connection Management

DynamoDB:

Stateless HTTP API (no connections to manage)
PynamoDB handles retries and exponential backoff
Works well with AWS Lambda (cold starts not affected)

PostgreSQL:

Connection pooling via psycopg3 ConnectionPool
Default pool: min=2, max=10 connections
Configure via environment variables: - PG_POOL_MIN_SIZE - PG_POOL_MAX_SIZE - PG_POOL_TIMEOUT
For AWS Lambda: Consider RDS Proxy for connection management

Data Types

JSON Storage:

DynamoDB: Native map/list types
PostgreSQL: JSONB column type (efficient binary storage)

Timestamps:

Both: Unix epoch as BIGINT for consistency
PostgreSQL: Could use native TIMESTAMP but uses BIGINT for compatibility

Composite Keys:

DynamoDB: Separate hash and range key fields
PostgreSQL: Concatenated with : separator (e.g., peerid:subid)

Recommendations by Scenario

Serverless/Lambda Deployment

Recommend: DynamoDB

Zero cold start impact (HTTP API)
Automatic scaling with Lambda concurrency
No connection management needed
Pay-per-use aligns with Lambda pricing model

Traditional Server Deployment

Recommend: PostgreSQL

Better cost efficiency for steady load
Simpler connection management with long-lived processes
Lower latency with persistent connections
Easier debugging with SQL query logs

Multi-Region Deployment

Recommend: DynamoDB Global Tables

Built-in multi-region replication
Automatic conflict resolution
<100ms cross-region latency
PostgreSQL multi-region requires manual setup (streaming replication, logical replication)

High-Traffic Application (>1M requests/day)

Recommend: PostgreSQL (if predictable traffic)

Fixed monthly cost
Better cost efficiency at scale
Lower per-request latency
Connection pooling handles high concurrency

Recommend: DynamoDB (if variable/spiky traffic)

Automatic scaling for burst traffic
No capacity planning needed
Pay only for actual usage

Development/Testing

Either backend works well:

DynamoDB Local: docker run -p 8000:8000 amazon/dynamodb-local
PostgreSQL: docker run -p 5432:5432 postgres:16-alpine
Both support rapid iteration and testing

Database Backends Reference

Quick Comparison

When to Use DynamoDB

When to Use PostgreSQL

Performance Characteristics

Actor Operations

Scaling Characteristics

Cost Analysis

Example: 1 million requests/day

Expired Token Cleanup (TTL)

Migration Between Backends

DynamoDB → PostgreSQL

PostgreSQL → DynamoDB

Implementation Details

Table Schema Mapping

Index Strategy

Connection Management

Data Types

Recommendations by Scenario

Serverless/Lambda Deployment

Traditional Server Deployment

Multi-Region Deployment

High-Traffic Application (>1M requests/day)

Development/Testing

See Also