Attributes and Buckets (Global)

Overview

The attribute/bucket system provides flexible JSON storage for application-level or cross-actor data. It supports configurations, registries, indexes, and any data that doesn’t fit the property model.

Key Characteristics:

  • JSON-serializable data stored per actor per bucket

  • Each attribute has a name, data payload, and optional timestamp

  • Global storage via system actors (_actingweb_system, _actingweb_oauth2)

  • Efficient querying by bucket with composite range keys

Basic Usage

from actingweb import attribute

# Per-actor bucket
prefs = attribute.Attributes(actor_id=actor.id, bucket="user_preferences", config=config)
prefs.set_attr(name="theme", data="dark")
theme_attr = prefs.get_attr(name="theme")
# Returns: {"data": "dark", "timestamp": <datetime>}

# Delete single attribute
prefs.delete_attr(name="theme")

# Delete entire bucket
prefs.delete_bucket()

Global Storage

# Global settings using system actor
from actingweb.constants import ACTINGWEB_SYSTEM_ACTOR

global_config = attribute.Attributes(
    actor_id=ACTINGWEB_SYSTEM_ACTOR,
    bucket="app_settings",
    config=config
)
global_config.set_attr(name="maintenance_mode", data=False)

Atomic Operations (v3.8.2+)

For concurrent access scenarios, use conditional_update_attr() to perform atomic compare-and-swap operations. This is essential for race-free updates when multiple requests might modify the same attribute simultaneously.

from actingweb import attribute

# Example: Atomic token rotation
tokens = attribute.Attributes(
    actor_id=OAUTH2_SYSTEM_ACTOR,
    bucket="spa_refresh_tokens",
    config=config
)

# Get current token data
token_attr = tokens.get_attr(name=refresh_token)
if not token_attr:
    return False

old_data = token_attr["data"]

# Only update if token hasn't been marked as used
if old_data.get("used"):
    return False  # Already used by another request

# Prepare new data with used flag
new_data = old_data.copy()
new_data["used"] = True
new_data["used_at"] = int(time.time())

# Atomic update - only succeeds if current data still matches old_data
success = tokens.conditional_update_attr(
    name=refresh_token,
    old_data=old_data,
    new_data=new_data
)

if success:
    # This request won the race - token is now marked as used
    return True
else:
    # Another request modified the token first
    return False

How It Works:

  • PostgreSQL: Uses UPDATE ... WHERE data = old_data for atomic compare-and-swap

  • DynamoDB: Uses conditional update expressions with condition=(Attribute.data == old_data)

  • Returns True only if the current database value exactly matches old_data

  • Returns False if value was modified by another request (no update performed)

Use Cases:

  • OAuth refresh token rotation (prevents concurrent reuse)

  • Distributed counters and rate limiting

  • Session management with concurrent requests

  • Any scenario requiring optimistic locking

Complete Bucket Reference

The following table documents all buckets used by ActingWeb:

System-Level Buckets (Global)

System-Level Buckets

Bucket Name

System Actor

Purpose

TTL

trust_types

_actingweb_system

Global registry of trust relationship type definitions

Permanent

oauth_sessions

_actingweb_oauth2

Temporary OAuth2 sessions for postponed actor creation

10 minutes

spa_access_tokens

_actingweb_oauth2

SPA (Single Page App) access token storage

1 hour

spa_refresh_tokens

_actingweb_oauth2

SPA refresh token storage

2 weeks

auth_code_index

_actingweb_oauth2

Global index mapping auth codes to actor IDs

10 minutes (tied to auth code)

access_token_index

_actingweb_oauth2

Global index mapping access tokens to actor IDs

Tied to token lifetime

refresh_token_index

_actingweb_oauth2

Global index mapping refresh tokens to actor IDs

Tied to token lifetime

client_index

_actingweb_oauth2

Global index mapping MCP client IDs to actor IDs

Permanent (until client deleted)

Per-Actor Buckets

Per-Actor Buckets

Bucket Name

Purpose

TTL

Cleanup Trigger

_internal

Internal actor metadata (email, trustee_root, oauth tokens)

Permanent

Actor deletion

trust_permissions

Per-trust permission overrides for peer relationships

Permanent

Trust deletion

mcp_clients

MCP client credentials and registration data

Permanent

Client deletion

mcp_tokens

MCP access tokens issued to clients

1 hour

Token revocation or expiry

mcp_refresh_tokens

MCP refresh tokens for token renewal

30 days

Token revocation or expiry

mcp_auth_codes

Temporary authorization codes during OAuth2 flow

10 minutes

Code exchange or expiry

mcp_google_tokens

Stored Google OAuth2 tokens for MCP authentication

Tied to access token

Access token deletion

oauth_tokens:{peer_id}

OAuth2 tokens per trust relationship

Permanent

Trust deletion

Data Type Details

MCP Client Data (mcp_clients)

{
    "client_id": "mcp_abc123",
    "client_secret": "hashed_secret",
    "client_name": "My MCP Client",
    "redirect_uris": ["https://example.com/callback"],
    "grant_types": ["authorization_code", "refresh_token"],
    "response_types": ["code"],
    "trust_type": "mcp_client",
    "created_at": 1703001234,
    "actor_id": "actor123"
}

MCP Access Token (mcp_tokens)

{
    "token_id": "unique_token_id",
    "token": "aw_access_token_value",
    "actor_id": "actor123",
    "client_id": "mcp_abc123",
    "created_at": 1703001234,
    "expires_at": 1703004834,  # created_at + 3600
    "expires_in": 3600,
    "google_token_key": "google_token_access_xyz"  # Reference to stored Google token
}

OAuth Session (oauth_sessions)

{
    "token_data": {"access_token": "...", "refresh_token": "..."},
    "user_info": {"email": "user@example.com", "name": "User"},
    "provider": "google",
    "created_at": 1703001234,
    "verified_emails": ["user@example.com"],
    "pkce_verifier": "base64_verifier_string"
}

Trust Permission Override (trust_permissions)

{
    "actor_id": "actor123",
    "peer_id": "peer456",
    "properties": {
        "config/settings": "rw",
        "config/*": "r"
    },
    "updated_at": "2024-01-15T10:30:00Z"
}

Data Lifecycle and Cleanup

Actor Deletion

When actor.delete() is called, the following cleanup occurs:

  1. Peer Trustees - All peer trustee relationships deleted

  2. Properties - All actor properties deleted

  3. Subscriptions - All subscriptions deleted

  4. Trust Relationships - For each trust:

    • Subscriptions for that peer deleted

    • Trust permissions deleted

    • If OAuth2 client trust: triggers client cleanup (tokens revoked, indexes cleaned)

    • Trust record deleted

  5. Attribute Buckets - attribute.Buckets(actor_id).delete() removes ALL buckets:

    • _internal

    • mcp_clients

    • mcp_tokens

    • mcp_refresh_tokens

    • mcp_auth_codes

    • mcp_google_tokens

    • Any custom buckets

  6. Actor Record - Actor deleted from _actors table

MCP Client Deletion

When an MCP client is deleted via client_registry.delete_client():

  1. Token Revocation - All access and refresh tokens for the client revoked

  2. Client Data - Removed from actor’s mcp_clients bucket

  3. Global Index - Removed from system actor’s client_index

  4. Trust Relationship - OAuth2 client trust deleted

  5. Google Token Data - Associated Google tokens cleaned up

Token Expiration Handling

Current Behavior: Lazy Deletion Only

Expired tokens are deleted only when accessed and found to be expired:

# Example from token_manager.py
def validate_access_token(self, token):
    token_data = self._get_access_token(token)
    if token_data and int(time.time()) > token_data["expires_at"]:
        self._remove_access_token(token)  # Lazy deletion
        return None
    return token_data

Known Limitations:

  • No scheduled garbage collection

  • No DynamoDB TTL configured

  • Expired but never-accessed tokens accumulate

  • Abandoned OAuth sessions may persist

Cleanup Methods Available

The following cleanup methods exist but are not automatically called:

from actingweb.oauth_session import OAuth2SessionManager

session_mgr = OAuth2SessionManager(config)

# Clear expired OAuth sessions (10 min TTL)
cleared = session_mgr.clear_expired_sessions()

# Clear expired SPA tokens
cleared = session_mgr.cleanup_expired_tokens()

Known Issues and Gaps

Data That May Accumulate

Potential Data Accumulation

Data Type

Risk Level

Description

Expired OAuth sessions

Medium

Sessions from abandoned OAuth flows (10 min TTL, lazy cleanup)

Expired SPA refresh tokens

High

2-week TTL, accumulates if users don’t refresh

Expired MCP refresh tokens

Critical

30-day TTL, significant accumulation potential

Expired auth codes

Medium

Codes from abandoned OAuth flows (10 min TTL)

Orphaned Google token data

Medium

May remain if refresh token deleted before access token

Cleanup Not Implemented

  1. Reverse Token Lookups - _revoke_refresh_tokens_for_access_token() logs warning but doesn’t delete

  2. Scheduled Cleanup - No cron/background task for expired data

  3. DynamoDB TTL - Not configured on the attributes table

Maintenance Recommendations

Note

ActingWeb typically runs in AWS Lambda with hundreds of concurrent containers. Fast cold start time is critical. Never add cleanup logic to the serving path.

See also

For detailed deployment instructions including DynamoDB TTL configuration and cleanup Lambda setup, see Database Maintenance Guide.

Anti-Patterns for Lambda

Warning

Do NOT do these in Lambda environments:

  • Startup cleanup (adds cold start latency, thundering herd)

  • Request-based periodic cleanup (unpredictable latency spikes)

  • Any synchronous cleanup in the serving path

Monitoring Recommendations

  1. CloudWatch metric for _actingweb_oauth2 bucket item count

  2. Alert if item count exceeds threshold (indicates TTL not working)

  3. Monitor DynamoDB TTL deletion metrics

  4. Track cleanup Lambda execution results

Client Registry Example

class ClientRegistry:
    def __init__(self, config):
        self.config = config

    def register_client(self, actor_id: str, client_data: dict) -> None:
        bucket = attribute.Attributes(actor_id=actor_id, bucket="clients", config=self.config)
        bucket.set_attr(name=client_data["client_id"], data=client_data)
        index = attribute.Attributes(actor_id="_global_registry", bucket="client_index", config=self.config)
        index.set_attr(name=client_data["client_id"], data=actor_id)

    def find_client(self, client_id: str) -> dict | None:
        index = attribute.Attributes(actor_id="_global_registry", bucket="client_index", config=self.config)
        actor_id_attr = index.get_attr(name=client_id)
        if not actor_id_attr or "data" not in actor_id_attr:
            return None
        actor_id = actor_id_attr["data"]
        bucket = attribute.Attributes(actor_id=actor_id, bucket="clients", config=self.config)
        client_attr = bucket.get_attr(name=client_id)
        return client_attr.get("data") if client_attr else None

    def delete_client(self, actor_id: str, client_id: str) -> bool:
        """Delete client and clean up index - ALWAYS do both!"""
        # Delete from actor bucket
        bucket = attribute.Attributes(actor_id=actor_id, bucket="clients", config=self.config)
        bucket.delete_attr(name=client_id)
        # Delete from global index
        index = attribute.Attributes(actor_id="_global_registry", bucket="client_index", config=self.config)
        index.delete_attr(name=client_id)
        return True

Best Practices

  1. JSON-serializable data only - All data must be JSON serializable

  2. Use attributes for sensitive data - Better than properties for secrets

  3. Keep bucket names stable - Treat keys as logical IDs

  4. Always clean up indexes - When deleting data with global indexes, delete both

  5. Set appropriate TTLs - Plan for data lifecycle from the start

  6. Monitor bucket growth - Especially for system actor buckets

  7. Implement lazy + scheduled cleanup - Both patterns together work best

See Also