AWS S3
Send tracked database events to AWS S3 or S3-compatible storage for analytics and data warehousing
Overview
AWS S3 (Simple Storage Service) is a highly scalable object storage service. pg_track_events supports sending data to S3 or any S3-compatible storage service (such as Cloudflare R2, MinIO, etc.) in two formats:
- Processed Events - Transformed analytics events with defined event names and properties
- Raw DB Events - Raw database change events showing the exact changes made to your database
Events are stored in ndjson (newline-delimited JSON) format, with each line representing a single event.
Processed Events Destination
This destination sends your transformed analytics events to S3 for analysis and reporting.
Configuration
To configure S3 as a destination for your processed events, add the following to your pg_track_events.config.yaml
file:
Configuration Options
filter
: Event name glob filter (optional) Learn about filtering eventsbucket
: Your S3 bucket name (required)region
: AWS region where your bucket is located (required)accessKey
: Your AWS access key ID (required)secretKey
: Your AWS secret access key (required)rootDir
: Base directory path within the bucket where events will be stored (optional)endpoint
: Custom endpoint URL for S3-compatible services (optional)
Data Format and Storage
Processed events are stored in ndjson format with each line containing a single event. Files are organized by event name, with events for each type stored in separate files.
The files are named using the pattern: {timestamp}-{agentID}.ndjson
, where:
timestamp
is in the format "YYYYMMDDTHHMMSSZ" (UTC)agentID
is a unique identifier for the pg_track_events agent instance
Files are automatically rotated after reaching a maximum size or event count to ensure efficient processing.
Raw DB Events Destination
This destination sends raw database change events to S3, allowing you to see exactly what changed in your database.
Configuration
To configure S3 as a destination for your raw DB events, add the following to your pg_track_events.config.yaml
file:
Configuration Options
filter
: Event name glob filter (optional) Learn about filtering eventsbucket
: Your S3 bucket name (required)region
: AWS region where your bucket is located (required)accessKey
: Your AWS access key ID (required)secretKey
: Your AWS secret access key (required)rootDir
: Base directory path within the bucket where events will be stored (optional)endpoint
: Custom endpoint URL for S3-compatible services (optional)
Data Format and Storage
Raw DB events are stored in ndjson format with each line containing a single event. Files are organized by table name, with events for each table stored in separate files.
The files are named using the pattern: {timestamp}-{agentID}.ndjson
, where:
timestamp
is in the format "YYYYMMDDTHHMMSSZ" (UTC)agentID
is a unique identifier for the pg_track_events agent instance
Files are automatically rotated after reaching a maximum size or event count to ensure efficient processing.
Setting Up AWS S3
Creating an IAM User
- Go to the AWS IAM Console
- Navigate to "Users" and click "Add user"
- Give the user a name and select "Programmatic access"
- Attach policies directly or create a custom policy with the following permissions:
s3:PutObject
s3:GetObject
s3:ListBucket
s3:DeleteObject
(if needed)
Creating a Bucket
- Go to the S3 Console
- Click "Create bucket"
- Choose a globally unique bucket name and select your preferred region
- Configure bucket settings according to your needs
- Complete the bucket creation process
Using S3-Compatible Services
pg_track_events is compatible with any storage service that implements the S3 API:
Cloudflare R2
To use Cloudflare R2, set the endpoint
to your R2 endpoint URL:
MinIO
For MinIO:
Environment Variables
For security, you can use environment variables for sensitive information:
Note
After making configuration changes, restart the pg_track_events agent for them to take effect.