AWS RDS Database Management Tutorial#
Introduction#
This tutorial walks through the AWS RDS (Relational Database Service) interface, showing you how to manage and monitor the Neotoma database instance. This guide covers the essential features you'll need for day-to-day database administration.
Prerequisites#
- AWS account with appropriate permissions
- Access to the Neotoma AWS environment
- Basic understanding of PostgreSQL databases
Estimated time: 20-30 minutes
Accessing the AWS Console#
Step 1: Log in to AWS Console#
When you first log in to AWS, you'll see the Console Home page.
The console displays:
- Recently visited services
- Quick access to common AWS resources
- Search bar for finding services
Step 2: Pin Frequently Used Services#
To streamline your workflow, you can pin services you use frequently.
- Type "RDS" in the search bar at the top
- Click the star icon next to "RDS" to pin it
- The service will now appear in your top navigation bar

Tip: You can unpin services you no longer use by clicking the star again. This helps keep your interface clean and focused on the tools you actually need.
Understanding RDS Database Instances#
What is an RDS Instance?#
An RDS database instance is a dedicated database server in the cloud. Key concepts:
- One instance = One database server (not one database)
- A single instance can host multiple databases (e.g.,
neotomaandneotomatank) - Each instance has a specific combination of CPU, memory, and storage, as well as any custom configurations you may apply.
- Instances are virtual servers optimized specifically for database workloads
Step 3: Navigate to Your Database Instance#
- Click on RDS in your pinned services or search bar
-
From the RDS dashboard, you'll see an overview including:
-
Available storage
- Number of database instances
-
Quick links to tutorials
-
Click on DB instances in the left sidebar
Screenshot placeholder: RDS dashboard showing overview statistics
You should see your database instance listed. For Neotoma, the identifier is neotomaPrivate.
Database Instance Overview#
Step 4: View Instance Details#
Click on your database identifier (e.g., neotomaPrivate) to see detailed information.
Screenshot placeholder: Database instance list showing neotomaPrivate with status "Available"
Key information displayed:
| Field | What It Means |
|---|---|
| Status | Should show "Available" when running normally |
| Region | "us-east-2" (Ohio) for Neotoma |
| Engine | PostgreSQL version |
| DB Instance Class | Server size (e.g., db.m5.large) |
| Storage | Amount of disk space allocated |
Understanding Database Status#
The Status field is critical for monitoring:
- Available: Database is running normally
- Storage-full: No available disk space (requires immediate action)
- Backing-up: Automated backup in progress
- Modifying: Configuration changes being applied
Warning: If status shows "unavailable" or "storage-full", the database needs immediate attention. See the Troubleshooting section below.
Database Actions and Quick Operations#
Step 5: Common Database Actions#
At the top of the instance details page, you'll see an Actions dropdown menu.
Screenshot placeholder: Actions dropdown menu showing available operations
Common actions include:
Take Snapshot#
Creates a point-in-time backup of the entire database.
When to use:
- Before major data uploads or changes
- Before schema modifications
- As an extra backup before maintenance
How to:
- Click Actions → Take snapshot
- Provide a descriptive name (e.g., "pre-upload-2024-12-10")
- Click Take Snapshot
Snapshots are stored in AWS and don't count against your database storage.
Reboot Database#
Restarts the database server.
When to use:
- After certain parameter group changes
- When database becomes unresponsive
- If memory fills up completely (rare)
Caution: Rebooting will briefly disconnect all active users and applications.
Monitoring Database Performance#
Step 6: Review Connection and Resource Information#
Scroll down to see detailed instance information:
Screenshot placeholder: Configuration section showing endpoint, port, and VPC information
Key Configuration Details#
Endpoint: neotomaprivate.xxxxx.us-east-2.rds.amazonaws.com
- This is the actual database server address
- Used in environment variables for applications (API, AppRunner, etc.)
- For local development, you'll use an SSH tunnel to
localhost:5555instead
Port: 5432
- Standard PostgreSQL port
- Required for connection strings
VPC (Virtual Private Cloud): Network isolation
- Database is in a private subnet for security
- Cannot be accessed directly from the internet
- Must tunnel through public VPC for external access
Public Accessibility: No
- Database is not exposed to the public internet
- Access only through secure tunnels or authorized AWS services
Step 7: Understanding Security Groups#
Security groups control network access to your database.
Screenshot placeholder: Security groups section showing RDS-EC2 and AppRunner groups
You'll see several security group rules:
- RDS-EC2: Allows SSH tunnel connections from EC2 instances
- AppRunner-Group-for-RDS: Allows API and application connections
- Each rule defines which services can connect to the database
Note: Security groups are nested and include multiple sub-rules. In general, these are stable and don't require frequent changes. You'll only modify them when adding new services.
Step 8: View Active Sessions#
The Current sessions section shows all active database connections.
Screenshot placeholder: Current sessions showing multiple connections
Each connection represents:
- The API making requests
- Tilia application connections
- DBeaver or other database tools
- Explorer (via the API)
- Direct user connections
Tip: If you see an unusually high number of connections, it may indicate:
- A runaway process
- Connection pooling issues
- An application not closing connections properly
Monitoring Database Health#
Step 9: Review Monitoring Metrics#
Click on the Monitoring tab to see performance graphs.
Screenshot placeholder: Monitoring tab showing multiple performance graphs
Key Metrics to Watch#
CPU Utilization#
Shows how much processing power is being used.
- Normal: 5-30% for typical Neotoma usage
- High: >70% sustained may indicate need for optimization or larger instance
- Current status: Generally quite low, suggesting we may be over-provisioned
Burst Balance#
Should normally be at 100%.
- Drops when database experiences sudden high activity
- If consistently below 100%, may need different instance type
- Neotoma rarely has issues with this
Connections#
Number of active database connections.
- Monitor for unusual spikes
- Too many connections can slow performance
- Each application (API, Tilia, etc.) maintains connection pools
Free Storage Space#
Amount of available disk space.
- Critical metric - running out of space will cause database failure
- An alarm is configured to alert when space is low
- If space fills up, database will become unavailable
Screenshot placeholder: Free storage space graph showing current usage
Read/Write Latency#
How long database operations take.
- Low latency = fast queries
- Spikes may indicate slow queries or indexing issues
- Generally stable for Neotoma
CloudWatch Monitoring and Alarms#
Step 10: Configure CloudWatch Alarms#
CloudWatch provides advanced monitoring and alerting.
- From the Monitoring tab, click View in CloudWatch or navigate to CloudWatch from the AWS console
- You can create custom alarms for specific conditions
Screenshot placeholder: CloudWatch alarms configuration page
Setting Up an Alarm#
Example: Alert when free storage drops below 20GB
- Click Create Alarm
- Select metric: FreeStorageSpace
- Set condition: Lower than 20 GB
- Configure notification (SNS topic or email)
- Name the alarm descriptively: "RDS-Low-Storage-Alert"
Existing Alarms for Neotoma:
- Low free storage space (configured)
- You can add alarms for:
- High CPU usage
- Too many connections
- High query latency
Reviewing Database Logs#
Step 11: Access Error Logs#
Logs are essential for debugging and monitoring database health.
- Click on the Logs & events tab
- You'll see a list of log files organized by date
Screenshot placeholder: Logs & events tab showing multiple log files
Understanding Log Files#
Each log file shows:
-
File name: Includes date/time stamp
-
Size: Larger files may indicate issues or high activity
- Last written: When the log was last updated
Viewing a Log File#
- Click on a log file name
- The log will open showing database events
- Most entries are routine checkpoints and status updates
Screenshot placeholder: Log file contents showing various database events
What to Look For in Logs#
Normal entries:
- Checkpoint completed
- Connection established/closed
- Automated backup started/completed
Issues to investigate:
- Repeated error messages
- Failed query attempts
- Connection failures
- Out of memory warnings
Debugging Failed Queries#
When you see an error in the logs:
- Identify the query: Look for SQL statements that failed
- Check the error message: What went wrong?
- Note the timestamp: Correlate with application logs
- Download the log: Click Download to save for analysis
Example use cases:
- API endpoint returning errors → Check logs for failed queries
- Tilia not saving data → Look for INSERT/UPDATE failures
- Landing pages loading slowly → Identify long-running queries
Limitations: Identifying Query Sources#
Currently, all applications (API, Tilia, Explorer) connect using the same database role: neotoma_admin.
Limitation: You cannot immediately tell which application executed a query just from the database logs.
Workaround:
-
Identify the failing query in database logs
-
Go to the specific application (API, Tilia, etc.)
-
Check the application's logs in AppRunner
-
Match timestamps to find the source
Future Improvement: Create separate database roles for each application (e.g., neotoma_api_user, tilia_user) to make log analysis easier.
Database Backups and Snapshots#
Step 12: Review Backup Configuration#
Click on the Maintenance & backups tab.
Screenshot placeholder: Maintenance & backups showing automated backup settings
Automated Backups#
The database automatically creates daily backups:
-
Backup window: 10:00 PM Ohio time
-
Retention period: 7 days (configurable)
-
Type: Full database snapshot
-
Storage location: AWS S3
Types of Backups#
System (Automated) Snapshots#
Created automatically by AWS:
-
Daily at scheduled time
-
Retained for specified period
-
Cannot be manually deleted
-
Used for point-in-time recovery
Manual Snapshots#
Created by you as needed:
-
Before major changes
-
For long-term archival
-
Can be kept indefinitely
-
Can be deleted when no longer needed
Step 13: View Available Snapshots#
-
Click Snapshots in the left sidebar of RDS console
-
Filter by your database instance
-
View both system and manual snapshots
Screenshot placeholder: Snapshots list showing dates and types
Snapshot information includes:
-
Snapshot ID: Unique identifier
-
Creation date/time: When it was taken
-
Type: System or Manual
-
Size: Storage space used
-
Status: Available, Creating, etc.
Step 14: Restore from a Snapshot#
If you need to recover the database:
-
Select the snapshot you want to restore
-
Click Actions → Restore snapshot
-
Configure the restored instance:
-
Instance identifier (must be unique)
-
Instance type
-
VPC and security groups
-
Click Restore DB Instance
Important: Restoring creates a NEW database instance - it doesn't overwrite the existing one. This is a safety feature.
Caution: After restoration, you'll need to:
-
Update application connection strings
-
Verify data integrity
-
Test all connections
-
Consider deleting the old instance if replacement was intended
Additional Backup Strategy#
Monthly Backups to S3#
In addition to automated RDS snapshots, Neotoma maintains monthly backups uploaded directly to S3.
How it works:
-
A Python script creates a database dump
-
The dump is compressed
-
Uploaded to an S3 bucket for long-term storage
-
Managed through AWS Batch service (runs automatically)
-
Uses Fargate for compute resources
Note: This is separate from RDS snapshots and provides an additional layer of data protection. These backups can be downloaded and restored independently of AWS.
Database Configuration and Parameters#
Step 15: Understanding Parameter Groups#
Parameter groups define PostgreSQL configuration settings for your RDS instance.
-
From the instance details, find Parameter group under Configuration
-
Click on the parameter group name (e.g.,
neotoma-postgres)
Screenshot placeholder: Parameter group details showing modified parameters
What Are Parameter Groups?#
-
Collections of database engine configuration settings
-
Control how PostgreSQL behaves
-
Some changes require database reboot
-
Can be shared across multiple instances
Key Modified Parameters for Neotoma#
By default, RDS uses standard PostgreSQL settings. For Neotoma, several parameters have been customized:
shared_preload_libraries#
Modified to include: pg_cron, pg_stat_statements
-
pg_cron: Enables scheduled database jobs
-
pg_stat_statements: Tracks query performance statistics
-
PostGIS: Spatial data extension
cron.database_name#
Specifies which database contains the cron scheduling tables.
Page Cost Settings#
-
random_page_cost: Cost estimate for non-sequential disk reads
-
seq_page_cost: Cost estimate for sequential disk reads
-
These affect query planner decisions and indexing efficiency
When to Modify Parameters#
Common scenarios:
-
Adding extensions: PostGIS, pg_cron, etc. require preload configuration
-
Performance tuning: Adjusting memory allocation, connection limits
-
Logging: Changing what gets logged and how much detail
-
Maintenance: Auto-vacuum frequency, statistics collection
Caution:
-
Some parameter changes require database reboot
-
Changes can significantly affect performance
-
Test changes in development environment first
-
Document all modifications
Step 16: View and Edit Parameters#
To view current parameters:
-
Open your parameter group
-
Browse through categories or search for specific parameters
-
Modified parameters are marked (different from default)
Screenshot placeholder: Parameter list showing modified vs. default values
To edit a parameter:
-
Find the parameter you want to change
-
Click Edit parameters
-
Modify the value
-
Save changes
-
Note if reboot is required (indicated in the interface)
Database Maintenance#
Automatic Minor Version Updates#
The database is configured to automatically apply minor version updates during the maintenance window.
What this means:
- Security patches are applied automatically
- Minor PostgreSQL updates (e.g., 15.2 → 15.3)
- Occurs during low-traffic periods
- Minimal downtime
Pending Maintenance#
Check the Maintenance & backups tab for:
- Scheduled maintenance windows
- Available updates
- Required actions
Screenshot placeholder: Pending maintenance section showing next maintenance window
When Database Maintenance Happens#
Automatic maintenance includes:
- Operating system patches
- Database engine updates
- Hardware maintenance
You control:
- Maintenance window timing
- Whether to apply immediately or defer
- Major version upgrades (manual only)
Performance Optimization Notes#
Current Instance Size#
Neotoma uses db.m5.large:
- 2 vCPUs
- 8 GB RAM
- Moderate network performance
Is This the Right Size?#
Current observation: CPU utilization is typically quite low (5-15%).
Why we haven't downgraded:
- Burst traffic: Explorer, API, and Tilia can create sudden activity spikes
- Cache performance: Larger instance maintains more query cache in memory
- Response times: Users experience faster query responses
- Cost consideration: Database is already the highest AWS cost
Future consideration: A read replica would help with scaling read operations but would increase costs significantly.
Database Optimization Strategies#
Most optimization for Neotoma focuses on:
- Indexing: Ensuring foreign keys and frequently queried columns are indexed
- Query optimization: Improving slow queries rather than increasing hardware
- Connection pooling: Managing database connections efficiently in applications
- Vacuum management: Automatic cleanup of deleted rows (via auto-vacuum)
Why we don't need aggressive optimization:
- Relatively low write volume
- Most tables are fairly static
- The
datatable has high activity but is well-indexed - Current performance meets user needs
When to Reboot the Database#
Reboot Scenarios#
You should reboot the database only in specific situations:
1. Memory Completely Full#
Symptoms:
- Database status shows "storage-full"
- WAL (Write-Ahead Log) files fill all available space
- Database becomes unresponsive
Action:
- Stop all write operations if possible
- Reboot database to clear temporary files
- Investigate what caused space to fill
- Consider increasing storage allocation
2. Parameter Group Changes#
Some parameter modifications require a reboot:
- Adding new extensions
- Changing memory allocation settings
- Modifying connection limits
Note: AWS will indicate if reboot is required when you save parameter changes.
3. Applying Major Updates#
When manually applying major PostgreSQL version updates, a reboot is typically required.
4. Restoring Snapshots#
The database automatically reboots when restoring from a snapshot.
How to Reboot#
- Navigate to your database instance
- Click Actions → Reboot
- Confirm the action
- Wait for status to return to "Available"
Caution:
- All active connections will be dropped
- Applications will temporarily lose database access
- Plan reboots during low-traffic periods
- Notify users of planned downtime
Database Isolation and Security#
Network Architecture#
The Neotoma database is configured with security best practices:
Private subnet:
- Database is NOT publicly accessible
- Cannot connect directly from internet
- Must use SSH tunnel or authorized AWS services
Security groups:
- Act as virtual firewalls
- Control which services can connect
- Nested rules for granular access control
VPC configuration:
- Database in private VPC
- Applications connect through secure channels
- EC2 instances provide SSH tunnel access
Why This Matters#
This isolation means:
- For local development: Use SSH tunnel to
localhost:5555 - For AWS services: Use internal endpoint directly
- For security: Database is protected from external attacks
- For maintenance: Fewer external factors affect performance
Key takeaway: The database exists in its own protected space. Applications reach into it, but it doesn't need to deal with complex networking or cross-VPC communications. This simplifies management significantly.
Troubleshooting Common Issues#
Issue: Database Status "Storage-Full"#
Symptoms:
- Cannot write to database
- Applications return errors
- Status shows storage is full
Resolution:
- Check logs to identify what's consuming space
- Consider clearing old WAL files (carefully!)
- Increase allocated storage:
- Actions → Modify
- Increase storage allocation
- Apply immediately if critical
- Reboot if necessary after freeing space
Issue: High CPU Usage#
Symptoms:
- Slow query response times
- CPU graph shows sustained >70% usage
- Users report lag
Resolution:
- Check CloudWatch metrics to identify spike timing
- Review logs during high-CPU period
- Look for slow queries using pg_stat_statements
- Optimize problematic queries
- Consider instance upgrade if persistent
Issue: Too Many Connections#
Symptoms:
- "Too many clients" errors in logs
- Applications can't connect
- Connection pool exhausted
Resolution:
- Check current connections count
- Identify which applications have open connections
- Verify connection pooling configuration
- Check for applications not closing connections
- Consider increasing max_connections parameter (requires reboot)
Issue: Slow Query Performance#
Symptoms:
- Specific queries take long time
- Timeouts in applications
- High read latency
Resolution:
- Identify slow queries in logs or pg_stat_statements
- Run EXPLAIN ANALYZE on slow queries
- Check if indexes exist on frequently queried columns
- Add indexes if needed
- Optimize query structure
Summary and Best Practices#
Key Takeaways#
- The RDS interface is your primary tool for database health monitoring and management
- Status should always show "Available" - any other status requires investigation
- Snapshots are automatic but manual snapshots before changes are good practice
- Logs are essential for debugging issues across applications
- CloudWatch alarms provide proactive monitoring
- Security groups are stable - rarely need modification
- The database is isolated which simplifies management
- Reboots are rare - only needed for specific scenarios
Regular Monitoring Tasks#
Daily:
- Check database status is "Available"
- Review CloudWatch alarms for any alerts
Weekly:
- Review recent error logs
- Check storage space usage
- Verify automated backups are running
Monthly:
- Review performance metrics
- Check for AWS recommendations
- Verify S3 backup completed
- Review and clean up old manual snapshots
Before major changes:
- Take manual snapshot
- Review current resource usage
- Plan for potential downtime
- Notify users if reboot required
When to Get Help#
Contact the database administrator or AWS support if:
- Database status is not "Available" for more than a few minutes
- Repeated errors appear in logs
- Storage is filling up rapidly
- Performance degrades significantly
- Parameter changes don't take effect as expected
- AWS recommendations seem urgent
Additional Resources#
AWS Documentation:
Internal Documentation:
- SSH Tunnel Setup (placeholder)
- Database Schema Documentation
- API Configuration
Video Walkthrough:
[Link to accompanying video recording] (placeholder)
Last updated: [Date]
Reviewed by: [Name]
Video recording: [Link]