Skip to content

AWS RDS Database Management Tutorial#

Introduction#

This tutorial walks through the AWS RDS (Relational Database Service) interface, showing you how to manage and monitor the Neotoma database instance. This guide covers the essential features you'll need for day-to-day database administration.

Prerequisites#

  • AWS account with appropriate permissions
  • Access to the Neotoma AWS environment
  • Basic understanding of PostgreSQL databases

Estimated time: 20-30 minutes


Accessing the AWS Console#

Step 1: Log in to AWS Console#

When you first log in to AWS, you'll see the Console Home page.

The console displays:

  • Recently visited services
  • Quick access to common AWS resources
  • Search bar for finding services

Step 2: Pin Frequently Used Services#

To streamline your workflow, you can pin services you use frequently.

  1. Type "RDS" in the search bar at the top
  2. Click the star icon next to "RDS" to pin it
  3. The service will now appear in your top navigation bar

A screenshot of the main AWS console page showing a search for RDS, with results. The top result has been "starred" to allow it to be displayed on the user's bookmarks.

Tip: You can unpin services you no longer use by clicking the star again. This helps keep your interface clean and focused on the tools you actually need.


Understanding RDS Database Instances#

What is an RDS Instance?#

An RDS database instance is a dedicated database server in the cloud. Key concepts:

  • One instance = One database server (not one database)
  • A single instance can host multiple databases (e.g., neotoma and neotomatank)
  • Each instance has a specific combination of CPU, memory, and storage, as well as any custom configurations you may apply.
  • Instances are virtual servers optimized specifically for database workloads

Step 3: Navigate to Your Database Instance#

  1. Click on RDS in your pinned services or search bar
  2. From the RDS dashboard, you'll see an overview including:

  3. Available storage

  4. Number of database instances
  5. Quick links to tutorials

  6. Click on DB instances in the left sidebar

Screenshot placeholder: RDS dashboard showing overview statistics

You should see your database instance listed. For Neotoma, the identifier is neotomaPrivate.


Database Instance Overview#

Step 4: View Instance Details#

Click on your database identifier (e.g., neotomaPrivate) to see detailed information.

Screenshot placeholder: Database instance list showing neotomaPrivate with status "Available"

Key information displayed:

Field What It Means
Status Should show "Available" when running normally
Region "us-east-2" (Ohio) for Neotoma
Engine PostgreSQL version
DB Instance Class Server size (e.g., db.m5.large)
Storage Amount of disk space allocated

Understanding Database Status#

The Status field is critical for monitoring:

  • Available: Database is running normally
  • Storage-full: No available disk space (requires immediate action)
  • Backing-up: Automated backup in progress
  • Modifying: Configuration changes being applied

Warning: If status shows "unavailable" or "storage-full", the database needs immediate attention. See the Troubleshooting section below.


Database Actions and Quick Operations#

Step 5: Common Database Actions#

At the top of the instance details page, you'll see an Actions dropdown menu.

Screenshot placeholder: Actions dropdown menu showing available operations

Common actions include:

Take Snapshot#

Creates a point-in-time backup of the entire database.

When to use:

  • Before major data uploads or changes
  • Before schema modifications
  • As an extra backup before maintenance

How to:

  1. Click ActionsTake snapshot
  2. Provide a descriptive name (e.g., "pre-upload-2024-12-10")
  3. Click Take Snapshot

Snapshots are stored in AWS and don't count against your database storage.

Reboot Database#

Restarts the database server.

When to use:

  • After certain parameter group changes
  • When database becomes unresponsive
  • If memory fills up completely (rare)

Caution: Rebooting will briefly disconnect all active users and applications.


Monitoring Database Performance#

Step 6: Review Connection and Resource Information#

Scroll down to see detailed instance information:

Screenshot placeholder: Configuration section showing endpoint, port, and VPC information

Key Configuration Details#

Endpoint: neotomaprivate.xxxxx.us-east-2.rds.amazonaws.com

  • This is the actual database server address
  • Used in environment variables for applications (API, AppRunner, etc.)
  • For local development, you'll use an SSH tunnel to localhost:5555 instead

Port: 5432

  • Standard PostgreSQL port
  • Required for connection strings

VPC (Virtual Private Cloud): Network isolation

  • Database is in a private subnet for security
  • Cannot be accessed directly from the internet
  • Must tunnel through public VPC for external access

Public Accessibility: No

  • Database is not exposed to the public internet
  • Access only through secure tunnels or authorized AWS services

Step 7: Understanding Security Groups#

Security groups control network access to your database.

Screenshot placeholder: Security groups section showing RDS-EC2 and AppRunner groups

You'll see several security group rules:

  • RDS-EC2: Allows SSH tunnel connections from EC2 instances
  • AppRunner-Group-for-RDS: Allows API and application connections
  • Each rule defines which services can connect to the database

Note: Security groups are nested and include multiple sub-rules. In general, these are stable and don't require frequent changes. You'll only modify them when adding new services.

Step 8: View Active Sessions#

The Current sessions section shows all active database connections.

Screenshot placeholder: Current sessions showing multiple connections

Each connection represents:

  • The API making requests
  • Tilia application connections
  • DBeaver or other database tools
  • Explorer (via the API)
  • Direct user connections

Tip: If you see an unusually high number of connections, it may indicate:

  • A runaway process
  • Connection pooling issues
  • An application not closing connections properly

Monitoring Database Health#

Step 9: Review Monitoring Metrics#

Click on the Monitoring tab to see performance graphs.

Screenshot placeholder: Monitoring tab showing multiple performance graphs

Key Metrics to Watch#

CPU Utilization#

Shows how much processing power is being used.

  • Normal: 5-30% for typical Neotoma usage
  • High: >70% sustained may indicate need for optimization or larger instance
  • Current status: Generally quite low, suggesting we may be over-provisioned

Burst Balance#

Should normally be at 100%.

  • Drops when database experiences sudden high activity
  • If consistently below 100%, may need different instance type
  • Neotoma rarely has issues with this

Connections#

Number of active database connections.

  • Monitor for unusual spikes
  • Too many connections can slow performance
  • Each application (API, Tilia, etc.) maintains connection pools

Free Storage Space#

Amount of available disk space.

  • Critical metric - running out of space will cause database failure
  • An alarm is configured to alert when space is low
  • If space fills up, database will become unavailable

Screenshot placeholder: Free storage space graph showing current usage

Read/Write Latency#

How long database operations take.

  • Low latency = fast queries
  • Spikes may indicate slow queries or indexing issues
  • Generally stable for Neotoma

CloudWatch Monitoring and Alarms#

Step 10: Configure CloudWatch Alarms#

CloudWatch provides advanced monitoring and alerting.

  1. From the Monitoring tab, click View in CloudWatch or navigate to CloudWatch from the AWS console
  2. You can create custom alarms for specific conditions

Screenshot placeholder: CloudWatch alarms configuration page

Setting Up an Alarm#

Example: Alert when free storage drops below 20GB

  1. Click Create Alarm
  2. Select metric: FreeStorageSpace
  3. Set condition: Lower than 20 GB
  4. Configure notification (SNS topic or email)
  5. Name the alarm descriptively: "RDS-Low-Storage-Alert"

Existing Alarms for Neotoma:

  • Low free storage space (configured)
  • You can add alarms for:
  • High CPU usage
  • Too many connections
  • High query latency

Reviewing Database Logs#

Step 11: Access Error Logs#

Logs are essential for debugging and monitoring database health.

  1. Click on the Logs & events tab
  2. You'll see a list of log files organized by date

Screenshot placeholder: Logs & events tab showing multiple log files

Understanding Log Files#

Each log file shows:

  • File name: Includes date/time stamp

  • Size: Larger files may indicate issues or high activity

  • Last written: When the log was last updated

Viewing a Log File#

  1. Click on a log file name
  2. The log will open showing database events
  3. Most entries are routine checkpoints and status updates

Screenshot placeholder: Log file contents showing various database events

What to Look For in Logs#

Normal entries:

  • Checkpoint completed
  • Connection established/closed
  • Automated backup started/completed

Issues to investigate:

  • Repeated error messages
  • Failed query attempts
  • Connection failures
  • Out of memory warnings

Debugging Failed Queries#

When you see an error in the logs:

  1. Identify the query: Look for SQL statements that failed
  2. Check the error message: What went wrong?
  3. Note the timestamp: Correlate with application logs
  4. Download the log: Click Download to save for analysis

Example use cases:

  • API endpoint returning errors → Check logs for failed queries
  • Tilia not saving data → Look for INSERT/UPDATE failures
  • Landing pages loading slowly → Identify long-running queries

Limitations: Identifying Query Sources#

Currently, all applications (API, Tilia, Explorer) connect using the same database role: neotoma_admin.

Limitation: You cannot immediately tell which application executed a query just from the database logs.

Workaround:

  1. Identify the failing query in database logs

  2. Go to the specific application (API, Tilia, etc.)

  3. Check the application's logs in AppRunner

  4. Match timestamps to find the source

Future Improvement: Create separate database roles for each application (e.g., neotoma_api_user, tilia_user) to make log analysis easier.


Database Backups and Snapshots#

Step 12: Review Backup Configuration#

Click on the Maintenance & backups tab.

Screenshot placeholder: Maintenance & backups showing automated backup settings

Automated Backups#

The database automatically creates daily backups:

  • Backup window: 10:00 PM Ohio time

  • Retention period: 7 days (configurable)

  • Type: Full database snapshot

  • Storage location: AWS S3

Types of Backups#

System (Automated) Snapshots#

Created automatically by AWS:

  • Daily at scheduled time

  • Retained for specified period

  • Cannot be manually deleted

  • Used for point-in-time recovery

Manual Snapshots#

Created by you as needed:

  • Before major changes

  • For long-term archival

  • Can be kept indefinitely

  • Can be deleted when no longer needed

Step 13: View Available Snapshots#

  1. Click Snapshots in the left sidebar of RDS console

  2. Filter by your database instance

  3. View both system and manual snapshots

Screenshot placeholder: Snapshots list showing dates and types

Snapshot information includes:

  • Snapshot ID: Unique identifier

  • Creation date/time: When it was taken

  • Type: System or Manual

  • Size: Storage space used

  • Status: Available, Creating, etc.

Step 14: Restore from a Snapshot#

If you need to recover the database:

  1. Select the snapshot you want to restore

  2. Click ActionsRestore snapshot

  3. Configure the restored instance:

  4. Instance identifier (must be unique)

  5. Instance type

  6. VPC and security groups

  7. Click Restore DB Instance

Important: Restoring creates a NEW database instance - it doesn't overwrite the existing one. This is a safety feature.

Caution: After restoration, you'll need to:

  • Update application connection strings

  • Verify data integrity

  • Test all connections

  • Consider deleting the old instance if replacement was intended


Additional Backup Strategy#

Monthly Backups to S3#

In addition to automated RDS snapshots, Neotoma maintains monthly backups uploaded directly to S3.

How it works:

  1. A Python script creates a database dump

  2. The dump is compressed

  3. Uploaded to an S3 bucket for long-term storage

  4. Managed through AWS Batch service (runs automatically)

  5. Uses Fargate for compute resources

Note: This is separate from RDS snapshots and provides an additional layer of data protection. These backups can be downloaded and restored independently of AWS.


Database Configuration and Parameters#

Step 15: Understanding Parameter Groups#

Parameter groups define PostgreSQL configuration settings for your RDS instance.

  1. From the instance details, find Parameter group under Configuration

  2. Click on the parameter group name (e.g., neotoma-postgres)

Screenshot placeholder: Parameter group details showing modified parameters

What Are Parameter Groups?#

  • Collections of database engine configuration settings

  • Control how PostgreSQL behaves

  • Some changes require database reboot

  • Can be shared across multiple instances

Key Modified Parameters for Neotoma#

By default, RDS uses standard PostgreSQL settings. For Neotoma, several parameters have been customized:

shared_preload_libraries#

Modified to include: pg_cron, pg_stat_statements

  • pg_cron: Enables scheduled database jobs

  • pg_stat_statements: Tracks query performance statistics

  • PostGIS: Spatial data extension

cron.database_name#

Specifies which database contains the cron scheduling tables.

Page Cost Settings#

  • random_page_cost: Cost estimate for non-sequential disk reads

  • seq_page_cost: Cost estimate for sequential disk reads

  • These affect query planner decisions and indexing efficiency

When to Modify Parameters#

Common scenarios:

  • Adding extensions: PostGIS, pg_cron, etc. require preload configuration

  • Performance tuning: Adjusting memory allocation, connection limits

  • Logging: Changing what gets logged and how much detail

  • Maintenance: Auto-vacuum frequency, statistics collection

Caution:

  • Some parameter changes require database reboot

  • Changes can significantly affect performance

  • Test changes in development environment first

  • Document all modifications

Step 16: View and Edit Parameters#

To view current parameters:

  1. Open your parameter group

  2. Browse through categories or search for specific parameters

  3. Modified parameters are marked (different from default)

Screenshot placeholder: Parameter list showing modified vs. default values

To edit a parameter:

  1. Find the parameter you want to change

  2. Click Edit parameters

  3. Modify the value

  4. Save changes

  5. Note if reboot is required (indicated in the interface)


Database Maintenance#

Automatic Minor Version Updates#

The database is configured to automatically apply minor version updates during the maintenance window.

What this means:

  • Security patches are applied automatically
  • Minor PostgreSQL updates (e.g., 15.2 → 15.3)
  • Occurs during low-traffic periods
  • Minimal downtime

Pending Maintenance#

Check the Maintenance & backups tab for:

  • Scheduled maintenance windows
  • Available updates
  • Required actions

Screenshot placeholder: Pending maintenance section showing next maintenance window

When Database Maintenance Happens#

Automatic maintenance includes:

  • Operating system patches
  • Database engine updates
  • Hardware maintenance

You control:

  • Maintenance window timing
  • Whether to apply immediately or defer
  • Major version upgrades (manual only)

Performance Optimization Notes#

Current Instance Size#

Neotoma uses db.m5.large:

  • 2 vCPUs
  • 8 GB RAM
  • Moderate network performance

Is This the Right Size?#

Current observation: CPU utilization is typically quite low (5-15%).

Why we haven't downgraded:

  1. Burst traffic: Explorer, API, and Tilia can create sudden activity spikes
  2. Cache performance: Larger instance maintains more query cache in memory
  3. Response times: Users experience faster query responses
  4. Cost consideration: Database is already the highest AWS cost

Future consideration: A read replica would help with scaling read operations but would increase costs significantly.

Database Optimization Strategies#

Most optimization for Neotoma focuses on:

  1. Indexing: Ensuring foreign keys and frequently queried columns are indexed
  2. Query optimization: Improving slow queries rather than increasing hardware
  3. Connection pooling: Managing database connections efficiently in applications
  4. Vacuum management: Automatic cleanup of deleted rows (via auto-vacuum)

Why we don't need aggressive optimization:

  • Relatively low write volume
  • Most tables are fairly static
  • The data table has high activity but is well-indexed
  • Current performance meets user needs

When to Reboot the Database#

Reboot Scenarios#

You should reboot the database only in specific situations:

1. Memory Completely Full#

Symptoms:

  • Database status shows "storage-full"
  • WAL (Write-Ahead Log) files fill all available space
  • Database becomes unresponsive

Action:

  1. Stop all write operations if possible
  2. Reboot database to clear temporary files
  3. Investigate what caused space to fill
  4. Consider increasing storage allocation

2. Parameter Group Changes#

Some parameter modifications require a reboot:

  • Adding new extensions
  • Changing memory allocation settings
  • Modifying connection limits

Note: AWS will indicate if reboot is required when you save parameter changes.

3. Applying Major Updates#

When manually applying major PostgreSQL version updates, a reboot is typically required.

4. Restoring Snapshots#

The database automatically reboots when restoring from a snapshot.

How to Reboot#

  1. Navigate to your database instance
  2. Click ActionsReboot
  3. Confirm the action
  4. Wait for status to return to "Available"

Caution:

  • All active connections will be dropped
  • Applications will temporarily lose database access
  • Plan reboots during low-traffic periods
  • Notify users of planned downtime

Database Isolation and Security#

Network Architecture#

The Neotoma database is configured with security best practices:

Private subnet:

  • Database is NOT publicly accessible
  • Cannot connect directly from internet
  • Must use SSH tunnel or authorized AWS services

Security groups:

  • Act as virtual firewalls
  • Control which services can connect
  • Nested rules for granular access control

VPC configuration:

  • Database in private VPC
  • Applications connect through secure channels
  • EC2 instances provide SSH tunnel access

Why This Matters#

This isolation means:

  • For local development: Use SSH tunnel to localhost:5555
  • For AWS services: Use internal endpoint directly
  • For security: Database is protected from external attacks
  • For maintenance: Fewer external factors affect performance

Key takeaway: The database exists in its own protected space. Applications reach into it, but it doesn't need to deal with complex networking or cross-VPC communications. This simplifies management significantly.


Troubleshooting Common Issues#

Issue: Database Status "Storage-Full"#

Symptoms:

  • Cannot write to database
  • Applications return errors
  • Status shows storage is full

Resolution:

  1. Check logs to identify what's consuming space
  2. Consider clearing old WAL files (carefully!)
  3. Increase allocated storage:
  4. Actions → Modify
  5. Increase storage allocation
  6. Apply immediately if critical
  7. Reboot if necessary after freeing space

Issue: High CPU Usage#

Symptoms:

  • Slow query response times
  • CPU graph shows sustained >70% usage
  • Users report lag

Resolution:

  1. Check CloudWatch metrics to identify spike timing
  2. Review logs during high-CPU period
  3. Look for slow queries using pg_stat_statements
  4. Optimize problematic queries
  5. Consider instance upgrade if persistent

Issue: Too Many Connections#

Symptoms:

  • "Too many clients" errors in logs
  • Applications can't connect
  • Connection pool exhausted

Resolution:

  1. Check current connections count
  2. Identify which applications have open connections
  3. Verify connection pooling configuration
  4. Check for applications not closing connections
  5. Consider increasing max_connections parameter (requires reboot)

Issue: Slow Query Performance#

Symptoms:

  • Specific queries take long time
  • Timeouts in applications
  • High read latency

Resolution:

  1. Identify slow queries in logs or pg_stat_statements
  2. Run EXPLAIN ANALYZE on slow queries
  3. Check if indexes exist on frequently queried columns
  4. Add indexes if needed
  5. Optimize query structure

Summary and Best Practices#

Key Takeaways#

  1. The RDS interface is your primary tool for database health monitoring and management
  2. Status should always show "Available" - any other status requires investigation
  3. Snapshots are automatic but manual snapshots before changes are good practice
  4. Logs are essential for debugging issues across applications
  5. CloudWatch alarms provide proactive monitoring
  6. Security groups are stable - rarely need modification
  7. The database is isolated which simplifies management
  8. Reboots are rare - only needed for specific scenarios

Regular Monitoring Tasks#

Daily:

  • Check database status is "Available"
  • Review CloudWatch alarms for any alerts

Weekly:

  • Review recent error logs
  • Check storage space usage
  • Verify automated backups are running

Monthly:

  • Review performance metrics
  • Check for AWS recommendations
  • Verify S3 backup completed
  • Review and clean up old manual snapshots

Before major changes:

  • Take manual snapshot
  • Review current resource usage
  • Plan for potential downtime
  • Notify users if reboot required

When to Get Help#

Contact the database administrator or AWS support if:

  • Database status is not "Available" for more than a few minutes
  • Repeated errors appear in logs
  • Storage is filling up rapidly
  • Performance degrades significantly
  • Parameter changes don't take effect as expected
  • AWS recommendations seem urgent

Additional Resources#

AWS Documentation:

Internal Documentation:

Video Walkthrough:

[Link to accompanying video recording] (placeholder)


Last updated: [Date]
Reviewed by: [Name]
Video recording: [Link]