Database Migration Strategies: Zero-Downtime Deployments

Database Migration Strategies: Zero-Downtime Deployments

BySanjay Goraniya
3 min read
Share:

Database Migration Strategies: Zero-Downtime Deployments

Database migrations are risky. One wrong migration can bring down your entire application. But with the right strategies, you can migrate databases safely, even in production, without downtime. After performing hundreds of migrations on production databases, I've learned what works and what doesn't.

The Challenge

Why Migrations Are Risky

  1. Lock contention - Migrations can lock tables
  2. Data loss - Wrong migrations can delete data
  3. Breaking changes - Schema changes break application
  4. Rollback complexity - Hard to undo migrations

Migration Types

1. Additive Changes (Safest)

Adding new columns, tables, indexes:

Code
-- Safe: Adding nullable column
ALTER TABLE users ADD COLUMN phone VARCHAR(20) NULL;

-- Safe: Adding new table
CREATE TABLE user_preferences (
  user_id INTEGER REFERENCES users(id),
  preference_key VARCHAR(100),
  preference_value TEXT
);

-- Safe: Adding index (concurrent)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

2. Destructive Changes (Risky)

Removing columns, tables, constraints:

Code
-- Risky: Removing column
ALTER TABLE users DROP COLUMN old_field;

-- Risky: Removing table
DROP TABLE old_table;

3. Transformative Changes (Complex)

Changing column types, constraints:

Code
-- Complex: Changing column type
ALTER TABLE users ALTER COLUMN age TYPE INTEGER;

Zero-Downtime Migration Strategy

Step 1: Add New Column (Nullable)

Code
-- Migration 1: Add new column as nullable
ALTER TABLE users ADD COLUMN email_new VARCHAR(255) NULL;

Application code: Still uses old column, new column exists but unused.

Step 2: Backfill Data

Code
-- Migration 2: Populate new column
UPDATE users SET email_new = email WHERE email_new IS NULL;

Application code: Still uses old column, but new column is populated.

Step 3: Dual Write

Code
// Application code: Write to both columns
async function updateUser(userId, email) {
  await db.query(
    'UPDATE users SET email = $1, email_new = $1 WHERE id = $2',
    [email, userId]
  );
}

Application code: Writes to both old and new columns.

Step 4: Switch Reads

Code
// Application code: Read from new column
async function getUser(userId) {
  const result = await db.query(
    'SELECT id, name, email_new as email FROM users WHERE id = $1',
    [userId]
  );
  return result.rows[0];
}

Application code: Reads from new column, still writes to both.

Step 5: Remove Old Column

Code
-- Migration 3: Remove old column
ALTER TABLE users DROP COLUMN email;
ALTER TABLE users RENAME COLUMN email_new TO email;

Application code: Uses new column name.

Renaming Columns

Safe Rename Strategy

Code
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN email_address VARCHAR(255) NULL;

-- Step 2: Backfill
UPDATE users SET email_address = email;

-- Step 3: Make new column NOT NULL (after backfill)
ALTER TABLE users ALTER COLUMN email_address SET NOT NULL;

-- Step 4: Application switches to new column

-- Step 5: Remove old column
ALTER TABLE users DROP COLUMN email;

Changing Column Types

Safe Type Change

Code
-- Step 1: Add new column with new type
ALTER TABLE users ADD COLUMN age_new INTEGER NULL;

-- Step 2: Migrate data
UPDATE users SET age_new = CAST(age AS INTEGER) WHERE age IS NOT NULL;

-- Step 3: Make NOT NULL
ALTER TABLE users ALTER COLUMN age_new SET NOT NULL;

-- Step 4: Application switches

-- Step 5: Remove old column
ALTER TABLE users DROP COLUMN age;
ALTER TABLE users RENAME COLUMN age_new TO age;

Adding Indexes

Concurrent Index Creation

Code
-- Safe: Concurrent index (doesn't lock table)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

-- Risky: Regular index (locks table)
CREATE INDEX idx_users_email ON users(email);

Always use CONCURRENTLY for production.

Removing Columns

Safe Removal

Code
-- Step 1: Stop using column in application
-- (Code no longer references column)

-- Step 2: Verify column is unused
-- (Monitor for a period)

-- Step 3: Remove column
ALTER TABLE users DROP COLUMN old_column;

Data Migrations

Batch Processing

Code
// Bad: Migrate all at once (locks table)
await db.query('UPDATE users SET status = "active" WHERE status IS NULL');

// Good: Batch processing
async function migrateUsersInBatches(batchSize = 1000) {
  let offset = 0;
  let hasMore = true;

  while (hasMore) {
    const users = await db.query(
      'SELECT id FROM users WHERE status IS NULL LIMIT $1 OFFSET $2',
      [batchSize, offset]
    );

    if (users.rows.length === 0) {
      hasMore = false;
      break;
    }

    const ids = users.rows.map(u => u.id);
    await db.query(
      'UPDATE users SET status = $1 WHERE id = ANY($2)',
      ['active', ids]
    );

    offset += batchSize;
    
    // Yield to other operations
    await sleep(100);
  }
}

Migration Tools

Using Migration Framework

Code
// migrations/001_add_email_column.js
exports.up = async (db) => {
  await db.query('ALTER TABLE users ADD COLUMN email VARCHAR(255) NULL');
};

exports.down = async (db) => {
  await db.query('ALTER TABLE users DROP COLUMN email');
};

Version Control

Code
// Track migration versions
CREATE TABLE schema_migrations (
  version VARCHAR(255) PRIMARY KEY,
  applied_at TIMESTAMP DEFAULT NOW()
);

// Check if migration applied
async function isMigrationApplied(version) {
  const result = await db.query(
    'SELECT 1 FROM schema_migrations WHERE version = $1',
    [version]
  );
  return result.rows.length > 0;
}

Rollback Strategies

Always Have Rollback Plan

Code
-- Migration
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Rollback
ALTER TABLE users DROP COLUMN phone;

Test Rollbacks

Always test rollback migrations in staging before production.

Real-World Example

Challenge: Migrate email column from VARCHAR(100) to VARCHAR(255) without downtime.

Strategy:

  1. Add new column (nullable)
  2. Backfill data (in batches)
  3. Dual write (application writes to both)
  4. Switch reads (application reads from new)
  5. Remove old column

Timeline:

  • Day 1: Add column, backfill
  • Day 2-3: Dual write period
  • Day 4: Switch reads
  • Day 5: Remove old column

Result: Zero downtime, zero data loss.

Best Practices

  1. Test in staging - Always test migrations first
  2. Backup first - Always backup before migrations
  3. Use transactions - When possible
  4. Batch large changes - Don't lock tables
  5. Monitor during migration - Watch for issues
  6. Have rollback plan - Know how to undo
  7. Communicate - Let team know about migrations
  8. Use CONCURRENTLY - For index creation

Common Pitfalls

1. Locking Tables

Code
-- Bad: Locks table
CREATE INDEX idx_email ON users(email);

-- Good: Doesn't lock
CREATE INDEX CONCURRENTLY idx_email ON users(email);

2. Migrating All Data at Once

Code
-- Bad: Locks table
UPDATE users SET status = 'active';

-- Good: Batch processing
-- (See batch processing example above)

3. No Rollback Plan

Always have a way to undo migrations.

Conclusion

Database migrations don't have to be scary. With the right strategy:

  • Additive changes are safe
  • Destructive changes can be done safely with multi-step process
  • Zero downtime is achievable
  • Data safety is maintained

Remember: Slow and steady wins the race. Take your time with migrations, test thoroughly, and always have a rollback plan.

What database migration challenges have you faced? What strategies have worked best for your systems?

Share:

Related Posts