Engineering

Zero-Downtime Database Migrations Using CDC

Rakesh Mamidala·Founder & Lead Engineer··12 min read

The Problem with Traditional Migration

Traditional database migrations require a maintenance window: stop the app, dump the data, load it, update connection strings, start the app. For a 500 GB Oracle database, that's easily 4–8 hours of downtime. For SaaS businesses, that's unacceptable.

Change Data Capture (CDC) solves this by keeping source and target synchronised while the bulk migration runs. When the bulk load finishes, the lag is seconds — not hours.

How CDC Works

Every relational database maintains a transaction log — Oracle has redo logs, MySQL has the binary log, PostgreSQL has WAL. CDC readers parse these logs and extract row-level change events (INSERT, UPDATE, DELETE) without touching the source tables.

text
CDC Pipeline:

Oracle Redo Logs
      ↓
LogMiner API (no agent required)
      ↓
DBMigrateAIPro CDC Reader
      ↓
Apply Engine (upsert/delete on target)
      ↓
PostgreSQL target

The Zero-Downtime Recipe

  1. Capture SCN — Record the System Change Number before bulk migration starts. This is your CDC resume point.
  2. Bulk migrate — Run the full schema + data copy while the source stays live. This takes minutes to hours depending on size.
  3. Start CDC from SCN — Feed LogMiner the saved SCN. It replays all changes that happened during the bulk copy.
  4. Verify lag — Watch the CDC lag metric in DBMigrateAIPro. Wait until it drops below 5 seconds.
  5. Smoke test — Run your application test suite against the PostgreSQL target while CDC keeps it in sync.
  6. Cutover — Update the connection string. Stop CDC. Run ANALYZE. Done. Typical app downtime: < 30 seconds.

Oracle LogMiner Setup

Oracle must be in ARCHIVELOG mode with supplemental logging enabled. Check with:

sql
SELECT LOG_MODE FROM V$DATABASE;
-- Must return: ARCHIVELOG

SELECT SUPPLEMENTAL_LOG_DATA_MIN FROM V$DATABASE;
-- Must return: YES

-- If not enabled:
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;
ALTER TABLE hr.employees ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

Managing CDC Lag

CDC lag is the delay between a commit on the source and its application to the target. Factors that increase lag: high source write rate, slow target, network latency, large transactions. DBMigrateAIPro shows real-time lag in the CDC panel. For most OLTP workloads, steady-state lag is under 2 seconds.

Common Pitfalls

  • DDL changes on the source during CDC — DBMigrateAIPro handles ADD COLUMN and DROP COLUMN; schema restructuring requires a remigration
  • Large LOB columns — streamed as separate LogMiner events, handled automatically
  • Sequences — not captured by CDC, synced separately after cutover
  • Triggers on the target — disable triggers before starting CDC to avoid double-firing