Migration Guide

MongoDB → PostgreSQL

Migrate from MongoDB document storage to a fully relational PostgreSQL schema — with AI-inferred table structures, JSONB for flexible parts, and CDC via Change Streams.

When to migrate from MongoDB to PostgreSQL?

  • Data has stabilised — schema flexibility no longer outweighs query power
  • Need complex JOINs, window functions, or transactional integrity
  • MongoDB Atlas costs are exceeding expectations at scale
  • Reporting stack expects SQL — BI tools, dbt, Metabase
  • Compliance requires row-level security or full audit trail
  • Want to consolidate — one database for both OLTP and analytics

How Schema Inference Works

MongoDB has no schema. DBMigrateAIPro samples up to 10,000 documents per collection (configurable) and infers the dominant schema:

  1. 1.For each field, the type distribution is computed across the sampled documents.
  2. 2.Fields present in ≥ 80% of documents become explicit columns.
  3. 3.Fields below the threshold are merged into a jsonb_extra JSONB column.
  4. 4.Nested subdocuments below a configurable depth are stored as JSONB.
  5. 5.Arrays of scalars with a consistent element type become native arrays (TEXT[], INTEGER[]).
  6. 6.Arrays of objects become JSONB[] or optionally a child table with a foreign key.

Review the inferred schema in the Schema tab before running migration. You can override any field type or flattening depth.

1

Prerequisites

MongoDB 4.0+ (source). For CDC: replica set mode (rs.initiate()) — Change Streams require a replica set. A read-only user with readAnyDatabase or per-collection read access is sufficient for bulk migration. PostgreSQL 14+ target recommended for JSONB performance.

2

Run the Assessment Report

The assessment for MongoDB reports: collection count, estimated document count, average document size, polymorphic field detection, deeply nested field analysis (depth > 3 flagged), DBRef references (cross-collection foreign keys), and GridFS collections.

3

Review the Inferred Schema

After assessment, open the Schema tab. Each collection maps to a proposed PostgreSQL table. Review the inferred columns, types, and which fields are merged into jsonb_extra. Adjust thresholds or override individual field types before proceeding.

4

Create Target Schema

Click Create Schema on Target to execute the generated CREATE TABLE statements on PostgreSQL. Check the DDL preview before executing. ObjectId fields default to VARCHAR(24) — change to UUID in settings if you prefer native UUID storage.

5

Bulk Data Migration

Documents are streamed from MongoDB via the driver cursor and inserted into PostgreSQL using COPY. Arrays and nested objects are serialised to JSONB. Parallel workers default to 8. For large collections, increase workers and set batch_size to 5000–10000 documents.

6

CDC via Change Streams (Optional)

After bulk migration, enable CDC in the CDC tab. DBMigrateAIPro subscribes to the MongoDB Change Stream and applies INSERT/UPDATE/DELETE/REPLACE events to PostgreSQL. Resume token is saved — migration survives restarts.

7

Validate & Cutover

Run the Comparison Report to validate document-to-row counts per collection. Review the orphan-reference report for unresolved DBRef fields. Update your application connection string to PostgreSQL when satisfied.

Field Type Mapping Reference

MongoDB Type / PatternPostgreSQL Mapping
Document (flat)TABLE row — each top-level field becomes a column
Nested subdocumentJSONB column — or flattened with dot-notation column names
Array of scalarsTEXT[] / INT[] — or JSONB if mixed types
Array of objectsJSONB[] — or child table with FK back to parent
ObjectIdVARCHAR(24) or UUID (with hex conversion)
StringTEXT
Int32 / Int64INTEGER / BIGINT
DoubleDOUBLE PRECISION
Decimal128NUMERIC
BooleanBOOLEAN
Date (ISODate)TIMESTAMPTZ
NullNULL
Binary / BinDataBYTEA

Common Issues

PolymorphicField

Same field has different types across documents

DBMigrateAIPro detects type collisions during schema inference and defaults to JSONB. Review the assessment report for flagged polymorphic fields.

MaxDocumentSizeError

Documents exceeding 16 MB

Large GridFS documents are streamed in chunks and stored as BYTEA. Standard document size limit does not apply to GridFS.

NullReference: _id

Missing _id field on some documents

All MongoDB documents have _id by default. If a collection was created without _id index, add one before migration.

FK violation

References between collections not preserved

MongoDB has no enforced FKs. DBMigrateAIPro generates an orphan-reference report showing DBRef fields. Add FK constraints manually after reviewing the report.

ChangeStreamError

Change Streams require replica set

MongoDB Change Streams require a replica set or sharded cluster. A standalone mongod does not support Change Streams. Add the replica set with: rs.initiate().