Migration

How to Assess Your Oracle Database Before Migration

Rakesh Mamidala·Founder & Lead Engineer··11 min read

The Highest-Leverage Hour You'll Spend

Every painful migration shares the same root cause: a surprise that should have been a known quantity. An XMLTYPE column nobody flagged. A package that turned out to hold session state. A circular foreign-key cluster that broke the load order at 2 a.m. on cutover night. None of these are hard to handle — they're only hard to handle when you discover them late.

A proper assessment converts every one of those late surprises into a line item on a plan you wrote weeks earlier. It's the single highest-leverage block of work in the whole project, and it's almost entirely read-only against the source. This guide walks through the five things worth measuring, the SQL to measure them, and how to turn the raw numbers into a risk score you can actually plan against.

Step 1 — Take a Complete Object Inventory

Start with a census. You can't scope what you haven't counted, and the object-type breakdown is the first thing that tells you whether this is a two-week job or a two-quarter program. Count every object type, per schema:

sql
-- Object census by schema and type
SELECT owner, object_type, COUNT(*) AS n
FROM   dba_objects
WHERE  owner IN ('HR', 'SALES', 'FIN')   -- the schemas in scope
GROUP  BY owner, object_type
ORDER  BY owner, n DESC;

-- Typical output worth noting:
--   TABLE, VIEW, INDEX           → mechanical
--   PACKAGE, PACKAGE BODY        → measure the surface (Step 3)
--   TRIGGER, SEQUENCE, TYPE      → mostly mechanical
--   MATERIALIZED VIEW, JOB       → needs a replacement plan

The ratio matters more than the totals. A schema that is 400 tables and 12 small packages is a very different project from 80 tables and 300 packages. The first is dominated by data movement; the second is dominated by code conversion.

Step 2 — Profile the Data Volume

Object count tells you about code effort; segment size tells you about time and cutover window. Pull the physical footprint, the biggest tables, and the LOB volume — large objects move at a different rate than ordinary rows and deserve their own line in the plan.

sql
-- Top 20 segments by size — these drive your migration runtime
SELECT segment_name, segment_type,
       ROUND(bytes/1024/1024/1024, 2) AS gb
FROM   dba_segments
WHERE  owner = 'SALES'
ORDER  BY bytes DESC
FETCH FIRST 20 ROWS ONLY;

-- LOB footprint — CLOB/BLOB columns stream slower than scalar rows
SELECT owner, table_name, column_name, data_type
FROM   dba_tab_columns
WHERE  owner IN ('HR','SALES','FIN')
AND    data_type IN ('CLOB','BLOB','NCLOB','XMLTYPE','LONG');

-- Partitioned tables — they map to PG declarative partitioning,
-- but the partition scheme is a decision worth making up front
SELECT table_owner, table_name, partitioning_type, COUNT(*) AS parts
FROM   dba_tab_partitions
GROUP  BY table_owner, table_name, partitioning_type
ORDER  BY parts DESC;

With sizes in hand you can estimate the bulk-load runtime honestly (a modern pipeline sustains tens of thousands of rows/sec per worker) and decide whether you need CDC to keep the cutover window to seconds rather than hours.

Step 3 — Measure the PL/SQL Surface

This is the number that most often surprises teams. Total PL/SQL line count, and where it lives, is the best single predictor of conversion effort. The good news: the large majority of PL/SQL maps cleanly to PL/pgSQL. The work is concentrated in a small set of constructs — so measure both the volume and the concentration.

sql
-- Lines of PL/SQL by object — where the code mass actually sits
SELECT owner, name, type, COUNT(*) AS lines
FROM   dba_source
WHERE  owner IN ('HR','SALES','FIN')
GROUP  BY owner, name, type
ORDER  BY lines DESC
FETCH FIRST 30 ROWS ONLY;

-- Total surface, by object type
SELECT type, COUNT(DISTINCT name) AS objects, COUNT(*) AS lines
FROM   dba_source
WHERE  owner IN ('HR','SALES','FIN')
GROUP  BY type;

A useful rule of thumb from 500+ migrations: plan for ~95% of PL/SQL to convert automatically, and budget your human review time for the remaining 5% — which the next step helps you find before it finds you.

Step 4 — Catalog the Genuinely Hard Constructs

Not all PL/SQL is equal. A handful of constructs have no one-to-one PostgreSQL equivalent and need a design decision rather than a mechanical translation. Finding every instance now is what separates a calm migration from a scramble. Grep the source for the usual suspects:

sql
-- Find the constructs that need a human decision
SELECT name, type,
       SUM(CASE WHEN UPPER(text) LIKE '%CONNECT BY%'            THEN 1 ELSE 0 END) AS connect_by,
       SUM(CASE WHEN UPPER(text) LIKE '%AUTONOMOUS_TRANSACTION%' THEN 1 ELSE 0 END) AS autonomous,
       SUM(CASE WHEN UPPER(text) LIKE '%DBMS_%'                 THEN 1 ELSE 0 END) AS dbms_calls,
       SUM(CASE WHEN UPPER(text) LIKE '%UTL_%'                  THEN 1 ELSE 0 END) AS utl_calls
FROM   dba_source
WHERE  owner IN ('HR','SALES','FIN')
GROUP  BY name, type
HAVING SUM(CASE WHEN UPPER(text) LIKE '%CONNECT BY%'
                  OR UPPER(text) LIKE '%AUTONOMOUS_TRANSACTION%'
                  OR UPPER(text) LIKE '%DBMS_%'
                  OR UPPER(text) LIKE '%UTL_%' THEN 1 ELSE 0 END) > 0
ORDER  BY 3 DESC, 4 DESC;

Map each finding to its effort class. The pattern for every one of these is well understood — the value of the assessment is simply knowing the count before you commit to a timeline:

ConstructEffort classWhy
Tables, columns, constraints, B-tree indexesMechanicalDirect type + DDL mapping; transpiles automatically
Views, standard sequencesMechanicalNear-identical syntax; IDENTITY replaces sequence+trigger
Most PL/SQL (cursors, loops, exceptions)Mostly automaticPL/pgSQL is block-structured and cursor-aware; ~95% transpiles
Packages with session stateDesign decisionNo CREATE PACKAGE in PG — schema + GUC/temp-table state pattern
Autonomous transactionsDesign decisionNo PRAGMA — dblink loopback pattern, needs an auth design
XMLTYPE, CONNECT BY, hierarchical queriesTargeted rewriteMap to JSONB / WITH RECURSIVE — pattern is known, needs review
DB links, Advanced Queuing, DBMS_SCHEDULERReplacementReplaced by FDW, listen/notify or a queue, pg_cron — architectural
RAC, Exadata-specific featuresArchitectureMap to Patroni HA / standard PG — a deployment decision, not code

Step 5 — Map Dependencies and Load Order

The last technical surprise is structural. Circular foreign keys, cross-schema references, and objects that depend on other schemas all affect the order in which you can load data and create constraints. Surface them now so the load plan is deterministic:

sql
-- Cross-schema dependencies — these affect load order and grants
SELECT owner, name, referenced_owner, referenced_name, referenced_type
FROM   dba_dependencies
WHERE  owner IN ('HR','SALES','FIN')
AND    referenced_owner NOT IN ('SYS','PUBLIC', owner)
ORDER  BY owner, name;

-- Foreign keys — scan for circular clusters that need deferred constraints
SELECT c.owner, c.table_name, c.constraint_name, r.table_name AS refs
FROM   dba_constraints c
JOIN   dba_constraints r ON c.r_constraint_name = r.constraint_name
WHERE  c.constraint_type = 'R'
AND    c.owner IN ('HR','SALES','FIN');

The fix is standard — load with constraints deferred, or load parent tables first — but only if you know the dependency graph before you start. Discovering a circular FK during the load is a classic cause of a blown cutover window.

Turning Findings Into a Risk Score

Raw counts aren't a plan. The point of the assessment is a single readiness signal that leadership and engineers can both act on. A simple, defensible scorecard:

  • Data volume — total GB and largest single table. Drives runtime and whether CDC is required.
  • Code volume — total PL/SQL lines. Drives conversion effort and review time.
  • Hard-construct density — count of packages-with-state, autonomous transactions, CONNECT BY, XMLTYPE, AQ, DB links. The single biggest risk multiplier.
  • Dependency complexity — circular FKs and cross-schema references. Drives load-order risk.
  • Platform features — RAC, Exadata, scheduler jobs, materialized views. Each maps to a known PostgreSQL pattern but needs an explicit decision.

Score each axis low / medium / high. A database that's high on data volume but low everywhere else is a straightforward (if long-running) data move. A database that's high on hard-construct density is where you concentrate senior review time. Either way, you now have a plan instead of a hope.

Doing All of This in 60 Seconds

Every query above is something DBMigrateAIPro's assessment runs for you. Point it at the source, and it produces an assessment report covering the full object inventory, data profile, PL/SQL surface area, and a flagged list of every hard construct in the schema — then the Migration Advisor turns those findings into a risk score with the specific items that need human review called out by name.

bash
python main.py --no-ui assess \
  --source oracle --source-host orahost --source-db PROD \
  --source-user demo_src --source-pass <pwd> \
  --schemas HR,SALES,FIN

# Writes to projects/<JOB>/assessment/:
#   inventory.json        — object census by schema and type
#   data_profile.json     — segment sizes, LOB volume, partitions
#   plsql_surface.json    — line counts + hard-construct findings
#   advisor_report.html   — risk score + items flagged for review

The assessment is read-only and free in the current build — it never writes to your source, so it's safe to run against production. The output is the same punch list this guide describes, produced in the time it takes to make coffee instead of the days it takes to write the queries by hand.

The Honest Verdict

Migrations don't fail because Oracle and PostgreSQL are far apart — they're closer than most teams expect. They fail because a construct nobody counted shows up at the worst possible moment. The assessment is how you make sure nothing shows up that you didn't already plan for.

Spend the hour. Count the objects, profile the data, measure the PL/SQL, catalog the hard constructs, map the dependencies. Turn it into a score. Everything downstream — timeline, staffing, cutover plan — gets more accurate the moment you do.

Assess your database for free

DBMigrateAIPro's assessment is read-only, runs in about a minute, and produces the full inventory + risk score described here. Free for Year 1 — no signup, no license key.