Skip to content

Import & Backfill Scripts

Initial Import

agent/scripts/import_mappedin_mall.py — full pipeline for new malls:

bash
python3 agent/scripts/import_mappedin_mall.py <sniff_dir> "<Mall Name>" "<City>" <country_code>
# With existing mall:
python3 agent/scripts/import_mappedin_mall.py <sniff_dir> "<Mall Name>" "<City>" <country_code> --mall-id <id> --replace-floors

Creates mall → floors → features (with georef) → brands (reuses existing) → stores → occupancies (matched by centroid proximity).

Idempotent Backfill

For malls already imported, backfill scripts upgrade data without duplicating:

ScriptTargetKeySafe?
backfill_mappedin_amenities.pyAmenity featuressourceIds.mappedin.polygonIdYes (deletes legacy, recreates keyed)
backfill_mappedin_units.pyUnit featuressourceIds.mappedin.polygonIdYes (upsert only by default)
backfill_mappedin_stores.pyStoressourceIds.mappedin.importKeyYes (server upserts by key)
backfill_mappedin_all.pyAll of aboveDriver script

First Run (Legacy Malls)

Malls imported before import audit need --claim-legacy to match keyless entities:

bash
# Claims legacy stores by (name, floor), deduplicates, then runs full upsert
python3 agent/scripts/backfill_mappedin_all.py --first-run

--first-run passes --claim-legacy --delete-legacy-dupes to both units and stores.

Subsequent Runs

Pure idempotent upsert — no flags needed:

bash
python3 agent/scripts/backfill_mappedin_all.py

Selective Runs

bash
# Only specific malls
python3 agent/scripts/backfill_mappedin_all.py --only chadstone,theglen

# Skip specific phases
python3 agent/scripts/backfill_mappedin_all.py --skip-units
python3 agent/scripts/backfill_mappedin_all.py --skip-amenities --skip-stores

Dry Run

Preview changes without writing:

bash
python3 agent/scripts/backfill_mappedin_all.py --first-run --dry-run

How Claim-Legacy Works

Stores

Match by (normalized_name, floor_id). If multiple legacy stores match the same name on the same floor, keep the oldest (by createdAt), delete the rest (with --delete-legacy-dupes). PATCH winner with importKey.

Units

Match by polygon centroid proximity — greedy nearest-first within threshold (0.00005° for geo, 0.02 for normalized). PATCH winner with polygonId (properties only, not geometry — avoids triggering the floor quality gate).

Amenities

No claim needed — delete all legacy amenities and recreate with proper keys. Amenities have no occupancy links to preserve.

Diagnostic Script

agent/scripts/diagnose_chadstone_dupes.py — query a mall's stores and detect duplicates by (name, floor), cluster by createdAt date. Useful for auditing before running --claim-legacy.