Import & Backfill Scripts
Initial Import
agent/scripts/import_mappedin_mall.py — full pipeline for new malls:
python3 agent/scripts/import_mappedin_mall.py <sniff_dir> "<Mall Name>" "<City>" <country_code>
# With existing mall:
python3 agent/scripts/import_mappedin_mall.py <sniff_dir> "<Mall Name>" "<City>" <country_code> --mall-id <id> --replace-floorsCreates mall → floors → features (with georef) → brands (reuses existing) → stores → occupancies (matched by centroid proximity).
Idempotent Backfill
For malls already imported, backfill scripts upgrade data without duplicating:
| Script | Target | Key | Safe? |
|---|---|---|---|
backfill_mappedin_amenities.py | Amenity features | sourceIds.mappedin.polygonId | Yes (deletes legacy, recreates keyed) |
backfill_mappedin_units.py | Unit features | sourceIds.mappedin.polygonId | Yes (upsert only by default) |
backfill_mappedin_stores.py | Stores | sourceIds.mappedin.importKey | Yes (server upserts by key) |
backfill_mappedin_all.py | All of above | — | Driver script |
First Run (Legacy Malls)
Malls imported before import audit need --claim-legacy to match keyless entities:
# Claims legacy stores by (name, floor), deduplicates, then runs full upsert
python3 agent/scripts/backfill_mappedin_all.py --first-run--first-run passes --claim-legacy --delete-legacy-dupes to both units and stores.
Subsequent Runs
Pure idempotent upsert — no flags needed:
python3 agent/scripts/backfill_mappedin_all.pySelective Runs
# Only specific malls
python3 agent/scripts/backfill_mappedin_all.py --only chadstone,theglen
# Skip specific phases
python3 agent/scripts/backfill_mappedin_all.py --skip-units
python3 agent/scripts/backfill_mappedin_all.py --skip-amenities --skip-storesDry Run
Preview changes without writing:
python3 agent/scripts/backfill_mappedin_all.py --first-run --dry-runHow Claim-Legacy Works
Stores
Match by (normalized_name, floor_id). If multiple legacy stores match the same name on the same floor, keep the oldest (by createdAt), delete the rest (with --delete-legacy-dupes). PATCH winner with importKey.
Units
Match by polygon centroid proximity — greedy nearest-first within threshold (0.00005° for geo, 0.02 for normalized). PATCH winner with polygonId (properties only, not geometry — avoids triggering the floor quality gate).
Amenities
No claim needed — delete all legacy amenities and recreate with proper keys. Amenities have no occupancy links to preserve.
Diagnostic Script
agent/scripts/diagnose_chadstone_dupes.py — query a mall's stores and detect duplicates by (name, floor), cluster by createdAt date. Useful for auditing before running --claim-legacy.