Boston's municipal digital records system holds hundreds of thousands of scanned documents — permit applications, zoning filings, inspection reports — and a growing share of that archive is junk. Duplicate images, sometimes the same page scanned three or four times, have quietly inflated storage costs and slowed retrieval times at agencies that process everything from building permits in Jamaica Plain to health inspections along Blue Hill Avenue in Dorchester. The city's Office of Digital Innovation, which operates under Mayor Michelle Wu's broader government modernisation push, flagged the problem formally in late 2025, but the conditions that created it stretch back more than a decade.
The timing matters because Boston is in the middle of an expensive infrastructure upgrade. The city is migrating records from legacy systems — some running on server hardware purchased before 2014 — to a cloud-based platform. Every redundant file that moves to the new system carries a cost. Cloud storage is priced per gigabyte, and duplicates that seemed trivial on old on-premise servers become a measurable budget line when they travel. For a city that has been pushing hard on housing production in Dorchester and Jamaica Plain, where permit volumes have surged since 2022, that overhead is not abstract.
How the Duplication Built Up
The root cause is mundane. For years, multiple city departments — the Inspectional Services Department, the Boston Planning & Development Agency on Atlantic Avenue, and the Registry Division at City Hall — operated separate document management workflows with no shared deduplication logic. When a single permit application touched more than one agency, each office scanned and stored its own copy. A routine variance request in Roxbury might generate identical image files in three separate repositories before anyone signed off on it.
Batch scanning projects made it worse. Between 2018 and 2022, the city contracted with outside vendors to digitise paper backlogs accumulated over decades. The contracts specified delivery of complete scanned sets, but did not require vendors to check against existing digital holdings. The result: entire folders of documents that had already been scanned in the early 2010s were scanned again and uploaded alongside their twins. No single department had visibility across the others' systems, so no one caught the overlap in real time.
The MBTA's parallel struggles with data fragmentation offer a local parallel. The transit authority spent years running incompatible systems across its bus and rail divisions — a problem that became undeniable when reliability reforms under the Wu administration required unified performance data that simply did not exist in a single place. The records duplication issue at City Hall is a quieter version of the same structural failure: siloed operations, separate procurement, no single point of accountability.
What Fixing It Actually Looks Like
The Office of Digital Innovation began a deduplication audit in January 2026, working first on Inspectional Services records because that department's permit backlog is directly tied to housing production timelines. Early findings, presented to the Wu administration's technology working group in March 2026, identified redundancy rates above 30 percent in certain document categories — meaning more than three in ten files were copies of something already in the system.
Clearing that backlog is not a single software fix. The audit team is running image-hash comparison tools to identify exact duplicates, but near-duplicates — same document, slightly different scan quality or rotation — require human review. The city has allocated staff time from the Department of Innovation and Technology, headquartered at City Hall Plaza, for that manual triage work through the end of fiscal year 2026, which closes September 30.
For residents and developers, the practical effect has been delays. A permit application that should clear Inspectional Services in 15 business days has, in some cases involving properties in Jamaica Plain's Hyde Square neighbourhood, taken longer because retrieval searches turn up multiple conflicting file versions. The deduplication project is expected to reduce average retrieval time once the first phase — covering permits filed since 2018 — completes this autumn. Applicants with pending filings are advised to contact Inspectional Services directly at its Roxbury satellite office on Washington Street to confirm which document version is the active record.