Boston's Office of Digital Transformation began a structured sweep this week to identify and remove duplicate image files embedded across at least a dozen municipal databases, a cleanup effort city technology staff say has been overdue since the consolidation of legacy systems following the 2021 migration to a centralized cloud platform. The work, which started Monday, July 1, is expected to run through the end of the month.
The timing is deliberate. Mayor Michelle Wu's administration set a July deadline for departments to meet new data hygiene standards ahead of a broader overhaul of the city's public-facing document portal, Boston.gov, which handles permit applications, zoning filings, and inspection records for residents across every neighborhood. Duplicate images — scanned permits, site photographs, infrastructure inspection shots filed multiple times by different departments — have accumulated into what technology staff describe as a significant storage and retrieval problem.
Where the Problem Runs Deepest
Two city operations have emerged as priority areas. The Inspectional Services Department, headquartered at 1010 Massachusetts Avenue in Roxbury, maintains a database of building inspection photographs that stretches back to digitization efforts begun in 2014. Staff auditing the records this week found image duplication rates running well above acceptable thresholds in files tied to properties in Jamaica Plain and Dorchester — two neighborhoods that have seen heavy permitting activity under recent housing production pushes. The Boston Planning Department, which manages files connected to proposed developments along corridors like Washington Street and Blue Hill Avenue, is also under the audit scope.
The problem is partly a legacy of how different city bureaus handled document uploads before a unified filing protocol was adopted in late 2023. Inspectors on site would photograph a property, upload images directly from a mobile device, and then an office administrator would upload a separate batch from the same visit. With no deduplication logic in place, both sets would sit in the database permanently. Multiply that across hundreds of inspections per month over several years, and the redundancy compounds quickly.
The MBTA's Capital Delivery team is not directly involved in this round of city audits — its infrastructure photo archives fall under a separate state records regime — but transit planners collaborating with Boston on station-area development projects near Orange Line stops in Roxbury and Dorchester have flagged overlapping image sets as a coordination headache when assembling environmental review packages.
What the Cleanup Actually Involves
City IT staff are running automated hash-comparison tools to flag identical or near-identical files before human reviewers make final deletion decisions. The Office of Digital Transformation has committed to keeping audit logs of every removed file for a minimum of three years, a retention requirement under the Massachusetts Secretary of State's records management guidelines. No original files are being permanently deleted without a supervisor sign-off step built into the workflow.
The practical stakes are not trivial. Boston processed more than 47,000 building permit applications in fiscal year 2025, according to Inspectional Services data published in the city's annual report. Even a modest duplication rate across the image attachments attached to those filings generates tens of thousands of redundant files. Storage costs on the city's cloud contract, negotiated with a vendor through the Massachusetts Operational Services Division, scale with data volume — meaning the cleanup could generate direct budget savings in the next renewal cycle, due in early 2027.
For residents and small developers waiting on public records requests — a frustration that has drawn complaints from contractors working on triple-deckers in Dorchester and mixed-use projects along the Hyde Park Avenue corridor — a leaner, better-organized database should mean faster search results and shorter fulfillment times when they pull inspection histories on a specific parcel.
City staff expect a preliminary report on the scope of duplication found across all target databases by July 18. If the audit surfaces systemic upload problems in specific workflows, the Office of Digital Transformation has authority to mandate procedural changes in those departments before the new Boston.gov portal features go live this fall.