Boston's city government is sitting on a digital records problem years in the making. Across departments ranging from the Boston Planning Department to the Office of Neighborhood Services, municipal image libraries accumulated thousands of duplicate photograph files — the same aerial shot of Dorchester's Savin Hill waterfront saved under six different filenames, the same ribbon-cutting at Roxbury Community College catalogued in three separate folders by three separate staffers. The city is now midway through a formal deduplication project intended to rationalize those archives before a planned migration to a centralized asset management platform.
The timing matters because the migration deadline is real. The city's Information and Technology Department has been working since late 2024 to consolidate records ahead of a broader digital infrastructure overhaul tied to Mayor Michelle Wu's open-data commitments under Boston's citywide technology plan. Duplicate image files aren't merely an aesthetic problem — they inflate storage costs, slow search retrieval, and create legal ambiguity about which version of a public document is the authoritative one. Every redundant file occupies server space that the city pays for, and Boston's municipal cloud-storage contract comes up for renegotiation in fiscal year 2027.
How the Duplication Happened
The root cause is straightforward: for most of the past two decades, individual departments managed their own photo archives with little coordination. The Boston Public Library's Digital Commonwealth project, which digitizes historical materials at its Copley Square headquarters, operated on entirely different software from the Boston Transportation Department, which maintains its own image records of infrastructure projects along the Route 28 corridor and elsewhere. When the city launched its 311 app — which allows residents to submit geotagged photographs of potholes, broken streetlights, and similar complaints — those images began flowing into yet another separate repository. None of these systems talked to each other in any systematic way.
Staff turnover accelerated the problem. Each time a project manager left, their successor often started a new folder rather than locating existing files. The Inspectional Services Department, which handles building permits and code enforcement across neighborhoods including Jamaica Plain and East Boston, found during an internal audit completed in March 2025 that roughly 22 percent of its stored images were exact or near-exact duplicates, according to a summary of that audit presented to the city's IT oversight committee. The MBTA, though a separate entity from city government, faced a parallel issue when it began digitizing its construction-documentation photos for the Green Line Extension project, which wrapped its core Phase 1 work in 2022.
The Path Forward
The city contracted with a Massachusetts-based technology vendor in early 2025 to run automated hash-matching and perceptual-similarity algorithms across the affected repositories — tools that compare image files at the pixel level and flag near-identical pairs for human review. The project is being piloted first within the Boston Planning Department's archive, which holds planning and zoning photographs stretching back to the early 2000s and covers areas from the Seaport District to the squares of Dorchester. A broader rollout to the Inspectional Services and Public Works archives is scheduled for the third quarter of 2026.
For residents and journalists who rely on Boston's public records systems, the practical effect should eventually be faster search results and fewer dead-end file names in public-facing portals. The city's Analyze Boston open-data platform, which already hosts dozens of public datasets, is the likely eventual home for a cleaned-up, searchable image index. Budget documents submitted to the City Council earlier this year included a line item of approximately $340,000 for the full deduplication and migration effort through fiscal year 2027 — a relatively modest figure given the volume of files involved, though advocates for digital transparency have pressed the city to ensure the cleaned archive is genuinely public once the work is complete.
For anyone filing public records requests that involve photographs — construction permits in Jamaica Plain, streetscape surveys along Blue Hill Avenue, infrastructure inspections in Charlestown — the advice from city IT staff is to be specific about date ranges and department of origin. Until the consolidated platform is live, requests that cast too wide a net are likely to return duplicate results anyway, and narrowing the parameters saves everyone time.