Boston's public-facing digital systems have hit a wall this week. Duplicate images — identical or near-identical visual files stored multiple times across servers — have accumulated to the point where city departments, MBTA digital teams, and several Fenway-area universities are now running active cleanup operations to claw back storage, speed up load times, and correct visual errors appearing on public-facing platforms.
The timing is not accidental. The city's push to consolidate its digital infrastructure under the Mayor's Office of New Urban Mechanics, which has been coordinating with the Boston Digital Services team on the Wu administration's broader technology modernization agenda, hit a milestone this spring when agencies began migrating legacy content management systems to a unified platform. That migration exposed just how badly image libraries had sprawled over the past decade — departments uploading the same photographs of City Hall Plaza, the Rose Kennedy Greenway, or MBTA Orange Line stations dozens of times without any deduplication check in place.
Where the Problem Is Worst
The MBTA's rider-facing app and website have been among the most visible casualties. Redundant image files were slowing page load times on route maps and station guides, a problem that transit advocates flagged as early as March 2026. The authority's technology division launched an internal audit in late June, targeting specifically the image assets tied to its recent fare-gate upgrade rollout across 20 downtown stations.
At Northeastern University on Huntington Avenue and at Boston University's Charles River Campus on Commonwealth Avenue, IT administrators have been dealing with a parallel headache: research portals and departmental websites that allow faculty to upload their own photos have accumulated libraries with duplication rates that internal reviews have flagged as significant enough to affect server performance. Both institutions declined to provide specific figures for this story, but technology staff at comparable research universities have described duplication rates of 30 to 60 percent in unmanaged image repositories — a range that gives a sense of the scale involved.
The city's own Department of Innovation and Technology, which oversees the Boston.gov infrastructure, began a structured deduplication push on July 1, coinciding with the start of the new fiscal year. The project is using open-source perceptual hashing tools — software that generates a digital fingerprint for each image so near-duplicates can be identified even when file names differ — to sort through an asset library that has grown substantially since the pandemic-era expansion of online public services.
What Agencies Are Doing About It
The practical fix is not glamorous, but it is necessary. Deduplication software flags redundant files; human reviewers then confirm which version is canonical before the others are deleted or archived. For municipal systems, that review process also has to account for accessibility compliance — the Americans with Disabilities Act requires alt-text and image descriptions, and replacing a duplicated image without carrying over its accessibility metadata creates a fresh legal exposure.
Jamaica Plain's Centre Street neighborhood office, which handles constituent services for one of the city's most densely populated wards, updated its local events page this week after discovering that the same banner image had been uploaded 14 separate times to the city's content management system since 2022 — each upload by a different staff member unaware prior versions existed.
For residents and local organizations that interact with city digital systems, the near-term advice from technology staff is straightforward: when submitting images for publication on city platforms, use file names that include the date and department, and check with the relevant communications contact before uploading assets that may already exist in a shared library. The Boston Digital Services team is expected to publish updated image submission guidelines for community groups by the end of July, part of a broader digital content policy document that has been in draft since February 2026.
The larger deduplication audit across city systems is scheduled to run through September 30, the end of the first quarter of fiscal year 2027. Whether the cleanup holds will depend on whether the new platform enforces upload controls automatically — the technical teams say that capability is being built, but is not yet live.