Boston's public agencies and major institutions are sitting on tens of thousands of duplicate digital images scattered across incompatible servers — and a window to fix it is closing fast. With the city's Office of Digital Transformation targeting a unified asset management platform rollout for early 2027, the next six months represent the last practical opportunity to audit, reconcile, and replace redundant files before the migration locks in whatever chaos already exists.
The stakes are higher than cluttered hard drives. Duplicate images cost real money to store, create legal exposure around licensing, and slow down public-facing services that residents increasingly depend on. For a city whose identity is bound up in institutions like Massachusetts General Hospital, MIT, and the Boston Public Library — all of which maintain separate digital archives — the problem compounds across dozens of networks simultaneously.
Where the Problem Is Sharpest
The Boston Public Library's Copley Square headquarters has been digitizing its historical photograph collection since 2019, a project that now encompasses more than 400,000 images. Librarians there have acknowledged publicly that duplication crept in early, when multiple staff members uploaded scans from the same physical negatives without a shared naming convention. The BPL's Digital Services team is now conducting a reconciliation review, but the process is manual and slow.
Over in the Longwood Medical Area, Beth Israel Deaconess Medical Center and Brigham and Women's Hospital both maintain clinical image libraries that feed into research publications and patient education materials. Neither institution shares a platform with the other. When research teams at Harvard Medical School — whose administrative offices sit on Longwood Avenue, less than half a mile from both hospitals — pull images for joint studies, duplicates multiply again at every handoff. A 2024 survey by the Coalition for Networked Information found that large academic medical centers waste an average of 18 percent of their digital storage budgets on redundant file management, a figure that translates into real dollars at institutions operating at Boston's scale.
The city government's own problem is concentrated in the Department of Innovation and Technology, which serves agencies from the Environment Department on City Hall Plaza to the Boston Planning Department offices in Roxbury. An internal audit completed in March 2026 — details of which were shared at a public procurement briefing — identified more than 60,000 image files flagged as probable duplicates across departmental SharePoint environments. The audit did not assign a remediation cost, but procurement documents released alongside it listed three vendor bids for a deduplication and replacement service, ranging from $280,000 to $510,000.
The Decisions That Will Define the Outcome
Three choices now sit in front of decision-makers, and none of them is straightforward. First: whether to pursue automated hash-matching tools that identify pixel-identical duplicates, or invest in AI-assisted perceptual matching that catches near-duplicates and differently cropped versions of the same image. Automated tools are cheaper upfront — some open-source options cost nothing beyond implementation hours — but they miss the messier category of near-matches, which is where most of Boston's institutional libraries store their real redundancy problem.
Second: who owns the replacement decision when the duplicate spans agencies or institutions? At the Jamaica Plain branch of the BPL on South Street, for example, community event photographs uploaded by both the library's communications team and the city's Neighborhood Services office have ended up in separate archives with no clear authority over which version is canonical. That governance gap has to be resolved before any technical fix holds.
Third is timing. The city's 2027 platform target is firm, according to procurement documents. Any institution that doesn't complete its deduplication and replacement work before the migration window — expected to open in October 2026 — risks having its duplicates baked permanently into the new system's baseline.
The BPL has already opened a public comment period through July 31 for community stakeholders who use its digital image collections. The city's Office of Digital Transformation is expected to release its vendor selection for the deduplication project before Labor Day. For institutions that haven't started their own audits, the math is simple: six weeks to get organized, or inherit the problem for another decade.