Boston's public institutions are sitting on digital image libraries riddled with duplicates, outdated files, and redundant records — and the effort to fix that problem is drawing attention from city hall to Copley Square. Archivists, municipal tech officers, and academic librarians are now pushing for a coordinated approach, arguing that bloated digital collections cost real money and undermine public access to government records.
The issue has sharpened over the past year as Mayor Michelle Wu's administration has accelerated its open-data and digital-services agenda. Managing accurate, searchable image databases is foundational to that work — whether it's construction permit photos filed with the Boston Inspectional Services Department, heritage images held by the Boston Public Library's Digital Commonwealth program, or public-health documentation archived by Boston Public Health Commission staff at 1010 Massachusetts Avenue.
Why Duplicate Images Become an Institutional Headache
Duplicate image replacement — the process of identifying redundant files, selecting a canonical version, and systematically retiring the rest — sounds mundane. It is not. When city agencies and universities each maintain separate digital asset management systems with overlapping content, researchers and journalists pulling records under Chapter 66 of Massachusetts public records law can receive inconsistent document sets. Archivists at institutions including Northeastern University's Snell Library and the Boston Athenaeum on Beacon Street have flagged this as a growing concern as collections digitized in the early 2010s age into obsolescence and file duplication compounds.
At the Boston Public Library's Kirstein Business Branch, digital services staff have been working through the Digital Commonwealth repository — a statewide platform hosting collections from more than 170 Massachusetts institutions — to address metadata gaps and image redundancy. The effort is part of a broader push to make holdings more usable for the public, but it requires specialized labor that many smaller partner institutions struggle to fund.
Academic technologists at MIT's Digital Humanities group, based in Cambridge just across the Charles River, have noted that automated deduplication tools can flag roughly 15 to 30 percent of images in large institutional repositories as probable duplicates — though human review remains essential before any file is retired. That range reflects findings published in library and information science literature over the past several years, not a figure specific to any single Boston institution.
Money, Staffing, and the City's Role
Cost is the central argument city officials are making for taking the problem seriously now rather than later. Cloud storage is not free. The City of Boston's Department of Innovation and Technology, which oversees municipal data infrastructure, has been reviewing its digital asset overhead as part of budget planning that carried into fiscal year 2026. Storage costs for unmanaged, duplicate-heavy image archives scale quickly when agencies like the Boston Parks and Recreation Department or the Office of Arts and Culture are each uploading event photography without centralized deduplication protocols.
Jamaica Plain's Spontaneous Celebrations arts organization and the Dorchester-based Codman Square Neighborhood Development Corporation have both contributed images to city-affiliated community documentation projects, adding to the complexity of who owns what version of a file once it enters a municipal or quasi-public repository.
Library and records professionals point to a practical path forward: institutions need written deduplication policies, not just software. They recommend designating a canonical file standard — typically the highest resolution original — before any automated replacement process runs. They also stress that audit trails matter; any replaced image should be logged with a timestamp and a reason code, both for legal defensibility under Massachusetts records-retention schedules and for the integrity of the historical record.
For Boston residents and researchers, the practical upshot is straightforward. If you are filing a public records request with a city agency and receive image files that appear inconsistent or mislabeled, you have the right under state law to request clarification on which version is the official record. The Secretary of State's office in Boston maintains guidance on that process. For institutions still sorting out their internal systems, the message from archivists is consistent: start with an audit, not a deletion.