The Daily Boston

Boston news, every day

News

Boston's Duplicate Image Problem: The Key Decisions Ahead for City Archives and Public Records

A growing backlog of duplicate and misidentified photographs in the city's digital repositories is forcing libraries, universities, and municipal offices to decide how — and how fast — to clean house.

By Boston News Desk · Published 4 July 2026, 2:45 pm

4 min read

Boston's Duplicate Image Problem: The Key Decisions Ahead for City Archives and Public Records
Photo: Photo by Richard Lathrop on Pexels

Boston's public institutions are sitting on a problem that has been accumulating for years: thousands of duplicate, mislabeled, and low-resolution images scattered across municipal digital archives, university repositories, and neighborhood preservation databases. The question now is who pays to fix it, who decides which copies survive, and what standard governs the whole process.

The issue has sharpened in 2026 as the city's ongoing push to digitize historical records — part of Mayor Michelle Wu's broader transparency and open-data agenda — has collided with the unglamorous reality of legacy content management. When you migrate files from old servers, duplicates multiply. When three separate departments each scan the same 1920s photograph of Washington Street, you end up with three versions of unequal quality sitting in three separate folders, none of them talking to the others.

Where the Backlog Lives

The Boston City Archives on Boylston Street and the Boston Public Library's Digital Repository, headquartered at the Central Branch on Copley Square, are the two institutions most directly in the crosshairs. The BPL's digital collections alone span more than 2 million items, a figure the library has cited publicly in fundraising materials, and staff have acknowledged that deduplication work has lagged behind acquisition for at least a decade. Meanwhile, Northeastern University's Archives and Special Collections on Snell Library Drive and the Boston Landmarks Commission — which maintains photographic documentation of historic structures across Roxbury, the South End, and East Boston — each run parallel systems with their own naming conventions and storage standards.

The lack of a shared metadata protocol is the core technical problem. When the city's Department of Innovation and Technology pushed agencies toward a unified content management platform starting in fiscal year 2024, the effort exposed just how inconsistent tagging and file-naming had become. A photograph of the Dudley Square municipal building might exist under four different file names, three different date stamps, and two different rights classifications depending on which office uploaded it. Replacing or retiring a duplicate is not simply a matter of hitting delete — it requires a rights review, a quality assessment, and a decision about which version becomes the canonical record.

For neighborhood preservation groups, the stakes are concrete. The Jamaica Plain Historical Society, which has been digitizing slides and prints from the 1970s urban renewal era, has had to pause uploads to the BPL's shared portal twice this year while the library works through a backlog of ingestion requests. The Dudley Street Neighborhood Initiative in Roxbury, which holds photographic documentation tied to its community land trust model, faces similar friction when trying to contribute materials to public-facing city collections.

The Decisions That Can't Wait

Three choices are coming to a head before the end of fiscal year 2026, which closes September 30. First, the city must decide whether to adopt a single deduplication standard — likely built around the Library of Congress's PREMIS metadata schema — or allow institutions to maintain separate systems linked by a shared index. Second, the BPL board needs to determine whether to fund a dedicated digital-collections archivist position, a role that has been vacant since March 2025, or contract the work to a vendor. Third, Northeastern and the Boston Landmarks Commission are in preliminary talks about a joint storage agreement that would eliminate redundant cloud costs, estimated internally at a combined six-figure annual expenditure, though no formal agreement has been signed.

The practical consequence for residents and researchers is real. Until these decisions land, anyone searching the city's digital archives for historical photographs of, say, the Orange Line construction through Egleston Square or the old Scollay Square blocks risks pulling up degraded or incorrectly dated images without knowing a better version exists somewhere else in the system. Getting this right matters not just for historical accuracy but for the legal defensibility of public records in land-use disputes, where photographic evidence of prior building conditions can be decisive.

The most likely path forward involves a working group convened through the Mayor's Office of Arts and Culture, with a report due by October 1. Whether the institutions can agree on a common standard in that window — given competing budgets and institutional cultures — is the central question heading into the fall.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Boston

This article was produced by the The Daily Boston editorial desk and covers news in Boston. See our editorial standards for how we use AI.

The Daily Boston brief

The day's Boston news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Boston news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Boston

More in News

Enjoyed this story? Get tomorrow's briefing free.