The Daily Boston

Boston news, every day

News

Boston's Digital Archives Are Full of Duplicate Images. Officials and Experts Say It's Costing the City More Than Storage Space.

From City Hall to Roxbury Community College, the push to clean up redundant digital records is gaining urgency as budgets tighten and AI-driven cataloguing tools go mainstream.

By Boston News Desk · Published 4 July 2026, 2:51 pm

3 min read

Boston's Digital Archives Are Full of Duplicate Images. Officials and Experts Say It's Costing the City More Than Storage Space.
Photo: North End Union, Boston. School of Printing / Public domain (Wikimedia Commons)

Boston's municipal digital infrastructure is carrying a problem that sounds mundane until you see the price tag. Thousands of duplicate images — redundant photographs, scanned documents, and graphics recycled across city databases — are cluttering records systems maintained by agencies ranging from the Boston Public Library's digital collections unit to the Office of Arts and Culture on City Hall Plaza. The cleanup effort, long deferred, is now drawing pointed attention from technologists, archivists, and budget analysts who say the status quo is unsustainable.

The issue surfaces at an awkward moment. Mayor Michelle Wu's administration has pushed hard on digital equity and open-government data initiatives, and the city's five-year capital plan includes modernisation funding for municipal IT. Redundant image files sit at the intersection of both priorities — they inflate storage costs, slow retrieval systems used by planners and researchers, and complicate the kind of transparent public records access that open-government advocates have demanded from Boston for years.

What the Experts Are Saying

Archivists at Northeastern University's Snell Library, which partners with city agencies on preservation projects, have flagged duplicate image proliferation as a growing concern in local government collections. The problem compounds when records are migrated between platforms — a city department moves from one content management system to another, files are imported twice, and no one audits the overlap. Across municipal governments nationally, analysts at the Urban Institute estimated in a 2024 report that unmanaged digital redundancy can add between 15 and 30 percent to annual cloud storage costs for mid-sized cities. Boston, which contracts with Amazon Web Services for portions of its cloud infrastructure, is not immune to that math.

Specialists in digital asset management point to a structural issue: Boston's agencies have historically operated their own siloed image libraries. The Boston Planning Department keeps its own archive of neighbourhood survey photographs. The Boston Public Health Commission maintains a separate collection of public-awareness graphics. The Parks and Recreation Department stores images from every ribbon-cutting in Dorchester and Jamaica Plain going back a decade. None of these libraries talk to each other in any automated way, which means the same photograph of, say, Franklin Park can exist in four separate city databases with four different file names and four separate storage costs attached.

Roxbury Community College's digital media program has begun incorporating duplicate-detection methodology into its curriculum, treating Boston's own municipal records challenge as a live case study. Instructors there describe the problem to students as a failure of governance as much as a failure of technology — the tools to identify and remove duplicate images have existed for years, but someone has to mandate their use and fund the audit.

What Happens Next

City technology officials have not publicly committed to a citywide duplicate-image audit with a firm deadline or budget line, though the Wu administration's Digital Equity Action Plan, updated in early 2025, acknowledged the need for improved data hygiene across departments. Industry practitioners say the typical municipal audit of this kind runs between $80,000 and $150,000 depending on the volume of records, when contracted to a specialist firm — costs that Boston could offset by reducing storage overhead within 18 to 24 months of completion.

For residents and researchers who rely on city digital records — journalists pulling public documents, urban planners consulting neighbourhood surveys, historians working through Boston City Archives on Moakley Street in South Boston — the practical upside of a cleanup would be faster search results and more reliable metadata. A photograph that exists once, properly tagged, is far easier to find than the same image duplicated seven times under inconsistent file names.

The conversation is moving faster than it was two years ago, driven partly by the arrival of AI-powered cataloguing tools that can flag duplicates at scale with limited human intervention. Several Boston-area universities and at least one cultural institution on the Fenway have begun piloting such tools internally. Whether the city government moves on a formal program before the next budget cycle closes in the spring of 2027 may depend on how loudly the archivists, IT managers, and open-records advocates make their case between now and then.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Boston

This article was produced by the The Daily Boston editorial desk and covers news in Boston. See our editorial standards for how we use AI.

The Daily Boston brief

The day's Boston news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Boston news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Boston

More in News

Enjoyed this story? Get tomorrow's briefing free.