The Daily Boston

Boston news, every day

News

How Boston's City Archives Ended Up With Thousands of Duplicate Photos — and What It's Doing About It

A years-long backlog of redundant digital images has quietly undermined public records access at City Hall, and officials are now scrambling to fix it.

By Boston News Desk · Published 4 July 2026, 3:16 pm

3 min read

How Boston's City Archives Ended Up With Thousands of Duplicate Photos — and What It's Doing About It
Photo: Photo by Mohan Nannapaneni on Pexels

Boston's municipal archives are sitting on a problem that took decades to build. Thousands of duplicate digital images — scanned photographs, permit records, neighborhood survey shots — have accumulated in the city's document management system, clogging searches, inflating storage costs, and making it harder for residents, researchers, and city planners to find what they actually need. The Boston City Archives, housed at 201 Rivermoor Street in West Roxbury, has formally flagged the issue to the mayor's office and is in the early stages of a remediation project that officials say could take through the end of fiscal year 2027 to complete.

The problem didn't start overnight. It traces back to at least 2009, when the city accelerated a bulk digitization push under a federal preservation grant. Departments uploaded files without a unified naming convention, and when the system was migrated again around 2018, duplicate files were copied wholesale rather than deduplicated. Subsequent uploads from agencies including the Boston Planning Department and the Inspectional Services Division compounded the issue. By the time internal auditors began a formal count in early 2025, the redundant files numbered in the tens of thousands.

Why does this matter now? Because Mayor Michelle Wu's administration has staked significant political capital on government transparency and digital accessibility — two pillars of her progressive agenda that are directly undercut when a resident searching the city's online portal for, say, a 1970s Jamaica Plain building survey pulls up fourteen identical scans of the same photograph and no usable metadata. The Jamaica Plain Neighborhood Development Corporation, which regularly accesses historical permit records for affordable housing work on properties along Centre Street, has noted the friction this creates for grant-funded research projects, though it has not issued a formal public complaint.

A System Built on Patch Jobs

The deeper issue is structural. The city's current content management platform, Laserfiche, was adopted in 2012 and was never configured with mandatory deduplication rules. When the Boston Public Library's Digital Repository Services team partnered with the archives on a 2019 joint digitization project covering Dorchester neighborhood records — streets like Bowdoin Avenue and Talbot Avenue featured heavily — files were uploaded by both institutions without cross-referencing, creating mirror copies across two separate systems. That project covered roughly 14,000 images, and archive staff estimate that somewhere between 20 and 30 percent of those files exist in at least two locations with no flag to alert users.

Storage isn't free. Municipal cloud storage contracts, which the city renegotiated with a vendor in 2023, bill at a per-gigabyte rate that makes redundant high-resolution image files a measurable budget line. The remediation project, according to a planning document circulated within the city's Department of Innovation and Technology in March 2026, will involve a phased automated deduplication sweep followed by manual review of flagged files by archive staff. Phase one is expected to cover records predating 2010. No public dollar figure for the full project has been released.

What Comes Next

The archives office is expected to issue updated file submission standards to all city departments by September 2026, requiring standardized naming conventions and SHA-256 hash verification on upload — a technical safeguard that flags identical files before they enter the system. The MBTA's own record-keeping overhaul, which has been running parallel to the city's effort as part of broader transit accountability reforms, offers a nearby template: the transit authority began enforcing upload deduplication protocols for maintenance photos in 2024 after an internal review found redundant inspection records had complicated a federal safety audit.

For residents, the practical upshot is that the city's public records portal — accessible at bostonma.gov — should become more reliable for property history searches and neighborhood document requests over the next 18 months, assuming the remediation timeline holds. Researchers at institutions like Northeastern University's library on Huntington Avenue, which regularly draws on Boston municipal records for urban planning studies, stand to benefit most from the cleanup. The archives office encourages anyone hitting dead ends in current searches to submit a public records request directly through the city's online form while the deduplication work proceeds.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Boston

This article was produced by the The Daily Boston editorial desk and covers news in Boston. See our editorial standards for how we use AI.

The Daily Boston brief

The day's Boston news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Boston news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Boston

More in News

Enjoyed this story? Get tomorrow's briefing free.