The Daily Boston

Boston news, every day

News

How Boston's City Archives Ended Up With Thousands of Duplicate Images — and What It's Doing About It

Decades of digitization drives, competing department databases, and a fragmented approach to records management left the city sitting on a sprawling mess of redundant files. Here's the story of how that happened.

By Boston News Desk · Published 4 July 2026, 2:45 pm

3 min read

How Boston's City Archives Ended Up With Thousands of Duplicate Images — and What It's Doing About It
Photo: Photo by Phil Evenden on Pexels

Boston's city government is sitting on an estimated tens of thousands of duplicate digital images scattered across at least six separate departmental servers — a problem that archivists and records managers say traces directly to a decade-long digitization push that prioritized speed over coordination. The situation came into sharper focus this spring when the Mayor's Office of New Urban Mechanics began an audit of the city's digital asset inventory ahead of a planned migration to a unified content management platform.

The timing matters. Mayor Michelle Wu's administration has made open government and digital equity central planks of its agenda, and city technology staff have been under pressure to consolidate legacy systems. But consolidating systems means confronting what years of siloed digitization actually produced: the same photographs, permits, and planning documents stored in multiple formats across multiple drives, with no single authoritative copy flagged for preservation.

A Fragmented History, Neighborhood by Neighborhood

The root of the problem goes back to roughly 2012, when several city departments — including the Boston Planning & Development Agency and the Department of Innovation and Technology — began independent scanning campaigns. The BPDA's initiative focused heavily on Dorchester and Jamaica Plain, two neighborhoods where zoning disputes generated dense paper trails. Staff photographed buildings, streetscapes, and public hearing documents, then saved those files to department-specific drives using inconsistent naming conventions. Simultaneously, the Boston City Archives on School Street ran its own scanning program under a separate grant from the Massachusetts Board of Library Commissioners.

Nobody at the time established a shared metadata standard or a deduplication protocol. By 2017, when the city attempted to launch a unified GIS-linked property database, technicians discovered that the same 1920s-era photograph of a triple-decker on Blue Hill Avenue existed in at least four separate files, each with a different file name, resolution, and embedded timestamp. The property database project stalled. The underlying image problem did not get fixed.

Three subsequent administrations — spanning the tenures of Mayors Menino, Walsh, and Wu — each initiated technology modernization efforts, but none specifically targeted the duplicate image backlog as a standalone project. Instead, each round of modernization added more scanned material without retiring or reconciling the old.

What the Audit Found, and What Comes Next

The current audit, which city technology staff began in March 2026, is using automated deduplication software to compare hash values across servers hosted at City Hall on Cambridge Street and at a secondary data facility in South Boston. Early results, according to public budget documents submitted to the Boston City Council in May, suggest that storage costs tied to redundant files have contributed to data infrastructure expenses running above projections for the fiscal year ending June 30, 2026.

The Massachusetts Public Records Law requires municipalities to maintain accessible, non-duplicative archives for legally designated records. Redundant image files complicate compliance because archivists cannot always certify which version of a document is the official record. That legal ambiguity is part of what prompted the current push for cleanup.

The BPDA, which manages planning records touching neighborhoods from Roxbury to East Boston, is expected to be the first department to complete deduplication under the new protocol. Staff there are working against a September 2026 target date tied to the rollout of a new permitting portal.

For residents and neighborhood groups that rely on city-held historical images — the Jamaica Plain Neighborhood Development Corporation, for instance, has used BPDA archive photos in community planning presentations — the practical upshot is that access to records may be temporarily restricted during the migration window. City technology staff have said publicly, through budget hearing testimony, that a read-only archive portal will remain available throughout the process. Anyone needing historical property images from the city's collection should submit public records requests now, before the migration begins in late summer, to avoid potential delays during the transition period.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Boston

This article was produced by the The Daily Boston editorial desk and covers news in Boston. See our editorial standards for how we use AI.

The Daily Boston brief

The day's Boston news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Boston news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Boston

More in News

Enjoyed this story? Get tomorrow's briefing free.