The Daily Boston

Boston news, every day

News

Boston's Duplicate Image Problem: What Happens Next and the Key Decisions Ahead

City agencies and universities are sitting on backlogs of redundant digital assets — and the clock is ticking on who pays to clean them up.

By Boston News Desk · Published 4 July 2026, 2:48 pm

3 min read

Boston's Duplicate Image Problem: What Happens Next and the Key Decisions Ahead
Photo: Photo by Mohammed Abubakr on Pexels

Boston's public institutions are facing a mounting decision about duplicate digital imagery — redundant photographs, scanned documents, and visual records stored across multiple servers that are costing real money and creating real administrative headaches. The problem has quietly grown alongside the city's aggressive push to digitize public records, a process that accelerated after Mayor Michelle Wu's administration expanded the Boston Digital Equity Initiative in 2024. Now, with several major city contracts up for renewal this fall, officials must decide which software platforms will manage the cleanup and what that process will cost taxpayers.

The stakes are higher than they might appear. Duplicate image files inflate cloud storage budgets, slow retrieval times for public records requests, and create compliance risks under Massachusetts public records law, which requires agencies to maintain accurate and accessible document inventories. For a city that has staked part of its progressive governance identity on transparency and open data, a bloated and redundant image archive is an embarrassing contradiction.

Where the Backlog Is Worst

Two institutions illustrate the challenge most sharply. The Boston Public Library's Copley Square branch, which houses the city's primary digital archive operation, has been working through a backlog of duplicate scans dating to a 2021 digitization grant from the Institute of Museum and Library Services. Staff there have identified thousands of image pairs — the same historical photograph stored twice under different file names — that need to be reconciled before the library can migrate to a new asset management platform later this year.

Separately, the city's Office of Planning, which oversees development review for neighborhoods including Jamaica Plain and Dorchester, maintains image libraries tied to zoning case files. Those libraries have ballooned as the office processed a record number of variance applications in 2025, driven by Wu's housing production push. Duplicate site photographs submitted by applicants are routinely stored without deduplication, a workflow gap that IT staff have flagged internally.

Boston's universities aren't immune. Northeastern University's library system on Huntington Avenue has grappled with similar redundancy issues in its institutional repository, particularly in collections digitized under external grants where multiple departments uploaded overlapping materials without a central registry.

The Decisions Coming This Fall

Three choices will define how this gets resolved — or doesn't. First, the city must decide by September 30 whether to renew its existing contract with its current cloud storage vendor or shift to a platform with built-in deduplication tools. The difference in annual licensing costs can run to six figures for a municipal operation the size of Boston's, according to standard enterprise software pricing tiers published by vendors like Microsoft Azure and Google Cloud.

Second, the Office of Planning needs to settle on a workflow standard before the next cycle of Jamaica Plain and Dorchester housing applications arrives in the spring. Without a mandatory deduplication step at the point of submission, the backlog will simply rebuild itself regardless of what cleanup happens now.

Third, the Boston Public Library's board is expected to vote before the end of 2026 on a new digital asset management contract. The leading platforms under consideration each handle duplicate detection differently, and that technical distinction will shape how staff time is allocated for years.

For residents and advocates who rely on public records — from housing activists in Roxbury tracking development applications to journalists filing requests about MBTA capital projects — the practical consequence is response time. An agency managing a clean, deduplicated image library can fulfill requests faster and with greater accuracy than one wading through redundant files. The city's stated goal under its open government commitments is to respond to straightforward digital records requests within ten business days. Whether that target holds through the next storage migration will depend on the procurement decisions made this summer.

The fourth of July offers a useful deadline of its own. City IT leadership has until the end of this fiscal quarter — September 30 — to submit budget justifications for any new software spending. What happens next is less a technical question than a political one: who inside City Hall will champion the unglamorous work of digital housekeeping, and whether it gets funded before a more visible crisis forces the issue.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Boston

This article was produced by the The Daily Boston editorial desk and covers news in Boston. See our editorial standards for how we use AI.

The Daily Boston brief

The day's Boston news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Boston news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Boston and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Boston

More in News

Enjoyed this story? Get tomorrow's briefing free.