Boston's municipal technology offices are sitting on a sprawling problem. Across city departments — from the Boston Planning & Development Agency's project files to the Boston Public Library's digital collections on Boylston Street — duplicate images have accumulated in public-facing databases, creating storage bottlenecks, complicating Freedom of Information requests, and raising questions about which version of a record is authoritative. The pressure to act is now acute, as the city pushes forward on a broader digital modernization effort tied to Mayor Michelle Wu's open-government commitments.
The issue matters now because Boston is mid-cycle on a technology infrastructure overhaul. The city's Department of Innovation and Technology has been migrating legacy records systems since 2023, and duplicated image files don't just waste server space — they create legal exposure. When multiple versions of a permit photograph, inspection document, or planning map exist in a system, attorneys and auditors cannot always determine which copy is the official one. That ambiguity has already slowed at least some permit appeals before the Zoning Board of Appeal on City Hall Plaza, according to public meeting records from 2025.
Where the Backlog Is Worst
Two institutions illustrate the scope of the challenge. The Boston Public Library's Digital Repository, which houses more than 2 million images including historic photographs from the Leslie Jones Collection, has long flagged deduplication as an unfinished task in its annual reports. The BPL began a phased cleanup in fiscal year 2024 with a reported budget allocation of roughly $180,000, but the work has moved slowly against a collection that grows by tens of thousands of assets per year.
Separately, the BPDA's project database — which tracks development proposals across neighborhoods like Jamaica Plain and Dorchester, where housing production has been a centerpiece of Wu's agenda — contains thousands of site photographs submitted by developers. Staff there have identified duplicate submissions as a recurring friction point when preparing materials for community review meetings. Residents at Uphams Corner and along Washington Street in Dorchester have shown up to planning sessions with conflicting imagery pulled from the same public portal, sometimes depicting different stages of a project with no clear timestamp hierarchy.
The MBTA adds another dimension. The transit authority's capital project documentation, which feeds into public dashboards for riders tracking station renovation progress at stops like Assembly on the Orange Line, relies on image libraries maintained by multiple contractors simultaneously. Deduplication protocols between those contractors remain inconsistent, a gap noted in a 2024 Federal Transit Administration program review.
The Decisions That Cannot Wait
Three choices now sit on the table for city and institutional decision-makers. First, procurement: Boston must decide whether to license a commercial deduplication platform — options in this space run from roughly $40,000 to $200,000 annually depending on storage volume — or build out open-source tooling through its existing vendor relationships. Second, governance: no single city office currently has authority over image metadata standards across departments, meaning a fix in one silo can be undone by uploads from another. A formal records management directive would need to originate from the City Clerk's office or the Department of Innovation and Technology to have binding effect. Third, timeline: the Wu administration has committed to a broader digital equity and transparency report due before the end of fiscal year 2026, on June 30, 2027. Whether deduplication work gets folded into that report — or treated as a separate operational matter — will determine how much political attention and funding it attracts.
Advocates at the New England First Amendment Coalition, which monitors public records access across the region, have argued that clean, non-duplicated municipal image libraries are a prerequisite for meaningful transparency, not a back-office detail. The coalition has pushed similar arguments in Providence and Hartford, where deduplication efforts were tied directly to improvements in public records request turnaround times.
Boston's tech office is expected to release a systems audit covering digital asset management before September 2026. That document, once public, will be the clearest signal yet of whether the city intends to treat duplicate images as a genuine governance problem or let the backlog keep growing alongside the housing files, transit records, and library collections it is quietly distorting.