Boston's Office of Digital Innovation confirmed this week that a duplicate image problem has slowed the rollout of the city's new public records archive, a project that was supposed to give residents and researchers streamlined access to municipal documents going back decades. The issue surfaced publicly on Tuesday, July 1, when city staff flagged that the ingestion pipeline had been accepting and storing duplicate scans — in some cases loading the same image file three or four times — bloating the system's storage and making retrieval searches unreliable.
The timing is awkward. Mayor Michelle Wu's administration has staked part of its open-government credentials on modernizing how Boston manages public data, and the archive project is central to that. The duplicate problem doesn't kill the initiative, but it hands critics a tangible example of the implementation gaps that can emerge when ambitious digitization timelines collide with the messiness of decades-old paper records. Several city departments, including the Boston Inspectional Services Department and the Boston Planning Department, had already begun uploading permit documents and zoning maps through the new system.
Where the Problem Started
According to a project status update posted to the city's data portal on July 2, the duplication issue originated during a batch upload in late June involving records scanned at the Boston City Archives facility on Boylston Street in the Fenway. Staff there had been processing a backlog of documents from Jamaica Plain and Dorchester permitting files — neighborhoods at the center of the Wu administration's housing production push — and a misconfigured deduplication script failed to catch repeat submissions before they were committed to the database. The update did not specify exactly how many duplicate images were involved, but described the problem as affecting multiple upload batches totaling thousands of individual files.
The Boston Public Library's Leventhal Map and Education Center, which had been coordinating with city archivists on a parallel digitization project covering historical neighborhood maps, said its own pipeline was not affected because it uses a separate storage system. That distinction matters for researchers who rely on the Leventhal's collection, but it also illustrates how fragmented Boston's public digital infrastructure still is — different institutions running different systems, with limited interoperability.
What's Being Done — and What Comes Next
City technicians began a manual audit of the affected batches on Wednesday. The project update indicated that a corrected deduplication script was being tested in a staging environment and would be deployed to the production system no earlier than July 7, pending review. Until then, new uploads to the affected portions of the archive are paused.
The archive project as a whole has been running since January 2025 and is being managed in part through a contract with a vendor whose name appears in procurement records published by the city. The full contract value listed in those public records is $1.4 million over two years. The duplicate image episode, while fixable, adds an unwanted complication to a project already navigating the challenge of scanning and cataloguing paper records that were never designed for digital retrieval.
For residents and advocates who use public permitting data — housing attorneys in Dorchester, neighborhood groups tracking development along Washington Street in Jamaica Plain, researchers at Northeastern University or MIT pulling historical zoning records — the practical effect right now is a slower, less reliable search experience. Some documents that should appear in keyword searches are either surfacing multiple times or, in a handful of cases, returning errors because duplicate records created conflicting metadata entries.
The city's data portal team says the audit should be complete by mid-July, with a full status report expected before the end of the month. Anyone who submitted a public records request since June 20 and is waiting on documents that would have been pulled from the affected batches should expect a delay of at least two to three weeks, according to the July 2 project update. Requests can still be tracked through the city's Constituent Experience platform at boston.gov. Paper copies remain available through Inspectional Services at 1010 Massachusetts Avenue in the South End.