Boston Leads U.S. Cities in Purging Duplicate Public Images, but Lags Behind London and Amsterdam
As municipalities worldwide race to clean up redundant digital archives, Boston's approach offers both a model and a cautionary tale.
As municipalities worldwide race to clean up redundant digital archives, Boston's approach offers both a model and a cautionary tale.

Boston's city government has quietly been working through one of the more unglamorous problems in modern municipal administration: tens of thousands of duplicate images cluttering public-facing digital records, property databases, and permit archives maintained by agencies including the Boston Planning Department and the Office of Digital Equity. The effort, accelerated under Mayor Michelle Wu's broader push to modernize city data infrastructure, has become a point of comparison for urban administrators in Chicago, Philadelphia, and several European capitals wrestling with the same problem.
The issue matters now because duplicate image data — redundant photographs attached to property records, zoning filings, and public works permits — inflates storage costs, slows database queries, and generates errors when automated systems process planning applications. Boston's Inspectional Services Department, which operates out of City Hall on Cambridge Street, maintains digitized records going back to a 2017 transition from paper filing. That five-year backlog produced an estimated 40 percent image-duplication rate across permit records, according to city technology staff who have publicly discussed the cleanup effort in budget presentations.
The city contracted with its existing IT infrastructure vendor in early 2025 to run deduplication software across the Inspectional Services database, targeting records tied to active development corridors in Jamaica Plain and Dorchester — two neighborhoods where permit volumes surged alongside housing production initiatives. The Jamaica Plain work alone covers hundreds of parcels along Washington Street, where new multi-family construction has generated dense clusters of overlapping photographic documentation submitted by multiple contractors for the same sites.
Boston's approach relies on perceptual hashing — a technique that compares image fingerprints rather than pixel-by-pixel matches — which allows the system to flag near-duplicates taken from slightly different angles or under different lighting conditions. The city's Digital Services team, based at 1 City Hall Square, began a pilot phase in March 2025 with a target of processing the Dorchester and Jamaica Plain permit libraries before expanding citywide. Staff have described the tool publicly as part of a broader open-data hygiene initiative rather than a standalone project.
Chicago's Department of Buildings launched a comparable cleanup in 2024, working through its integrated permit system, and Philadelphia's Office of Innovation and Technology completed a similar deduplication pass on its L&I permit records in late 2024 after a city controller's report flagged redundant files as a secondary cost driver in a storage contract renewal. Philadelphia's city controller published findings showing storage costs for permit imagery had grown 22 percent over three years, partly attributed to duplication.
London and Amsterdam are further along. Transport for London integrated automated deduplication into its asset management system in 2023, covering street-level infrastructure photographs used in maintenance planning. Amsterdam's municipal archive digitization program, which the city has documented publicly through its open data portal, built deduplication into its pipeline from the start of a 2021 digitization project covering planning records dating to the 1980s. Both cities benefited from building the cleanup function into new systems rather than retrofitting it onto legacy databases — the problem Boston now faces.
Boston's retrofit approach is slower and more expensive. IT budget documents presented to the City Council's Committee on Government Operations in fiscal year 2025 showed the deduplication contract valued at roughly $380,000 over 18 months — a figure that technology policy analysts who track municipal IT spending have noted is on the higher end for a city of Boston's size, reflecting the complexity of working with legacy Inspectional Services data structures. Singapore's Urban Redevelopment Authority has cited its own early-stage deduplication work as costing significantly less per record, largely because its planning database was built on a unified schema from the outset.
The practical stakes are not abstract. Duplicate images have been flagged in at least two instances by Boston planning staff as contributing to processing delays on variance applications in Dorchester, where the development pipeline is among the most active in the city. Architects and project managers who file frequently with Inspectional Services have noted in public testimony before the Zoning Board of Appeal that document management inconsistencies slow turnaround times.
The citywide expansion of the deduplication project is scheduled to roll through fiscal year 2026. Residents and developers who interact with the city's Accela permitting portal — the public-facing interface for building and zoning filings — should expect gradual improvements in search accuracy for historical permit records as the cleanup progresses neighborhood by neighborhood.
How does this story make you feel?
Spread the word
About this article
Published by The Daily Boston
Daily brief
Free, in your inbox before 7am. Weekdays.
More in News


