An Automatic Approach for Generating Rich, Linked Geo-Metadata from Historical Map Images

12/03/2021
by   Zekun Li, et al.
0

Historical maps contain detailed geographic information difficult to find elsewhere covering long-periods of time (e.g., 125 years for the historical topographic maps in the US). However, these maps typically exist as scanned images without searchable metadata. Existing approaches making historical maps searchable rely on tedious manual work (including crowd-sourcing) to generate the metadata (e.g., geolocations and keywords). Optical character recognition (OCR) software could alleviate the required manual work, but the recognition results are individual words instead of location phrases (e.g., "Black" and "Mountain" vs. "Black Mountain"). This paper presents an end-to-end approach to address the real-world problem of finding and indexing historical map images. This approach automatically processes historical map images to extract their text content and generates a set of metadata that is linked to large external geospatial knowledge bases. The linked metadata in the RDF (Resource Description Framework) format support complex queries for finding and indexing historical maps, such as retrieving all historical maps covering mountain peaks higher than 1,000 meters in California. We have implemented the approach in a system called mapKurator. We have evaluated mapKurator using historical maps from several sources with various map styles, scales, and coverage. Our results show significant improvement over the state-of-the-art methods. The code has been made publicly available as modules of the Kartta Labs project at https://github.com/kartta-labs/Project.

READ FULL TEXT

page 3

page 4

research
12/12/2021

Synthetic Map Generation to Provide Unlimited Training Data for Historical Map Text Detection

Many historical map sheets are publicly available for studies that requi...
research
06/29/2023

The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps

Scanned historical maps in libraries and archives are valuable repositor...
research
12/10/2021

A Label Correction Algorithm Using Prior Information for Automatic and Accurate Geospatial Object Recognition

Thousands of scanned historical topographic maps contain valuable inform...
research
09/13/2021

Project Pipeline: Preservation, Persistence, and Performance

Preservation pipelines demonstrate extended value when digitized content...
research
12/05/2020

Aligning geographic entities from historical maps for building knowledge graphs

Historical maps contain rich geographic information about the past of a ...
research
10/03/2021

Translating Images into Maps

We approach instantaneous mapping, converting images to a top-down view ...
research
04/06/2017

A Service-Oriented Architecture for Assisting the Authoring of Semantic Crowd Maps

Although there are increasingly more initiatives for the generation of s...

Please sign up or login with your details

Forgot password? Click here to reset