Embedding-Based Multilingual Semantic Search for Geo-Textual Data in Urban Studies

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

Abstract

Urban studies increasingly rely on vast amounts of geo-textual data from social media and news archives. However, effectively searching and analyzing this data remains a significant challenge. Current keyword-based search methods struggle to capture the inherent vagueness, multilingualism, and dynamic nature of this information, thus limiting comprehensive urban analysis and insight discovery. To address this, we propose a novel and modular geospatial semantic search (GSS) workflow. This framework leverages multilingual text embeddings to enable effective semantic search across flexible geographical units, moving beyond the limitations of keyword matching. The GSS workflow integrates pre-computed document embeddings, adaptive spatial aggregation methods, and interactive map visualization to facilitate the exploration of search results. Our approach enables efficient semantic exploration of user-contributed mass data, improving the understanding of public expression in urban areas, monitoring thematic trends and localized events, and offering a scalable and privacy-aware design. We demonstrate these capabilities in a proof-of-concept using hexagonal bins for visualization.

Details

OriginalspracheEnglisch
Aufsatznummer31
FachzeitschriftJournal of Geovisualization and Spatial Analysis
Jahrgang9
Ausgabenummer2
PublikationsstatusVeröffentlicht - 10 Juli 2025
Peer-Review-StatusJa

Externe IDs

ORCID /0000-0003-2949-4887/work/202348963
ORCID /0000-0003-1157-7967/work/202353085

Schlagworte

Schlagwörter

  • Geo-textual data, Semantic similarity, Social media, Text embeddings, Urban studies