Embedding-Based Multilingual Semantic Search for Geo-Textual Data in Urban Studies
Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung
Beitragende
Abstract
Urban studies increasingly rely on vast amounts of geo-textual data from social media and news archives. However, effectively searching and analyzing this data remains a significant challenge. Current keyword-based search methods struggle to capture the inherent vagueness, multilingualism, and dynamic nature of this information, thus limiting comprehensive urban analysis and insight discovery. To address this, we propose a novel and modular geospatial semantic search (GSS) workflow. This framework leverages multilingual text embeddings to enable effective semantic search across flexible geographical units, moving beyond the limitations of keyword matching. The GSS workflow integrates pre-computed document embeddings, adaptive spatial aggregation methods, and interactive map visualization to facilitate the exploration of search results. Our approach enables efficient semantic exploration of user-contributed mass data, improving the understanding of public expression in urban areas, monitoring thematic trends and localized events, and offering a scalable and privacy-aware design. We demonstrate these capabilities in a proof-of-concept using hexagonal bins for visualization.
Details
| Originalsprache | Englisch |
|---|---|
| Aufsatznummer | 31 |
| Fachzeitschrift | Journal of Geovisualization and Spatial Analysis |
| Jahrgang | 9 |
| Ausgabenummer | 2 |
| Publikationsstatus | Veröffentlicht - 10 Juli 2025 |
| Peer-Review-Status | Ja |
Externe IDs
| ORCID | /0000-0003-2949-4887/work/202348963 |
|---|---|
| ORCID | /0000-0003-1157-7967/work/202353085 |
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- Geo-textual data, Semantic similarity, Social media, Text embeddings, Urban studies