Embedding-Based Multilingual Semantic Search for Geo-Textual Data in Urban Studies
Research output: Contribution to journal › Research article › Contributed › peer-review
Contributors
Abstract
Urban studies increasingly rely on vast amounts of geo-textual data from social media and news archives. However, effectively searching and analyzing this data remains a significant challenge. Current keyword-based search methods struggle to capture the inherent vagueness, multilingualism, and dynamic nature of this information, thus limiting comprehensive urban analysis and insight discovery. To address this, we propose a novel and modular geospatial semantic search (GSS) workflow. This framework leverages multilingual text embeddings to enable effective semantic search across flexible geographical units, moving beyond the limitations of keyword matching. The GSS workflow integrates pre-computed document embeddings, adaptive spatial aggregation methods, and interactive map visualization to facilitate the exploration of search results. Our approach enables efficient semantic exploration of user-contributed mass data, improving the understanding of public expression in urban areas, monitoring thematic trends and localized events, and offering a scalable and privacy-aware design. We demonstrate these capabilities in a proof-of-concept using hexagonal bins for visualization.
Details
| Original language | English |
|---|---|
| Article number | 31 |
| Journal | Journal of Geovisualization and Spatial Analysis |
| Volume | 9 |
| Issue number | 2 |
| Publication status | Published - 10 Jul 2025 |
| Peer-reviewed | Yes |
External IDs
| ORCID | /0000-0003-2949-4887/work/202348963 |
|---|---|
| ORCID | /0000-0003-1157-7967/work/202353085 |
Keywords
ASJC Scopus subject areas
Keywords
- Geo-textual data, Semantic similarity, Social media, Text embeddings, Urban studies