Retrolive: Analysis of relational retrofitted word embeddings

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

Text values are valuable information in relational database systems for analysis and machine learning (ML) tasks. Since ML techniques depend on numerical input representations, word embeddings are increasingly utilized to convert symbolic representations such as text into meaningful numbers. However, those models do not incorporate the context-specific semantics of text values in the database. To significantly improve the representation of text values occurring in DBMS, we propose a novel retrofitting approach called Retro which considers both, the semantics of the word embedding and the relational schema. Based on this, we developed RetroLive, an interactive system, that allows exploring how the retrofitted embeddings improve the performance for various ML and integration tasks. Moreover, the demo includes several interactive visualizations to explore the characteristics of the adapted vectors and their connection to the relational database.

Details

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT 2020
EditorsAngela Bonifati, Yongluan Zhou, Marcos Antonio Vaz Salles, Alexander Bohm, Dan Olteanu, George Fletcher, Arijit Khan, Bin Yang
PublisherOpenProceedings.org
Pages607-610
Number of pages4
ISBN (electronic)9783893180837
Publication statusPublished - 2020
Peer-reviewedYes

Publication series

SeriesAdvances in database technology : proceedings / EDBT ...
Volume2020-March

Conference

Title23rd International Conference on Extending Database Technology, EDBT 2020
Duration30 March - 2 April 2020
CityCopenhagen
CountryDenmark

External IDs

Scopus 85084175541
ORCID /0000-0001-8107-2775/work/142253450