Xlindy: Interactive recognition and information extraction in spreadsheets

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Over the years, spreadsheets have established their presence in many domains, including business, government, and science. However, challenges arise due to spreadsheets being partially-structured and carrying implicit (visual and textual) information. This translates into a bottleneck, when it comes to automatic analysis and extraction of information. Therefore, we present XLIndy, a Microsoft Excel add-in with a machine learning back-end, written in Python. It showcases our novel methods for layout inference and table recognition in spreadsheets. For a selected task and method, users can visually inspect the results, change configurations, and compare different runs. This enables iterative fine-tuning. Additionally, users can manually revise the predicted layout and tables, and subsequently save them as annotations. The latter is used to measure performance and (re-)train classifiers. Finally, data in the recognized tables can be extracted for further processing. XLIndy supports several standard formats, such as CSV and JSON.

Details

OriginalspracheEnglisch
TitelProceedings of the ACM Symposium on Document Engineering, DocEng 2019
Herausgeber (Verlag)Association for Computing Machinery (ACM), New York
Seiten25:1-25:4
Seitenumfang4
ISBN (elektronisch)978-1-4503-6887-2
PublikationsstatusVeröffentlicht - 23 Sept. 2019
Peer-Review-StatusJa

Publikationsreihe

ReiheDocEng: Document Engineering

Konferenz

Titel19th ACM Symposium on Document Engineering, DocEng 2019
Dauer23 - 26 September 2019
StadtBerlin
LandDeutschland

Externe IDs

dblp conf/doceng/KociKLOTGL019
ORCID /0000-0001-8107-2775/work/142253491
ORCID /0000-0002-5985-4348/work/162348855

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

  • Add-in, Annotation, Excel, Information extraction, Interactive, Layout inference, Spreadsheets, Table recognition