Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

Abstract

Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, though they are considerably less accessible. One option to expand patent search beyond pure keywords is the inclusion of classification information: Since every patent is assigned at least one class code, it should be possible for these assignments to be automatically used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. This report describes our comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms.Our analysis shows a strong structural similarity of the hierarchies, but significant differences of terms and annotations. The low number of IPC class assignments and the lack of occurrences of class labels in patent texts imply that current patent search is severely limited. To overcome these limits, we evaluate a method for the automated assignment of additional classes to patent documents, and we propose a system for guided patent search based on the use of class co-occurrence information and external resources.

Details

OriginalspracheEnglisch
Seiten (von - bis)S3
FachzeitschriftJournal of biomedical semantics
Jahrgang2013
Ausgabenummer4 Suppl 1
PublikationsstatusVeröffentlicht - 15 Apr. 2013
Peer-Review-StatusJa

Externe IDs

PubMedCentral PMC3632996
Scopus 84948117428
ORCID /0000-0003-2848-6949/work/141543378

Schlagworte

Bibliotheksschlagworte