Robust named entity detection from optical character recognition output

Abstract In this paper, we focus on information extraction from optical character recognition (OCR) output. Since the content from OCR inherently has many errors, we present robust algorithms for information extraction from OCR lattices instead of merely looking them up in the top-choice (1-best) OC...
Ausführliche Beschreibung

Gespeichert in:
Autor*in:

Subramanian, Krishna [verfasserIn]

Prasad, Rohit [verfasserIn]

Natarajan, Prem [verfasserIn]

Format:

E-Artikel

Sprache:

Englisch

Erschienen:

2011

Schlagwörter:

Optical character recognition

Hidden Markov Model

Information extraction

Named entity detection

Übergeordnetes Werk:

Enthalten in: International journal on document analysis and recognition - Berlin : Springer, 1998, 14(2011), 2 vom: 13. Apr., Seite 189-200

Übergeordnetes Werk:

volume:14 ; year:2011 ; number:2 ; day:13 ; month:04 ; pages:189-200

Links:

Volltext

DOI / URN:

10.1007/s10032-011-0150-z

Katalog-ID:

SPR008121966

Nicht das Richtige dabei?

Schreiben Sie uns!