Document Understanding

Document layout analysis (DLA) plays an important role for identifying and classifying the different regions of digital documents in the context of Document Understanding tasks. In light of this, SciBank seeks to provide a considerable amount  of data from text (abstract, text blocks, caption, keywords, reference, section, subsection, title), tables, figures and equations (isolated equations and inline equations) of 74435 scientific articles pages. Human curators validated that these 12 regions were properly labeled.

Categories:
1440 Views