1 libscantools is a library for graphics manipulation, written with a view towards
2 the handling of scanned documents and generation of high-quality PDF files. The
3 library is written in C++ and makes heavy use of Qt5.
7 The development of libscantools currently concentrates on the production of
8 high-quality, well-compressed and standards-compliant PDF/A files; PDF/A is the
9 ISO standard for long-term archiving of digital documents. More features,
10 including graphics manipulation and scanner access, will be added in the future.
12 * Conversion of images to PDF/A format. HOCR files, which are produced by
13 optical character recognition programs such as 'tesseract', can optinonally be
14 used to make the PDF file searchable.
19 First time users will likely want to look at the following classes and
22 * The class 'PDFAWriter' generates well-crafted, PDF/A-2b compliant
23 documents. Just construct a PDFAWriter instance, add graphic files and HOCR
24 files to create a and well-crafted, searchable PDF file. Files in JBIG2 and
25 JPEG format, as well as JPEG2000 files in JPX format will be written directly
26 into the PDF, all all other graphic files will be converted to RGB, and
27 encoding losslessly in a way that depends on the image
28 characteristics. Multi-page TIFF files are well supported.
30 * The class 'HOCRDocument' reads and interprets HOCR files, the standard output
31 file format for Optical Character Recognition systems. It converts HOCR files
32 to text, or renders them on any QPainDevice.
34 * The namspace 'compression' gives access to zlib and Fax G4, as well as
35 state-of-the-art zopfli compression routines, all implemented in a thread-safe
40 The API is currently experimental, and subject to change. We expect that the API
41 will stabilize with the 1.0.0 release.