Scan Server for Digital Long-Term Archiving

PDF

Nowadays, most companies no longer want to waste time and money filling windowless rooms with paper folders or hiring employees to search for paper documents. It is not only in large companies that more and more executives are recognizing the advantages of digital archiving. But how should it be implemented? Some say to leave it to the manufacturers of the scanning equipment, while others think that more is needed.

Is a CT scan enough?

In most companies, scanning paper documents has become a routine task when it comes to incoming mail. For this purpose, multifunction printers (MFPs) or high-performance scanners are used, depending on the type and quantity of paper documents received.

In most matter, the scanned images are created in the form of black and white TIFF files, the typical format used by fax machines. In special matter, such as the scanning of checks or identity photos, the file is generated in color. However, color scans are usually avoided because the TIFF files created are either too large or the JPEG compression visibly reduces the image quality.

However, good image quality is an important condition for a good text recognition rate. Achieving good image quality at a high compression ratio requires processing power that local multifunction printers usually do not have. Separate scanning software can offer significant advantages in this regard.

As a rule, individual processing steps such as text recognition, compression, PDF/A generation and digital signature cannot be performed by the Scanner alone, as the metadata is often added after by an index station. However, this stage of the work breaks the seal of the digital signature and renders it worthless. Again, separate software can offer a key advantage.

PDF/a – a universal document standard

The PDF/A standard is now widely used in inbox applications. The PDF/A standard offers the following important advantages over traditional document formats such as TIFF and JPEG:

The standardized PDF /a format is suitable for both the storage of scanned and digitally created documents.

High compression ratio the PDF/A standard supports more modern and powerful compression methods, and therefore small file sizes for color images.

Text recognition the created PDF/a documents can be viewed by embedding text from an OCR engine.

Integrated Metadata in order for the document and the associated metadata to form an inseparable whole, the PDF/A metadata are incorporated into the file. For registration, PDF/A uses the Extensible Metadata Platform (XMP) format, which, like PDF/A, is also defined as its own ISO standard.

Digital signature to guarantee the integrity and authenticity of the documents created, a digital signature can be applied to the PDF/A document in accordance with the PAdES standard. The digital signature is a type of electronic signature that can serve the same purpose as a handwritten signature, provided that the appropriate lawful requirements (national laws on signature) are met.

In principle, TIFF documents offer all these advantages, but only as proprietary extensions, since the TIFF standard itself does not provide solutions

What can a centralized analysis server do?

A scanning server is a centralized service that converts locally scanned files and the associated index files within a company to the standardized PDF/a file format. To do this, the service executes all the tasks that can be delegated to it by the local analysis station. The solution is particularly suitable for processing steps that do not require any user interaction or that compromise the efficiency of the local scanning station with CPU-intensive capabilities (OCR, compression).

The main functions of this service are:

Text and barcode recognition the scanned image files must be searchable. Services can use the 3-Heights® OCR service to identify the text of an image file and integrate it into the converted file so that it is searchable. The detected barcodes can be used in different ways: in text search, as part of the integrated metadata or to control the processing (output file name, page separation, etc.).) within the service.

Compression color images are divided into several elements. They are then strongly compressed without visible losses using the MRC (Mixed Raster Content) method.

Metadata integration the PDF/A standard requires that metadata be incorporated into the document in the form of XMP packages. This feature is offered by the service.

Creation of PDF / A the Service creates output documents of one or more pages in accordance with the ISO 19005 series of standards. all published parts of the standard – PDF/A-1, PDF/A-2 and PDF/A-3 – are supported.

Digital signature the signature can be advanced or qualified, suitable for long-term storage or simply for exchange. It can also contain a time stamp. Instead of the personal signature, only a time stamp can be affixed. The service can use a cryptographic infrastructure (USB token, HSM) via a standard interface (PKCS#11) to create a digital signature.

A typical sequence would look like this:

Leave a Reply

Your email address will not be published. Required fields are marked *