
The i‑net PDFC is a tool specifically for comparing two PDF files (or folders containing PDF files) for content differences. It is useful for comparing the output of two different PDF generator programs, e.g. OpenOffice and Adobe PDF Writer, or different versions of one program.
Thereby the PDF content comparer, as the name states, does not do a pixel-based or structural comparison but an element based check for differences in the documents.
The following elements are compared and any differences logged:
These differences each have a configurable tolerance value so that minor differences can be ignored if necessary. For the eye it does not matter whether a line or box is being miss-placed by a couple of pixels, as long the line is there and looks like intended.
PDFC [-c <config file>] [-[i][o]] [<Folder1> <Folder2> | <File1> <File2>]
| Parameter | Description |
|---|---|
| -c | Specify a configuration file (config.xml) for PDFC. If none is specified, the default “config.xml” is taken |
| -i | Creates diff images in <Folder1>/differences for any differences found (recommended for a graphical comparison) |
| -o | Creates images for each page of each version (need only be used for debug purposes) |
Note that if using two folders, the PDF files must have the same names in each folder.
PDFC -i CRFolder CCFolder
This would compare all PDF files in the folder “CRFolder” with the PDF files of the same name in the folder “CCFolder”.
The following tolerance values can be set in the config.xml file:
| Property | Description |
|---|---|
CHART_DENSITY_THRESHOLD | (Decimal) density threshold: ((number of shapes)^3 / area size) |
CHART_REMOVAL_MARGIN | (Decimal) percent of shape height to use as margin for removing PDF elements above and below detected charts |
CREATE_DIFFIMAGES | True to create png files with the marked difference of the compared pages |
CREATE_ORIGIMAGES | True to create a png file for each page that is compared |
LOG_LEVEL | Level for Logging (OFF, FATAL, ERROR, WARN, INFO, DEBUG, TRACE, ALL). The default is set to WARN |
MAX_ERRORS_PER_REPORT | maximum number of errors that can occur before the comparison is canceled for the current pdf file. |
MAX_WORD_DIFFERENCES | maximum number of differences that can occur before the comparison is canceled |
MODULES | comma separated list of modules to be executed for each page |
NORMALIZERS | comma separated list of normalizers to be executed before and after each page |
TOLERANCE_BOX_ROUND_EDGES | (Integer) maximum number of pixels that a curve control point may differ in total |
TOLERANCE_IMAGE_DISTANCE | maximum number of pixels that the position of an image can differ |
TOLERANCE_IMAGE_SIZE | maximum difference in percent, that the area spanned by an image may differ |
TOLERANCE_LINE_POSITION | (Decimal) maximum number of pixels that the position of a line or curves can differ per axis |
TOLERANCE_LINE_SIZE | (Integer) maximum number of pixels that the length of a line can differ in total |
TOLERANCE_LINE_STYLE | (Boolean) if true, different stroke styles will be an error |
TOLERANCE_LINE_THICKNESS | (Integer) maximum difference in stroke thickness of two lines or curves |
TOLERANCE_PAGE_LEFTCORNER | maximum number of pixels that the left or top margin of a page can differ (is the upper left corner of all elements) |
TOLERANCE_PAGE_RATIO | tolerance for the aspect ratio of the pdf page |
TOLERANCE_PAGE_SIZE | maximum number of pixels that the width or height of a page can differ |
TOLERANCE_UNDERLINE_LENGTH | (Decimal) the maximum difference in percent, which the length of underlines may differ |
A good example on how i‑net PDFC will help is the decision for a reporting.
E.g, you're using Crystal Reports but find it to expensive to keep and you're looking for a replacement that is much cheaper but has the same capabilities. And, more important, your old reports look the same after migration to the new system.
Using i‑net PDFC you can compare the PDF output of your favorite reporting systems and make an objective decision.
i‑net PDFC bundles two external components that underly the Apache Software License 1.1 and LGPL 3.0 (see downloaded file).
All other components are bound to the proprietary license which you do accept by downloading the software.
If you have any questions, problems or suggestions, please do not hesitate to contact tools@inetsoftware.de for technical support.