veraPDF CLI Configuration


Below the veraPDF installation directory there is a sub-directory called config. This contains the XML configuration files for the veraPDF software components. To see the contents of this directory from a terminal session in the installation root directory type ls config/ on Mac or Linux machines or dir config on Windows machines. On my Windows test VM this outputs the following:

C:\Users\cfw\verapdf>dir config
 Volume in drive C has no label.
 Volume Serial Number is 1C45-2074

 Directory of C:\Users\cfw\verapdf\config

22/01/2017  12:44    <DIR>          .
22/01/2017  12:44    <DIR>          ..
22/01/2017  12:44               411 app.xml
22/01/2017  12:44               186 features.xml
22/01/2017  12:44               109 fixer.xml
22/01/2017  12:44               131 validator.xml
               4 File(s)            837 bytes
               2 Dir(s)   3,695,038,464 bytes free

If you can’t see any files then it’s likely you’ve not run the application after installation. The software generates default configuration files on start-up if none exist. Try running verapdf --version which should generate the missing files.

Running veraPDF without installation gives no configuration files

All of the above assumes that you’ve installed veraPDF with the downloaded installer. If you’re running a version of the application you’ve built yourself and not installed you won’t have an application home directory. The following only applies if you’re running a jar directly from the command line, that is something like : java -jar target/1.8.1.jar, from the veraPDF-apps/gui module. The problem is that the installer adds a couple of invocation scripts that set up the application home directory. The solution is to choose a config directory and pass it to the application when you call it. Here’s an example:

  1. Select a folder you want to use as home and create it, a good suggestion is ~/.verapdf beneath your home directory, in my case /home/cfw/.verapdf.
  2. Execute the following command: java -Dapp.home="/home/cfw/.verapdf" -jar gui-1.8.1-SNAPSHOT.jar --version
  3. ls ~/.verapdf/config

and you should see

-rw-rw-r-- 1 cfw cfw 375 Jan 28 20:36 app.xml
-rw-rw-r-- 1 cfw cfw 186 Jan 28 20:36 features.xml
-rw-rw-r-- 1 cfw cfw 109 Jan 28 20:36 fixer.xml
-rw-rw-r-- 1 cfw cfw   0 Jan 28 20:36 plugins.xml
-rw-rw-r-- 1 cfw cfw 131 Jan 28 20:36 validator.xml

Now proceed to use the config files in this directory, which will work as long as you use java -Dapp.home="/home/cfw/.verapdf" -jar gui-1.8.1-SNAPSHOT.jar as oppose to java -jar gui-1.8.1.jar when you start the app. This form of invocation supports all command line options, e.g.

  • java -Dapp.home="/home/cfw/.verapdf" -jar gui-1.8.1-SNAPSHOT.jar -f 1b somefile.pdf
  • java -Dapp.home="/home/cfw/.verapdf" -jar gui-1.8.1-SNAPSHOT.jar --extract somefile.pdf
  • java -Dapp.home="/home/cfw/.verapdf" -jar gui-1.8.1-SNAPSHOT.jar --policyfile my-policy.sch somefile.pdf

veraPDF config files

There are four config files available:

  • app.xml configures the veraPDF CLI and GUI applications;
  • validator.xml sets defaults for PDF/A validation;
  • fixer.xml provides configuration of the metadata fixer; and
  • features.xml configures feature extraction.

The sections below give a brief overview of these files and their options.

Configuring the veraPDF application

A default application config file looks like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<appConfig type="VALIDATE" maxFails="100" isOverwrite="false" format="MRR" isVerbose="false">


The appConfig element has a set of attributes can be used as follows:

  • type controls the default processing model for the GUI, legal values are:
    • VALIDATE : PDF/A validation.
    • VALIDATE_FIX : PDF/A validation and metadata fixing.
    • EXTRACT : Feature extraction.
    • VALIDATE_EXTRACT: PDF/A validation and feature extraction.
    • EXTRACT_FIX : PDF/A valdiation, feature extraction and metadata fixing.
    • POLICY : Policy checking, this also enables PDF/A validation and feature extraction as the policy checker depends upon them.
    • POLICY_FIX : Policy checking and metadata fixing, again PDF/A validation and feature extraction are also enabled.
  • maxFails specifies how many failed tests are reported per validation rule.
  • isOverwrite tells the application whether to overwrite existing result files.
  • format chooses the default reporting format, valid values are:
    • MRR : machine readable report, an XML file that has been formatted for machine parsing and reporting.
    • XML : the raw XML data used by the veraPDF APIs, it’s not quite as readable as the MRR format but can be de-serialised by the veraPDF API for further processing.
    • HTML : a formatted HTML report intended for human consumption.
    • ‘TEXT’ : very brief single line text output.
  • isVerbose can be set to false for brief output which is the default, or true for verbose output.


The fixerFolder element sets a default folder where the repaired files generated by the metadata fixer are written.


The wikiPath element defines the base URL used to create reference links in the HTML report. You’re unlikely to want to change this unless you intend to host your own local version of the veraPDF validation rule wiki.


The reportFile element defines the default name for report files generated by the application.


The reportFolder element sets the folder where report files generated by the application are written.


The policyFile element defines default policy file to be applied by the veraPDF policy checker.

Configuring PDF/A validation

The default validation config file contains:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<validatorConfig flavour="NO_FLAVOUR" recordPasses="false" maxFails="-1"/>

The validatorConfig element

The validatorConfig element defines the following attributes:

  • flavour the default flavour to use when none is specified by the user, can be PDF_A_1A, PDF_A_1B, PDF_A_2A, PDF_A_2B, PDF_A_2U, PDF_A_3A, PDF_A_3B, PDF_A_3U, PDF_UA1, or NO_FLAVOUR (for automatic detection).
  • recordPasses set true to report passed validation checks, false to report failures only.
  • maxFails specifies the maximum number of failed checks before validation is terminated, the default value of -1 means report all failures.

Configuring feature extraction

The config/features.xml file configures the types of PDF features extracted by the veraPDF software. The default file contains a single entry:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

This enables the extraction of the PDF document metadata held in the information dictionary. You can enable the extraction of other features by adding new <feature> sub-elements to the <enabledFeatures> element.

For reference here’s a version of features.xml with every type of feature enabled:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>


Lists all acton elements associated with various document, page, interactive form events. The extracted action element contains information about the action type and location (document, page, annotation, outline) to which this action was associated.


Lists all of the annotations found within the document. The extracted annotation elements contain detailed information about annotation e.g. type, location, references to the annotation resources and other annotations used used by an annotation.


Lists all colour spaces contained in the document. The description of each color space contains details relevant for given color space family. The family is specified in family attribute. Possible color space families are:

  • DeviceGray
  • DeviceRGB
  • DeviceCMYK
  • CalGray
  • CalRGB
  • Lab
  • ICCBased
  • Indexed
  • Pattern
  • Separation
  • DeviceN


Requests information about document security including encryption, password protection and permissions.


Extracts information about any embedded files contained within a PDF document.


Lists the graphic states used in the document and their properties, e.g. transparency.


Lists any fonts used in the document. The description of each font contains the details relevant for given font type. The children elements of the font element:

  • subtype
  • name
  • baseName
  • firstChar
  • lastChar
  • widths
  • encoding
  • embedded
  • subset
  • fontDescriptor (the font descriptor describing the font’s metrics other then its glyph widths)


Extracts information about any forms contained in the document.


Configures the extraction of ICC profiles contained in the PDF document.


Extracts information about the images contained in the document like height, width and compression used.


This enables the extraction of key-value pairs from the PDF Document information dictionary. The dictionary key name is saved as the value of the key argument; the dictionary value is saved as the value of the entry element


Extracts information about all interactive form fields found in the document. The extracted information includes the name of the form field and its value.


Extract information about indirect objects, the document ID as well as compression / decoding filters used in the document.


Requests reporting of the document-level XMP metadata package exactly as it is in the original PDF Document or, if automatic XMP metadata fixing is enabled, in the resulting PDF Document. Since XMP serialization is based on XML there is no need to change in the serialized XMP packet, except for encoding. If the encoding used by XMP differs from encoding used for Report generation, the XMP will be re-encoded to make it consistent with the rest of the Report.


Extracts information pertaining to any bookmarks in the document.


Requests the extraction of information about the document’s output intents.


Lists the page elements, each representing a page in the PDF document. This includes information about:

  • media boxes;
  • crop boxes;
  • trim boxes;
  • bleed boxes;
  • art boxes;
  • rotation;
  • scaling;
  • thumbnails;
  • resources, including reference to fonts and images used on a page; and
  • annotations.


Gathers information about the patterns contained in the PDF.


Extracts information about any PostScript fragments used when printing to a PostScript device.


Lists the properties dictionaries.


Lists the shadings used in the document.


Extracts information about any digital signatures contained in the document.

Configuring plugins

The config/plugins.xml file configures plug in components for veraPDF. The default file contains an empty entry:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

To add a plug in execution the plugin element shall be specified.

For reference here’s an example of plugins.xml with single plugin enabled:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <plugin enabled="true">
        <name>Plugin Name</name>
        <description>Some plugin description</description>
            <attribute key="attrKey" value="attrValue"/>
            <attribute key="attr2Key" value="attr2Value"/>


The enabled attribute specified if the plugin shall be executed during features extracting or not. This attribute can be used for temporary disabling the plugin without removing the configuration data for the plugin.


This is a plug in name which will be added into features report.


This is a plug in version which will be added into features report.


This is a plug in description which will be added into features report.

plugin jar

This is a path to plug in jar file. Shall be either absolute or relative to veraPDF installation folder.


This is a list of attribute nodes. Each of them contains two xml attributes key and value. The resulted map will be used as attributes map for the plug in.

Configuring the metadata fixer

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<fixerConfig fixId="true" fixesPrefix="veraFixMd_"/>