Guide: Setting up XML validation

Track	Test Bed setup

This guide walks you through the points to consider when setting up XML validation and the steps to bring your validation service online.

What you will achieve

At the end of this guide you will have understood what you need to consider when starting to implement validation services for your XML-based specification. You will also have gone through the steps to bring it online and make it available to your users.

An XML validation service can be created using multiple approaches depending on your needs. You can have an on-premise (or local to your workstation) service through Docker or use the Test Bed’s resources and, with minimal configuration, bring online a public service that is automatically kept up-to-date.

For the purpose of this guide you will be presented the options to consider and start with a Docker-based instance that could be replaced (or complemented) by a setup through the Test Bed. Interestingly, the configuration relevant to the validator is the same regardless of the approach you choose to follow.

What you will need

About 30 minutes.
A text editor.
A web browser.
Access to the Internet.
Docker installed on your machine (only if you want to run the validator as a Docker container).
A basic understanding of XML-related technologies. For more information you can check out online resources on XML Schema (XSD), XPath and Schematron.

How to complete this guide

The steps described in this guide are for the most part hands-on, resulting in you creating a fully operational validation service. For these practical steps there are no prerequisites and the content for all files to be created are provided in each step. In addition, if you choose to try your setup as a Docker container you will also be issuing commands on a command line interface (all commands are provided and explained as you proceed).

Steps

You can complete this guide by following the steps described in this section. Not all steps are required, with certain ones being optional or complementary depending on your needs. The following diagram presents an overview of all steps highlighting the ones that apply in all cases (marked as mandatory):

When and why you should skip or consider certain steps depends on your testing needs. Each step’s description covers the options you should consider and the next step(s) to follow depending on your choice.

Step 1: Determine your testing needs

Before proceeding to setup your validator you need to clearly determine your testing needs. A first outline of the approach to follow would be provided by answering the following questions:

Will the validator be available to your users as a tool to be used on an ad-hoc basis?
Do you plan on measuring the conformance of your community’s members to the XML-based specification?
Is the validator expected to be used in a larger conformance testing context (e.g. during testing of a message exchange protocol)?
Should the validator be publicly accessible?
Should test data and validation reports be treated as confidential?

The first choice to make is on the type of solution that will be used to power your validation service:

Standalone validator: A service allowing validation of individual XML instances based on a predefined configuration of validation artefacts, including XML Schema (XSD) (for syntax validation) and Schematron (for business rule validation). The service supports fine-grained customisation and configuration of different validation types (e.g. specification versions) and supported communication channels. Importantly, use of the validator is anonymous and it is fully stateless in that none of the test data or validation reports are maintained once validation completes.
Complete Test Bed: The Test Bed is used to realise a full conformance testing campaign. It supports the definition of test scenarios as test cases, organised in test suites that are linked to specifications. Access is account-based allowing users to claim conformance to specifications and execute in a self-service manner their defined test cases. All results are recorded to allow detailed reporting, monitoring and eventually certification. Test cases can address XML validation but are not limited to that, allowing validation of any complex exchange of information.

It is important to note that these two approaches are by no means exclusive. It is often the case that a standalone validator is defined as a first step that is subsequently used from within test cases in the Test Bed. The former solution offers a community tool to facilitate work towards compliance supporting ad-hoc data validation, whereas the latter allows for rigorous conformance testing to take place where proof of conformance is required. This could apply in cases where conformance is a qualification criterion before receiving funding or before being accepted as a partner in a distributed system. Finally, it is interesting to consider that non-trivial XML validation may involve multiple validation artefacts (e.g. Schematron files). In such a case, even if ad-hoc data validation is not needed, defining a separate validator simplifies management of the validation artefacts by consolidating them in a single location, as opposed to bundling them within test suites.

Regardless of the choice of solution, the next point to consider will be the type of access. If public access is important then the obvious choice is to allow access over the Internet. An alternative would be an installation that allows access only through a restricted network, be it an organisation’s internal network or a virtual private network accessible only by your community’s members. Finally, an extreme case would be access limited to individual workstations where each community member would be expected to run the service locally (albeit of course without the expectation to test message exchanges with remote parties).

If access to your validation services over the Internet is preferred or at least acceptable, the simplest case is to opt for using the shared DIGIT Test Bed resources, both regarding the standalone validator and the Test Bed itself. If such access is not acceptable or is technically not possible (e.g. access to private resources is needed), the proposed approach would be to go for a Docker-based on-premise installation of all components.

Summarising the options laid out in this section, you will first want to choose:

Whether you will be needing a standalone validator, a complete Test Bed or both.
Whether the validator and/or Test Bed will be accessible over the Internet or not.

Your choices here can help you better navigate the remaining steps of this guide. Specifically:

Step 2: Prepare validation artefacts and Step 3: Prepare validator configuration can be skipped if you just want a quick deployment for testing with a generic validator that allows you to upload your own schemas before validating.
Step 4: Setup validator as Docker container can be skipped if you are interested only in a public service or if you plan to only use the validator as part of conformance testing scenarios (i.e. within the Test Bed).
Step 5: Setup validator on Test Bed can be skipped if a publicly accessible service is not an option for you.
Step 7: Use the validator in GITB TDL test cases can be skipped if you only want data validation without additional conformance testing scenarios.

Step 2: Prepare validation artefacts

As an example case for XML validation we will consider the EU purchase order case first seen in Guide: Creating a test suite. In short, for the purposes of this guide you are considered to be leading an EU cross-border initiative to define a new common specification for the exchange of purchase orders between retailers.

To specify the content of purchase orders your experts have created the following XML Schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://itb.ec.europa.eu/sample/po.xsd" xmlns="http://itb.ec.europa.eu/sample/po.xsd" elementFormDefault="qualified">

  <xs:element name="purchaseOrder" type="PurchaseOrderType"/>

  <xs:element name="comment" type="xs:string"/>

  <xs:complexType name="PurchaseOrderType">
    <xs:sequence>
      <xs:element name="shipTo" type="Address"/>
      <xs:element name="billTo" type="Address"/>
      <xs:element ref="comment" minOccurs="0"/>
      <xs:element name="items"  type="Items"/>
    </xs:sequence>
    <xs:attribute name="orderDate" type="xs:date"/>
  </xs:complexType>

  <xs:complexType name="Address">
    <xs:sequence>
      <xs:element name="name"   type="xs:string"/>
      <xs:element name="street" type="xs:string"/>
      <xs:element name="city"   type="xs:string"/>
      <xs:element name="zip"    type="xs:decimal"/>
    </xs:sequence>
    <xs:attribute name="country" type="CountryType" use="required"/>
  </xs:complexType>

  <xs:complexType name="Items">
    <xs:sequence>
      <xs:element name="item" minOccurs="0" maxOccurs="unbounded">
        <xs:complexType>
          <xs:sequence>
            <xs:element name="productName" type="xs:string"/>
            <xs:element name="quantity" type="xs:positiveInteger"/>
            <xs:element name="priceEUR"    type="xs:decimal"/>
            <xs:element ref="comment"   minOccurs="0"/>
          </xs:sequence>
          <xs:attribute name="partNum" type="xs:string" use="required"/>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
  </xs:complexType>

  <xs:simpleType name="CountryType">
    <xs:restriction base="xs:string">
      <xs:pattern value="[A-Z]{2}"/>
    </xs:restriction>
  </xs:simpleType>

</xs:schema>

Based on this, a sample purchase order would be as follows:

<?xml version="1.0"?>
<purchaseOrder xmlns="http://itb.ec.europa.eu/sample/po.xsd" orderDate="2018-01-22">
  <shipTo country="BE">
    <name>John Doe</name>
    <street>Europa Avenue 123</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </shipTo>
  <billTo country="BE">
    <name>Jane Doe</name>
    <street>Europa Avenue 210</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </billTo>
  <comment>Send in one package please</comment>
  <items>
    <item partNum="XYZ-123876">
      <productName>Mouse</productName>
      <quantity>20</quantity>
      <priceEUR>15.99</priceEUR>
      <comment>Confirm this is wireless</comment>
    </item>
    <item partNum="ABC-32478">
      <productName>Keyboard</productName>
      <quantity>15</quantity>
      <priceEUR>25.50</priceEUR>
    </item>
  </items>
</purchaseOrder>

A first obvious validation for purchase orders would be against their XML Schema. However, your business requirements also define the concept of a large purchase order which is one that includes more than 10 of each ordered item. This restriction is not reflected in the XML Schema which is considered as a base for all purchase orders but rather in a Schematron rule file that checks this only for orders that are supposed to be “large”. Such a rule file would be as follows:

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
  <title>Large Purchase Order business rules</title>
  <ns prefix="po" uri="http://itb.ec.europa.eu/sample/po.xsd"/>
  <pattern name="Check order items">
    <rule context="/po:purchaseOrder/po:items/po:item">
      <assert test="number(po:quantity) > 10" flag="fatal" id="PO-01">[PO-01] The quantities of items for large orders must be greater than 10.</assert>
    </rule>
  </pattern>	
</schema>

Given these requirements and validation artefacts we want to support two types of validation (or profiles):

basic: For all purchase orders acting as a common base. This is realised by XML Schema validation.
large: For large purchase orders. This includes validation against the XML Schema and also against the relevant Schematron rule.

As the first configuration step for the validator we will prepare a folder with the required resources. For this purpose create a root folder named validator with the following subfolders and files:

validator
└── resources
    └── order
        ├── sch
        │   └── LargePurchaseOrder.sch
        └── xsd
            └── PurchaseOrder.xsd

You will likely note that we are creating several folders of no obvious use. Nonetheless please follow this structure as it will facilitate subsequent steps where we add resources depending on our needs. In terms of meaning of these folders consider the following:

validator is the root folder for all files.
resources is the root folder for all files that will be considered by the validator.
order is the root folder for all files pertinent to purchase order validation. We separate this as the validator could be used to also validate completely different content.
sch is the folder containing all Schematron files.
xsd is the folder containing all XML Schemas.

Regarding the PurchaseOrder.xsd and LargePurchaseOrder.sch files you can create them from the above content or download them (here: PurchaseOrder.xsd and LargePurchaseOrder.sch). Finally, note that you are free to use any names for the files and folders; the ones used here will however be the ones considered in this guide’s subsequent steps.

Note

Schematrons in pure and XSLT format: The Schematron rules presented above are in “pure” Schematron format. Schematron used in production should be first converted to XSLT as this allows most efficient processing. Using XSLT is also required in case your rules refer to built-in or custom functions. As an example, and in case you are familiar with the Apache Maven build tool, a popular approach for converting Schematron files is the ph-schematron-maven-plugin.

Step 3: Prepare validator configuration

After having defined your testing needs and the validation artefacts for your specific case, the next step will be to configure the validator. The validator is defined by a core engine maintained by the Test Bed team and a layer of configuration, provided by you, that defines its use for a specific scenario. In terms of features the validator supports the following:

Validation channels including a REST web service API, a SOAP web service API, a web user interface, validation via email and a command-line tool.
Configuration of XML Schema and Schematron validation artefacts to drive the validation that can be local or remote.
Definition of different validation types as logically-related sets of validation artefacts.
Support per validation type allowing user-provided XML Schema and Schematron extensions.
Definition of separate validator configurations that are logically split but run as part of a single validator instance. Such configurations are termed “validation domains”.
Customisation of all texts presented to users.

Configuration is provided by means of key-value pairs in a property file. This file can be named as you want but needs to end with the .properties extension. In our case we will name this config.properties and place it within the order folder. Recall that the purpose of this folder is to store all resources relevant to purchase order validation. These are the validation artefacts themselves (PurchaseOrder.xsd and LargePurchaseOrder.sch) and the configuration file (config.properties).

Define the content of the config.properties file as follows:

# The different types of validation to support. These values are reflected in other properties.
validator.type = basic, large
# Labels to describe the defined types.
validator.typeLabel.basic = Basic purchase order
validator.typeLabel.large = Large purchase order
# Validation artefacts (XML Schema) to consider for the "basic" type.
validator.schemaFile.basic = xsd/PurchaseOrder.xsd
# Validation artefacts (XML Schema and Schematron) to consider for the "large" type.
validator.schemaFile.large = xsd/PurchaseOrder.xsd
validator.schematronFile.large = sch/LargePurchaseOrder.sch
# The title to display for the validator's user interface.
validator.uploadTitle = Purchase Order Validator

All validator properties share a validator. prefix. The validator.type property is key as it defines one or more types of validation that will be supported (multiple are provided as a comma-separated list of values). The values provided here are important not only because they define the available validation types but also because they drive most other configuration properties. Regarding the validation artefacts themselves, these are provided by means of the validator.schemaFile and validator.schematronFile properties:

validator.schemaFile.TYPE defines one or more (comma-separated) file paths (relative to the configuration file) to lookup XML Schema files.
validator.schematronFile.TYPE defines one or more (comma-separated) file paths (relative to the configuration file) to lookup Schematron files.

Using these properties you define the validator’s validation artefacts as local files, where in both cases each provided path can be for a file or a folder. If a folder is referenced it will load all contained top-level files (i.e. ignoring subfolders). Note that if your XML Schema or Schematron files import or include other files you need to only point to the “root” or “master” file(s) per case. In case you want your validator to skip XML Schema or Schematron validation you would simply not include any of the relevant configuration properties.

Note

Further validation artefact configuration: You may also define validation artefacts as remote resource references and/or as being user-provided. In addition, you can pre-process configured artefacts before they are used for validation.

The purpose of the remaining properties is to customise the text descriptions presented to users:

validator.typeLabel defines a label to present to users on the validator’s user interface for the type in question.
validator.uploadTitle defines the title label to present to users on the validator’s user interface.

Once you have created the config.properties file, the validator folder should be as follows:

validator
└── resources
    └── order
        ├── sch
        │   └── LargePurchaseOrder.sch
        ├── xsd
        │   └── PurchaseOrder.xsd
        └── config.properties

This limited configuration file assumes numerous default configuration properties. An important example is that by default, the validator will expose a web user interface, SOAP web service API and REST web service API. This configuration is driven through the validator.channels property that by default is set to form, soap_api, rest_api (for a user form and SOAP web service respectively). All configuration properties provided in config.properties relate to the specific domain in question, notably purchase orders, reflected in the validator’s resources as the order folder. Although rarely needed, you may define additional validation domains each with its own set of validation artefacts and configuration file (see Configuring additional validation domains for details on this). Finally, if you are planning to host your own validator instance you can also define configuration at the level of the complete validator (see Additional configuration options regarding application-level configuration options).

For the complete reference of all available configuration properties and their default values refer to section Validator configuration properties.

Remote validation artefacts

Defining the validator’s artefacts as local files is not the only option. If these are available online you can also reference them remotely by means of the following properties:

validator.schemaFile.TYPE.remote.N.url for an XML Schema that is to be loaded remotely (e.g. from a GitHub repository).
validator.schematronFile.TYPE.remote.N.url for one or more Schematron files that are to be loaded remotely (e.g. from a GitHub repository).

The N element in the properties’ names is a zero-based positive integer allowing you to define more than one entries to match the number of remote files. Similar to the case of local files, you are expected to only reference the “master” or “root” files assuming that included resources can also be looked up remotely based on the defined locations. Note that loading referenced resources from within remote Schematrons (e.g. via sch:include) is currently not supported.

The example that follows illustrates the loading of one remote XML Schema file and one Schematron file for a validation type named v2 from a remote location:

validator.type = v2
...
validator.schemaFile.v2.remote.0.url = https://my.server.com/my_rules_1.xsd
validator.schematronFile.v2.remote.0.url = https://my.server.com/my_rules_2.sch

In case remote validation artefacts fail to be retrieved, you may choose to report this to your users. This is achieved by using property validator.validator.remoteArtefactLoadErrors.TYPE to adapt this for a given validation type, or validator.validator.remoteArtefactLoadErrors to set your default approach (see Domain-level configuration). The values you may set are:

fail, to log the error, immediately stop validation and report this as an error to the user.
warn, to log the error, continue validation, but display a warning to the user that the results may be incomplete.
log, considered by default, to log the error but continue validation normally without notifying the user.

You may also combine local and remote Schematron files by defining a validator.schematronFile.TYPE property and one or more validator.schematronFile.TYPE.remote.N.url properties. In all cases, the Schematrons from all sources will be aggregated into a single model for the validation. Such combinations are not possible for XML Schemas where only one schema source is considered.

Note

Remote XML Schema and Schematron caching: Caching is used to avoid constant lookups of remote files. Once loaded, remote files will be automatically refreshed every hour.

User-provided validation artefacts

Apart from defining the XML Schema and Schematron to apply as local and/or remote files, you may also define for a given validation type to allow or not user-provided XSDs and user-provided Schematrons. This is achieved through the following properties:

...
validator.externalSchemaFile.TYPE = optional
validator.externalSchematronFile.TYPE = required

These properties allow three possible values:

required: The relevant validation artefact(s) must be provided by the user.
optional: Providing the relevant validation artefact(s) is allowed but not mandatory.
none (the default value): No such validation artefacts are requested or considered.

Specifying that for a given validation type you allow users to provide XML Schema and Schematron artefacts will result in any such extensions being combined with your predefined artefacts (if present). This could be useful in scenarios where you want to define a common validation base but allow also ad-hoc extensions for e.g. restrictions defined at user-level (e.g. National validation rules to consider in addition to a common set of EU rules). In the case of XML Schemas, if predefined ones are present no user-provided ones are ever considered (i.e. the relevant property is fixed as none).

Note

Generic validator: It is possible to not predefine any XML Schemas or Schematron resulting in a validator that is trully generic, expecting all schemas to be provided by users. Such a generic instance actually exists at https://www.itb.ec.europa.eu/xml/upload. This generic validator will automatically be set up if you don’t specify validator configurations.

User-provided context files

Depending on how you have designed your validation artefacts you may need your users to supply complementary inputs to be used as context files for the validation. These are XML files that you would reference in Schematron files as user-provided configuration. They are not validated themselves but rather serve to customise the validation for each user.

Given that you need to refer to such configuration files from pre-configured Schematrons, you need to specify the path at which they will be recorded and referred to for the validation. For example you may expect your users to supply an XML file listing settings, that in your Schematron you refer to as such:

...
<xsl:value-of select="document('context/userSettings.xml')/*:settings/*:settingOne/text()"/>
...

In this case, users will be expected to provide an XML file that will be stored at path context/userSettings.xml for your Schematron rules to find it. Furthermore, you may also want to run an XSD validation on this context file to ensure that it matches what you expect before you proceed with the validation of the actual input. Any failures in this XSD validation will not figure in the produced report, but will raise an error preventing the validation from proceeding. You will also likely want to customise the label presented for this context file’s input to make it more intuitive for users. Finally, everything explained above may be also be replicated more than once if you expect more than one context files.

The configuration for context files is driven through the validator’s domain properties. To configure a context file for a given validation type you would use configuration such as the following:

...
# The path at which to record the context file.
# The path is provided as a relative path starting from the domain root folder.
validator.contextFile.TYPE.0.path = path/to/file.xml
# An optional XSD to validate the context file with.
# The XSD path is provided as a relative path starting from the domain root folder.
validator.contextFile.TYPE.0.schema = path/to/schema.xsd
# An optional custom label to display on the context file's input control.
validator.contextFile.TYPE.0.label = Settings
# An optional custom placeholder text to display on the context file's input control.
validator.contextFile.TYPE.0.placeholder = Select settings...

From the above properties only the path is mandatory. As such the most basic context file configuration would be as follows:

...
# The path at which to record the context file.
validator.contextFile.TYPE.0.path = path/to/file.xml

Note above the 0 index in the property names. This corresponds to the first context file to be requested but you can add additional sets of properties with incremental indexes to require multiple files. In this case it is especially interesting to also define at least the label as this will help users distinguish the purpose of each file.

...
# First context file.
validator.contextFile.TYPE.0.path = path/to/file1.xml
validator.contextFile.TYPE.0.label = Settings
# Second context file.
validator.contextFile.TYPE.1.path = path/to/file2.xml
validator.contextFile.TYPE.1.label = Code list

Finally, in case multiple validation types define the same configuration files you can also define them as defaultContextFile entries, which will have them apply for all types. Note nonetheless that if you define both default files and type-specific ones for a given validation type, only the type-specific ones will be considered for the validation type in question.

...
# Default context files applying to all validation types without type-specific files.
validator.defaultContextFile.0.path = path/to/file.xml
validator.defaultContextFile.0.label = Settings

Combining context files with the input

In the previous section we saw how context files can complement the main input when validating Schematron rules. Such context files are placed at a configured location so that they can be looked up by XPath expressions when needed.

A further option offered by context files is to use them more directly in the validation, by combining them and the main input into a master XML file that will be used for the Schematron validation. This combination follows a template that defines which context files will be used for this purpose and how. It is interesting to note that context files used in this way are still available to also be used in XPath lookups if necessary.

The main reason why you may want to use such combined inputs, is to introduce validations that span multiple XML documents. An example scenario for this would be the validation of an XML receipt against an XML order, where we want to ensure that the order’s identifier is correctly quoted in the receipt. To achieve this we can combine both the receipt and order into a single XML document and apply Schematron rules that inspect information from both. Even though it may be possible to achieve similar results via XPath lookups (as described in the previous section), using such combined inputs might well result in simpler Schematron rules and more intuitive error reporting for the validator’s users.

Note

Validating combinations of context files and inputs only applies to Schematron validations. If your validator includes an XSD validation this will always be applied to the main input.

The configuration for context file combinations is provided in the validator’s domain properties. We include here the configuration of one or more context files, specifying as well a combinationPlaceholder for the files we want to combine. We also include the definition of a template file that defines how the combination should take place:

...
# Define a context file to be combined with the input.
validator.contextFile.TYPE.0.path = path/to/file1.xml
validator.contextFile.TYPE.0.label = Purchase order
validator.contextFile.TYPE.0.combinationPlaceholder = purchaseOrder
# Define also the template to use for the combination.
validator.contextFileCombinationTemplate.TYPE = path/to/templateFile.xml

This configuration states that if the TYPE validation type is selected, the user will also need to provide an additional “Purchase order” file that will be placed at path/to/file1.xml. In addition, this file will be combined with the input for Schematron validations, based on a XML template file defined at path/to/templateFile.xml, after replacing its placeholder purchaseOrder.

To complete this configuration you also need to define the template file (in this example at path/to/templateFile.xml) that may look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<validation>
    <receipt>${input}</receipt>
    <order>${purchaseOrder}</order>
</validation>

You will notice here that this template file defines two placeholders using the syntax ${PLACEHOLDER}:

${input}, for the main input provided to the validator (the placeholder “input” is reserved for this purpose and cannot be used for context files).
${purchaseOrder}, for the relevant context file (as defined through the combinationPlaceholder property.

The template file must define these placeholders as text nodes, but besides this constraint it may have any structure you want. You can also refer to a placeholder multiple times to include it in more that one locations if meaningful, and also include multiple context files. The following template file shows an example that includes multiple files alongside the main input:

<?xml version="1.0" encoding="UTF-8"?>
<validation>
    <receipt>${input}</receipt>
    <references>
        <order>${purchaseOrder}</order>
        <shippingNotice>${shippingNotice}</shippingNotice>
        <trackingInfo>${trackingInfo}</trackingInfo>
    </references>
</validation>

As you have seen you can define such context file combination templates per supported validation type. In case you have multiple types that share the same template, you can also define this as the overall default:

...
# Default context file combination template.
validator.defaultContextFileCombinationTemplate = path/to/defaultTemplate.xml

Note that a default combination template will only apply when a validation type includes context files with defined combinationPlaceholder properties. This means that you can have alongside a default combination template, validation types without context file combinations or without context files altogether.

Supporting options per validation type

The different types of validation supported by the validator (enumerated using property validator.type) determine the different kinds of validation that your users may select. Available types are listed in the validator’s web user interface in a dropdown list, and need to be provided as input when executing a validation.

It could be the case that your validator needs to support an extra level of granularity over the validation types. This would apply if each validation type has itself a set of additional options that actually define the specific validation to take place. For example, a validator for a specification defining rules for different types of data structures, may need to also allow users to select the desired version number. In this case we would define:

As validation types, the specification’s foreseen data structures.
As validation type options, the version numbers for each data structure.

Configuring such options can greatly simplify a validator’s configuration given that certain common data needs to be defined only once. In addition, the validator’s user interface becomes much more intuitive by listing two dropdowns in place of one: the first one to select the validation type, and the second one to select it’s specific option. The alternative, simply configuring all combinations as separate validation types, would render the validator less intuitive and more difficult to maintain.

Options are defined per validation type using validator.typeOptions.TYPE properties, for which the applicable options are defined as a string with comma-separated values. Once options are defined, most configuration properties that are specific to validation types now consider the full type as TYPE.OPTION (type followed by option and separated by .).

In terms of defining labels for options we can use:

validator.typeOptionLabel.TYPE.OPTION, for the label of an option specific to a given validation type.
validator.optionLabel.OPTION, for the label of an option that is the same across types.
validator.completeTypeOptionLabel.TYPE.OPTION, for a label to better express the combination of type plus option.

Revisiting our EU Purchase Order example we could add support for specification versions by configuring properties as follows (we skip defining labels as the option value suffices):

# Validation types
validator.type = basic, large
validator.typeLabel.basic = Basic purchase order
validator.typeLabel.large = Large purchase order
# Options
validator.typeOptions.basic = v1.2.0, v1.1.0, v1.0.0
validator.typeOptions.large = v1.1.0, v1.0.0
# Validation artefacts
validator.schemaFile.basic.v1.2.0 = v1.2.0/xsd/PurchaseOrder.xsd
validator.schemaFile.basic.v1.1.0 = v1.1.0/xsd/PurchaseOrder.xsd
validator.schemaFile.basic.v1.0.0 = v1.0.0/xsd/PurchaseOrder.xsd
validator.schemaFile.large.v1.1.0 = v1.1.0/xsd/PurchaseOrder.xsd
validator.schemaFile.large.v1.0.0 = v1.0.0/xsd/PurchaseOrder.xsd
validator.schematronFile.large.v1.1.0 = v1.1.0/sch/LargePurchaseOrder.sch
validator.schematronFile.large.v1.0.0 = v1.0.0/sch/LargePurchaseOrder.sch

Note

The configuration property reference specifies per property whether it expects the validation type, option or full type (validation type plus option) as part of its definition.

Presenting validation types in groups

Similar to supporting validation type options you can also add further organisation to your proposed validation types by means of validation type groups. These groups apply to the validator’s web user interface by presenting your validation types in separate sets. Such sets could refer to different families of specifications, different solutions, or anything basically that has a grouping meaning in the context of your validator. Configuring groups has no effect on how validation artefacts are set up, nor on other properties that apply to specific validation types.

To define groups you include in your configuration one or more validator.typeGroup.GROUP entries, set to the list of validation types the group includes. You may also provide a user-friendly name for each group through validator.typeGroupLabel.GROUP properties.

Given that the groups’ purpose is specific to your validator, you also have several options on how these are presented. Groups can be displayed as:

Inline elements included as option groups in the validation types’ dropdown list.
A separate dropdown list presented as a selection step before selecting a validation type.

To specify the groups’ presentation approach you define property validator.typeGroupPresentation, set as inline (the default, presenting groups within the validation type dropdown list), or split (presenting groups in a separate dropdown). In the latter case you would typically also want to override the label of the groups’ dropdown list through property validator.label.typeGroupLabel (the default label being “Group”).

Revisiting our EU Purchase Order example we could include groups to split the available types in “production” and “development” modes, the latter including an “experimental” configuration. The following properties illustrate how this could be achieved:

...
validator.type = basic, large, experimental
validator.typeOptions.basic = v2.1.0, v2.0.0, v1.2.0, v1.1.0
validator.typeOptions.large = v2.1.0, v2.0.0

# Define 'prod' and 'dev' groups.
validator.typeGroup.prod = basic, large
validator.typeGroup.dev = experimental
# Label the groups accordingly.
validator.typeGroupLabel.prod = Production
validator.typeGroupLabel.dev = Development
# Present a separate dropdown with the groups (as opposed to inline).
validator.typeGroupPresentation = split
# Override the groups' dropdown label.
validator.label.typeGroupLabel = Validation mode

Note

When using groups all validation types must be mapped to groups otherwise the domain’s configuration is considered invalid.

Hidden validation types

It could be the case that you want to support certain types of validation but not make them immediately apparent to users. This could be meaningful for sharing experimental configurations or supporting specific validation types besides the publicly listed ones. Such validation types would be defined as hidden, making them available via the validator’s REST, SOAP APIs and CLI, but without being listed on its user interface.

To define validation types as hidden you extend your domain configuration file with the validator.hiddenType property. This property accepts a comma-separated list of validation types, or specific validation type options.

The following example considers our example validator, for which we choose to hide the “large” type, as well as older versions of the “basic” type:

...
validator.type = basic, large
validator.typeOptions.basic = v2.1.0, v2.0.0, v1.2.0, v1.1.0
validator.typeOptions.large = v2.1.0, v2.0.0
# Hide the full "large" validation type as well as the "basic" v1.* releases.
# These can still be used but are not listed in the UI.
validator.hiddenType = basic.v1.2.0, basic.v1.1.0, large

Note

If all options under a validation type are hidden, the overall validation type will also be hidden.

Validation type aliases

Validation type aliases are alternative ways of referring to the configured validation types. They become meaningful when users refer directly to specific types, such as when using the validator’s REST API, SOAP API or REST API. Typical use cases for aliases would be:

To define an additional “latest” alias that always points to the latest version of your specifications.
To enable backwards compatibility when validation types are reorganised in a configuration update.

To define a validator alias add one or more validator.typeAlias.ALIAS properties where ALIAS is the alias you want to define. As the value of the property you set the target validation type.

Note

Validator aliases refer to full validation types, meaning the combination of validation type and option (TYPE.OPTION).

As an example consider the following configuration:

validator.type = basic, large, preview
validator.typeOptions.basic = v2.1.0, v2.0.0
validator.typeOptions.large = v2.1.0, v2.0.0

The available full validation types based on these properties are basic.v2.1.0, basic.v2.0.0, large.v2.1.0, large.v2.0.0 and preview.

Based on this example we can consider that you may want to add aliases named basic_latest and large_latest for the latest versions of each supported profile. To do so extend your configuration with the following properties:

validator.typeAlias.basic_latest = basic.v2.1.0
validator.typeAlias.large_latest = large.v2.1.0

Doing so you allow clients of your APIs that are interested in always validating against the latest specifications, to do so by referring to these aliases. Otherwise, if new versions where introduced they would need to adapt their implementation.

Domain aliases

Domain aliases are similar in concept to validation type aliases in that they allow you to adapt the validation to carry out based on the user’s request. Simply put, a user requests validation A and you internally carry out validation B.

Domain aliases however, have a a fundamentally different purpose. They are used to delegate requests to a completely separate validator configuration, defined as a separate domain. This means that not only requested validation types are adapted, but also that the complete validator configuration is loaded from the aliased domain. In this case, if domain A is defined as an alias for domain B, any requests to domain A are effectively redirected to domain B. This is not an actual web redirection mind you, although practically the result is similar.

The main reason to define an alias for a domain is to cover validator migrations and consolidations. This typically becomes interesting if you have defined several distinct validators over time, but want to now consolidate all these into a single one to provide a unified user experience. Aggregating validation artefacts into a single validator is easily achieved and then exposing them nicely organised using groups, validation types and options.

While creating the consolidated validator is straightforward, you need to consider the existing users of the now legacy validator domains. This is where aliases come in, as they allow continued use of the legacy domains, while transparently delegating to the correct, consolidated, one. Importantly, this delegation covers all validator APIs (web user interface, REST API and SOAP API).

A domain alias is defined using the validator.domainAlias property, set with the name of the domain to delegate to. Complementing this, you can also define a set of validator.domainAlias.TYPE properties so that you can map your current validation types to their equivalent types in the target domain. Note that the TYPE postfix added to these properties must be the full validation type (validation type and option separated by a dot .), pointing similarly to the full validation type or an alias in the target domain. Mappings can be omitted for validation types having the same identifier in both the current and target domains.

Note

In case validation types of the current domain cannot be fully mapped to the target, the entire domain alias configuration is ignored and the validator outputs relevant warnings to its log.

To illustrate how domain aliases work with an example, let’s revisit our fictional purchase order validator, defined through an order domain with a configuration as follows:

# Title
validator.uploadTitle = Purchase Order Validator
# Validation types
validator.type = basic, large
# Options per type.
validator.typeOptions.basic = v1.1.0, v1.0.0
validator.typeOptions.large = v1.1.0, v1.0.0
# Validation artefact mappings
validator.schemaFile.basic.v1.1.0 = xsd/PurchaseOrder_v1.1.0.xsd
validator.schematronFile.basic.v1.1.0 = sch/PurchaseOrderBasic_v1.1.0.xslt
validator.schemaFile.basic.v1.0.0 = xsd/PurchaseOrder_v1.0.0.xsd
validator.schematronFile.basic.v1.0.0 = sch/PurchaseOrderBasic_v1.0.0.xslt
validator.schemaFile.large.v1.1.0 = xsd/PurchaseOrder_v1.1.0.xsd
validator.schematronFile.large.v1.1.0 = sch/PurchaseOrderBasic_v1.1.0.xslt, sch/PurchaseOrderLarge_v1.1.0.xslt
validator.schemaFile.large.v1.0.0 = xsd/PurchaseOrder_v1.0.0.xsd
validator.schematronFile.large.v1.0.0 = sch/PurchaseOrderBasic_v1.0.0.xslt, sch/PurchaseOrderLarge_v1.0.0.xslt

Now consider that alongside the purchase order validator you have also defined an invoice validator in a separate invoice domain:

# Title
validator.uploadTitle = Invoice Validator
# Validation types
validator.type = full, summary
# Validation artefact mappings
validator.schemaFile.full = xsd/Invoice.xsd
validator.schematronFile.full = sch/InvoiceFull.xslt
validator.schemaFile.summary = xsd/Invoice.xsd
validator.schematronFile.summary = sch/InvoiceSummary.xslt

In terms of filesystem configuration for the order and invoice domains, the validator’s resources are defined as follows:

validator
└── resources
    ├── order
    │   ├── sch
    │   │   ├── PurchaseOrderBasic_v1.1.0.xslt
    │   │   ├── PurchaseOrderBasic_v1.0.0.xslt
    │   │   ├── PurchaseOrderLarge_v1.1.0.xslt
    │   │   └── PurchaseOrderLarge_v1.0.0.xslt
    │   ├── xsd
    │   │   ├── PurchaseOrder_v1.1.0.xsd
    │   │   └── PurchaseOrder_v1.0.0.xsd
    │   └── config.properties
    └── invoice
        ├── sch
        │   ├── InvoiceFull.xslt
        │   └── InvoiceSummary.xslt
        ├── xsd
        │   └── Invoice.xsd
        └── config.properties

You now want to group both validators into a new, consolidated ecommerce validator, defined through an ecommerce domain. This domain includes all purchase order and invoice artefacts, introducing groups as a first selection for the document type to validate. The configuration file of this domain will be as follows:

# Title
validator.uploadTitle = eCommerce Validator
# Validation types
validator.type = order_basic, order_large, invoice_full, invoice_summary
# Options per purchase order type.
validator.typeOptions.order_basic = v1.1.0, v1.0.0
validator.typeOptions.order_large = v1.1.0, v1.0.0
# Validation type groups
validator.typeGroup.order = order_basic, order_large
validator.typeGroup.invoice = invoice_full, invoice_summary
validator.typeGroupLabel.order = Purchase order
validator.typeGroupLabel.invoice = Invoice
validator.typeGroupPresentation = split
validator.label.typeGroupLabel = Document type
# Validation artefact mappings for purchase orders
validator.schemaFile.order_basic.v1.1.0 = order/xsd/PurchaseOrder_v1.1.0.xsd
validator.schematronFile.order_basic.v1.1.0 = order/sch/PurchaseOrderBasic_v1.1.0.xslt
validator.schemaFile.order_basic.v1.0.0 = order/xsd/PurchaseOrder_v1.0.0.xsd
validator.schematronFile.order_basic.v1.0.0 = order/sch/PurchaseOrderBasic_v1.0.0.xslt
validator.schemaFile.order_large.v1.1.0 = order/xsd/PurchaseOrder_v1.1.0.xsd
validator.schematronFile.order_large.v1.1.0 = order/sch/PurchaseOrderBasic_v1.1.0.xslt, order/sch/PurchaseOrderLarge_v1.1.0.xslt
validator.schemaFile.order_large.v1.0.0 = order/xsd/PurchaseOrder_v1.0.0.xsd
validator.schematronFile.order_large.v1.0.0 = order/sch/PurchaseOrderBasic_v1.0.0.xslt, order/sch/PurchaseOrderLarge_v1.0.0.xslt
# Validation artefact mappings for invoices
validator.schemaFile.invoice_full = invoice/xsd/Invoice.xsd
validator.schematronFile.invoice_full = invoice/sch/InvoiceFull.xslt
validator.schemaFile.invoice_summary = invoice/xsd/Invoice.xsd
validator.schematronFile.invoice_summary = invoice/sch/InvoiceSummary.xslt

We will now revisit the purchase order and invoice validator configurations, to ensure existing users can seamlessly transition to the new ecommerce validator. This is achieved by defining a domain alias per case. The order domain’s configuration is extended as follows:

validator.uploadTitle = Purchase Order Validator
...
# Delegate to the ecommerce validator
validator.domainAlias = ecommerce
validator.domainAlias.basic.v1.1.0 = order_basic.v1.1.0
validator.domainAlias.basic.v1.0.0 = order_basic.v1.0.0
validator.domainAlias.large.v1.1.0 = order_large.v1.1.0
validator.domainAlias.large.v1.0.0 = order_large.v1.0.0

Similarly, the invoice domain’s configuration is extended as follows:

validator.uploadTitle = Invoice Validator
...
# Delegate to the ecommerce validator
validator.domainAlias = ecommerce
validator.domainAlias.full = invoice_full
validator.domainAlias.summary = invoice_summary

These alias definitions, ensure that any requests to the legacy validators will be delegated transparently to the new consolidated domain. Revisiting the validator’s filesystem resources, the order, invoice and ecommerce domains are defined as follows:

validator
└── resources
    ├── order
    │   ├── sch
    │   │   ├── PurchaseOrderBasic_v1.1.0.xslt
    │   │   ├── PurchaseOrderBasic_v1.0.0.xslt
    │   │   ├── PurchaseOrderLarge_v1.1.0.xslt
    │   │   └── PurchaseOrderLarge_v1.0.0.xslt
    │   ├── xsd
    │   │   ├── PurchaseOrder_v1.1.0.xsd
    │   │   └── PurchaseOrder_v1.0.0.xsd
    │   └── config.properties
    ├── invoice
    │   ├── sch
    │   │   ├── InvoiceFull.xslt
    │   │   └── InvoiceSummary.xslt
    │   ├── xsd
    │   │   └── Invoice.xsd
    │   └── config.properties
    └── ecommerce
        ├── order
        │   ├── sch
        │   │   ├── PurchaseOrderBasic_v1.1.0.xslt
        │   │   ├── PurchaseOrderBasic_v1.0.0.xslt
        │   │   ├── PurchaseOrderLarge_v1.1.0.xslt
        │   │   └── PurchaseOrderLarge_v1.0.0.xslt
        │   └── xsd
        │       ├── PurchaseOrder_v1.1.0.xsd
        │       └── PurchaseOrder_v1.0.0.xsd
        ├── invoice
        │   ├── sch
        │   │   ├── InvoiceFull.xslt
        │   │   └── InvoiceSummary.xslt
        │   └── xsd
        │       └── Invoice.xsd
        └── config.properties

If your validator is self-hosted, this is would be the updated resource root folder you would provide as its configuration. Alternatively, if your validators are hosted by the Interoperability Test Bed, you likely have separate repositories for each configuration. This is not an issue however, as all you need to do is ensure each domain is set up correctly and the alias definitions are in place.

Note

Domain aliases work when all domains are defined in the same validator application. If this is not the case, you will need another solution such as reverse proxy rewriting and/or redirects. As this approach is outside the control of validators they are outside the scope of this guide.

Managing remote schema references

XML schemas are often composed of several related schemas by means of import and include elements. For example, a schema describing purchase orders referring to addresses, will often not define the structure of an address, but will rather import its definition from a separate schema.

Whenever such schema references are found, the validator will gracefully load them depending on how they are defined. Absolute references are loaded unchanged, whereas relative ones are retrieved based on the schema that includes them. Moreover, references can be both local and remote, with relative ones calculated accordingly based on the “parent” schema’s location.

When the validator loads a schema from a remote source it will cache it locally by default to avoid repeated lookups, with cached schemas being cleared upon validator restart. In case you want to disable the caching of remote schemas you can set property validator.skipRemoteSchemaImportCaching to true. This might be interesting in case you are referring to schemas that are in development and can change while still being served using the same URI.

...
validator.skipRemoteSchemaImportCaching = true

Assuming remote schema caching is not disabled, the validator will look up and cache schemas lazily upon first validation. Alternatively, you can choose to eagerly load schema references and cache them at startup, by setting property validator.preloadRemoteSchemaImports to true. You can also fine tune this behaviour for specific (full) validation types by defining one or more validator.preloadRemoteSchemaImports.TYPE properties, where the TYPE postfix corresponds to the validation type. In this case, the non-postfixed property (if present) will serve as the default for unspecified types.

...
# By default preload and cache all remote XSD imports.
validator.preloadRemoteSchemaImports = true
# Skip this for the 'experimental' validation type.
validator.preloadRemoteSchemaImports.experimental = false

Preloading and caching remote schema references can be interesting if you need to ensure maximum performance from the very first validation. In addition, the preloading process could be useful in discovering schema reference errors.

Note

If validator.skipRemoteSchemaImportCaching is true the preloading of schemas at startup is disabled.

A final tool at your disposal when managing remote schema references is the possibility to map them to local files. Besides avoiding remote lookups to improve performance, this could be necessary in case your validator is unable to read remote resources due to networking restrictions. Schema mappings are provided by specifying a series of validator.remoteSchemaImportMapping properties, indexed using a zero-based index, with each entry defining two properties:

uri, with the full remote URI for the schema.
file, with the path to the local file relative to the domain root folder.

To illustrate how this works, consider a purchase order schema that references two schemas: an address schema that is imported and an order item schema that is included.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <!-- Relative URI. -->
   <xs:include schemaLocation="OrderItem.xsd"/>
  <!-- Absolute URI. -->
   <xs:import schemaLocation="https://www.itb.ec.europa.eu/common/Address.xsd" namespace="http://itb.ec.europa.eu/sample/address"/>
  ...
</xs:schema>

To ensure that our validator makes no remote lookups we need to handle the loading of the address schema. To achieve this we will store a local copy of the schema and add a relevant mapping for the URI. Note that any other schemas included or imported from this schema would also need similar local copies and mappings. The validator’s domain configuration folder will look like this:

validator
└── resources
    └── order
        ├── imports
        │   ├── Address.xsd
        │   └── PostalCodes.xsd
        ├── xsd
        │   ├── OrderItem.xsd
        │   └── PurchaseOrder.xsd
        └── config.properties

In terms of configuration, the mappings are defined as follows:

...
# Mapping for the address schema.
validator.remoteSchemaImportMapping.0.uri = https://www.itb.ec.europa.eu/common/Address.xsd
validator.remoteSchemaImportMapping.0.file = imports/Address.xsd
# Mapping for a second schema included from the address schema.
validator.remoteSchemaImportMapping.1.uri = https://www.itb.ec.europa.eu/common/PostalCodes.xsd
validator.remoteSchemaImportMapping.1.file = imports/PostalCodes.xsd

Validation artefact pre-processing

An advanced configuration option available to you is to enable for a given validation type the pre-processing of the validator’s artefacts, both pre-configured ones (local or remote) as well as those provided by users (if enabled). Pre-processing allows you to run an XSLT transformation on a resource, in order to produce the final file to be used for the validation (either an XML Schema or Schematron). The input for such pre-processing can be any text-based file, allowing you to dynamically generate validation artefacts based on a flexible input.

XSLT pre-processing can be configured for all cases of resources:

Local pre-configured files (per file).
Remote pre-configured files (per file).
User-provided files (per file type).

Pre-processing is enabled by adding to the relevant configuration properties the following postfixes:

.preprocessor: The reference to a locally available XSLT file to be used for the transformation.
.preprocessor.output: The file extension for the resulting file (by default xsd for XML Schema and sch for Schematron).

Defining the .preprocessor.output postfix can be interesting for Schematron output in case the result is not a raw Schematron (.sch) but rather is itself an XSLT file.

To illustrate use of these properties consider the following sample for a v2 validation type that addresses all cases:

...
# A local XML file to be used as input to generate the XML Schema
validator.schemaFile.v2 = xsds/MyXSD.xml
validator.schemaFile.v2.preprocessor = resources/xsd_template.xslt
# A remote XML file to be used as input to generate the (raw) Schematron
validator.schematronFile.v2.remote.0.url = https://my.server.com/my_rules.xml
validator.schematronFile.v2.remote.0.url.preprocessor = resources/sch_template.xslt
validator.schematronFile.v2.remote.0.url.preprocessor.output = sch
# User-provided XML files to be used to generate (raw) Schematron files
validator.externalSchematronFile.v2 = required
validator.externalSchematronFile.v2.preprocessor = resources/sch_template.xslt
validator.externalSchematronFile.v2.preprocessor.output = sch

The above configuration effectively expects all pre-configured resources and user input to serve as input in generating the actual validation artefacts to use. The XML Schema will be generated using XSLT file resources/sch_template.xslt whereas Schematrons will be generated using the resources/sch_template.xslt file.

Using such pre-processing can allow you to achieve powerful customisations for your XML validator. A good example would be a configuration with a fixed, predefined XML Schema, with Schematron files that are generated on-the-fly based on one or more XML files provided by the user. To complement such customisation, the label used to prompt users for Schematron files would also be adapted via the validator.label.externalSchematronLabel property to reflect the expected XML input files (see Properties related to UI labels).

Input pre-processing

A feature similar in concept to the pre-processing of validation artefacts is the option to pre-process the validator’s input. The purpose of such pre-processing is to focus the validation on a specific part of the incoming XML content rather than the entire document. The typical use case for this is when the content of interest is included within a container construct or when a metadata header is present such as the Standard Business Document Header (SBDH). In this case your validation artefacts (XSDs, and Schematrons) would likely be tailored towards your business payload and should ignore such headers and container structures to focus on the payload itself. Alternatively you may have separate validation types focusing on different types of validation such as header-only, payload-only or the complete document.

Pre-processing of input can be configured in your validator by means of XPath expressions, applying such an expression for the validation types you need to. Once your validator receives the input XML for a given validator type, it will check to see whether an XPath expression is defined for that type to pre-process the input before validating. Configuring input pre-processing expressions is done through validator.input.preprocessor.TYPE properties in your domain configuration file.

For example if you have XML content such as the following:

<root xmlns="http://www.foo.org/">
   <header>
      ...
   </header>
   <payload>
      <po:purchaseOrder xmlns:po="http://itb.ec.europa.eu/sample/po.xsd">
         ...
      </po:purchaseOrder>
   </payload>
</root>

You could define different types of validation to focus on the header, the payload or the complete document as follows:

...
validator.type = header, payload, full
...
# Expression to extract the header.
validator.input.preprocessor.header = //*[local-name() = 'header' and namespace-uri() = 'http://www.foo.org']
# Expression to extract the payload.
validator.input.preprocessor.payload = //*[local-name() = 'purchaseOrder' and namespace-uri() = 'http://itb.ec.europa.eu/sample/po.xsd']
# No need to specify an expression for the "full" type as content will be validated as-is.

Input transformation

A feature similar to that of input pre-processing using XPath expressions, is the transformation of the input using an XSLT stylesheet. With XSLT you can achieve more complex transformations than XPath, as it allows you to adapt both the structure and content of the received XML. An example scenario where this could be useful is if you expect certain users to provide data for validation that is not fully aligned to your specifications. In this case rather than adapt your XSDs and Schematrons to accommodate such exceptions, you could configure additional validation types and options that will modify the input on the fly, resulting in the expected format.

Applying an XSLT transformation is achieved using validator.input.transformer.TYPE properties in your domain configuration file (one optional property set per validation type). The value of such a property is the path to a XSLT file, relative to the domain root folder.

To illustrate this with an example, we could extend our sample configuration to address users that are currently generating purchase orders that differ slightly from the specification. Specifically, instead of listing order items’ part numbers as attributes they are using child elements. The following XSLT can be used to identify such cases and convert them to the expected structure:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:po="http://itb.ec.europa.eu/sample/po.xsd">
  <xsl:output method="xml" indent="yes"/>
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  <!-- Match items defining part numbers as child elements. -->
  <xsl:template match="po:item[exists(po:partNum)]">
    <!-- Copy the item, setting the part number as an attribute and then skipping the element. -->
    <xsl:copy>
        <xsl:attribute name="partNum"><xsl:value-of select="po:partNum/text()"/></xsl:attribute>
        <xsl:apply-templates select="node() except po:partNum"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>

Given this XSLT, we can place it under our domain root folder (for example in xslts/item_conversion.xslt) and configure it as an input transformation step for a new validation type called largeConverted:

...
validator.type = basic, large, largeConverted
...
validator.input.transformer.largeConverted = xslts/item_conversion.xslt

Note

Using both input pre-processing and transformation: In case you have defined both input pre-processing and transformation, the transformation is always applied after XPath pre-processing has taken place. Using both could be interesting if you want to first use XPath to extract the part of the input to focus on, and then use XSLT to transform it to what you expect.

Continue validation in case of XSD errors

XML validators often use a combination of XML Schema (XSD) and Schematron to validate, respectively, the structure and content of XML data. We saw this already in our sample configuration where to validate “large” purchase orders we used a Schematron rule as an extension over the basic XSD validation.

When combining XSD and Schematron checks we typically want to first validate against the configured XSD to ensure the XML data is structurally what we expect, before proceeding with Schematron checks. Moreover, it usually makes sense to skip Schematron checks altogether in case of XSD errors, given the possibility of inconsistent results due to the data’s invalid structure.

You can override this default behaviour by forcing the validator to proceed with Schematron checks regardless of XSD failures. This is achieved through the validator.stopOnXsdErrors boolean property that can be defined as follows:

validator.stopOnXsdErrors, to set the default behaviour for all validation type.
validator.stopOnXsdErrors.TYPE, to override the behaviour for a given (full) validation type.

Coming back to our example specification, we can adapt the configuration for large purchase orders by ensuring validation proceeds even if XSD errors are found:

...
validator.type = basic, large
...
# For the "large" validation type, continue even in the presence of XSD errors.
validator.stopOnXsdErrors.large = false

Note

Property validator.stopOnXsdErrors has no effect for validation types without Schematron rules.

Adding a custom banner and footer

The validator foresees configuration properties to further customise the validator’s web user interface by adding a rich banner and/or footer. Adding at least a banner to your validator goes a long way in making it more user friendly, providing needed context, but also references to supporting resources and support details.

A good example of a simple banner can be drawn from the Test Bed’s generic XML validator where apart from explaining the purpose of the validator, users are also informed of its alternate APIs and contact information.

Banners can nonetheless be as complex as needed by introducing more advanced layouts, styling, and images, to provide further information and theming. A good example of such a banner is from the Public Service Data validator, a demo instance of the Test Bed’s RDF validator that includes images, text formatting, and popups (upon clicking provided links).

Adding a banner and footer to your validator is achieved by setting, respectively, the domain configuration properties validator.bannerHtml and validator.footerHtml with a snippet of minified HTML code. This can also include CSS potentially also making use of Bootstrap styles and components (version 3.*) that come already packaged in the validator. In case you need to include JavaScript, you can add this in minified form through the separate validator.javascriptExtension property. In such scripts you may also use JQuery.

Note

Banners and footers are only available for the validator’s normal user interface. If enabled, the validator’s minimal interface hides banners and footers to provide a more concise presentation.

A common approach to customise and test a new HTML snippet is to draft its general structure, run it on a browser, and use the browser’s developer tools to fine-tune it. Once it suits your needs, the snippet is then minified (e.g. using any of the various online utilities for HTML and JavaScript), before being set to the corresponding domain property (validator.bannerHtml, validator.footerHtml or validator.javascriptExtension).

As an example case, but also as a good starting point for your own banner, we can consider the banner configured for the generic XML validator. The banner’s (pretty-printed) HTML snippet is the following:

<div>
    <div style="display: table;">
        <div style="display: table-row;">
            <div style="display: table-cell; cursor: pointer;" class="validatorReload">
                <h1>XML validator</h1>
            </div>
        </div>
        <div style="display: table-row;">
            <div style="display: table-cell; padding-top: 20px;">
                <p> This service allows you to validate arbitrary XML content against <a href="https://www.w3.org/standards/xml/schema">XML Schema</a> (to validate structure) and <a href="http://schematron.com/">Schematron</a> (to validate content). It is also offered via a <a href="https://www.itb.ec.europa.eu/xml/api/validation?wsdl">SOAP API</a> and a <a href="https://www.itb.ec.europa.eu/xml-offline/xml/validator.zip">command-line tool</a> with further information available <a href="https://www.itb.ec.europa.eu/docs/guides/latest/validatingXML/">here</a>. Questions and feedback can be sent to <a href="mailto:DIGIT-ITB@ec.europa.eu">DIGIT-ITB@ec.europa.eu</a>. </p>
            </div>
        </div>
    </div>
    <hr>
</div>

The “validatorReload” class above is a marker class to allow us to attach basic JavaScript behaviour to reload the page on click. This is achieved by this script:

$(document).ready(function() {
    $(".validatorReload").off().on("click", function() { window.location.href="upload"; });
});

Taking these snippets we can then minify them, to produce single line versions ready to set in properties validator.bannerHtml and validator.javascriptExtension as follows:

...
validator.bannerHtml=<div><div style="display:table"><div style="display:table-row"><div style="display:table-cell;cursor:pointer" class="validatorReload"><h1>XML validator</h1></div></div><div style="display:table-row"><div style="display:table-cell;padding-top:20px"><p>This service allows you to validate arbitrary XML content against<a href="https://www.w3.org/standards/xml/schema">XML Schema</a>(to validate structure) and<a href="http://schematron.com/">Schematron</a>(to validate content). It is also offered via a<a href="https://www.itb.ec.europa.eu/xml/api/validation?wsdl">SOAP API</a>and a<a href="https://www.itb.ec.europa.eu/xml-offline/xml/validator.zip">command-line tool</a>with further information available<a href="https://www.itb.ec.europa.eu/docs/guides/latest/validatingXML/">here</a>. Questions and feedback can be sent to<a href="mailto:DIGIT-ITB@ec.europa.eu">DIGIT-ITB@ec.europa.eu</a>.</p></div></div></div><hr></div>
validator.javascriptExtension=$(document).ready(function() { $(".validatorReload").off().on("click", function() { window.location.href="upload"; });});

Note

Use of CSS and images: For simple content the easiest approach is to define it inline within your HTML. For images it is best to add them as remote references, although for very small ones you may also define them inline as data URLs. Keep in mind however to try and keep the content’s size small as such inline content is not browser-cacheable.

Supporting multiple languages

Certain configuration properties we have seen up to now define texts that are visible to the validator’s users. Examples of these include the title of the validator’s user interface (validator.uploadTitle) or the labels to present for the available validation types (validator.typeLabel.TYPE), which in the sample configuration are set with English values. Depending on your validator’s audience you may want to switch to a different language or support several languages at the same time. Supporting multiple languages affects:

The texts, labels and messages presented on the validator’s user interface.
The reports produced after validating content via any of the validator’s interfaces.

The text values used by default by the validator are defined in English (see default values here), with English being the language considered by the validator if no other is selected. If your validator needs to support only a single language, a simple approach is to ensure that the domain-level configuration properties for texts presented to users are defined in the domain configuration file with the values for your selected language. Note that as long as the validator’s target language is an EU official language you need not provide translations for user interface labels and messages as these are defined by the validator itself. You are nonetheless free to redefine these to override the defaults or to define them for a non-supported language.

In case you want your validator to support multiple languages at the same time you need to adapt your configuration to define the supported languages and their specific translations. To do this adapt your domain configuration property file making use of the following properties:

validator.locale.available: The list of languages to be supported by the validator, provided as a comma-separated list of language codes (locales). The order these are defined with determines their presentation order in the validator’s user interface.
validator.locale.default: The validator’s default language, considered if no specific language has been requested. If multiple languages are supported the default needs to be set to one of these.
validator.locale.translations: The path to a folder (absolute or relative to the domain configuration file) that contains the translation property files for the validator’s supported languages.

Each language (locale) is defined as a 2-character lowercase language code (based on ISO 639), followed by an optional 2-character uppercase country code (based on ISO 3166) separated using an underscore character (_). The format is in fact identical to that used to define locales in the Java language. Valid examples include “fr” for French, “fr_FR” for French in France, or “fr_BE” for French in Belgium. Such language codes are the values expected to be used for properties validator.locale.available and validator.locale.default.

Regarding property validator.locale.translations, the value is expected to be a folder containing the translation files for your selected languages. These are defined exactly as you would define a resource bundle in a Java program, specifically:

The names of all property files start with the same prefix. Any value can be selected with good examples being “labels” or “translations”.
The common prefix is followed by the relevant locale value (language code and country code) separated with an underscore.
The files’ extension is set as “.properties”.

Considering the above, good examples of translation property file names would be “labels_de.properties”, “labels_fr.properties” and “labels_fr_FR.properties”. Note that these files are implicitly hierarchical meaning that for related locales you need not redefine all properties. For example you may have your French texts defined in “labels_fr.properties” and only override what you need in “labels_fr_BE.properties”. You can also define an overall default translation file by omitting the locale in its name (labels.properties) which will be used when no locale-specific file exists or if it exists but does not include the given property key. Note additionally that if you define translatable text values in your main domain configuration file these are considered as overall defaults if no specific translations could be found in translation files.

In terms of contents, the translation files are simple property files including key-value pairs. Each such pair defines as its key the property key for the given text, with the value being the translation to use. The properties that can be defined in such files are:

Any domain-level configuration properties that are marked as being a translatable String.
Any user interface labels and messages if you want to override the default translations.

Considering that you typically wouldn’t need to override labels and messages, the texts you would need to translate are the ones relevant to your specification. These are most often the following:

The title of the validator’s UI (validator.uploadTitle).
The UI’s HTML banner content (validator.bannerHtml), which can be customised as explained in Adding a custom banner and footer.
The descriptions for the validation types that you define and their options (validator.typeLabel.TYPE, validator.optionLabel.OPTION, validator.typeOptionLabel.TYPE.OPTION and validator.completeTypeOptionLabel.TYPE.OPTION).

The information up to this point covers the translation of texts, labels and messages but has not yet addressed the validator’s validation artefacts. For the XML validator these artefacts are XML Schemas and Schematron files to validate XML syntax and content respectively. For XML Schemas default translations are already in place for you but you will still need to define language-specific messages for your Schematron files. This is done by defining a diagnostics element within which you define a diagnostic for each translated message. The diagnostic messages define an identifier (attribute id) to allow referencing, and a language attribute (attribute xml:lang) to specify the relevant language. The Schematron assertions (elements assert) are then updated to reference with their diagnostics attributes the IDs of the corresponding diagnostic elements, separated with spaces.

To illustrate how all this comes together let’s revisit our Purchase Order example. In our current, single-language and English-only setup, the configuration files are structured as follows:

validator
└── resources
    └── order
        ├── sch
        │   └── LargePurchaseOrder.sch
        ├── xsd
        │   └── PurchaseOrder.xsd
        └── config.properties

The domain configuration file (config.properties) defines itself the user-presented texts (see highlighted lines):

validator.type = basic, large
validator.typeLabel.basic = Basic purchase order
validator.typeLabel.large = Large purchase order
validator.schemaFile.basic = xsd/PurchaseOrder.xsd
validator.schemaFile.large = xsd/PurchaseOrder.xsd
validator.schematronFile.large = sch/LargePurchaseOrder.sch
validator.uploadTitle = Purchase Order Validator

In addition, our Schematron file (LargePurchaseOrder.sch) includes in its assert element the text of the rule’s error message in English:

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <title>Large Purchase Order business rules</title>
    <ns prefix="po" uri="http://itb.ec.europa.eu/sample/po.xsd"/>
    <pattern name="Check order items">
        <rule context="/po:purchaseOrder/po:items/po:item">
            <assert test="number(po:quantity) > 10" flag="fatal" id="PO-01">[PO-01] The quantities of items for large orders must be greater than 10.</assert>
        </rule>
    </pattern>
</schema>

Starting from this point we will make the necessary changes to support alongside English (which remains the default language), German and French translations. The first step is to adapt the config.properties file to remove the contained translations. We could have kept these here for English but as we will be adding specific translation files it is cleaner to move all translations to them. The content of config.properties becomes now as follows:

validator.type = basic, large
validator.schemaFile.basic = xsd/PurchaseOrder.xsd
validator.schemaFile.large = xsd/PurchaseOrder.xsd
validator.schematronFile.large = sch/LargePurchaseOrder.sch
validator.locale.available = en,fr,de
validator.locale.default = en
validator.locale.translations = translations

To define the translations we will introduce a new folder translations (as defined in property validator.locale.translations) that includes the property files per locale:

validator
└── resources
    └── order
        ├── sch
        │   └── LargePurchaseOrder.sch
        ├── xsd
        │   └── PurchaseOrder.xsd
        ├── translations
        │   ├── labels_en.properties
        │   ├── labels_fr.properties
        │   └── labels_de.properties
        └── config.properties

The English translations are provided in labels_en.properties (these are simply moved here from the config.properties file):

validator.typeLabel.basic = Basic purchase order
validator.typeLabel.large = Large purchase order
validator.uploadTitle = Purchase Order Validator

French translations are defined in labels_fr.properties:

validator.typeLabel.basic = Bon de commande de base
validator.typeLabel.large = Bon de commande important
validator.uploadTitle = Validateur de bon de commande

And finally German translations are defined in labels_de.properties:

validator.typeLabel.basic = Grundbestellung
validator.typeLabel.large = Großbestellung
validator.uploadTitle = Bestellbestätigung

Finally, we must not forget to adapt LargePurchaseOrder.sch with the error messages per language:

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <title>Large Purchase Order business rules</title>
    <ns prefix="po" uri="http://itb.ec.europa.eu/sample/po.xsd"/>
    <pattern name="Check order items">
        <rule context="/po:purchaseOrder/po:items/po:item">
            <assert test="number(po:quantity) > 10" flag="fatal" id="PO-01" diagnostics="d01_en d01_fr d01_de"/>
        </rule>
    </pattern>
    <diagnostics>
        <diagnostic id="d01_en" xml:lang="en">[PO-01] The quantities of items for large orders must be greater than 10.</diagnostic>
        <diagnostic id="d01_fr" xml:lang="fr">[PO-01] Les quantités d'articles pour les grosses commandes doivent être supérieures à 10.</diagnostic>
        <diagnostic id="d01_de" xml:lang="de">[PO-01] Die Artikelmengen für Großbestellungen müssen größer als 10 sein.</diagnostic>
    </diagnostics>
</schema>

This completes the validator’s localisation configuration. With this setup in place, the user will be able to select one of the supported languages to change the validator’s user interface and resulting report. Note that localised reports can also now be produced from the validator’s REST API, SOAP API and command-line tool.

Validation metadata in reports

The machine-processable report produced when calling the validator via its SOAP API, REST API, or downloaded from its user interface, uses the GITB Test Reporting Language (TRL). The GITB TRL is an XML format, but when using the REST API in particular, it may also be generated in JSON.

Apart from defining the report’s main content, the GITB TRL format foresees optional metadata to provide information on the validation service itself and the type of validation applied. Specifically:

An identifier and name for the report.
A name and version for the validator.
The validation profile considered as well as any type-specific customisation.

The inclusion of all such properties is driven through your domain configuration file. The report and validator metadata are optional fixed values that you may configure to apply to all produced reports. The validation profile and its customisation however, apart from also supporting overall default values, can furthermore be set with values depending on configured validation types. If nothing is configured, the only metadata included by default is the profile, that is set to the validation type that was considered to carry out the validation (selected by the user, or implicit if there is a single defined type or a default).

The following table summarises the available report metadata, the relevant configuration properties and their configuration approach:

Report element	Configuration property	Description
`/id`	`validator.report.id`	Identifier for the overall report, set as a string value.
`/name`	`validator.report.name`	Name for the overall report, set as a string value.
`/overview/validationServiceName`	`validator.report.validationServiceName`	Name for the validator, set as a string value.
`/overview/validationServiceVersion`	`validator.report.validationServiceVersion`	Version for the validator, set as a string value.
`/overview/profileID`	`validator.report.profileId[.type]`	The applied profile (validation type). Multiple entries can be added for configured validation types added as a postfix. When defined without a postfix the value is considered as an overall default. If entirely missing this is set to the applied validation type.
`/overview/customizationID`	`validator.report.customisationId[.type]`	A customisation of the applied profile. Multiple entries can be added for configured validation types added as a postfix. When defined without a postfix the value is considered as an overall default.

To illustrate the above properties consider first the following XML report metadata, produced by default if no relevant configuration is provided. Only the profile is included, set to the validation type that was used:

<TestStepReport>
    ...
    <overview>
        <profileID>basic</profileID>
    </overview>
    ...
</TestStepReport>

Extending now our domain configuration, we can include additional metadata as follows:

# A name to display for the validator.
validator.report.validationServiceName=Purchase Order Validator
# A version to display for the validator.
validator.report.validationServiceVersion=v1.0.0
# The name for the overall report.
validator.report.name=Purchase order validation report
# The profile to display depending on the selected validation type (basic or large).
validator.report.profileId.basic=Basic purchase order
validator.report.profileId.large=Large purchase order

Applying the above configuration will result in GITB TRL reports produced with the following metadata included:

<TestStepReport name="Purchase order validation report">
    ...
    <overview>
        <profileID>basic</profileID>
        <validationServiceName>Purchase Order Validator</validationServiceName>
        <validationServiceVersion>v1.0.0</validationServiceVersion>
    </overview>
    ...
</TestStepReport>

In a JSON report produced by the validator’s REST API the metadata would be included as follows:

{
    ...
    "overview": {
        "profileID": "Basic purchase order",
        "validationServiceName": "Purchase Order Validator",
        "validationServiceVersion": "v1.0.0"
    },
    ...
    "name": "Purchase order validation report"
}

Rich text support in report items

A validation report’s items represent the findings of a given validation run. The description of report items is by default treated as simple text and displayed as such in all report outputs. If this description includes rich text (i.e. HTML) content, the validator’s user interface will escape and display it as-is without rendering it.

It is possible to configure your validator to expect report items with descriptions including rich text, and specifically HTML links (anchor elements). If enabled, links will be rendered as such in the validator’s user interface and PDF reports, so that when clicked, their target is opened in a separate window. A typical use case for this would be to link each reported finding with online documentation that provides further information or a normative reference.

To enable HTML links in report items set property validator.richTextReports to true as part of your domain configuration properties.

validator.richTextReports = true

It is important to note that with this feature enabled, the description of report items is sanitised to remove any rich content that is not specifically a link. If found, non-link HTML tags are stripped from descriptions, leaving only their contained text (if present).

Support for XML Schema version 1.1

By default the validator treats XML Schema artefacts as being at version 1.0. In case your validator needs to support XML Schema 1.1 you need to explicitly configure this in your domain configuration. The configuration can be done for your entire domain, or per validation type.

Configuring XML Schema 1.1 support is done by means of the following properties:

validator.schemaVersion, to specify the default version to consider.
validator.schemaVersion.TYPE, to specify the version to consider for a specific validation type.

In terms of values, use 1.0 for XML Schema v1.0 and 1.1 for v1.1.

The following example illustrates a scenario where your validator is using XML Schema 1.0 for an earlier version of your specifications, whereas later versions consider XML Schema 1.1:

# Consider XML Schema v1.1 by default.
validator.schemaVersion = 1.1
# In the case of your specification's initial version, consider XML Schema v1.0.
validator.schemaVersion.mySpec.1_0_0 = 1.0

Note

Why XML Schema v1.0 as the default? XML Schema v1.0 is much more widely adopted than v1.1 and is treated as such as the default. Requiring the version’s configuration is necessary given that v1.1 is mostly, but not fully, backwards compatible.

Step 4: Setup validator as Docker container

Note

When to setup a Docker container: The purpose of setting up your validator as a Docker container is to host it yourself or run it locally on workstations. If you prefer or don’t mind the validator being accessible over the Internet it is simpler to delegate hosting to the Test Bed team by reusing the Test Bed’s infrastructure. If this is the case skip this section and go directly to Step 5: Setup validator on Test Bed. Note however that even if you opt for a validator managed by the Test Bed, it may still be interesting to create a Docker image for development purposes (e.g. to test new validation artefact versions) or to make available to your users as a complementary service (i.e. use online or download and run locally).

Once the validator’s configuration is ready (configuration file and validation artefacts) you can proceed to create a Docker image.

The configuration for your image is driven by means of a Dockerfile. Create this file in the validator folder with the following contents:

FROM isaitb/xml-validator:latest
COPY resources /validator/resources/
ENV validator.resourceRoot /validator/resources/

This Dockerfile represents the most simple Docker configuration you can provide for the validator. Let’s analyse each line:

`FROM isaitb/xml-validator:latest`	This tells Docker that your image will be built over the latest version of the Test Bed’s `isaitb/xml-validator` image. This represents the validator’s core that expects configuration to drive the validation. It is available on the public Docker Hub and as such can be directly used through any Docker installation with Internet access.
`COPY resources /validator/resources/`	This copies your `resources` folder to the image under path `/validator/resources/`.
`ENV validator.resourceRoot /validator/resources/`	This instructs the validator that it should consider as the root of all its configuration resources the `/validator/resources/` folder (which was just copied into it).

The contents of the validator folder should now be as follows:

validator
├── resources
│   └── order
│       ├── sch
│       │   └── LargePurchaseOrder.sch
│       ├── xsd
│       │   └── PurchaseOrder.xsd
│       └── config.properties
└── Dockerfile

That’s it. To build the Docker image open a command prompt to the validator folder and issue:

docker build -t po-validator .

This command will create a new local Docker image named po-validator based on the Dockerfile it finds in the current directory. It will proceed to download missing images (e.g. the isaitb/xml-validator:latest image) and eventually print the following output:

Sending build context to Docker daemon  32.77kB
Step 1/3 : FROM isaitb/xml-validator:latest
---> 39ccf8d64a50
Step 2/3 : COPY resources /validator/resources/
---> 66b718872b8e
Step 3/3 : ENV validator.resourceRoot /validator/resources/
---> Running in d80d38531e11
Removing intermediate container d80d38531e11
---> 175eebf4f59c
Successfully built 175eebf4f59c
Successfully tagged po-validator:latest

The new po-validator:latest image can now be pushed to a local Docker registry or to the Docker Hub. In our case we will proceed directly to run this as follows:

docker run -d --name po-validator -p 8080:8080 po-validator:latest

This command will create a new container named po-validator based on the po-validator:latest image you just built. It is set to run in the background (-d) and expose its internal listen port through the Docker machine (-p 8080:8080). Note that by default the listen port of the container (which you can map to any available host port) is 8080.

Your validator is now online and ready to validate XML documents. If you want to try it out immediately skip to Step 6: Use the validator. Otherwise, read on to see additional configuration options for the image.

Running without a custom Docker image

The discussed approach involved building a custom Docker image for your validator. Doing so allows you to run the validator yourself but also potentially push it to a public registry such as the Docker Hub. This would then allow anyone else to pull it and run a self-contained copy of your validator.

If such a use case is not important for you, or if you want to only use Docker for your local artefact development, you could also skip creating a custom image and use the base isaitb/xml-validator image directly. To do so you would need to:

Define a named or unnamed Docker volume pointing to your configuration files.
Run your container by configuring it with the volume.

Considering the same file structure of the /validator folder you can launch your validator using an unnamed volume as follows:

docker run -d --name po-validator -p 8080:8080 \
           -v /validator/resources:/validator/resources/ \
           -e validator.resourceRoot=/validator/resources/  \
           isaitb/xml-validator

As you see here we create the validator directly from the base image and pass it as a volume our resource root folder. When doing so you need to also make sure that the validator.resourceRoot environment variable is set to the path within the container.

Using this approach to run your validator has the drawback of being unable to share it as-is with other users. The benefit however is one of simplicity given that there is no need to build intermediate images. As such, updating the validator for configuration changes means that you only need to restart your container.

Note

Running the default docker image can also be done without providing a validator.resourceRoot. If you decide to do this, a generic instance with the any validator will automatically be set-up for you and you will be able to access it on http://localhost:8080/any/upload.

Configuring additional validation domains

Up to this point you have configured validation for purchase orders which defines one or more validation types (basic and large). This configuration can be extended by providing additional types to reflect:

Additional profiles with different business rules (e.g. minimal).
Specification versions (e.g. basic_v1.0, large_v1.0, basic_v1.1_beta).
Other types of content that are linked to purchase orders (e.g. purchase_order_basic_v1.0 and order_receipt_v1.0).

All such extensions would involve defining potentially additional validation artefacts and updating the config.properties file accordingly.

Apart from extending the validation possibilities linked to purchase orders you may want to configure a completely separate validator to address an unrelated specification that would most likely not be aimed to the same user community. To do so you have two options:

Repeat the previous steps to define a separate configuration and a separate Docker image. In this case you would be running two separate containers that are fully independent.
Reuse your existing validator instance to define a new validation domain. The result will be two validation services that are logically separate but are running as part of a single validator instance.

The rationale behind the second option is simply one of required resources. If you are part of an organisation that needs to support validation for dozens of different XML-based specifications that are unrelated, it would probably be preferable to have a single application to host rather than one per specification.

Note

Sharing artefact files accross domains: Setting the application property validator.restrictResourcesToDomain to false allows to add paths of validation artefacts that are outside of the domain root folder. This enables sharing artefacts between different domains.

In your current single domain setup, the purchase order configuration is reflected through folder order. The name of this folder is also by default assumed to match the name of the domain. A new domain could be named invoice that is linked to XML-based invoices. This is represented by an invoice folder next to order that contains similarly its validation artefacts and domain-level configuration property file. Considering this new domain, the contents of the validator folder would be as follows:

validator
├── resources
│   ├── invoice
│   │   └── (Further contents skipped)
│   └── order
│       └── (Further contents skipped)
└── Dockerfile

If you were now to rebuild the validator’s Docker image this would setup two logically-separate validation domains (invoice and order).

Note

Validation domains vs types: In almost all scenarios you should be able to address your validation needs by having a single validation domain with multiple validation types. Validation types under the same domain will all be presented as options for users. Splitting in domains would make sense if you don’t want the users of one domain to see the supported validation types of other domains.

Important: Support for such configuration is only possible if you are defining your own validator as a Docker image. If you plan to use the Test Bed’s shared validator instance (see Step 5: Setup validator on Test Bed), your configuration needs to be limited to a single domain. Note of course that if you need additional domains you can in this case simply repeat the configuration process multiple times.

Additional configuration options

We have seen up to now that configuring how validation takes place is achieved through domain-level configuration properties provided in the domain configuration file (file config.properties in our example). When setting up the validator as a Docker image you may also make use of application-level configuration properties to adapt the overall validator’s operation. Such configuration properties are provided as environment variables through ENV directives in the Dockerfile.

We already saw this when defining the validator.resourceRoot property that is the only mandatory property for which no default exists. Other such properties that you may choose to override are:

validator.domain: A comma-separated list of names that are to be loaded as the validator’s domains. By default the validator scans the provided validator.resourceRoot folder and selects as domains all subfolders that contain a configuration property file (folder order in our case). You may want to configure the list of folder names to consider if you want to ensure that other folders get ignored.
validator.domainName.DOMAIN: A mapping for a domain (replacing the DOMAIN placeholder) that defines the name that should be presented to users. This would be useful if the folder name itself (order in our example) is not appropriate (e.g. if the folder was named files).
validator.acceptedMimeTypes: The mime types of received XML files to consider as acceptable input for the validator.
validator.acceptedSchematronExtensions: The file extensions to consider when looking up Schematron files.
validator.rateLimit.*: A series of properties that allow you to configure validation rate limits per client IP address and requested endpoint.

The following example Dockerfile illustrates use of these properties. The values set correspond to the applied defaults so the resulting Docker images from this Dockerfile and the original one (see Step 4: Setup validator as Docker container) are in fact identical:

FROM isaitb/xml-validator:latest
COPY resources /validator/resources/
ENV validator.resourceRoot /validator/resources/
ENV validator.domain order
ENV validator.domainName.order order
ENV validator.acceptedMimeTypes application/xml, text/xml, text/plain
ENV validator.acceptedSchematronExtensions xsl, xslt, sch

In case you want to configure validation rate limits for your validator (per validator instance), the following configuration shows an example setup:

# Enable validation rate limits (by default no limits are enforced).
ENV validator.rateLimit.enabled true
# When a limit is exceeded return a 429 "Too Many Requests" response (including a Retry-After header).
ENV validator.rateLimit.warnOnly false
# Read the client's IP address from the X-Real-IP (in case the validator is proxied).
ENV validator.rateLimit.ipHeader X-Real-IP
# Apply limits per minute of 100 for the UI, 500 for the REST API, 50 for the bulk REST API, and 400 for the SOAP API.
ENV validator.rateLimit.capacity.uiValidate 100
ENV validator.rateLimit.capacity.restValidate 500
ENV validator.rateLimit.capacity.restValidateMultiple 50
ENV validator.rateLimit.capacity.soapValidate 400

See Application-level configuration for the full list of supported application-level properties.

Finally, it may be the case that you need to adapt further configuration properties that relate to how the validator’s application is ran. The validator is built as a Spring Boot application which means that you can override all configuration properties by means of environment variables. This is rarely needed as you can achieve most important configuration through the way you run the Docker container (e.g. defining port mappings). Nonetheless the following adapted Dockerfile shows how you could ensure the validator’s application starts up on another port (9090) and uses a specific context path (/ctx).

FROM isaitb/xml-validator:latest
COPY resources /validator/resources/
ENV validator.resourceRoot /validator/resources/
ENV server.servlet.context-path /ctx
ENV server.port 9090

Note

Custom port: Even if you define the server.port property to a different value other than the default 8080 this remains internal to the Docker container. The port through which you access the validator will be the one you map on your host through the -p flag of the docker run command.

The full list of such application configuration properties, as well as their default values, are listed in the Spring Boot configuration property documentation.

Environment-specific domain configuration

In the previous section you saw how could can configure the validator’s application by means of environment properties. Use of environment properties is also possible for specific validator domains, allowing you to externalise and override their configuration aspects. Typical cases where you may want to do this are:

To adapt the configuration for different instances of the same validator.
To hide sensitive properties such as internal URLs or passwords.

External configuration properties can be provided through environment variables or system properties (the latter being interesting when using a command line validator). These can contribute to the configuration:

Complete properties, by defining a variable or property with the same name as a configuration property.
Values, by referring to arbitrary variables or properties within a domain property file. This is done by defining a placeholder ${} and using within it the prefixes env: for environment variables or sys: for system properties (e.g. ${env:MY_VARIABLE}).

As a simple example of this, consider the definition of the title of the validator’s web UI. We want to adapt this title depending on the purpose of the specific validator instance. To begin with we can define the title via the validator.uploadTitle property in the config.properties file as follows:

validator.uploadTitle = Purchase Order Validator

The problem with this approach is that the upload title remains fixed across all instances of the validator. Alternatively we can define the value for the title as an environment variable named VALIDATOR_TITLE. To use this we can adapt config.properties to reference it as follows:

validator.uploadTitle = ${env:VALIDATOR_TITLE}

We can then adapt this for each Docker container we create by defining a specific value for the title:

docker run -e VALIDATOR_TITLE="Purchase Order Validator (Demo)" ...

A further alternative would be to externalise the complete validator.uploadTitle property by removing it from config.properties and specifying it as an environment variable:

docker run -e validator.uploadTitle="Purchase Order Validator (Demo)" ...

Note that such external configuration can also be used as a partial value. If we define an environment variable named VALIDATOR_ENV we could also use it within config.properties as follows:

validator.uploadTitle = Purchase Order Validator (${env:VALIDATOR_ENV})

Finally, you can use environment or system variables to override properties that are already defined in config.properties. This is useful if you use the file’s properties as default values and selectively override certain of them as needed. A property’s value is looked up in sequence as follows:

Look in environment variables.
If not found look in system properties.
If not found look in the domain configuration file.
If not found consider the property’s overall default value (see Domain-level configuration).

Step 5: Setup validator on Test Bed

Note

When to setup on Test Bed resources: Setting up your validator on the Test Bed’s resources removes hosting concerns and allows you to benefit from automatic service reloads for configuration changes. In doing so however you need to keep in mind that the validator will be exposed over the Internet. If this approach is not suitable for you (e.g. you want to expose the validator within a restricted network) you should consider setting up the validator as a Docker container (see Step 4: Setup validator as Docker container) that you can then host as you see fit.

To configure a validator using the Test Bed’s resources all you need to do is get in touch with the Test Bed team and provide the validator’s configuration. Specifically:

Send an email to DIGIT-ITB@ec.europa.eu describing your case: This step is needed for two reasons. Firstly you may want to have a further discussion and potentially a demo to better understand the available options. Secondly the Test Bed’s team would need to ensure that you qualify to use its resources (to e.g. avoid that you are a private company planning to offer commercial validation services).
Share the configuration for the validator: Once contact has been established you need to provide the initial configuration for the validator.

Regarding the second step, the validator’s configuration to be shared is the contents of the validator folder as described in Step 3: Prepare validator configuration. The eventual goal here will be to have the configuration available through an accessible Git repository. This can be done in a number of ways listed below in decreasing order of preference:

Create a new Git repository: You can push all resources (the validator folder) to a new Git repository (e.g. on GitHub or the European Commission’s CITNet Bitbucket server). You can of course add any other resources to this repository as you see fit (e.g. a README file). Once done provide the repository’s URL to the Test Bed team.
Provide the resources to the Test Bed team: You can send the configuration files themselves to the Test Bed’s team (e.g. make an archive of the validator folder). Ideally you should define the configuration file but if in doubt you can simply describe the resources and the Test Bed team will prepare the initial configuration for you. When following this approach a new Git repository will be created for you on the European Commission’s CITNet Bitbucket server for which you will be assigned write access (assuming you have a CITNet user account).
Update an existing Git repository: If you already have a Git repository to maintain the validation artefacts you can reuse this by adding to it the required configuration file (config.properties in our case). When ready you will need to provide the Test Bed team with the URL to the repository and the location of the configuration file.

Following the initial configuration, the resulting Git repository will be monitored to detect any changes to the validation artefacts or the configuration file. If such a change is detected, the validation service will be automatically updated within a few minutes.

Note

Using a dedicated Git repository for the validator: Whether you define a new Git repository yourself or the Test Bed team creates one for you, the result is a repository that is dedicated to the validator. This approach is preferable to reusing an existing Git repository to avoid unwanted changes to the validator. whether or not this is done through GitHub, CITNet’s Bitbucket or another service depends on what best suits your needs.

As part of the initial setup for the validator the Test Bed team will also configure how it is accessed. The name used will match the name of the folder that contains your configuration file (order in the considered example), but this can differ according to your preferences. If this is the case make sure to inform the Test Bed team of your preferred naming.

Considering our example, for a name of order, the resulting root URL through which the validator will be accessed is https://www.itb.ec.europa.eu/order. The specific paths will depend on the supported validation channels as described in Step 6: Use the validator.

Step 6: Use the validator

Well done! At this step your validator has been successfully configured and is ready to use. Depending on which approach was followed, this may have been done either:

As a Docker container (described in Step 4: Setup validator as Docker container).
Through the Test Bed’s resources (described in Step 5: Setup validator on Test Bed).

The validation channels that are supported depend on the configuration you have supplied. This is done through the validator.channels property of your configuration file (config.properties) that defaults to form, rest_api, soap_api. The supported channels are as follows:

form: A web user interface allowing a user to upload the XML document to validate.
rest_api: A REST API allowing machine-to-machine integration using HTTP calls.
soap_api: A SOAP web service API allowing for machine-to-machine integration.

The following sub-sections describe how each channel can be used considering the example EU purchase order specification.

Validation via user interface

The validator’s user interface is available at the /upload path. The exact path depends on how this is deployed:

Via Docker: http://DOCKER_MACHINE:8080/order/upload
On the Test Bed: https://www.itb.ec.europa.eu/order/upload

The first page that you see is a simple form to upload the file to validate.

This form expects the following input:

Content to validate: The content that will be validated.
Validate as: The type of validation to apply.

The dropdown menu to the right of the Content to validate label selects the input method. For this you have three choices:

File: Content provided as a file upload (the default).
URI: Content provided as a remote URI reference.
Direct input: Content encoded directly in an on-screen editor.

In case the validator is configured to support multiple languages, the form includes an additional dropdown menu to list them in the bottom-right corner. Selecting one of these languages will reload the interface and will record your choice to apply it automatically for future visits.

../_images/validator_upload_languages3.png

Note that all displayed labels can be adapted through the config.properties configuration file (see Properties related to UI labels). The available validation types match the ones defined in the validator.type property, displayed using the validator.typeLabel.TYPE labels. Moreover, the text title could be replaced by a configurable HTML banner, and further complemented with a HTML footer (see Domain-level configuration).

../_images/validator_upload_selected3.png

It is worth noting also that if your configuration defined only a single validation type, the user interface would be simplified by presenting only a single file upload input.

Finally, in case the validator’s configuration foresees user-provided validation artefacts (see User-provided validation artefacts) the interface is adapted to allow their provision. In the following example, the “large” validation type is set to allow optional external Schematron files. Any number of these can be provided by means of the provided icons.

../_images/validator_upload_external3.png

Once a file has been uploaded and the validation type is selected click the Validate button to trigger the validation. Upon completion you will be presented with the validation results:

This screen includes an overview of the result listing:

The validation timestamp (in UTC), the name of the validated file and the applied validation type.
The overall result (SUCCESS or FAILURE).
The number of errors, warnings and information messages.

This section is followed by the Details panel, where the details of each report item are listed:

It’s type (whether this is an error, warning or information message).
It’s description.
It’s location in the provided input.

Clicking on each item’s details will open a popup that shows within the provided content the specific point that triggered the issue:

../_images/validator_result_errordetail2.png

In terms or reporting, apart from the on-screen display, buttons are available allowing you to view the validation report:

View annotated input: Opens a popup with the provided content, including annotations for the lines with relevant report items.
Download report: Download the validation report as XML (in GITB TRL syntax - sample here), PDF (sample here) or CSV.

Note that the download options are initially disabled but are enabled as soon as the respective reports become available.

In case validation has produced similar findings for multiple items, the validator offers the possibility to view reports in detailed (the default) or aggregated format. In case of an aggregated report, findings that have the same description and severity are merged to display only the first one, alongside an indication of the total number of occurrences. This indication is added as a prefix to the displayed description.

../_images/validator_result_aggregated3.png

Aggregated reports are also available to download in XML, PDF or CSV formats. Regarding the on-screen display of findings, this can be switched between detailed and aggregated by using the provided control on the top of the “Details” panel. In case the validation report, detailed or aggregate, includes findings at different severity levels, you may also filter the displayed on-screen findings to show all items (the default), or show specifically errors, warnings and information messages.

Finally, once a validation result is produced you may trigger additional validations. To do this you may either use the form from the top of the result screen or click on the form’s title to take you back to the previous page.

Validation via minimal user interface

If you are exposing a web user interface (see Validation via user interface) for your validator you also have the option of enabling an alternative minimal interface that could be used as an embedded component in another web page (e.g. via an iframe). This is enabled through the validator.supportMinimalUserInterface property in your domain configuration (file config.properties).

...
validator.supportMinimalUserInterface = true

The result of this is to expose a /uploadm path. The path depends on how this is deployed:

Via Docker: http://DOCKER_MACHINE:8080/order/uploadm
On the Test Bed: https://www.itb.ec.europa.eu/order/uploadm

The minimal interface offers largely the same functionality as the complete one but with a more condensed layout and minimal styling. The initial input page you see for the validator is as follows:

../_images/validator_upload_minimal3.png

The most significant difference is the result page which provides initially only the overview and the relevant download controls:

../_images/validator_result_minimal3.png

You can switch to display the detailed findings by clicking the View details button in which case you will also see the relevant findings’ filtering controls. All controls and displayed information for the input as well as the summary and detailed result pages are identical to the complete user interface (see Validation via user interface).

Validation via embedded interface

In case you want to use the validator through an existing web application the typical approach would be to use the validator’s machine-to-machine interface (via REST API or SOAP API). This allows you to present your own user interface to users while integrating with the validator in the background to validate provided data.

An alternative to this integration is to embed the validator’s user interface directly within your own user interface. This is possible for web applications but also for simple websites that would collect input data and provide it to the validator. When embedding the validator in this way you define an iframe in your interface’s HTML and set its source to point to the validator. Doing as such, results in the validator’s user interface displayed within your own, placed inside the defined iframe.

When embedding the validator in this way you have the following options available:

You may embed the validator’s complete user interface as-is. To do this set the source of the iframe to the validator’s interface (e.g. https://www.itb.ec.europa.eu/csv/order/upload).
Alternatively you may embed the validator’s minimal user interface (if enabled) for a more concise presentation. To do this set the source of the iframe to the validator’s minimal interface variant (/uploadm instead of /upload).
If you want to manage data input yourself you may skip the validator’s input form, using it only to display validation results. In this case you would typically use the validator’s minimal user interface as its result display does not include an input form.

If you are using the validator only to display results (i.e. the last option above), you will need to manage data input yourself and provide it to the validator as it expects. To do this you include your inputs in a form that will need to be submitted to the validator via a HTTP POST request. The request parameters you may provide are listed in the following table:

Input name	Input type	Description	Required?
`file`	file	The file to validate. If provided the form must be a set as `multipart/form-data`.	One of `file`, `uri` or `text` must be provided.
`uri`	text	The URI from which to load the content to validate.	One of `file`, `uri` or `text` must be provided.
`text`	text	The text to validate.	One of `file`, `uri` or `text` must be provided.
`contentType`	text	The type of input to consider (use `fileType` for `file`, `uriType` for `uri` or `stringType` for `text`).	Skip if only one of `file`, `uri` or `text` is submitted.
`validationType`	text	The type of validation to perform (as defined in the validator’s configuration).	Required unless the validator only defines a single validation type.

The following HTML sample is a simple web page that provides a file input control for its users:

<html>
    <head><title>Simple validator</title></head>
    <body>
        <h1>Validate your data</h1>
        <form method="POST" enctype="multipart/form-data" action="https://www.itb.ec.europa.eu/order/uploadm" target="output">
            <input type="file" name="file">
            <input type="hidden" name="validationType" value="large">
            <button type="submit">Validate</button>
        </form>
        <iframe name="output" style="width:100%; height:50%;" src='about:blank'></iframe>
    </body>
</html>

In the above sample take note of the following points:

Our input control is a file upload whereas the validation type is fixed and hidden. As we have a file upload, the enclosing form is set to make a multipart submission.
The validator interface to be used to display the results is the minimal interface (identified by the /uploadm path).
The validation result is displayed in an iframe named output. This is set to be initially empty.
The validation submission, a HTTP POST, is set to target the iframe. Once the validation is complete the iframe will display the output.

The following screenshot shows how the above configuration would appear once a validation has taken place.

Note

Disable validator embedding: For the validator to be embedded it needs to allow itself to be presented in iframes. If you prefer to disable this feature (see why here) you may set property validator.supportUserInterfaceEmbedding to false.

Validation via REST web service API

The validator’s REST API is available under the /rest/DOMAIN/api path. The exact path depends on how this is deployed:

Via Docker: http://DOCKER_MACHINE:8080/rest/order/api
On the Test Bed: https://www.itb.ec.europa.eu/xml/rest/order/api

The operations that the REST API supports are the following:

Operation	Description	HTTP method	Request payload type
`info`	Retrieve the available validation types for a given domain (or for all domains).	`GET`	None
`validate`	Validate one XML document.	`POST`	`application/json`
`validateMultiple`	Validate multiple XML documents.	`POST`	`application/json`

The supported operations as well their input and output are thoroughly documented using OpenAPI and Swagger. The documentation can be accessed online at the /swagger-ui.html path.

The Swagger UI is notable as this provides rich, interactive documentation that can also be used to call the underlying operations. To access this navigate to:

If running via Docker: http://DOCKER_MACHINE:8080/swagger-ui.html
If running on the Test Bed: https://www.itb.ec.europa.eu/xml/swagger-ui.html

Note that before using the Swagger UI to execute any of the operations you will also need to specify the {domain} path parameter. In the example we have been following this would be order.

Coming back to the specific operations supported, the first one to address is the info operation. This can be useful if the validator is configured with multiple validation types in which case this service returns each type’s name and description.

Considering our order example making a GET request to http://DOCKER_MACHINE:8080/rest/order/api/info (or https://www.itb.ec.europa.eu/xml/rest/order/api/info on the Test Bed), you receive a JSON response as follows:

{
    "domain": "order",
    "validationTypes": [
        {
            "type": "basic",
            "description": "Basic purchase order"
        },
        {
            "type": "large",
            "description": "Large purchase order"
        }
    ]
}

To trigger validation of an XML document you use the validate operation by making a POST request of type application/json to http://DOCKER_MACHINE:8080/rest/order/api/validate (or https://www.itb.ec.europa.eu/xml/rest/order/api/validate on the Test Bed). The payload of the validate operation defines the following properties:

Property	Description	Required?	Type	Default value
`contentToValidate`	The content to validate.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the `contentToValidate` value. It is advised to always provide this for best performance. If not provided the validator will attempt to infer the method from the content.	No	One of `STRING` (for content provided directly), `BASE64` (for content embedded as a BASE64 encoded string) or `URL` (for a reference to remote content).
`validationType`	The type of validation to perform.	Yes, unless a single validation type is defined.	String	The single configured validation type (if defined).
`externalSchemas`	An array of user-provided XML schemas to be considered with any predefined ones. These are accepted only if explicitly allowed in the configuration for the validation type in question.	No	An array of `SchemaInfo` entries (see below for content).
`externalSchematrons`	An array of user-provided Schematrons to be considered with any predefined ones. These are accepted only if explicitly allowed in the configuration for the validation type in question.	No	An array of `SchemaInfo` entries (see below for content).
`locationAsPath`	Whether the locations reported for returned errors will be XPath expressions (default true). False will return the line number in the input.	No	`true`
`showLocationPaths`	When `locationAsPath` is false and locations are line numbers, whether a simplified XPath path expression will be added to report item locations.	No	`false`
`locale`	Locale (language code) to use for reporting of results. If the provided locale is not supported by the validator the default locale will be used instead (e.g. “fr”, “fr_FR”). See Supporting multiple languages for details.	No	String
`addInputToReport`	Whether to include the validated input in the resulting report’s context section.	No	Boolean	`false`
`wrapReportDataInCDATA`	Whether to wrap the input (see addInputToReport) in a CDATA block if producing an XML report. False results in adding the input via XML escaping.	No	Boolean	`false`
`contextFiles`	An array of user-provided XML files to be considered as context files (usable by pre-configured XSLTs).	No	An array of `ContextFileInfo` entries (see below for content).

In case user-provided XML schemas and/or Schematrons are supported, these are provided as SchemaInfo entries as elements of the externalSchemas and externalSchematrons arrays. The content of each SchemaInfo is as follows:

Property	Description	Required?	Type	Default value
`schema`	The artifact’s content.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the value for `schema`. It is advised to always provide this for best performance. If not provided the validator will attempt to infer the method from the ruleSet value.	No	One of `STRING` (for content provided directly), `BASE64` (for content embedded as a BASE64 encoded string) or `URL` (for a reference to remote content).

In case user-provided context files are supported, these are provided as ContextFileInfo entries as elements of the contextFiles array. The content of each ContextFileInfo is as follows:

Property	Description	Required?	Type	Default value
`content`	The file’s content.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the value for `content`. It is advised to always provide this for best performance. If not provided the validator will attempt to infer the method from the ruleSet value.	No	One of `STRING` (for content provided directly), `BASE64` (for content embedded as a BASE64 encoded string) or `URL` (for a reference to remote content).

To illustrate how this operation can be used we will consider a purchase order that will fail validation when checked to be of large type due to it lacking the required quantities per item (you can download the sample here):

<?xml version="1.0"?>
<purchaseOrder xmlns="http://itb.ec.europa.eu/sample/po.xsd" orderDate="2018-01-22">
  <shipTo country="BE">
    <name>John Doe</name>
    <street>Europa Avenue 123</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </shipTo>
  <billTo country="BE">
    <name>Jane Doe</name>
    <street>Europa Avenue 210</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </billTo>
  <comment>Send in one package please</comment>
  <items>
    <item partNum="XYZ-123876">
      <productName>Mouse</productName>
      <quantity>5</quantity>
      <priceEUR>15.99</priceEUR>
      <comment>Confirm this is wireless</comment>
    </item>
    <item partNum="ABC-32478">
      <productName>Keyboard</productName>
      <quantity>15</quantity>
      <priceEUR>25.50</priceEUR>
    </item>
  </items>
</purchaseOrder>

As a first validation example we will provide the content to validate as a URI to be looked up:

{
    "contentToValidate": "https://www.itb.ec.europa.eu/files/samples/xml/sample-invalid.xml",
    "validationType": "large"
}

In the contentToValidate parameter we include the URI to the file whereas in the validationType parameter we select the desired validation type. Note that we need to define the validation type as we have more than one configured for purchase orders. If there was only one this parameter would be optional.

The response returned for this call will be the validation report in the XML GITB TRL syntax:

<?xml version="1.0" encoding="UTF-8"?>
<TestStepReport xmlns="http://www.gitb.com/tr/v1/"
                 xmlns:ns2="http://www.gitb.com/core/v1/"
                 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                 xsi:type="TAR">
    <date>2022-11-14T16:42:42.973Z</date>
    <result>FAILURE</result>
    <counters>
        <nrOfAssertions>0</nrOfAssertions>
        <nrOfErrors>1</nrOfErrors>
        <nrOfWarnings>0</nrOfWarnings>
    </counters>
    <reports>
        <error xsi:type="BAR">
            <description>[PO-01] The quantities of items for large orders must be greater than 10.</description>
            <location>/purchaseOrder/items/item[1]</location>
            <test>number(po:quantity) &gt; 10</test>
        </error>
    </reports>
</TestStepReport>

In this report we can see the overall validation result (FAILURE), its timestamp, as well as the individual report items (one in this case). Each such item includes:

The item’s description.
The item’s location (expressed in this case as a XPath expression).

In addition to the validation’s result you may also include as context information in the produced report the input data that was considered. To do this specify in the call addInputToReport setting it to true:

{
    "contentToValidate": "https://www.itb.ec.europa.eu/files/samples/json/sample-invalid.json",
    "validationType": "large",
    "addInputToReport": true
}

Doing so will include a context section in the report with the validated data:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<TestStepReport xmlns="http://www.gitb.com/tr/v1/" xmlns:ns2="http://www.gitb.com/core/v1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="TAR">
    ...
    <context>
        <ns2:item embeddingMethod="STRING" mimeType="application/xml" name="xml">
            <ns2:value>&lt;?xml version="1.0" encoding="UTF-8"?&gt;&lt;purchaseOrder&gt;...&lt;/purchaseOrder&gt;</ns2:value>
        </ns2:item>
    </context>
    <reports>
        ...
    </reports>
</TestStepReport>

Note

CDATA vs XML-escaping: By default context data is added to the report using XML escaping. If you would prefer that CDATA blocks are used instead you may specify wrapReportDataInCDATA as true.

In case you would want to display the report’s XML in a user-friendly manner, the Test Bed makes available XSL stylesheets to transform it to HTML. This provides an alternative to using the validator’s user interface, when you need to work with a REST API but still want to present the resulting report as-is. Stylesheets for all official EU languages are available in a ZIP bundle, whereas individual stylesheets per language can also be accessed directly using the following URL (replace LANGUAGE with your desired language’s two-letter ISO code):

https://www.itb.ec.europa.eu/files/stylesheets/gitb_trl/gitb_trl_stylesheet_v1.0.LANGUAGE.xsl.

The validation report may also be obtained in JSON format by specifying the HTTP Accept header and setting it to application/json. The JSON report corresponding to the previous validation would be as follows:

{
    "date": "2022-11-14T16:43:03.540+0000",
    "result": "FAILURE",
    "counters": {
        "nrOfAssertions": 0,
        "nrOfErrors": 1,
        "nrOfWarnings": 0
    },
    "reports": {
        "error": [
            {
                "description": "[PO-01] The quantities of items for large orders must be greater than 10.",
                "location": "/purchaseOrder/items/item[1]",
                "test": "number(po:quantity) > 10"
            }
        ]
    }
}

When calling the validator we used the contentToValidate property to provide the input via remote URL. You may also choose to embed the content to be validated within the request itself either as a string (with appropriate JSON escaping) or as a BASE64 encoded string. When passing a string, the request would resemble the following:

{
  "contentToValidate": "<?xml version=\"1.0\"?>\r\n<purchaseOrder xmlns=\"http:\/\/itb.ec.europa.eu\/sample\/po.xsd\" orderDate=\"2018-01-22\">\r\n  <shipTo country=\"BE\">\r\n    <name>John Doe<\/name>\r\n    <street>Europa Avenue 123<\/street>\r\n    <city>Brussels<\/city>\r\n    <zip>1000<\/zip>\r\n  <\/shipTo>\r\n  <billTo country=\"BE\">\r\n    <name>Jane Doe<\/name>\r\n    <street>Europa Avenue 210<\/street>\r\n    <city>Brussels<\/city>\r\n    <zip>1000<\/zip>\r\n  <\/billTo>\r\n  <comment>Send in one package please<\/comment>\r\n  <items>\r\n    <item partNum=\"XYZ-123876\">\r\n      <productName>Mouse<\/productName>\r\n      <quantity>5<\/quantity>\r\n      <priceEUR>15.99<\/priceEUR>\r\n      <comment>Confirm this is wireless<\/comment>\r\n    <\/item>\r\n    <item partNum=\"ABC-32478\">\r\n      <productName>Keyboard<\/productName>\r\n      <quantity>15<\/quantity>\r\n      <priceEUR>25.50<\/priceEUR>\r\n    <\/item>\r\n  <\/items>\r\n<\/purchaseOrder>",
  "validationType": "large"
}

In case of a BASE64 encoded string, the request would be as follows:

{
  "contentToValidate": "PD94bWwgdmVyc2lvbj0iMS4wIj8+CjxwdXJjaGFzZU9yZGVyIHhtbG5zPSJodHRwOi8vaXRiLmVjLmV1cm9wYS5ldS9zYW1wbGUvcG8ueHNkIiBvcmRlckRhdGU9IjIwMTgtMDEtMjIiPgogIDxzaGlwVG8gY291bnRyeT0iQkUiPgogICAgPG5hbWU+Sm9obiBEb2U8L25hbWU+CiAgICA8c3RyZWV0PkV1cm9wYSBBdmVudWUgMTIzPC9zdHJlZXQ+CiAgICA8Y2l0eT5CcnVzc2VsczwvY2l0eT4KICAgIDx6aXA+MTAwMDwvemlwPgogIDwvc2hpcFRvPgogIDxiaWxsVG8gY291bnRyeT0iQkUiPgogICAgPG5hbWU+SmFuZSBEb2U8L25hbWU+CiAgICA8c3RyZWV0PkV1cm9wYSBBdmVudWUgMjEwPC9zdHJlZXQ+CiAgICA8Y2l0eT5CcnVzc2VsczwvY2l0eT4KICAgIDx6aXA+MTAwMDwvemlwPgogIDwvYmlsbFRvPgogIDxjb21tZW50PlNlbmQgaW4gb25lIHBhY2thZ2UgcGxlYXNlPC9jb21tZW50PgogIDxpdGVtcz4KICAgIDxpdGVtIHBhcnROdW09IlhZWi0xMjM4NzYiPgogICAgICA8cHJvZHVjdE5hbWU+TW91c2U8L3Byb2R1Y3ROYW1lPgogICAgICA8cXVhbnRpdHk+NTwvcXVhbnRpdHk+CiAgICAgIDxwcmljZUVVUj4xNS45OTwvcHJpY2VFVVI+CiAgICAgIDxjb21tZW50PkNvbmZpcm0gdGhpcyBpcyB3aXJlbGVzczwvY29tbWVudD4KICAgIDwvaXRlbT4KICAgIDxpdGVtIHBhcnROdW09IkFCQy0zMjQ3OCI+CiAgICAgIDxwcm9kdWN0TmFtZT5LZXlib2FyZDwvcHJvZHVjdE5hbWU+CiAgICAgIDxxdWFudGl0eT4xNTwvcXVhbnRpdHk+CiAgICAgIDxwcmljZUVVUj4yNS41MDwvcHJpY2VFVVI+CiAgICA8L2l0ZW0+CiAgPC9pdGVtcz4KPC9wdXJjaGFzZU9yZGVyPg==",
  "validationType": "large"
}

In each of these cases, the validator will attempt to determine how the provided contentToValidate should be treated based on its format. To speed up this process it is advised to specify the embeddingMethod input that makes the input processing approach explicit.

{
    "contentToValidate": "QHBy...CAu",
    "embeddingMethod": "BASE64",
    "validationType": "large"
}

Finally, recall that in Step 3: Prepare validator configuration the possibility was mentioned to allow for a given validation type, user-provided XML schemas and Schematrons as part of the validator’s input. To do this through the REST API you provide the externalSchemas and externalSchematrons arrays containing objects with two properties:

schema for the content of the XML schema or Schematron to consider.
embeddingMethod (STRING, URL or BASE64) to determine how the schema value is to be considered. As in the case of the input, this can be omitted but making it explicit is advised to speed up validation.

The following example illustrates how a user could provide an XML Schema and two Schematron files to consider for the validation:

{
  "contentToValidate": "ewo...Qp9",
  "embeddingMethod": "BASE64",
  "externalSchemas": [
      {
        "schema": "ewo...gfQ==",
        "embeddingMethod": "BASE64"
      }
  ],
  "externalSchematrons": [
      {
        "schema": "ewo...gXQ==",
        "embeddingMethod": "BASE64"
      },
      {
          "schema": "ewo...CK6",
          "embeddingMethod": "BASE64"
      }
  ]
}

Note

Blocking user-provided schemas: If you provide schemas where this has not been explicitly allowed the call will fail.

The remaining operation that is available is validateMultiple that can be used for batch validation. In this case the input to the service uses the same JSON structure but this time as an array. To use the operation submit a POST request of type application/json to http://DOCKER_MACHINE:8080/rest/order/api/validateMultiple (or https://www.itb.ec.europa.eu/xml/rest/order/api/validateMultiple on the Test Bed).

In the following example two distinct validations are requested:

[
    {
        "contentToValidate": "https://www.itb.ec.europa.eu/files/samples/xml/sample.xml",
        "validationType": "basic"
    },
    {
        "contentToValidate": "https://www.itb.ec.europa.eu/files/samples/xml/sample-invalid.xml",
        "validationType": "large"
    }
]

The resulting response in this case always includes the reports as BASE64 encoded strings:

[
    {
        "report": "PD9...Qo="
    },
    {
        "report": "PD9...Qo="
    }
]

Validation via SOAP web service API

The validator’s SOAP API is available under the /api path. The exact path depends on how this is deployed (path to WSDL provided):

Via Docker: http://DOCKER_MACHINE:8080/api/order/validation?wsdl
On the Test Bed: https://www.itb.ec.europa.eu/order/api/validation?wsdl

Note that the api and order path elements are intentionally inverted in the Docker case. This inconsistency is due to a technical restriction that however doesn’t apply when running on the Test Bed.

The SOAP API used is the GITB validation service API, meaning that the validator is a GITB-compliant validation service. The importance of this is that apart from using it directly, this SOAP API allows integration of the validator in more complex conformance testing scenarios as a validation step in GITB TDL test cases. This potential is covered further in Step 7: Use the validator in GITB TDL test cases.

The operations supported are as follows:

getModuleDefinition: Called to return information on how to call the service (i.e. what inputs are expected).
validate: Called to trigger validation for provided content.

You can download this SOAP UI project that includes sample calls of these operations (make sure to change the service URL to match your setup).

Regarding the getModuleDefinition operation, a request of:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:v1="http://www.gitb.com/vs/v1/">
   <soapenv:Header/>
   <soapenv:Body>
      <v1:GetModuleDefinitionRequest/>
   </soapenv:Body>
</soapenv:Envelope>

Will produce a response as follows:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
   <soap:Body>
      <ns4:GetModuleDefinitionResponse xmlns:ns2="http://www.gitb.com/core/v1/" xmlns:ns3="http://www.gitb.com/tr/v1/" xmlns:ns4="http://www.gitb.com/vs/v1/">
         <module operation="V" id="ValidatorService">
            <ns2:metadata>
               <ns2:name>ValidatorService</ns2:name>
               <ns2:version>1.0.0</ns2:version>
            </ns2:metadata>
            <ns2:inputs>
               <ns2:param type="string" name="type" use="R" kind="SIMPLE"/>
               <ns2:param type="object" name="xml" use="R" kind="SIMPLE"/>
               <ns2:param type="string" name="embeddingMethod" use="O" kind="SIMPLE"/>
               <ns2:param type="list[map]" name="externalSchema" use="O" kind="SIMPLE"/>
               <ns2:param type="list[map]" name="externalSchematron" use="O" kind="SIMPLE"/>               
            </ns2:inputs>
         </module>
      </ns4:GetModuleDefinitionResponse>
   </soap:Body>
</soap:Envelope>

This response can be customised through configuration properties in config.properties to provide descriptions specific to your setup. For example, extending config.properties with the following:

...
validator.webServiceId = PurchaseOrderValidator
validator.webServiceDescription.xml = The purchase order XML file to validate
validator.webServiceDescription.type = The type of validation to perform ('basic' or 'large')

Will produce a response as follows:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Body>
        <ns4:GetModuleDefinitionResponse xmlns:ns2="http://www.gitb.com/core/v1/" xmlns:ns3="http://www.gitb.com/tr/v1/" xmlns:ns4="http://www.gitb.com/vs/v1/">
            <module operation="V" id="InvoiceValidationService">
                <ns2:metadata>
                    <ns2:name>PurchaseOrderValidator</ns2:name>
                    <ns2:version>1.0.0</ns2:version>
                </ns2:metadata>
                <ns2:inputs>
                    <ns2:param type="object" name="xml" use="R" kind="SIMPLE" description="The purchase order XML file to validate"/>
                    <ns2:param type="string" name="type" use="R" kind="SIMPLE" description = "The type of validation to perform ('basic' or 'large')"/>
                    <ns2:param type="string" name="embeddingMethod" use="O" kind="SIMPLE"/>
                    <ns2:param type="list[map]" name="externalSchema" use="O" kind="SIMPLE"/>
                    <ns2:param type="list[map]" name="externalSchematron" use="O" kind="SIMPLE"/>
                </ns2:inputs>
            </module>
        </ns4:GetModuleDefinitionResponse>
    </soap:Body>
</soap:Envelope>

Running the validation itself is done through the validate operation. This expects the following inputs:

Input	Description	Required?	Type	Default value
`xml`	The XML content to validate.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the `xml` value.	Yes, but should be skipped in favour of the `embeddingMethod` attribute - see below.	One of `BASE64` (for content embedded as a BASE64 encoded string), `URI` (for a reference to remote content) or `STRING` (for content embedded as-is).
`type`	The type of validation to perform.	Yes, unless a single validation type is defined.	String	The single configured validation type (if defined).
`locationAsPath`	Whether the report items linked to Schematron validation should be an XPath expression (if `true`) or the relevant line and column number (if `false`).	No	`boolean`	`false`
`showLocationPaths`	When `locationAsPath` is false and locations are line numbers, whether a simplified XPath path expression will be added to report item locations.	No	`false`
`addInputToReport`	Whether the validated input should be included in the produced response as the report’s context.	No	`boolean`	`true`
`externalSchema`	A list of user-provided XML Schemas to be considered with any predefined ones. These are accepted only if explicitly allowed in the configuration for the validation type in question.	No	A list of map entries (see below for content).
`externalSchematron`	A list of user-provided Schematron extensions to be considered with any predefined ones. These are accepted only if explicitly allowed in the configuration for the validation type in question.	No	A list of map entries (see below for content).
`locale`	Locale (language code) to use for reporting of results. If the provided locale is not supported by the validator the default locale will be used instead (e.g. “fr”, “fr_FR”). See Supporting multiple languages for details.	No	String
`contextFiles`	An list of user-provided XML files to be considered as context files (usable by pre-configured XSLTs).	No	A list of map entries (see below for content).

Note

Configuration for increased throughput: If you expect to be validating large XML files and/or with high frequency it would be advised to call the validator setting locationAsPath to true as determining the position of each finding is costly. In addition, you can set addInputToReport to false to avoid including the validated content (which could be large) in the resulting report’s context.

Regarding the externalSchema and externalSchematron inputs, each item is treated as a map with three named properties:

Input	Description	Required?	Type
`content`	The content to consider.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the `content` input value.	No	One of `BASE64` (for content embedded as a BASE64 encoded string), `URI` (for a reference to remote content) or `STRING` (the default - for content embedded as-is).
`type`	The specific type of artifact.	No	A string with value `xsd` (as default) for a XML Schema, or `sch` (default) or `xsl` for Schematron.

With respect to the contextFiles input, the structure is similar, with each item treated as a map with two named properties:

Input	Description	Required?	Type
`content`	The content to consider.	Yes	A string that is interpreted based on the `embeddingMethod` input.
`embeddingMethod`	The way in which to interpret the `content` input value.	No	One of `BASE64` (for content embedded as a BASE64 encoded string), `URI` (for a reference to remote content) or `STRING` (the default - for content embedded as-is).

As an alternative to the embeddingMethod inputs, the GITB SOAP API foresees also the embeddingMethod attribute that is defined on each input element. The values it supports are:

Value	Description
`STRING`	The value is interpreted as-is as an embedded text.
`BASE64`	The value is interpreted as an embedded BASE64 string that will need to be decoded before processing.
`URI`	The value is interpreted as a remote URI reference that will be looked up before processing.

Note

embeddingMethod values: The two approaches to provide the embeddingMethod value (as an input or an attribute) is due to a known issue in the GITB software where not all embedding methods can be leveraged within test cases (see Step 7: Use the validator in GITB TDL test cases).

The sample SOAP UI project includes sample requests per case. As an example, validating via URI would be done using the following envelope:

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:v1="http://www.gitb.com/vs/v1/" xmlns:v11="http://www.gitb.com/core/v1/">
   <soapenv:Header/>
   <soapenv:Body>
      <v1:ValidateRequest>
         <sessionId>?</sessionId>
         <input name="xml" embeddingMethod="URI">
            <v11:value>https://www.itb.ec.europa.eu/files/samples/po-sample-invalid.xml</v11:value>
         </input>
         <input name="type" embeddingMethod="STRING">
            <v11:value>large</v11:value>
         </input>
<!-- 
   Sample illustrating use of a user-provided Schematron

	      <input name="externalSchematron">
        	  <v11:item>
	           <v11:item name="content" embeddingMethod="URI">
	        	    <v11:value>http://a.server/rules.xml</v11:value>
	        	  </v11:item>
	        	  <v11:item name="type" embeddingMethod="STRING">
	        	    <v11:value>sch</v11:value>
	        	  </v11:item>
        	  </v11:item>
         </input>
-->        
      </v1:ValidateRequest>
   </soapenv:Body>
</soapenv:Envelope>

With the resulting report provided as follows:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
   <soap:Body>
      <ns4:ValidationResponse xmlns:ns2="http://www.gitb.com/core/v1/" xmlns:ns3="http://www.gitb.com/tr/v1/" xmlns:ns4="http://www.gitb.com/vs/v1/">
         <report>
            <ns3:date>2019-06-05T15:08:21.537Z</ns3:date>
            <ns3:result>FAILURE</ns3:result>
            <ns3:counters>
               <ns3:nrOfAssertions>0</ns3:nrOfAssertions>
               <ns3:nrOfErrors>1</ns3:nrOfErrors>
               <ns3:nrOfWarnings>0</ns3:nrOfWarnings>
            </ns3:counters>
            <ns3:context>
               <ns2:item name="xml" embeddingMethod="STRING">
                  <ns2:value><![CDATA[<?xml version="1.0"?>
<purchaseOrder xmlns="http://itb.ec.europa.eu/sample/po.xsd" orderDate="2018-01-22">
  <shipTo country="BE">
    <name>John Doe</name>
    <street>Europa Avenue 123</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </shipTo>
  <billTo country="BE">
    <name>Jane Doe</name>
    <street>Europa Avenue 210</street>
    <city>Brussels</city>
    <zip>1000</zip>
  </billTo>
  <comment>Send in one package please</comment>
  <items>
    <item partNum="XYZ-123876">
      <productName>Mouse</productName>
      <quantity>5</quantity>
      <priceEUR>15.99</priceEUR>
      <comment>Confirm this is wireless</comment>
    </item>
    <item partNum="ABC-32478">
      <productName>Keyboard</productName>
      <quantity>15</quantity>
      <priceEUR>25.50</priceEUR>
    </item>
  </items>
</purchaseOrder>]]></ns2:value>
               </ns2:item>
            </ns3:context>
            <ns3:reports>
               <ns3:error xsi:type="ns3:BAR" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                  <ns3:description>[PO-01] The quantities of items for large orders must be greater than 10.</ns3:description>
                  <ns3:location>xml:17:0</ns3:location>
                  <ns3:test>number(po:quantity) > 10</ns3:test>
               </ns3:error>
            </ns3:reports>
         </report>
      </ns4:ValidationResponse>
   </soap:Body>
</soap:Envelope>

The returned report uses the GITB TRL syntax and is the same as the XML report you can download from the user interface (see Validation via user interface). It includes:

The validation timestamp (in UTC).
The overall result (SUCCESS or FAILURE).
The count of errors, warnings and information messages.
The context for the validation (i.e. the XML that was validated).
The list of report items displaying per item its description, location in the validated content and performed test.

Validation via email

Validation via email takes place by configuring a mailbox for the validator (at the level of a validation domain) and validating any documents sent to it. The validator processes the XML attachments contained in an incoming message that are validated for a configured validation type based on how they are named. Specifically, the validation type is added as a prefix to the attached file name followed by a .. For example, an attachment named basic.myFile.xml will be validated against the basic validation type.

Note

Validation type names when email validation is enabled: The target validation type is determined by the part of the attachment’s name up to the first .. This means that an attachment named v1.2.myFile.xml will attempt validation for a v1 validation type rather than a v1.2 type. To avoid such issues, ensure your validation type identifiers don’t include . separators if you plan to enable validation via email (e.g. use v1_2 instead of v1.2).

Once validation is completed for all attachments, the results are returned as a reply addressed to the received email’s sender. This email includes for each validated document the XML and PDF version of the validation report as attachments. These are complemented by a summary message in the email’s body such as:

Validation report for [basic.myFile.xml]:
- Date: 14/06/2019 12:43:20
- Result: FAILURE
- Errors: 5
- Warnings: 2
- Messages: 0
- Detailed report in: XML [basic.myFile.report.xml] and PDF [basic.myFile.report.pdf]

Note

Enabling email validation: Validation via email is by default disabled. To enable it you will need to ensure availability of a mail server and mailbox for which you will need to configure its inbound (SMTP) and outbound (IMAP/POP) parameters. See Properties related to email for the properties to configure.

Validation via command-line tool

Note

Command-line tool availability: Command-line tools are supported only for validators hosted on Test Bed resources and if generation of such a tool has been requested (see Step 5: Setup validator on Test Bed).

When a command line tool is set up for your validator it is available as an executable JAR file, packaged alongside a README file in a ZIP archive. This ZIP archive is downloadable from a URL of the form https://www.itb.ec.europa.eu/xml-offline/DOMAIN/validator.zip, where DOMAIN is the name of the validator’s domain. Considering our purchase order example, the command-line validator would be available at https://www.itb.ec.europa.eu/xml-offline/order/validator.zip.

To use it you need to:

Ensure you have Java running on your workstation (minimum version 17).
Download and extract the validator’s ZIP archive.
Open a command prompt and change to the directory in which you extracted the JAR file.
View the validator’s help message by issuing java -jar validator.jar

> java -jar validator.jar

Expected usage: java -jar validator.jar -input FILE_OR_URI_1 ... [-input FILE_OR_URI_N] [-noreports] [-xsd SCHEMA_FILE_OR_URI] [-sch SCHEMATRON_FILE_OR_URI_1] ... [-sch SCHEMATRON_FILE_OR_URI_N]  [-locale LOCALE]
Where:
    - FILE_OR_URI_X is the full file path or URI to the content to validate.
    - SCHEMA_FILE_OR_URI is the full file path or URI to a schema for the validation.
    - SCHEMATRON_FILE_OR_URI_X is the full file path or URI to a schematron file for the validation.
    - LOCALE is the language code to consider for reporting of results. If the provided locale is not supported by the validator the default locale will be used instead (e.g. 'fr', 'fr_FR').

The summary of each validation will be printed and the detailed reports produced in the current directory (as "report.X.xml", "report.X.pdf" and "report.X.csv").

Running the validator will produce a summary output on the command console as well as the detailed validation report(s) (unless flag -noreports has been specified). To resolve potential problems during execution, an output log is also generated with a detailed log trace.

> java -jar validator.jar -input sample.xml -xsd PurchaseOrder.xsd -sch LargePurchaseOrder.sch

Validating 1 of 1 ... Done.

Validation report summary [sample.xml]:
- Date: 2020-12-07T17:53:03.396+01:00
- Result: FAILURE
- Errors: 1
- Warnings: 0
- Messages: 0
- Detailed reports in [D:\tmp\report.0.xml], [D:\tmp\report.0.pdf] and [D:\tmp\report.0.csv]

Note

Offline validator use and remote files: Depending on the validator’s configuration, one or more of its XSD or Schematron files may be defined as URIs (see Step 3: Prepare validator configuration). In addition, you may also provide the content to validate as a reference to a remote file. In these cases the workstation running the validator would need access to the remote resources. Any proxy settings applicable for the workstation will automatically be used for the connections.

Step 7: Use the validator in GITB TDL test cases

As a next step over the standalone XML validator you may consider using it from within GITB TDL test cases running in the Test Bed. You would typically do this for the following reasons:

You want to control access to the validation service based on user accounts.
You prefer to record all data linked to validations (for e.g. subsequent inspection).
You want to build complete conformance testing scenarios that are either focused on the validator or that use it as part of validation steps.

As described in Validation via SOAP web service API, the standalone XML validator offers by default a SOAP API for machine-to-machine integration that realises the GITB validation service specification. In short this means that the service can be easily included in any GITB TDL test case as the handler of a verify step. This is done by supplying as the handler value the full URL to the service’s WSDL, as illustrated in the following example that requests the user to upload the file to validate:

<?xml version="1.0" encoding="UTF-8"?>
<testcase id="testCase1_upload" xmlns="http://www.gitb.com/tdl/v1/" xmlns:gitb="http://www.gitb.com/core/v1/">
    <metadata>
        <gitb:name>[TC1] Validate user-provided data</gitb:name>
        <gitb:version>1.0</gitb:version>
        <gitb:description>Test case that allows the developer of an EU retailer system to upload a purchase order for validation.</gitb:description>
    </metadata>
    <actors>
        <gitb:actor id="Retailer" name="Retailer" role="SUT"/>
    </actors>
    <steps>
        <!--
            Request from the user the content to validate.
        -->
        <interact id="userInput" desc="Upload content">
            <request name="content" desc="Purchase order to validate:" inputType="UPLOAD"/>
        </interact>
        <!--
            Trigger the validation.
        -->
        <verify handler="https://www.itb.ec.europa.eu/order/api/validation?wsdl" desc="Validate purchase order">
            <input name="xml">$userInput{purchaseOrder}</input>
            <input name="type">"basic"</input>
        </verify>
    </steps>
</testcase>

The supported inputs for the verify step match those expected by the validator’s SOAP API (see Validation via SOAP web service API). We included the type input here because we define two validation types (basic and large) but this could be omitted if only a single validation type is supported. For string or binary content you typically don’t need to provide the embeddingMethod input as this is determined automatically by the Test Bed. If however you are using string variables that contain BASE64 or URI references you would need to define this explicitly.

An example of this is a test case that requests the content from the user as a URI:

<testcase id="testCase1_upload" xmlns="http://www.gitb.com/tdl/v1/" xmlns:gitb="http://www.gitb.com/core/v1/">
    ...
    <steps>
        <!--
            Request from the user the content to validate as a URI.
        -->
        <interact id="userInput" desc="Upload content">
            <request name="content" desc="URI of the purchase order to validate:"/>
        </interact>
        <!--
            Trigger the validation.
        -->
        <verify handler="https://www.itb.ec.europa.eu/order/api/validation?wsdl" desc="Validate purchase order">
            <input name="xml">$userInput{purchaseOrder}</input>
            <!--
                Explicitly define the embeddingMethod to consider.
            -->
            <input name="embeddingMethod">"URL"</input>
            <input name="type">"basic"</input>
        </verify>
    </steps>
</testcase>

A more complicated example involves externally provided Schematron files to apply alongside the validator’s built-in configuration (if supported for the validation type in question). The following example shows two such files being provided, one as a URI and one from a file included in the test suite itself:

<testcase id="testCase1_upload" xmlns="http://www.gitb.com/tdl/v1/" xmlns:gitb="http://www.gitb.com/core/v1/">
    ...
    <imports>
        <!--
            Import the additional shapes to consider from the test suite.
        -->
        <artifact type="schema" encoding="UTF-8" name="additionalSchematron">resources/additional-schematron.sch</artifact>
    </imports>
    ...
    <steps>
        <!--
            Request from the user the content to validate.
        -->
        <interact id="userInput" desc="Upload content">
            <request name="content" desc="Purchase order to validate:" inputType="UPLOAD"/>
        </interact>
        <!--
            Configure the remotely loaded Schematron file.
        -->
        <assign to="schematron1{content}">"https://path.to.rules/schematron_extensions.sch"</assign>
        <assign to="schematron1{type}">"sch"</assign>
        <assign to="schematron1{embeddingMethod}">"URL"</assign>
        <!--
            Configure the imported Schematron file.
        -->
        <assign to="schematron2{content}">$additionalSchematron</assign>
        <assign to="schematron2{type}">"sch"</assign>
        <!--
            Add them to the input list.
        -->
        <assign to="schematronsToUse" append="true">$schematron1</assign>
        <assign to="schematronsToUse" append="true">$schematron2</assign>
        <!--
            Trigger the validation.
        -->
        <verify handler="https://www.itb.ec.europa.eu/order/api/validation?wsdl" desc="Validate purchase order">
            <input name="xml">$userInput{content}</input>
            <input name="type">"basic"</input>
            <input name="externalSchematron">$schematronsToUse</input>
        </verify>
    </steps>
</testcase>

When referring to the service’s address you may have noticed that we have used directly its WSDL URL. A better approach to improve portability is to define this in the Test Bed’s domain configuration as a domain parameter. Defining a validationService parameter in the domain you could thus redefine the verify step as:

...
<verify handler="$DOMAIN{validationService}" desc="Validate purchase order">
    ...
</verify>
...

Using an external service for XML validation is not the only way to support XML validation from within a test case. The Test Bed actually offers numerous built-in validators to cover most XML validation needs:

The XmlValidator to check syntax against XSDs and Schematrons.
The XPathValidator to validate using XPath expressions (resulting in true or false values).
The XmlMatchValidator to check an XML document against a given template (potentially skipping certain elements or complete sections).

Given these built-in validators you may wonder why use an external service to begin with. The following are reasons you may choose to do so:

To allow data validation also as a separate public, anonymous and stateless service. If you need to define such a service you might as well reuse it within test cases.
To simplify management of validation artefacts. Artefacts used in test cases such as XSDs and Schematrons need to either be included in test suites or referred to remotely through URIs. Given multiple test suites and artefacts it is likely simpler to bundle resources once in a separate validation service where you can have a single update reflected automatically across all relevant test cases.

Summary

Congratulations! You have just set up a validation service for your XML specification. In doing so you considered your needs and defined your service through configuration on the DIGIT Test Bed or as a Docker container. In addition, you used this service via its different APIs and considered how this could be used as part of complete conformance testing scenarios.

References

This section contains additional references linked to this guide.

Validator configuration properties

The following sections list the configuration properties you can use to customise the operation of your validation service.

Domain-level configuration

The properties in this section are to be provided in the configuration property file (one per configured validation domain) you define as part of your validator configuration. The properties marked as being translatable (in the listed property Type) can also be defined in translation property files if the validator is configured to support multiple languages (see Supporting multiple languages).

Note

Property placeholders: Numerous configuration properties listed below are presented with placeholders. These are marked in uppercase and have the following meaning:

TYPE: The validation type to perform.
OPTION: An option for a given validation type (e.g. a specific version).
FULL_TYPE: For a validation type with options this is equals to TYPE.OPTION, otherwise it is the TYPE value itself.
N: A zero-based integer (used as a counter).

Property	Description	Type	Default value
`validator.addBOMToCSVExports`	Whether or not a UTF-8 BOM (Byte Order Mark) should be added when generating validation reports in CSV format.	Boolean	true
`validator.bannerHtml`	Configurable HTML banner replacing the text title.	String (translatable)
`validator.channels`	Comma-separated list of validation channels to have enabled. Possible values are (`form`, `rest_api`, `soap_api`).	Comma-separated Strings	form, rest_api, soap_api
`validator.completeTypeOptionLabel.TYPE.OPTION`	Label to display for the full validation type and option combination (visible in the validator’s result screen).	String (translatable)
`validator.contextFile.FULL_TYPE.N.combinationPlaceholder`	The placeholder value to refer to the context file in combination templates.	String
`validator.contextFile.FULL_TYPE.N.path`	The relative path under the domain root folder where the corresponding context file will be placed during validation.	String
`validator.contextFile.FULL_TYPE.N.schema`	The path to an XSD used to validate the corresponding type-specific context file, provided as a relative path from the domain root.	String
`validator.defaultContextFile.FULL_TYPE.N.combinationPlaceholder`	The placeholder value to refer to the default context file in combination templates.	String
`validator.defaultContextFile.N.path`	The relative path under the domain root folder where the corresponding default context file will be placed during validation.	String
`validator.defaultContextFile.N.schema`	The path to an XSD used to validate the corresponding default context file, provided as a relative path from the domain root.	String
`validator.defaultPlugins.N.class`	Paired with the its corresponding jar property (see above), this defines the fully qualified names of the classes to be loaded as plugin implementations.	Comma-separated Strings
`validator.defaultPlugins.N.jar`	Configuration for custom plugins that can be used to extend the validation. This is a default plugin definition that applies to all validation types. This property is the relative path pointing to the all-in-one JAR file that contains the plugin implementation(s).	String
`validator.defaultType`	The default validation type, considered if no type is indicated.	String
`validator.domainAlias`	In case the current domain is to be considered an alias, this is the target domain to delegate processing to.	String
`validator.domainAlias.FULL_TYPE`	Mapping for the given validation type (set as a postfix) in the current domain, to a validation type or alias in the target domain.	String
`validator.externalSchematronFile.FULL_TYPE`	Whether or not user-provided Schematron are allowed for the given validation type (added as a postfix). Possible values are (`required`, `optional`, `none`).	String	none
`validator.externalSchematronFile.FULL_TYPE.preprocessor`	The relative path to a XSLT file that will be used for Schematron pre-processing.	String
`validator.externalSchematronFile.FULL_TYPE.preprocessor.output`	The file extension for the file resulting from the Schematron pre-processing.	String	`sch`
`validator.externalSchemaFile.FULL_TYPE`	Whether or not user-provided XML Schemas are allowed for the given validation type (added as a postfix). Possible values are (`required`, `optional`, `none`).	String	none
`validator.externalSchemaFile.FULL_TYPE.preprocessor`	The relative path to a XSLT file that will be used for XSD pre-processing.	String
`validator.externalSchemaFile.FULL_TYPE.preprocessor.output`	The file extension for the file resulting from the XSD pre-processing.	String	xsd
`validator.footerHtml`	Configurable HTML banner for the footer.	String (translatable)
`validator.hiddenType`	Comma separated list of validator types to hide in the web UI. Validator types listed here, will not be displayed in the web UI and will only be accessible through the CLI and the API.	String	none
`validator.httpVersion`	The HTTP protocol version to use when loading remote resources. Possible values are `1.1` and `2` (the default). This can be set if you are experiencing connection errors (e.g. unexpected 403 forbidden errors) when accessing remote resources.	String	2
`validator.importProperties`	The path of a file declaring additional domain properties. If the application property `validator.restrictResourcesToDomain` is set to `true` or omitted, only relative paths of files under the domain root folder are accepted.	String
`validator.includeAssertionID`	Whether Schematron rule IDs (if available) should be included in the resulting reports.	Boolean	true
`validator.includeLocationPath`	Whether by default a simplified XPath path expression will be added to schematron report items besides the line number. When used via the REST or SOAP API this can be forced on or off through the `showLocationPaths` input.	Boolean	false
`validator.includeTestDefinition`	Whether test expressions (if available) should be included in the resulting reports.	Boolean	true
`validator.input.preprocessor.FULL_TYPE`	An XPath expression to be used for the preprocessing of input before proceeding to validate. The result of the applied expression must be an XML node.	String
`validator.input.transformer.FULL_TYPE`	The path (relative to the domain root) to an XSLT stylesheet file for the transformation of input before proceeding to validate.	String
`validator.javascriptExtension`	Configurable JavaScript content to support HTML banners and footers.	String (translatable)
`validator.locale.available`	The list of locales (language codes) that will be available for the user to select. This is provided as a comma-separated set of String values, the order of which determines the order that they will be listed in the UI’s language selection control.	Comma-separated Strings
`validator.locale.default`	The default locale (language code) to consider for the validator. This can be provided alone or with a country variant (e.g. “en”, “en_US”, “en_GB”).	String	en
`validator.locale.translations`	The path to a folder (absolute or relative to the domain configuration file) that contains the translation property files for the validator’s supported languages.	String
`validator.maximumReportsForDetailedOutput`	The maximum number of report items for which a PDF validation report will be generated (no report will be preoduced if exceeded).	Integer	5000
`validator.maximumReportsForXmlOutput`	The maximum number of report items to include in the XML validation report.	Integer	50000
`validator.optionLabel.OPTION`	Label to display in the web form for an option across all validation types (added as a postfix of `validator.optionLabel`). Only displayed if there are options defined.	String (translatable)
`validator.plugins.FULL_TYPE.N.class`	Paired with the its corresponding jar property (see above), this defines the fully qualified names of the classes to be loaded as plugin implementations.	Comma-separated Strings
`validator.plugins.FULL_TYPE.N.jar`	Configuration for custom plugins that can be used to extend the validation specific to a given type (FULL_TYPE) and extending any default plugins (see above). This property is the relative path pointing to the all-in-one JAR file that contains the plugin implementation(s).	String
`validator.preloadRemoteSchemaImports`	The default setting determining whether remote schema imports should be preloaded and cached at startup.	Boolean	false
`validator.preloadRemoteSchemaImports.FULL_TYPE`	The setting for a given validation type determining whether remote schema imports should be preloaded and cached at startup.	Boolean	false
`validator.remoteArtefactLoadErrors`	The default handling approach for errors raised when downloading pre-configured remote artefacts. Possible values are `fail` (log, stop processing and report error), `warn` (log, continue processing but report a warning), `log` (log and continue processing).	String	log
`validator.remoteArtefactLoadErrors.FULL_TYPE`	The handling approach for a given validation type for errors raised when downloading pre-configured remote artefacts. Possible values are `fail` (log, stop processing and report error), `warn` (log, continue processing but report a warning), `log` (log and continue processing).	String	log
`validator.remoteSchemaImportMapping.N.file`	An entry mapping a remote schema URI to a local file (this is the local file to map to).	String
`validator.remoteSchemaImportMapping.N.uri`	An entry mapping a remote schema URI to a local file (this is the full remote URI to map).	String
`validator.report.id`	A report identifier to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.name`	A report name to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.customisationId`	The default value for a profile customisation ID to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.customisationId.FULL_TYPE`	The profile customisation ID for a given (full) validation type to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.profileId`	The default value for a profile ID to include as metadata in produced GITB TRL reports (in XML or JSON format).	String	The applied (full) validation type.
`validator.report.profileId.FULL_TYPE`	The profile ID for a given (full) validation type to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.validationServiceName`	A name for the validator to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.report.validationServiceVersion`	A version for the validator to include as metadata in produced GITB TRL reports (in XML or JSON format).	String
`validator.reportsOrdered`	Whether the report items are to be ordered (errors first, then warnings, then messages). Otherwise the items will appear based on where they were raised in the validated content.	Boolean	false
`validator.richTextReports`	Whether the report items are expected to contain rich text and will be rendered as such (currently links).	Boolean	false
`validator.schematronFile.FULL_TYPE`	Comma-separated list of Schematron files loaded for a given validation type (added as a postfix). These can be files or folders.	Comma-separated Strings
`validator.schematronFile.FULL_TYPE.preprocessor`	The relative path to a XSLT file that will be used for Schematron pre-processing.	String
`validator.schematronFile.FULL_TYPE.preprocessor.output`	The file extension for the file resulting from the Schematron pre-processing.	String	sch
`validator.schematronFile.FULL_TYPE.remote.N.preprocessor`	The relative path to a XSLT file that will be used for Schematron pre-processing.	String
`validator.schematronFile.FULL_TYPE.remote.N.preprocessor.output`	The file extension for the file resulting from the Schematron pre-processing.	String	sch
`validator.schematronFile.FULL_TYPE.remote.N.url`	Reference for a remotely loaded Schematron file for a given validation type (added as the `FULL_TYPE` placeholder). One or more such entries can be defined by incrementing the zero-based `N` counter.	String
`validator.schemaFile.FULL_TYPE`	Comma-separated list of XSD files loaded for a given validation type (added as a postfix). These can be a files or folders.	Comma-separated Strings
`validator.schemaFile.FULL_TYPE.preprocessor`	The relative path to a XSLT file that will be used for XSD pre-processing.	String
`validator.schemaFile.FULL_TYPE.preprocessor.output`	The file extension for the file resulting from the XSD pre-processing.	String	xsd
`validator.schemaFile.FULL_TYPE.remote.N.preprocessor`	The relative path to a XSLT file that will be used for XSD pre-processing.	String
`validator.schemaFile.FULL_TYPE.remote.N.preprocessor.output`	The file extension for the file resulting from the XSD pre-processing.	String	xsd
`validator.schemaFile.FULL_TYPE.remote.N.url`	Reference for a remotely loaded XML Schema file for a given validation type (added as the `FULL_TYPE` placeholder). One or more such entries can be defined by incrementing the zero-based `N` counter.	String
`validator.schemaVersion`	The XML Schema specification version to consider when treating XML Schemas. Possible values are `1.0` (the default) and `1.1`.	String	1.0
`validator.schemaVersion.FULL_TYPE`	The XML Schema specification version to consider when treating XML Schemas for a given validation type (added as the `FULL_TYPE` placeholder). Possible values are `1.0` (the default) and `1.1`.	String	1.0
`validator.showAbout`	Whether or not to show the about panel on the web UI.	Boolean	true
`validator.skipRemoteSchemaImportCaching`	Whether or not to skip the caching of remotely loaded schema files.	Boolean	false
`validator.stopOnXsdErrors`	Whether validation should stop if XSD failures are encountered.	Boolean	true
`validator.stopOnXsdErrors.FULL_TYPE`	Whether validation for the given validation type should stop if XSD failures are encountered.	Boolean	true
`validator.supportMinimalUserInterface`	Enable a minimal user interface useful for embedding in other UIs or portals (applies only if the `form` validation channel is enabled).	Boolean	false
`validator.supportUserInterfaceEmbedding`	Allow the validator to be embedded within other user interfaces by displaying it in iframes.	Boolean	true
`validator.type`	Comma-separated list of supported validation types. Values need to be reflected in the other properties’ `TYPE` and `FULL_TYPE` placeholders. `FULL_TYPE` equals `TYPE.OPTION` (in case type options are defined) or simply `TYPE` if there are no options (see also `validator.typeOptions`).	Comma-separated Strings
`validator.typeGroup.GROUP`	Comma-separated list of the validation types included in the group (see `validator.type`).	Comma-separated Strings
`validator.typeGroupLabel.GROUP`	Label to display in the web form for a given validation type group (added as a postfix of `validator.typeGroupLabel`).	String (translatable)
`validator.typeGroupPresentation`	The presentation approach for validation type groups (if defined). Accepted values are `inline` (display as option groups) and `split` (display in a separate dropdown list).	String	inline
`validator.typeLabel.TYPE`	Label to display in the web form for a given validation type (added as a postfix of `validator.typeLabel`). Only displayed if there are multiple types.	String (translatable)
`validator.typeOptions.TYPE`	Comma-separated list of options defined for a given validation type (added as a postfix). Values need to be reflects in the other properties’ `OPTION` and `FULL_TYPE` placeholders. `FULL_TYPE` equals `TYPE.OPTION` (in case type options are defined) or simply `TYPE` if there are no options (see also `validator.type`).	Comma-separated Strings
`validator.typeAlias.ALIAS`	An alias that points to a full validation type and will resolve to it when used.	String
`validator.typeOptionLabel.TYPE.OPTION`	Label to display for an option for a specific validation type.	String (translatable)
`validator.webServiceDescription.addInputToReport`	The description of the SOAP web service for element “addInputToReport”.	String	Whether the returned XML validation report should also include the validated input as context information.
`validator.webServiceDescription.embeddingMethod`	The description of the SOAP web service for element “embeddingMethod”.	String	The embedding method to consider for the ‘xml’ input (‘BASE64’, ‘URL’ or ‘STRING’).
`validator.webServiceDescription.externalSchema`	The description of the SOAP web service for element “externalSchema”.	String	A list of maps that defines external XSDs to consider in addition to any preconfigured ones.
`validator.webServiceDescription.externalSchematron`	The description of the SOAP web service for element “externalSchematron”.	String	A list of maps that defines external Schematrons to consider in addition to any preconfigured ones.
`validator.webServiceDescription.locale`	The description of the SOAP web service for element “locale”.	String	Locale (language code) to use for reporting of results. If the provided locale is not supported by the validator the default locale will be used instead (e.g. “fr”, “fr_FR”).
`validator.webServiceDescription.locationAsPath`	The description of the SOAP web service for element “locationAsPath”.	String	Whether error locations should be XPath expressions or resolve their line and column locations in the provided input.
`validator.webServiceDescription.type`	The description of the web service for element “type”. Only displayed if there are multiple types.	String	The type of validation to perform (if multiple types are supported).
`validator.webServiceDescription.xml`	The description of the web service for element “xml”.	String	The XML content to validate, provided as a string, BASE64 or a URI.
`validator.webServiceId`	The ID of the web service.	String	ValidatorService

Properties related to email

In case the email channel is enabled (i.e. validator.channels includes email) the following properties need to be provided:

Property	Description	Type	Default value
`validator.mailAuthEnable`	Whether authentication is needed.	Boolean	false
`validator.mailAuthPassword`	The password to authenticate with.	String
`validator.mailAuthUsername`	The username to authenticate with.	String
`validator.mailInboundFolder`	The folder to read emails from.	String	INBOX
`validator.mailInboundHost`	The server’s host name to read emails from.	String
`validator.mailInboundPort`	The server’s port to read emails from.	String
`validator.mailInboundSSLEnable`	Whether SSL is needed to connect to the inbound service.	Boolean	false
`validator.mailFrom`	The FROM address to use.	String
`validator.mailOutboundHost`	The SMTP server’s host to send emails with.	String
`validator.mailOutboundPort`	The SMTP server’s port to send emails with.	Integer
`validator.mailOutboundSSLEnable`	Whether SSL is needed to connect to the SMTP server.	Boolean	false

Properties related to UI labels

To override labels on the web UI you can use the following properties. All properties listed here may be defined also in translation property files if the validator is configured to support multiple languages (see Supporting multiple languages).

Property	Description	Default value
`validator.about`	The about message displayed below the validator’s interface.	This service is powered by the Interoperability Test Bed, a conformance testing service offered by the European Commission’’s DG DIGIT for projects involved in the delivery of cross-border public services. Find out more <a href=”{0}” target=”_blank”>here</a>.
`validator.contextFile.FULL_TYPE.N.label`	The label to display on the UI and error messages relevant to type-specific context files.	The value of `validator.label.contextFileLabel` (Context file)
`validator.contextFile.FULL_TYPE.N.placeholder`	The placeholder text to display on the UI file upload control relevant to type-specific context files.	The value of `validator.label.contextFilePlaceholder` (Select file…)
`validator.defaultContextFile.N.label`	The label to display on the UI and error messages relevant to default context files.	The value of `validator.label.contextFileLabel` (Context file)
`validator.defaultContextFile.N.placeholder`	The placeholder text to display on the UI file upload control relevant to default context files.	The value of `validator.label.contextFilePlaceholder` (Select file…)
`validator.label.additionalInfoLabel`	The label for Schematron rule IDs reported in the validation report.	Rule:
`validator.label.backButton`	The text for back button displayed on the minimal UI results page (if enabled).	Back
`validator.label.contextFileLabel`	The default text to display for the label of the UI file upload control for context files.	Context file
`validator.label.contextFilePlaceholder`	The default text to display for the placeholder text within the UI file upload control for context files.	Select file…
`validator.label.csvHeaderAdditionalInfo`	The text for the header in CSV reports for a report item’s Schematron rule ID.	Rule
`validator.label.csvHeaderLevel`	The text for the header in CSV reports for a report item’s severity level.	Level
`validator.label.csvHeaderDescription`	The text for the header in CSV reports for a report item’s description.	Description
`validator.label.csvHeaderLocation`	The text for the header in CSV reports for a report item’s location path.	Location
`validator.label.csvHeaderTest`	The text for the header in CSV reports for a report item’s test definition.	Test
`validator.label.csvLevelError`	The text for the error status listed in CSV reports.	error
`validator.label.csvLevelWarning`	The text for the warning status listed in CSV reports.	warning
`validator.label.csvLevelMessage`	The text for the information message status listed in CSV reports.	message
`validator.label.downloadReportButton`	The text of the button to download the validation report.	Download report
`validator.label.externalArtefactsTooltip`	The tooltip text for the external artefacts option.	Additional artefacts that will be considered for the validation
`validator.label.externalArtefactsTooltip.TYPE`	The tooltip text for the external artefacts option when a specific validation type is selected.	Additional artefacts that will be considered for the validation
`validator.label.externalArtefactsTooltip.TYPE.OPTION`	The tooltip text for the external artefacts option when a specific validation type option is selected.	Additional artefacts that will be considered for the validation
`validator.label.externalSchemaLabel`	The label displayed next to included external artefacts.	External XML Schema
`validator.label.externalSchemaLabel.TYPE`	The label displayed next to included external artefacts when a specific validation type is selected.	External XML Schema
`validator.label.externalSchemaLabel.TYPE.OPTION`	The label displayed next to included external artefacts when a specific validation type option is selected.	External XML Schema
`validator.label.externalSchemaPlaceholder`	Placeholder text displayed within the Schema input’s text field.	Select file…
`validator.label.externalSchemaPlaceholder.TYPE`	Placeholder text displayed within the Schema input’s text field when a specific validation type is selected.	Select file…
`validator.label.externalSchemaPlaceholder.TYPE.OPTION`	Placeholder text displayed within the Schema input’s text field when a specific validation type option is selected.	Select file…
`validator.label.externalSchematronLabel`	The label displayed next to included external artefacts.	External Schematron
`validator.label.externalSchematronLabel.TYPE`	The label displayed next to included external artefacts when a specific validation type is selected.	External Schematron
`validator.label.externalSchematronLabel.TYPE.OPTION`	The label displayed next to included external artefacts when a specific validation type option is selected.	External Schematron
`validator.label.externalSchematronPlaceholder`	Placeholder text displayed within the Schematron input’s text field.	Select file…
`validator.label.externalSchematronPlaceholder.TYPE`	Placeholder text displayed within the Schematron input’s text field when a specific validation type is selected.	Select file…
`validator.label.externalSchematronPlaceholder.TYPE.OPTION`	Placeholder text displayed within the Schematron input’s text field when a specific validation type option is selected.	Select file…
`validator.label.fileInputLabel`	Label for the file input.	File to validate
`validator.label.fileInputPlaceholder`	Placeholder text displayed within the file input’s text field.	Select file…
`validator.label.includeExternalArtefacts`	Text appearing next to the option to include external artefacts.	Include external artefacts
`validator.label.includeExternalArtefacts.TYPE`	Text appearing next to the option to include external artefacts when a specific validation type is selected.	Include external artefacts
`validator.label.includeExternalArtefacts.TYPE.OPTION`	Text appearing next to the option to include external artefacts when a specific validation type option is selected.	Include external artefacts
`validator.label.maximumReportsExceededForDetailedOutputMessage`	Message to display when a PDF report is not generated due to the number of report items.	Findings are not listed here due to their large number. Download the validation report to view further details.
`validator.label.maximumReportsExceededForXmlOutputMessage`	Tooltip to display on the XML report download button when report items have been truncated after having reached their maximum threshold.	The validation report is limited to include the first X items.
`validator.label.ofLabel`	Text for the “(page X) of (Y)” label in the PDF reports’ footer.	of
`validator.label.optionContentDirectInput`	The text for the direct input content option.	Direct input
`validator.label.optionContentFile`	The text for the file content option.	File
`validator.label.optionContentURI`	The text for the URI content option.	URI
`validator.label.optionLabel`	The label for the selection of the target validation type option (if defined).	Option
`validator.label.optionLabel.TYPE`	The label for the selection of the target validation type option (if defined) when a specific validation type is selected.	Option
`validator.label.pageLabel`	Text for the page label in the PDF reports’ footer.	Page
`validator.label.popupCloseButton`	The text of the button on the annotated input popup to close.	Close
`validator.label.popupTitle`	The title to display on the popup displaying the annotated input.	XML content
`validator.label.reportAggregatedCSV`	The text of the option to download the validation report in aggregated CSV format.	CSV (aggregated)
`validator.label.reportAggregatedPDF`	The text of the option to download the validation report in aggregated PDF format.	PDF (aggregated)
`validator.label.reportAggregatedXML`	The text of the option to download the validation report in aggregated XML format.	XML (aggregated)
`validator.label.reportDetailedCSV`	The text of the option to download the validation report in CSV format.	CSV
`validator.label.reportDetailedPDF`	The text of the option to download the validation report in PDF format.	PDF
`validator.label.reportDetailedXML`	The text of the option to download the validation report in XML format.	XML
`validator.label.reportItemTotalOccurrences`	The text displayed for aggregated report items when viewing aggregated findings to highlight the total number of occurrences.	First of {0} occurrences
`validator.label.result.failure`	The text for the overall validation result when failed.	FAILURE
`validator.label.result.success`	The text for the overall validation result when successful.	SUCCESS
`validator.label.result.undefined`	The text for the overall validation result when undefined.	UNDEFINED
`validator.label.result.warning`	The text for the overall validation result when successful with warnings.	WARNING
`validator.label.resultSectionTitle`	Title of the panel displaying validation results.	Validation result
`validator.label.resultDateLabel`	The label for the result panel’s date.	Date:
`validator.label.resultFileNameLabel`	The label for the result panel’s file name..	File name:
`validator.label.resultFindingsLabel`	The label for the result findings display.	Findings:
`validator.label.resultFindingsDetailsLabel`	The label for the display of the finding counts per severity (0: errors, 1: warnings, 2: messages).	{0} error(s), {1} warning(s), {2} message(s)
`validator.label.resultLocationLabel`	The label preceding a report item’s location description.	Location:
`validator.label.resultResultLabel`	The label for the result panel’s overall validation result.	Result:
`validator.label.resultSubSectionDetailsTitle`	The title of the panel displaying the validation report’s details.	Details
`validator.label.resultSubSectionOverviewTitle`	The title for the overview section of the results.	Overview
`validator.label.resultTestLabel`	The label preceding a report item’s test description.	Test:
`validator.label.resultValidationTypeLabel`	The label for the result panel’s validation type display.	Validation type:
`validator.label.typeGroupLabel`	The label for the selection of the validation type group (if displayed as a separate dropdown list).	Group
`validator.label.typeLabel`	The label for the selection of the target validation type.	Validate as
`validator.label.uploadButton`	The upload button’s text.	Validate
`validator.label.validatingInputMessage`	The message text when the validator is carrying out the validation.	Validating input
`validator.label.viewAnnotatedInputButton`	The text for the button to open the annotated input.	View annotated input
`validator.label.viewDetailsButton`	The text of the button displayed in the minimal UI to view the report’s details.	View details
`validator.label.viewReportItemsAggregated`	The text of the option to switch the report displayed on the UI to an aggregated report.	Aggregated report
`validator.label.viewReportItemsDetailed`	The text of the option to switch the report displayed on the UI to a detailed report.	Detailed report
`validator.label.viewReportItemsShowAll`	The text of the filter option to show all report items in the UI.	Show all
`validator.label.viewReportItemsShowErrors`	The text of the filter option to show only errors in the UI.	Show errors
`validator.label.viewReportItemsShowMessages`	The text of the filter option to show only messages in the UI.	Show messages
`validator.label.viewReportItemsShowWarnings`	The text of the filter option to show only warnings in the UI.	Show warnings
`validator.label.viewSummaryButton`	The text of the button displayed in the minimal UI to view the report’s summary.	View summary
`validator.reportTitle`	The title for the produced validation report.	Validation report
`validator.uploadTitle`	Title for the validator web form.	Validator

Application-level configuration

These properties govern the validator’s application instance itself. They apply only when you are defining your own validator as a Docker image in which case they are supplied as environment variables (ENV directives in a Dockerfile). Note that apart from these properties any Spring Boot configuration property can also be supplied.

Note

The only property that is mandatory for a custom validator setup is validator.resourceRoot. If you don’t provide this, a generic validator will be configured for validation against user-provided validation artefacts.

Property	Description	Type	Default value
`logging.file.path`	Path to a folder that will hold the validator’s log output.	String	/validator/logs
`validator.acceptedMimeTypes`	Accepted mime-types for input files.	Comma-separated Strings	application/xml, text/xml, text/plain
`validator.acceptedSchemaMimeType`	Accepted XML Schema files mime type.	Comma-separated Strings	application/xml, text/xml, text/plain
`validator.acceptedSchematronExtensions`	Accepted Schematron file extensions. All other files found in `validator.schematronFile.FULL_TYPE` (when folders are defined) are ignored.	Comma-separated Strings	xsl, xslt, sch
`validator.acceptedSchematronMimeType`	Accepted Schematron files mime type.	Comma-separated Strings	application/xslt+xml, application/xml, text/xml, text/plain
`validator.acceptedZipMimeType`	Accepted ZIP files mime type.	Comma-separated Strings	application/zip, application/octet-stream, application/x-zip-compressed, multipart/x-zip
`validator.baseSoapEndpointUrl`	The full public base URL at which SOAP endpoints will be published (up to but without including the domain name).	String
`validator.cleanupPollingRate`	The rate at which the `validator.reportFolder` folder is polled for forced cleanup (in ms).	Integer	60000
`validator.disablePreprocessingCache`	Whether to disable caching for pre-processing XSLTs.	Boolean	false
`validator.docs.host`	The host to display as the root for the REST API Swagger documentation.	String	localhost:8080
`validator.docs.licence.description`	Description of the licence in the Swagger UI.	String,	European Union Public Licence (EUPL) 1.2
`validator.docs.licence.url`	URL to the licence for the Swagger UI.	String,	https://eupl.eu/1.2/en/
`validator.docs.schemes`	Comma-separated scheme values for the Swagger documentation	Comma-separated Strings	http
`validator.docs.server.url`	Comma-separated server URL values for the Swagger documentation	Comma-separated Strings
`validator.docs.title`	Title to display in the Swagger UI.	String,	XML Validator REST API
`validator.docs.version`	Version number to display in the Swagger UI.	String,	1.0.0
`validator.domain`	The names of the domain subfolders to consider. By default all folders under `validator.resourceRoot` will be considered.	Comma-separated Strings
`validator.domainName.DOMAIN`	The name to display for a given domain folder (the folder name replacing the `DOMAIN` placeholder). This value will also be used in request paths.	String	The folder name is used.
`validator.identifier`	The validator’s identifier to be sent for usage statistics reporting.	String	xml
`validator.inputFilePrefix`	Prefix of input files in the report folder.	String	ITB-
`validator.mailPollingRate`	The rate at which the configured email addresses (if configured) are polled for received input files (in ms).	Integer	60000
`validator.minimumCachedInputFileAge`	Time to keep XML input files in milliseconds (600000 = 10 minutes).	Integer	600000
`validator.rateLimit.capacity.uiValidate`	The maximum validations per minute through the web user interface.	Integer	60
`validator.rateLimit.capacity.restValidate`	The maximum validations per minute through the REST API (operation `validate`).	Integer	60
`validator.rateLimit.capacity.restValidateMultiple`	The maximum validations per minute through the REST API (operation `validateMultiple`).	Integer	30
`validator.rateLimit.capacity.soapValidate`	The maximum validations per minute through the SOAP API.	Integer	60
`validator.rateLimit.enabled`	Whether validation rate limiting is enabled.	Boolean	false
`validator.rateLimit.ipHeader`	The name of a HTTP header from which to read the client’s IP address.	String
`validator.rateLimit.warnOnly`	Whether exceeding the rate limit will result in a logged warning as opposed to a blocked request.	Boolean	false
`validator.reportFilePrefix`	Prefix of report files in the report folder.	String	TAR-
`validator.reportFolder`	Path to a folder that contains temporary data and reports.	String	/validator/reports
`validator.resourceRoot`	The root folder under which domain subfolders will be loaded from.	String
`validator.restrictResourcesToDomain`	Whether local validation artefacts can be loaded from outside the domain root folder. If `false`, both relative and absolute paths are accepted.	Boolean	true
`validator.webhook.ipheader`	The HTTP header to use to retrieve the user’s IP address for usage statistics if the validator is behind a proxy. This property is only used to detect the user’s country, when such feature is enabled (see `validator.webhook.statisticsEnableCountryDetection`).	String	X-Real-IP
`validator.webhook.statistics`	The URL of the backend service that will collect usage statistics. This property is optional and, if (and only if) present, the validator will report usage statistics.	String
`validator.webhook.statisticsCountryDetectionDbFile`	The path to the .mmdb file with the geolocation database that is used to resolve the user’s country from an IP address. This database will only be used when the property `validator.webhook.statisticsEnableCountryDetection` takes the value `true`.	String
`validator.webhook.statisticsEnableCountryDetection`	Whether the usage statistics reporting service can detect users’ countries from their IP addresses. If `false` no information about the user’s country or IP address will be reported.	Boolean	false
`validator.webhook.statisticsSecret`	The validator client secret to be passed to the backend service for usage statistics reporting.	String

Change history

This section lists the change history for the XML validator core component, applicable to all validator instances built over it. The core validator is provided through a latest release that is updated for every change, as well as milestone releases for deployments where ensuring stability is more important than rapid updates. Note that all updates, be it on the release milestones or the latest snapshot, are guaranteed to always be backwards compatible with configurations defined for earlier builds.

The table that follows lists all changes per build date and release milestones. Users operating a snapshot “latest” release can use the build dates to determine what updates are available. When a release number is listed next to a change, this includes all changes leading up to and including it.

Note

All validators managed by the Test Bed are always operating with the very latest build. In addition, release milestones are currently only offered for Docker images.

Release	Date	Type	Description
	2026/02/23	Improvement	Upgraded Spring Boot to v3.5.11.
	2026/02/13	Feature	Support for validation of XML Schema v1.1 if configured for a domain (default remaining XML Schema 1.0).
	2026/02/13	Bug fix	When using the UI and cancelling an file browser popup, clear any currently selected file to prevent failed submissions.
1.10.0	2026/02/04	Release	Milestone release 1.10.0.
	2026/01/26	Improvement	Upgraded Spring Boot to v3.5.10.
	2026/01/08	Feature	Configurable IP-based validation rate limiting.
	2026/01/08	Improvement	Upgraded Spring Boot to v3.5.9 and CXF to v4.1.4.
	2025/11/28	Improvement	When using validation type groups and a split group display, automatically select the first validation type upon group selection.
	2025/11/03	Improvement	Upgraded Spring Boot to v3.5.7 and Tomcat to v10.1.48.
	2025/09/16	Improvement	Upgraded ph-schematron (v9.0.1) and Saxon (v12.8).
1.9.0	2025/08/22	Release	Milestone release 1.9.0.
	2025/08/22	Improvement	Upgraded Spring Boot (v3.5.5) and Apache Tika (v3.2.2).
	2025/08/19	Improvement	Tomcat upgrade (v10.1.44).
	2025/08/19	Improvement	Remove log warnings about unused PDF fonts.
	2025/08/18	Feature	Support the mapping of remote schemas to local files.
	2025/08/18	Feature	Cache remotely loaded schema references by default, allowing cache disabling.
	2025/08/18	Improvement	Gracefully handle remote schema references.
	2025/07/25	Feature	Support for domain aliases, allowing seamless migration between separate validator configurations.
	2025/07/25	Improvement	Upgraded Spring Boot (v3.5.4) and Apache Tika (v3.2.1).
	2025/07/25	Bug fix	Unable to embed a results-only view of the validator’s UI in iframes.
	2025/07/24	Improvement	Added default placeholder texts for the file upload controls for user-provided XSD and Schematron files.
	2025/07/24	Improvement	Improved the error message related to using functions in Schematron rules while in pure Schematron format.
	2025/06/30	Improvement	Migrated com.openhtmltopdf (v1.0.10) to io.github.openhtmltopdf (v1.1.28) to ensure continued maintenance.
	2025/06/24	Bug fix	Extended the limit of processed multipart request parts to avoid errors when many custom inputs are provided through the UI.
	2025/06/24	Improvement	Added publiccode.yml metadata file.
	2025/06/23	Improvement	Upgraded Spring Boot (v3.5.3) and Apache CXF (v4.1.2).
	2025/06/19	Improvement	List in full third-party libraries licences and copyright notices in NOTICE.md, and add copyright to all source files.
	2025/06/17	Improvement	Tomcat upgrade (version 10.1.42).
	2025/06/17	Feature	Support adding a simplified XPath path expression in the location of reported items from schematron validation.
	2025/06/10	Bug fix	XML parsing failing for unicode documents containing BOM (Byte Order Mark).
	2025/05/26	Improvement	Spring Boot upgrade (version 3.5.0).
	2025/05/07	Improvement	Improved displayed error message when processing invalid XML content.
	2025/05/06	Improvement	Support XSLT functions and resource lookups when validating against SCH files (rather than compiled XSLT).
	2025/04/25	Improvement	Upgraded Apache Tika (v2.9.3), Spring Boot (v3.4.5) and Apache HTTP Client5 (v5.4.4) to resolve CVE-2025-27820, CVE-2025-22234 and CVE-2025-31672 (as a precaution).
1.8.0	2025/04/08	Release	Milestone release 1.8.0.
	2025/03/24	Improvement	Spring Boot upgrade (version 3.4.4).
	2025/03/11	Improvement	Upgrade Schematron validation library (ph-schematron) to v8.0.6.
	2025/03/10	Feature	Support optionally continuing with Schematron rule validation even if XSD errors are found.
	2025/03/05	Improvement	Spring Boot upgrade (version 3.4.3).
	2025/02/19	Bug fix	When validating against remote schemas DTDs may not be resolved correctly.
	2025/02/19	Bug fix	When an invalid validation type is used it does not get reported correctly in the resulting error message.
	2025/02/17	Improvement	Improve image anti-aliasing in PDF reports across PDF viewers.
	2025/01/31	Feature	Support the presentation of validation types in named groups when using the validator’s web user interface.
	2025/01/27	Improvement	Spring Boot upgrade (version 3.4.2).
	2025/01/21	Improvement	Library upgrade (CXF) to resolve CVE-2025-23184 (as a precaution).
	2025/01/20	Bug fix	Using the UI, unable to switch back to the detailed report once the aggregated report has been toggled.
	2025/01/15	Improvement	Upgrade Schematron validation library (ph-schematron) to v8.0.5.
	2024/12/06	Improvement	Upgrade Schematron validation library (ph-schematron) to v8.0.4.
1.7.0	2024/11/29	Release	Milestone release 1.7.0.
	2024/11/29	Improvement	Spring Boot upgrade (version 3.4.0).
	2024/11/22	Feature	Support forcing server URL value in OpenAPI documentation via configuration property.
	2024/10/03	Feature	Support forcing the HTTP protocol version when accessing remote resources (in case unexpected errors are reported).
	2024/09/16	Feature	New REST API health-check endpoint for specific domains and overall validator.
	2024/07/30	Bug fix	When loading remote URIs for the content to validate or user-provided artefacts, redirects are not followed.
	2024/07/08	Improvement	REST API documentation (OpenAPI/Swagger) includes several executable examples for different validations.
	2024/07/05	Improvement	Include decoded message for exception stack traces in the log.
	2024/07/02	Improvement	Upgrade to Java 21.
1.6.0	2024/06/24	Release	Milestone release 1.6.0.
	2024/06/20	Feature	Support for validation type aliases to facilitate backwards compatibility of validator API clients.
	2024/06/18	Bug fix	Occasional error (StringIndexOutOfBoundsException) when loading REST API documentation through Swagger UI.
	2024/06/06	Improvement	Added the Referrer-Policy and Permissions-Policy HTTP headers to all responses.
	2024/05/24	Improvement	Support by default RDF/XML files as an accepted XML type for validations.
	2024/04/19	Improvement	Made available multi-architecture Docker image variants on the Docker Hub (supporting linux/amd64 and linux/arm64).
	2024/04/09	Bug fix	Schematron assertions set with CAUTION role should be treated as information messages not warnings.
	2024/04/08	Improvement	Support also RFC 2045 (MIME) for inputs provided as Base64-encoded strings.
	2024/02/09	Feature	Support direct input via editor for user-provided schemas, schematrons and context files.
	2024/02/09	Bug fix	Unable to provide external schemas or schematrons via URI through the minimal UI.
	2024/01/29	Feature	Support for input transformation before validation using XSLT stylesheets.
	2024/01/24	Feature	Support for HTML links in report item descriptions (if rich text support is enabled).
	2024/01/23	Feature	Support for Schematron validation after combining user-provided context files with the validator’s main input.
	2023/11/13	Bug fix	When multiple Schematrons are configured and a URI fails to be resolved the base path is not correctly reported in the error logs.
	2023/11/10	Bug fix	Upload errors from user-provided files are not translated correctly.
	2023/11/10	Bug fix	When a single user-provided context file is expected an incorrect error may be reported that it is missing.
1.5.0	2023/11/07	Release	Milestone release 1.5.0.
	2023/10/20	Feature	Support for user-provided context files that can be used from Schematron files and XSLTs.
	2023/10/20	Improvement	Spring Boot upgrade (version 3.1.5) to resolve published Tomcat CVEs.
	2023/10/20	Bug fix	Removal of Spring-related logging startup message when using a CLI validator.
	2023/09/22	Improvement	Library upgrades to resolve (non-exploitable) CVE-2023-41080.
	2023/09/14	Bug fix	A validator with types having both options and no options may show the validate button as disabled for a no-options type.
	2023/08/17	Improvement	Switched to a Jammy-based JRE Docker image to support installation on hosts with ARM processors.
	2023/08/02	Bug fix	Restored the Swagger UI interface for the validator’s REST API that was missing.
	2023/08/01	Bug fix	Corrected the lack of results being displayed when using the validator’s minimal UI in embedded mode.
1.4.0	2023/07/23	Release	Milestone release 1.4.0.
	2023/06/30	Feature	Include Schematron rule IDs in the validation report’s items (if present).
	2023/06/26	Improvement	Improved design of PDF validation reports.
	2023/06/19	Improvement	Stricter Content Security Policy (CSP) to limit script execution based on a nonce value.
	2023/06/19	Improvement	Upgrade to Java 17 and Spring Boot 3.
	2023/05/22	Feature	Support validation types and options that are available but hidden on the user interface.
	2023/05/12	Feature	Allow different labels for options and external validation artefacts upon selection of specific validation types and/or options.
	2023/05/05	Bug fix	Corrected styling of overall result when validating in non-English languages.
	2023/03/22	Improvement	Environment variable for base SOAP API publishing URL to simplify proxying.
	2023/03/20	Feature	Support for optional validator metadata included in produced GITB TRL reports (in XML and JSON formats).
	2023/02/14	Bug fix	Files uploaded through the user interface may not be automatically cleaned.
	2023/02/14	Improvement	Consider a generic validator instance (expecting user-provided artefacts) if the validator.resourceRoot property is not set.
1.3.0	2022/11/24	Release	Milestone release 1.3.0.
	2022/11/17	Improvement	Better use of colours to distinguish validation results and report item severity levels on the UI.
	2022/11/17	Improvement	When using the REST API to generate an XML GITB TRL report allow an input option on the use of CDATA blocks vs XML escaping for context data.
	2022/11/14	Feature	New REST API with operations to validate documents and query supported validation options.
	2022/10/12	Bug fix	User-provided XSLT Schematron files via the UI are reported as invalid.
	2022/09/30	Improvement	Distinguish the minimal from the regular web UI in collected statistics.
	2022/09/20	Bug fix	Trim provided URI strings before attempting to load their resources.
	2022/08/03	Feature	Configurable handling of errors while loading remote (pre-configured) validation artefacts.
	2022/07/25	Feature	Allowing configuration of default validation type.
	2022/06/07	Bug fix	Content provided via the UI and direct input via editor is not recognised correctly.
	2022/06/07	Bug fix	When viewing the results’ UI page the submit button is disabled when changing input method.
1.2.0	2022/05/31	Release	Milestone release 1.2.0.
	2022/05/17	Feature	Allow using the validator embedded in another UI (within an iframe), as-is or only for result display.
	2022/05/10	Improvement	The minimal UI now allows the user to toggle between the report’s summary and detailed display.
	2022/05/09	Improvement	Following validation through the UI the provided input parameters remain populated when viewing results.
	2022/05/05	Bug fix	Remotely loaded resources should ignore querystring URI extensions.
	2022/04/08	Feature	Allow preprocessing of input before validation using configured XPath expressions.
	2022/04/05	Feature	Allow filtering of the presented report items on the validator’s UI based on severity level.
	2022/04/05	Improvement	Highlight displayed report items when hovered to indicate they can be clicked for details.
	2022/04/05	Feature	When using the validator via its UI allow switching between detailed and aggregated display of reported findings.
	2022/04/05	Bug fix	Language selection menu not correctly positioned when showing a minimal UI.
	2022/03/30	Feature	CSV validation reports available when using the validator via its UI or CLI.
	2022/01/20	Improvement	Prevent remote artefact loading issues from affecting other validation types in a given domain.
1.1.0	2022/01/17	Release	Milestone release 1.1.0.
	2021/12/16	Feature	Support for internationalisation of validators and default translations for all EU languages.
	2021/12/03	Improvement	Dependency updates and resolution of CVE-2021-43466.
	2021/11/08	Improvement	Third-party library upgrades.
	2021/10/25	Improvement	Resolution of CVE-2021-42340.
	2021/10/05	Improvement	Third-party library upgrades.
	2021/09/27	Improvement	Third-party library upgrades.
	2021/08/18	Feature	Allow SOAP clients to choose whether Schematron-produced report item locations are line/column pairs or XPath expressions.
	2021/08/18	Feature	Allow SOAP clients to choose whether to include the validated content in the resulting report’s context.
	2021/08/17	Improvement	SOAP API update to explicitly set the mime type of the validated content in the resulting report.
	2021/07/23	Bug fix	Clean-up and release class-loaders of custom plugins at application shutdown.
	2021/07/23	Bug fix	Preprocessing of preconfigured artifacts in cases of a single artifact fails.
	2021/07/23	Bug fix	Lookup of preconfigured local artifacts may fail on Linux-based file systems.
	2021/06/29	Feature	Allow validator usage statistics reporting via webhook.
	2021/05/21	Improvement	Context data is no longer included in the PDF validation report.
	2021/05/03	Improvement	Add previous page button in error page and specific error handling for document downloads.
1.0.0	2021/04/21	Release	Milestone release 1.0.0.
	2021/04/21	Improvement	Support for publishing of milestone releases.
	2021/04/19	Improvement	Support for unicode properties in configuration files.
	2021/04/07	Feature	Add build timestamp in all JAR files’ manifests.
	2021/03/11	Feature	Allow the reuse of validation artefacts across domains and the split of domain configuration in several files.
	2021/02/12	Feature	Update internal Schematron processing library and simplify Schematron validation approach.
	2021/01/27	Improvement	Improve the detection of user-provided validation artefacts based on their embedding method.
	2021/01/21	Improvement	Consider content as plain-text if not possible to parse as a URL or BASE64.
	2021/01/19	Feature	Allow any configuration property to be provided via environment variable or system property.
	2021/01/18	Bug fix	Correctly define and clean up log and work folders when running via CLI tool.
	2020/12/07	Feature	Set a maximum threshold for the items included in an XML validation report.
	2020/12/07	Improvement	Improve reporting when used as a CLI tool.
	2020/12/04	Improvement	Improve the internal definition of common dependencies.
	2020/12/04	Improvement	Allow PDF validation reports to be produced by command line tools.
	2020/12/04	Bug fix	Correctly clean up temporary files on CLI tool exit.
	2020/10/27	Bug fix	Correct the handling of XSDs provided a ZIP archive when using the SOAP API.
	2020/10/23	Improvement	Display error messages from custom plugins.
	2020/09/30	Feature	Set a maximum threshold for report items displayed on the web user interface.
	2020/09/30	Improvement	Add spinners on report download buttons of the web user interface to indicate a pending download.
	2020/09/28	Feature	Support validation type options as an extra configuration level over validation types.
	2020/09/28	Improvement	Always display on the web user interface external artefact inputs if supported the same way by all validation types.
	2020/09/24	Improvement	Improved reporting for unexpected errors.
	2020/09/23	Bug fix	Validation report may fail to open when accessing via the minimal web user interface.
	2020/09/21	Bug fix	The web interface’s code editor may fail to correctly highlight the offending line in the validated content.
	2020/09/21	Bug fix	The web interface should not display a report item as clickable if it cannot be highlighted in the validated content.
	2020/09/21	Bug fix	Correct the configuration property setting the validator’s log output folder.
	2020/09/16	Improvement	Improve internal support for handling UI events and configuration loading.
	2020/09/15	Improvement	Upgrade library dependencies.
	2020/09/14	Bug fix	Replace the mention of “schemas” with “artefacts” in error messages.
	2020/09/14	Bug fix	Correct the handling of externally-provided validation artefacts provided via the SOAP API when the embedding method is defined as an input.
	2020/07/13	Improvement	Improve the handling of user-provided validation artefacts.
	2020/07/02	Feature	Support custom validator plugins to extend the capabilities of the validator beyond the limitations of its validation artefacts.
	2020/06/30	Improvement	Extract common validator components to shared libraries.
	2020/05/15	Improvement	Improve the handling of user-provided validation artefacts.
	2020/02/05	Improvement	Enable wildcard CORS for the validator to simplify its integration.
	2020/02/03	Improvement	Improve the documentation and input validation of inputs provided via the SOAP API.
	2020/02/03	Feature	Support external validation artefacts being provided via URI and as a ZIP archive.
	2020/02/03	Feature	Support pre-processing of validation artefacts.
	2020/02/03	Feature	Allow a single validator instance to support multiple domains.
	2019/12/04	Feature	Support user-provided validation artefacts through the validator’s web UI.
	2019/11/29	Feature	Support configurable banner and footer HTML blocks for the validator’s web UI.
	2019/11/29	Feature	Support a configurable minimal web user interface.
	2019/05/14	Feature	Allow XSD definitions to be optional.
	2019/04/10	Improvement	Upgrade of library dependencies.
	2019/03/28	Improvement	Migration to Java 11.
	2019/03/26	Improvement	Increased the default maximum upload limit for the validator’s web application.
	2019/03/20	Improvement	Allow a user-friendly domain name and include the validation type in the report display.
	2019/02/20	Improvement	Upgrade of library dependencies.
	2019/02/08	Bug fix	Corrections to prevent integration errors due to anti-CSRF checks.
	2019/02/07	Feature	Allow validation channels to be configured.
	2019/02/06	Feature	Support multiple domains in a single validator instance.
	2018/08/27	Improvement	Improved logging configuration.
	2018/08/24	Bug fix	Corrected the URI resolution for XSLT Schematron resources.
	2018/07/27	Improvement	Upgrade of library dependencies.
	2018/07/12	Feature	Support configurable ordering of report items and inclusion of test expressions.
	2018/05/28	Improvement	Upgrade of library dependencies.
	2018/04/23	Bug fix	Allow resolution of document(‘’) definitions.
	2018/03/21	Improvement	Upgrade of library dependencies and improved Docker support.
	2018/03/02	Feature	Allow validation report to be downloaded in PDF format.
	2018/02/28	Bug fix	Adapt resource URLs to support reverse proxies.
	2018/02/22	Bug fix	Ensure that Schematron files can be optional.
	2018/02/22	Bug fix	Correctly resolve imported resources when validating with XSLT Schematrons.
	2018/02/22	Feature	Allow the exclusion of certain resources from the loaded Schematron folders.
	2017/09/19	Feature	Allow Schematron files to be optional in the validation.
	2017/05/23	Feature	Include assertions messages for rules that have succeeded.
	2017/05/09	Improvement	Improvements to internal resource configurations.
	2017/03/07	Improvement	Improvements to internal resource configurations.
	2016/12/09	Improvement	Improvements to logging output and the handling of emails.
	2016/08/16	Bug fix	Correct the use of the temp folder when using the validator as a CLI tool.
	2016/08/13	Bug fix	Correct Docker property definitions and the handling of empty email attachments.
	2016/08/12	Feature	Support fot the validator to be used as a CLI tool.
	2016/08/11	Bug fix	Correctly handle BOM in BASE64 content.
	2016/08/11	Bug fix	Correct location detection for errors when Schematrons are in XSLT format.
	2016/08/11	Feature	Allow a target validation type to be defined when the validator is used via email submission.
	2016/08/10	Feature	Support for raw Schematron files and validation against multiple XSDs.
	2016/08/09	Feature	Support for multiple validation types.
	2016/08/08	Bug fix	Corrected the XSD resolver to ensure it correctly looks up nested resources.
	2016/06/24	Bug fix	Corrected the SOAP API implementation to ensure it is GITB-compliant.
	2016/03/22	Feature	Adapted initial implementation to ensure it can be used as a generic XML validator.
	2016/03/16	Feature	Support for email as an input channel.
	2016/03/09	Feature	Initial version.

How to determine my validator’s version or build timestamp?

The version information of your validator (be it a full web application instance or a command line tool), can be inspected in any of the following ways:

If the validator UI is enabled, viewing the HTML source of the validator’s web page. This includes at the end hidden divs with ids build.version and build.timestamp.
If using via Docker, inspecting the Docker image tag and history. For the build timestamp, consider the top line returned by docker history --format "{{.CreatedAt}}" isaitb/xml-validator:latest.
If using via JAR file, inspecting its manifest file. This can be found in the JAR at /META-INF/MANIFEST.MF and locating properties Implementation-Version and Build-Timestamp.
If using via JAR file, inspecting the validator’s configuration file. This can be found in the JAR at /BOOT-INF/lib/xmlvalidator-common-1.0.0-SNAPSHOT.jar/application.properties and locating properties validator.buildVersion and validator.buildTimestamp.

If the validator UI is enabled you may also view in its source two additional timestamps which may be of interest:

The startup.timestamp is the time when the validator was last started.
The resource.timestamp is the time when the configuration of the relevant domain was last updated or last checked for updates (if hosted on the Test Bed).

In all cases, timestamp are formatted as yyyy-MM-dd HH:mm:ss (YEAR-MONTH-DAY HOUR:MINUTE:SECOND) such as “2021-03-20 14:50:30”, followed by the relevant timezone (“Z” for UTC).

Guide: Setting up XML validation

What you will achieve

What you will need

How to complete this guide

Steps

Step 1: Determine your testing needs

Step 2: Prepare validation artefacts

Step 3: Prepare validator configuration

Remote validation artefacts

User-provided validation artefacts

User-provided context files

Combining context files with the input

Supporting options per validation type

Presenting validation types in groups

Hidden validation types

Validation type aliases

Domain aliases

Managing remote schema references

Validation artefact pre-processing

Input pre-processing

Input transformation

Continue validation in case of XSD errors

Adding a custom banner and footer

Supporting multiple languages

Validation metadata in reports

Rich text support in report items

Support for XML Schema version 1.1

Step 4: Setup validator as Docker container

Running without a custom Docker image

Configuring additional validation domains

Additional configuration options

Environment-specific domain configuration

Step 5: Setup validator on Test Bed

Step 6: Use the validator

Validation via user interface

Validation via minimal user interface

Validation via embedded interface

Validation via REST web service API

Validation via SOAP web service API

Validation via email

Validation via command-line tool

Step 7: Use the validator in GITB TDL test cases

Summary

See also

References

Validator configuration properties

Domain-level configuration

Properties related to email

Properties related to UI labels

Application-level configuration

Change history