Guide: Setting up YAML validation
Track |
---|
This guide walks you through the points to consider when setting up YAML validation and the steps to bring your validation service online.
What you will achieve
At the end of this guide you will have understood what you need to consider when starting to implement validation services for your YAML-based specification. You will also have gone through the steps to bring it online and make it available to your users.
A YAML validation service can be created using multiple approaches depending on your needs. You can have an on-premise (or local to your workstation) service through Docker or use the test bed’s resources and, with minimal configuration, bring online a public service that is automatically kept up-to-date.
For the purpose of this guide you will be presented the options to consider and start with a Docker-based instance that could be replaced (or complemented) by a setup through the test bed. Interestingly, the configuration relevant to the validator is the same regardless of the approach you choose to follow.
What you will need
About 30 minutes.
A text editor.
A web browser.
Access to the Internet.
Docker installed on your machine (only if you want to run the validator as a Docker container).
A basic understanding of YAML, JSON and JSON schema. A good source for more information here is the Understanding JSON schema tutorial site.
How to complete this guide
A YAML validator uses the same underlying software as a JSON validator and shares for the most part the same documentation. Going through the steps of this guide you will be often directed to the JSON validator documentation, with additional options presented when applicable specifically for a YAML validator.
The configuration steps are hands-on and will result in you creating a fully operational validation service. For these practical steps there are no prerequisites and the content for all files to be created are provided in each step. In addition, if you choose to try your setup as a Docker container you will also be issuing commands on a command line interface (all commands are provided and explained as you proceed).
Steps
To complete this guide follow the steps described in this section. Certain steps include sub-sections with optional configurations that are discussed but are not required to create a basic validator.
Step 1: Determine your testing needs
Before proceeding to setup your validator you need to clearly determine your testing needs. A first outline of the approach to follow would be provided by answering the following questions:
Will the validator be available to your users as a tool to be used on an ad-hoc basis?
Do you plan on measuring the conformance of your community’s members to the YAML-based specification?
Is the validator expected to be used in a larger conformance testing context (e.g. during testing of a message exchange protocol)?
Should the validator be publicly accessible?
Should test data and validation reports be treated as confidential?
The first choice to make is on the type of solution to power your validation service:
Standalone validator: A service allowing validation of YAML content based on a predefined configuration of JSON schemas The service supports fine-grained customisation and configuration of different validation types (e.g. specification versions) and supported communication channels. Importantly, use of the validator is anonymous and it is fully stateless in that none of the test data or validation reports are maintained once validation completes.
Complete test bed: The test bed is used to realise a full conformance testing campaign. It supports the definition of test scenarios as test cases, organised in test suites that are linked to specifications. Access is account-based allowing users to claim conformance to specifications and execute in a self-service manner their defined test cases. All results are recorded to allow detailed reporting, monitoring and eventually certification. Test cases can address YAML validation but are not limited to that, allowing validation of any complex exchange of information.
It is important to note that these two approaches are by no means exclusive. It is often the case that a standalone validator is defined as a first step that is subsequently used from within test cases in the test bed. The former solution offers a community tool to facilitate work towards compliance supporting ad-hoc data validation, whereas the latter allows for rigorous conformance testing to take place where proof of conformance is required. This could apply in cases where conformance is a qualification criterion before receiving funding or before being accepted as a partner in a distributed system. Finally, it is interesting to consider that non-trivial YAML validation may involve multiple validation artefacts (e.g. different schemas for different message types). In such a case, even if ad-hoc data validation is not needed, defining a separate validator simplifies management of the validation artefacts by consolidating them in a single location, as opposed to bundling them within test suites.
Regardless of the choice of solution, the next point to consider will be the type of access. If public access is important then the obvious choice is to allow access over the Internet. An alternative would be an installation that allows access only through a restricted network, be it an organisation’s internal network or a virtual private network accessible only by your community’s members. Finally, an extreme case would be access limited to individual workstations where each community member would be expected to run the service locally (albeit of course without the expectation to test message exchanges with remote parties).
If access to your validation services over the Internet is preferred or at least acceptable, the simplest case is to opt for using the shared DIGIT test bed resources, both regarding the standalone validator and the test bed itself. If such access is not acceptable or is technically not possible (e.g. access to private resources is needed), the proposed approach would be to go for a Docker-based on-premise installation of all components.
Summarising the options laid out in this section, you will first want to choose:
Whether you will be needing a standalone validator, a complete test bed or both.
Whether the validator and/or test bed will be accessible over the Internet or not.
Your choices here can help you better navigate the remaining steps of this guide.
Step 2: Prepare validation artefacts
Generally speaking, YAML content is for the most part semantically equivalent to JSON, and can easily be converted to and from JSON. As such, the most common means of validating the content of a YAML file is to use JSON schema.
As an example validation case we will consider a variation of the EU purchase order case first seen in Guide: Creating a test suite. In short, for the purposes of this guide you are considered to be leading an EU cross-border initiative to define a new common specification for the exchange of purchase orders between retailers.
To specify the content of purchase orders your experts have created the following JSON schema:
{
"$id": "http://itb.ec.europa.eu/sample/PurchaseOrder.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "A JSON representation of EU Purchase Orders",
"type": "object",
"required": [ "shipTo", "billTo", "orderDate", "items" ],
"properties": {
"orderDate": { "type": "string" },
"shipTo": { "$ref": "#/definitions/address" },
"billTo": { "$ref": "#/definitions/address" },
"comment": { "type": "string" },
"items": {
"type": "array",
"items": { "$ref": "#/definitions/item" },
"minItems": 1,
"additionalItems": false
}
},
"definitions": {
"address": {
"type": "object",
"properties": {
"name": { "type": "string" },
"street": { "type": "string" },
"city": { "type": "string" },
"zip": { "type": "number" }
},
"required": ["name", "street", "city", "zip"]
},
"item": {
"type": "object",
"properties": {
"partNum": { "type": "string" },
"productName": { "type": "string" },
"quantity": { "type": "number", "minimum": 0 },
"priceEUR": { "type": "number", "minimum": 0 },
"comment": { "type": "string" }
},
"required": ["partNum", "productName", "quantity", "priceEUR"]
}
}
}
Based on this, a sample purchase order would be as follows:
shipTo:
name: John Doe
street: Europa Avenue 123
city: Brussels
zip: 1000
billTo:
name: Jane Doe
street: Europa Avenue 210
city: Brussels
zip: 1000
orderDate: '2020-01-22'
items:
- partNum: XYZ-123876
productName: Mouse
quantity: 20
priceEUR: 8.99
comment: Confirm this is wireless
- partNum: ABC-32478
productName: Keyboard
quantity: 5
priceEUR: 25.5
A first obvious validation for purchase orders would be against the defined JSON schema. However, your business requirements also define the concept of a large purchase order which is one that includes more than 10 of each ordered item. This restriction is not reflected in the JSON schema which is considered as a base for all purchase orders but rather in a separate JSON schema file that checks this only for orders that are supposed to be “large”. Such a rule file would be as follows:
{
"$id": "http://itb.ec.europa.eu/sample/PurchaseOrder-large.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Business rules for large EU Purchase Orders expressed in JSON",
"type": "object",
"required": [ "items" ],
"properties": {
"items": {
"type": "array",
"items": {
"type": "object"
},
"minItems": 10
}
}
}
As you see in the content of the two schemas, the first one defines the structure of the expected YAML data, whereas the second one does not replicate structural checks, focusing only on the number of items. In this case a valid large purchase order would be expected to validate against both schemas.
Given these requirements and validation artefacts we want to support two types of validation (or profiles):
basic: For all purchase orders acting as a common base. This is realised by validating against
PurchaseOrder.schema.json
.large: For large purchase orders. This is realised by validating against
PurchaseOrder.schema.json
andPurchaseOrder-large.schema.json
.
As the first configuration step for the validator we will prepare a folder with the required resources. For this purpose create a root
folder named validator
with the following subfolders and files:
validator
└── resources
└── order
└── schemas
├── PurchaseOrder.schema.json
└── PurchaseOrder-large.schema.json
Regarding the PurchaseOrder.schema.json
and PurchaseOrder-large.schema.json
files you can create them from the above content or download them
(here: PurchaseOrder.schema.json
and
PurchaseOrder-large.schema.json
).
Step 3: Prepare validator configuration
Note
YAML validator configuration options are discussed in detail in the JSON validation guide.
YAML validation is supported through the same underlying validator component used to validate JSON. Moreover, the exact same configuration steps and options for JSON validators also apply to YAML validation.
This allows you to go beyond what a basic YAML validator would cover, by leveraging advanced options such as the pre-processing of input using JSONPath expressions. Features that would normally work on JSON apply for the YAML validator by first automatically converting the received YAML input to JSON, before converting back to YAML for the validation and result reporting.
While going through the configuration options of the JSON validation guide you may have noticed that
you can configure the support for YAML input as an alternative to JSON. If you are defining
a validator explicitly for YAML however, it would make sense to accept only YAML inputs. To enforce this set validator.yamlSupport
to force
as illustrated in the following configuration:
...
# Require YAML input for all validation types.
validator.yamlSupport = force
By setting this property, only YAML input will be accepted by the validator. To support JSON alongside YAML you can alternatively set the property’s value
to support
.
Step 4: Deploy the validator
Note
The validator’s deployment options match those available for the JSON validator, be it as a self-hosted Docker container or hosting on the test bed.
When considering a self-hosted approach, a possible extension that may be interesting for a YAML validator is replacing the validator context root.
Given that the YAML validator is a configuration of the isaitb/json-validator image, the context root defaults to /json
.
If you want to change this to e.g. /yaml
you can do so using the server.servlet.context-path
environment variable. In case you are defining your own Docker image
this would be done in your Dockerfile as follows:
FROM isaitb/json-validator:latest
COPY resources /validator/resources/
ENV validator.resourceRoot /validator/resources/
ENV server.servlet.context-path /yaml
In case you are using the isaitb/json-validator image directly without extending it, the configuration would be similar to the following:
docker run -d --name po-validator -p 8080:8080 \
-v /validator/resources:/validator/resources/ \
-e validator.resourceRoot=/validator/resources/ \
-e server.servlet.context-path=/yaml \
isaitb/json-validator
Step 5: Use the validator
Note
Detailed usage instructions for the validator are provided in the JSON validation guide.
A YAML validator shares the same underlying software as JSON validators, and is used in the same way. For detailed usage instructions you may refer to the relevant sections from the JSON validation guide depending on your use case, specifically:
The web user interface (also in minimal and embedded modes) for manual usage.
The REST API for simple web service use.
The SOAP API for contract-based web service use.
The command-line tool for scripting.
Use in GITB TDL test cases to leverage the validator in conformance test scenarios.
Summary
Congratulations! You have just set up a validation service for your YAML specification. In doing so you considered your needs and defined your service through configuration on the DIGIT test bed or as a Docker container. In addition, you used this service via its different APIs and considered how this could be used as part of complete conformance testing scenarios.
See also
The test bed provides a generic YAML validator that you can use while providing your own JSON Schema(s) for validation. This can be useful for quick validations but also to try out your JSON Schemas while in development.
In case you plan to use a YAML validator as a validation step in complete conformance testing scenarios, several additional guides are available with further information:
Guide: Creating a test suite on how to create a simple GITB TDL test suite.
Guide: Defining your test configuration on how to configure a GITB TDL test suite in the test bed as part of your overall test setup.
Guide: Executing a test case on how to execute tests and monitor results.
Guide: Installing the test bed for development use on how to install your own test bed instance to test with.
For the full information on GITB TDL test cases check out the GITB TDL documentation, the reference for all test step constructs as well as a source of numerous complete examples.
In case you need to consider validation of further content types, be aware that the test bed provides similar support for:
XML validation using XML Schema and Schematron.
RDF validation using SHACL shapes.
CSV validation using Table Schema.
JSON validation using JSON Schema.
If you are planning on operating a validator on your own-premises for production use check the validator production installation guide for the steps to follow and guidelines to consider.
Finally, for more information on Docker and the commands used in this guide, check out the Docker online documentation.