Introduction

This article explains about how to validate an XML document with XSD schema. The validation is performed by checking whether the XML document is a well-formed one by programmatically using Java packages. Using Java transformation in Infomatica PowerCenter 8.6 we are validating the XML.

Issue

We are creating the XML using expression and hard coding the tags of XML. There is no XSD defined in project start up. So that we need to validate after generating the XML with client provided new XSD. Due to this we went for Java transformation.

Version Informatica 8.6 incompatibility

We don’t have user friendly validation option except session level validate property against XSD.

Version Informatica 9.1 compatibility

We have XML parser transformation with validate option on XML tags and able to send the errors in separate pipe line.

XML document

An XML document contains elements, attributes, and values of primitive data types. For example, consider the following XML document:

Sample.xml

XSD schema

XSD schema defines elements, attributes, and the relationship between them. It conforms to the W3C XML schema standards and recommendations. XSD schema for the above XML document, address.xsd, can be given as follows:

Sample.xsd

The following example shows how to validate XML document against XSD schema by using java.

Mapping flow diagram:

Informatica process steps:

The above mapping taking XML and XSD file name as input and send a validation status to the target file.

1. Create mapping variable $$XML_FILE and $$XSD_FILE and declare it in your parameter file.
2. Assign the XML, XSD File names along with folder details into the parameters.
3. Create two ports XML_FILE, XSD_FILE in Expression.
4. Load the $$XML_FILE and $$XSD_FILE data in to output ports (XML_FILE, XSD_FILE) in Expression transformation.
5. Send the Expression transformation output ports (XML_FILE, XSD_FILE) data into the Java transformation as an input.
6. Create the output ports STATUS , ERROR output ports in the Java transformation
7. The below code placed in the On Input row tab under the Javacode tab in Java transformation.

8. The below java packages placed in the Import Java packages tab under the Javacode tab in Java transformation.

9. If XML is valid against XSD, then status is assigned as Valid else InValid.
10. Check the status as ‘InValid’ in Filter transformation and send error records to target .

The following code place on the input row tab in Java transformation

try{
// 1. Lookup a factory for the W3C XML Schema language
SchemaFactory factory = SchemaFactory.newInstance (XMLConstants.W3C_XML_SCHEMA_NS_URI);

// 2. Compile the schema.
Schema schema = factory.newSchema (new StreamSource (XSD_FILE));

// 3. Get a validator from the schema.
Validator validator = schema.newValidator();

// 4. Check the input xml file
validator.validate(new StreamSource(XML_FILE));
STATUS=”Valid”;

}

catch(Exception e){
ERROR=e.getMessage();
STATUS=”InValid”;
}

Following java packages used for the xsd validation. These packages placed in the ‘Import Packages’ tab in the java transformation

1. import javax.xml.validation.Schema;
2. import javax.xml.validation.SchemaFactory;
3. import javax.xml.validation.Validator;
4. import javax.xml.XMLConstants;
5. import java.xml.transform.stream.StreamSource;

Sample Param file entry

$$XML_FILE =C:\xml\Sample.xml
$$XSD_FILE =C:\xml\Sample.xsd

Output (Target File) Out.txt:

Test.xml InValid Error_Message.

The below java classes used in the Java transformation.

• Schema Class

ο This object represents a set of constraints that can be checked/ enforced against an XML document.

Two kinds of validators can be created from a Schema object. One is Validator, which provides high-level validation operations that cover typical use cases. The other is ValidatorHandler, which works on top of SAX for better modularity.

• SchemaFactory class

SchemaFactory is a schema compiler. It reads external representations of schemas and prepares them for validation. The class is capable of locating other implementations for other schema languages at run-time.

SchemaFactory object for a given schema language, this method looks the following places in the following order where “the class loader” refers to the context class loader:

• Validator Class
A processor that checks an XML document against Schema.

A validator is a thread-unsafe and non-reentrant object. In other words, it is the application’s responsibility to make sure that one Validator object is not used from more than one thread at any given time, and while the validate method is invoked, applications may not recursively call the validate method.

• XMLConstants

Utility class to contain basic XML values as constants.

Conclusion

This article explained about the XML document, XSD schema, and how to validate XML document against XSD schema using java in Informatica.

Posted by Sreenivasa Rao B
Comments (0)
February 7th, 2012

Comments (0)