Problem Statement: Data to be processed in Informatica were XML files in nature. The number of XML files to be processed was dynamic in nature. The need was also to ensure that the XML file name from which data is being processed is to be captured.

Resolution:
Option 1 – Using File list as part of Indirect File Sources in session
Option 2 – Using Parameter File and workflow variable

Implementation Details for option 1: Using File list

XML file names to be processed were read using batch script and file list was created containing XML file. This file list name was set under source properties at session level. XML file were read sequentially and data pertaining to every XML file was processed. Since the number of XML files to be processed was dynamic the need of the hour was to achieve looping in Informatica.

Challenge in using File List – Created in a session to run multiple source files for one source instance in the mapping. When file list is used in a mapping as multiple source files for one source instance, the properties of all files must match the source definition. File list are configured in session properties by mentioning the file name of the file list in the Source Filename field and location of the file list in the Source File Directory field. When the session starts, the Integration Service reads the file list, then locates and reads the first file source in the list. After the Integration Service reads the first file, it locates and reads the next file in the list. The issue using XML file names in file list was further compounded by Informatica grouping records pertaining to similar XML node together. This lead to difficultly in identifying which record belonged to which XML file.

Batch Script – batch scripts controlled over all looping in Informatica by encompassing below mentioned tasks:
• Reading XML file names from staging location and creating file list containing XML file names.
• Moving XML files from staging location to archive location.
• Verifying whether there are any more XML files to be processed and depending on the outcome either loop the process by invoking first workflow or end the process
• Using PMCMD commands invoke appropriate workflows.

Workflow Details –
There were two Informatica workflows designed to achieve looping:
• First workflow –created indirect file to be used as source in session properties and will trigger second workflow. Details of workflow are:
o Command task will execute a DOS batch script which will create indirect file after reading XML filenames from a pre-defined location on server.
o Command task which will execute the second workflow to process data within XML files.

image1

• Second workflow will read process XML files and populate staging tables. Details of workflow are:
o A session will read XML file names using indirect file and load into staging tables.
o A command task will move the XML file just processed in file into an archive folder. Using batch script
o A command task will execute a batch script which will:
 Check whether there are any more XML files to be processed.
 If yes then it will trigger the first workflow. This will ensure all XML files are processed and loaded into staging tables.
 If no then process will complete.

image2

Thanks for reading, pls let me know have you faced any similar situation.

Posted by Sivaram
Comments (11)
March 3rd, 2010

Comments (11)

Swetha - April 21st, 2011

Hi Sivaram, This is really brilliant. Can you share the script which is used for looping. Thanks, Swetha

Sujay - March 8th, 2011

Hi Sivaram, Let me first thank you for the wonderful explanation of the solution. I have to use a similar approach in my code. However, in my staging table I also need to document the name of the XML file. Is there a way I can obtain the XML file name and update it in the Satging table. Thanks. Sujay

Sri - December 10th, 2010

Hi Siva, It's really useful info and i have a similar issue where i have to read multiple xml files and my file is getting generated from batch scripts,but the problem is i'm getting an error like below FR_3018:The indirect input file does not contain filenames. But everything is set fine and same scenario is working fine with FlatFiles.I tried all options but failed to find the error can anyone please help me on this?

Madhav - November 12th, 2010

Hi Sivaram, This was really nice. I need to work on an xml output file created by a webservice. When I use the exact file, the file is not recognized as input but when I remove the soap header and Body I'm able to process. Can you please help me on this? Thank You, Madhav

Devidas Kanchetti - August 10th, 2010

Thanks boss ..very usefull information

Vimal - July 27th, 2010

Really the contents in this blog very useful for both learners and expericences.

Prasad - May 24th, 2010

Hey nice article man... I know how to read multiple files.. but i want to read the files one by one, i mean loop. Plz give the looping scrpt....

Saurabh Kale - May 13th, 2010

HI , I have similar situation as yours . Can you please share the batch script used for overall looping of the workflow. Actually when we use file list it throws eror for multiple files .so we need to process files one by one.can u tell how we can do this. Thanks, saurabh

Murugan - March 24th, 2010

Hi Mr.Sivaram, This blog is really useful. Actually I am new to use XML Source in informatica and I have the similar kind of requirement in my project. I would like to discuss with you for get more information. If you don't mind please share your contact number, so that I'll explain my requirement. Regards Murug

Sivaram Sivanarayanan - March 23rd, 2010

Chander, Thanks for your comments. As requested the script details are mentioned below: 1. Reading XML file names from staging location and creating file list containing XML file names. Both these options are achieved using a single batch script as detailed below: @echo off cd D:\ cd D:\InformaticaRoot\SrcFiles\SourceXML if exist D:\InformaticaRoot\SrcFiles\SourceXML\pXMLFileNames.txt del D:\InformaticaRoot\SrcFiles\SourceXML\pXMLFileNames.txt set var2= for %%a in ("D:\InformaticaRoot\SrcFiles\SourceXML\*.xml") do set var2=%%a if not "%var2%"=="" ( echo %var2:~40,62% >> D:\InformaticaRoot\SrcFiles\SourceXML\pXMLFileNames.txt exit /b 0 ) 2.Moving XML files from staging location to archive location @echo off cd D:\ cd D:\InformaticaRoot\SrcFiles\SourceXML set filecontent= for /f "delims=" %%a in (D:\InformaticaRoot\SrcFiles\SourceXML\pXMLFileNames.txt) do set filecontent=%%a move D:\InformaticaRoot\SrcFiles\SourceXML\%filecontent% D:\InformaticaRoot\SrcFiles\SourceXML\Archive exit /b In the above scripts, pXMLFileNames refers to file list name [indirect file source]. The XML files are assumed to be present at location D:\InformaticaRoot\SrcFiles\SourceXML [which are read sequentially] and the archive file location will be D:\InformaticaRoot\SrcFiles\SourceXML\Archive respectively on Informatica PowerCenter Server. Do let me know in case you need any further details from me. Regards, Sivaram.

chander - March 22nd, 2010

Hi, Your blog is really useful and I definetly recommand all my friends to use it for Infa reference. Could you please share the batch script for the following. • Reading XML file names from staging location and creating file list containing XML file names. • Moving XML files from staging location to archive location. Thanks Chander K

Comments are closed.