AB-Initio Online Training
AB Initio Online Training Course
Introduction to Abinitio
• Abinitio Architecture
• Graph Programming
• Introduction to .dat and .dml files
Partition Components
• Braod Cast
• Partition by Expression
• Partition by range
• Partition by community
• Partition by percentage
• Partition by Round Robin
Departition Components
• Concatenate
• Gather
• Interleave
• Merge
Multifile System(MFS)
Types of parallelism
Layouts
Sort Components
• Sort
• Sort with in groups
• Sample
• Partition by key and sort
Transform Components
• Filter by expression
• Aggregate
• Scan
• Rollup
• Denormalize Sorted
• Normalize Reformat
• Match sorted
• Dedup sorted
Working with Databases
Database components
• Run SQL
• Input Table
• Output Table
• Truncate Table
• Update table
Phase and check Points
Miscellaneous component
• Gather logs
• Run program
• Redefine format
• Trash
• Replicate
Dataset Components
• Input File
• Output File
• Lookup File
• Intermediate File
FTP Components
• FTP From
• FTP To
Compress Components
• Compress
• Uncompress
• Gzip
• Gunzip
Validate Component
• Check Order
• Generate Records Generate Random bytes
• Compare Records
• Compute Check Sum
• Compare Check Sum
Translate components
• Record XML
• Write XML
Project and Sandbox
Performance Tuning
Overview
Ab Initio means “ Starts From the Beginning”. Ab-Initio software works with the client-server model.
The client is called “Graphical Development Environment” (you can call it GDE).It
resides on user desktop.The server or back-end is called Co-Operating System”. The Co-Operating System can reside in a mainframe or unix remote machine.
FAQs
What is the relation between EME , GDE and Co-operating system ?
Ans. EME is said as enterprise metdata env, GDE as graphical devlopment env and Co-operating sytem can be said as asbinitio server relation b/w this CO-OP, EME AND GDE is as fallows Co operating system is the Abinitio Server. this co-op is installed on perticular O.S platform that is called NATIVE O.S .comming to the EME, its i just as repository in informatica , its hold the metadata,trnsformations,db config files source and targets informations. comming to GDE its is end user envirinment where we can devlop the graphs(mapping just like in informatica)
desinger uses the GDE and designs the graphs and save to the EME or Sand box it is at user side.where EME is ast server side.
What is the use of aggregation when we have rollup as we know rollup component in abinitio is used to summirize group of data record. then where we will use aggregation ?
Ans: Aggregation and Rollup both can summerise the data but rollup is much more convenient to use. In order to understand how a particular summerisation being rollup is much more explanatory compared to aggregate. Rollup can do some other functionalities like input and output filtering of records.
Aggregate and rollup perform same action, rollup display intermediat result in main memory, Aggregate does not support intermediat result what are kinds of layouts does ab initio supports
Basically there are serial and parallel layouts supported by AbInitio. A graph can have both at the same time. The parallel one depends on the degree of data parallelism. If the multi-file system is 4-way parallel then a component in a graph can run 4 way parallel if the layout is defined such as it’s same as the degree of parallelism.
How can you run a graph infinitely?
To run a graph infinitely, the end script in the graph should call the .ksh file of the graph. Thus if the name of the graph is abc.mp then in the end script of the graph there should be a call to abc.ksh.
Like this the graph will run infinitely.
How do you add default rules in transformer?
Double click on the transform parameter of parameter tab page of component properties, it will open transform editor. In the transform editor click on the Edit menu and then select Add Default Rules from the dropdown. It will show two options – 1) Match Names 2) Wildcard.
Do you know what a local lookup is?
If your lookup file is a multifile and partioned/sorted on a particular key then local lookup function can be used ahead of lookup function call. This is local to a particular partition depending on the key.
Lookup File consists of data records which can be held in main memory. This makes the transform function to retrieve the records much faster than retirving from disk. It allows the transform component to process the data records of multiple files fastly.
What is the difference between look-up file and look-up, with a relevant example?
Generally Lookup file represents one or more serial files(Flat files). The amount of data is small enough to be held in the memory. This allows transform functions to retrive records much more quickly than it could retrive from Disk.
A lookup is a component of abinitio graph where we can store data and retrieve it by using a key parameter.
A lookup file is the physical file where the data for the lookup is stored.
How many components in your most complicated graph? It depends the type of components you us. Usually avoid using much complicated transform function in a graph.
Explain what is lookup?
Lookup is basically a specific dataset which is keyed. This can be used to mapping values as per the data present in a particular file (serial/multi file). The dataset can be static as well dynamic ( in case the lookup file is being generated in previous phase and used as lookup file in current phase). Sometimes, hash-joins can be replaced by using reformat and lookup if one of the input to the join contains less number of records with slim record length.
AbInitio has built-in functions to retrieve values using the key for the lookup
What is a ramp limit?
The limit parameter contains an integer that represents a number of reject events
The ramp parameter contains a real number that represents a rate of reject events in the number of records processed.
no of bad records allowed = limit + no of records*ramp.
ramp is basically the percentage value (from 0 to 1)
This two together provides the threshold value of bad records.
Have you worked with packages?
Multistage transform components by default uses packages. However user can create his own set of functions in a transfer function and can include this in other transfer functions.







