Interview Questions on Aggregator Transformation

Q. What is aggregator transformation?
Ans: The Aggregator transformation allows us to perform aggregate calculations, such as averages and sums. Unlike expression transformation (performs calculations on a row-by-row basis), an Aggregator transformation allows us to perform calculations on groups.
RXVV7UF8ZQ6P
Q. What are the different types of aggregate functions?
Ans: The transformation language includes the following aggregate functions:
  • AVG, COUNT , MAX, MIN, SUM
  • FIRST, LAST
  • MEDIAN, PERCENTILE, STDDEV, VARIANCE
  • Single Level Aggregate Function: MAX(SAL)
  • Nested Aggregate Function: MAX( COUNT( ITEM ))

Q. What is Nested Aggregate Functions?
Ans: In Aggregator transformation, there can be multiple single level functions or multiple nested functions. An Aggregator transformation cannot have both types of functions together.
E.g: MAX( COUNT( ITEM )) is correct.
MIN(MAX( COUNT( ITEM ))) is not correct. It can also include one aggregate function nested within another aggregate function

Q. What is  the Conditional Clauses
Ans: We can use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
SUM( COMMISSION, COMMISSION > QUOTA )

Q. Why cannot you use both single level and nested aggregate functions in a single aggregate transformation?
Ans: The nested aggregate function returns only one output row, whereas the single level aggregate function returns more than one row. Since the number of rows returned are not same, you cannot use both single level and nested aggregate functions in the same transformation. If you include both the single level and nested functions in the same aggregator, the designer marks the mapping or mapplet as invalid. So, you need to create separate aggregator transformations.

Q. Up to how many levels, you can nest the aggregate functions?
Ans: We can nest up to two levels only.
Example: MAX( SUM( ITEM ) )

Q. What is aggregate cache?
Ans: The PowerCenter Server stores data in the aggregate cache until it completes aggregate calculations. It stores group values in an index cache and row data in the data cache. If the PowerCenter Server requires more space, it stores overflow values in cache files.

  • Aggregator Index Cache:
    • The index cache holds group information from the group by ports. If we are using Group By on DEPTNO, then this cache stores values 10, 20, 30 etc.
    • All Group By Columns are in AGGREGATOR INDEX CACHE. Eg. DEPTNO
  • Aggregator Data Cache:
    • DATA CACHE is generally larger than the AGGREGATOR INDEX CACHE. The columns in Data Cache:
    • Variable ports if any
    • Non group by input/output ports.
    • Non group by input ports used in non-aggregate output expression

Q. How can we improve performance of aggregate transformation?
Ans:

  • Use sorted input: Sort the data before passing into aggregator. To use sorted input, we must pass data to the Aggregator transformation sorted by group by port, in ascending or descending order. When we use this option, we tell Aggregator that data coming to it is already sorted. We check the Sorted Input Option in Properties Tab of the transformation. If the option is checked but we are not passing sorted data to the transformation, then the session fails.
  • Filter the unwanted data before aggregating: Limit the number of input/output or output ports to reduce the amount of data the aggregator transformation stores in the data cache.

Q. What is incremental aggregation?
Ans: The integration service performs aggregate calculations and then stores the data in historical cache. Next time when you run the session, the integration service reads only new data and uses the historical cache to perform new aggregation calculations incrementally.

Q. Why cannot we use sorted input option for incremental aggregation?
Ans: In incremental aggregation, the aggregate calculations are stored in historical cache on the server. In this historical cache the data need not be in sorted order. If you give sorted input, the records come as presorted for that particular run but in the historical cache the data may not be in the sorted order. That is why this option is not allowed.

Q. How the NULL values are handled in Aggregator?
Ans: You can configure the integration service to treat null values in aggregator functions as NULL or zero. By default the integration service treats null values as NULL in aggregate functions.

3 comments :

> Related Posts with Thumbnails
 

Copyright © 2012. GS dot net - All Rights Reserved - Design by BTDesigner - Proudly powered by Blogger