Monday, August 24, 2015

Oracle Informatica Interview Question and Answers


1. What is Data warehouse?

  According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.


2.         what are the types of data warehouses?

    There are three types of data warehouses

    Enterprise Data Warehouse

    ODS (operational data store)

    Data Mart


3.         What is Data mart?

            A data mart is a subset of data warehouse that is designed for a particular line of business, such as sales, marketing, or finance. In a dependent data mart, data can be derived from an enterprise wide data warehouse. In an independent data mart can be collected directly from sources.


4.         What is star schema?

            A star schema is the simplest form of data warehouse schema that consists of one or more dimensional and fact tables.


5.         What is snow flake schema?

            A Snowflake schema is nothing but one Fact table which is connected to a number of dimension tables, The snowflake and star schema are methods of storing data which are multidimensional in nature.


6.         What are ETL Tools?

            ETL Tools are stands for Extraction, Transformation, and Loading the data into the data warehouse for decision making. ETL refers to the methods involved in accessing and manipulating source data and loading it into target database.


7.         What are Dimensional table?

            Dimension tables contain attributes that describe fact records in the fact table.


8.         What is data Modelling? 

    Data Modeling is representing the real world set of data structures or entities and their relationship in their of data models, required for a database.Data Modelling consists of various types like :

    Conceptual data modeling

    Logical data modeling

    Physical data modeling

    Enterprise data modeling

    Relation data modeling

    Dimensional data modeling.


9.         What is Surrogate key?

            Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number of each row that can be used for the primary key to the table.


10.       What is Data Mining?

            A Data Mining is the process of analyzing data from different perpectives and summarizing it into useful information


11.       What is Operational Data Store?

            A ODS is an operational data store which comes as a second layer in a datawarehouse architecture. It has got the characteristics of both OLTP and DSS systems.


12.       What is the Difference between OLTP and OLAP?

            OLTP is nothing but OnLine Transaction Processing which contains a normalised tables .

But OLAP(Online Analtical Programming) contains the history of OLTP data which is non-volatile acts as a Decisions Support System.


13.       How many types of dimensions are available in Informatica?

    There are three types of dimensions available are :

    Junk dimension

    Degenerative Dimension

    Conformed Dimension


14.       What is Difference between ER Modeling and Dimensional Modeling?

            ER Modeling is used for normalizing the OLTP database design.

Dimesional modeling is used for de-normalizing the ROLAP / MOLAP design.


15.       What is the maplet?

            Maplet is a set of transformations that you build in the maplet designer and you can use in multiple mapings.


16.       What is Session and Batches?

            Session: A session is a set of commands that describes the server to move data to the target.

Batch: A Batch is set of tasks that may include one or more numbar of tasks (sessions, ewent wait, email, command, etc).


17.       What are slowly changing dimensions?

    Dimensions that change overtime are called Slowly Changing Dimensions(SCD).

    Slowly Changing Dimension-Type1 : Which has only current records.

    Slowly Changing Dimension-Type2 : Which has current records + historical records.

    Slowly Changing Dimension-Type3 : Which has current records + one previous records.


18.       What are 2 modes of data movement in Informatica Server?

            There are two modes of data movement are:

Normal Mode in which for every record a separate DML stmt will be prepared and executed.

Bulk Mode in which for multiple records DML stmt will be preapred and executed thus improves performance.


19.       What is the difference between Active and Passive transformation?

            Active Transformation:An active transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.

Passive Transformation:A passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.


20.       What is the difference between connected and unconnected transformation?

            Connected Transformation:Connected transformation is connected to other transformations or directly to target table in the mapping.

UnConnected Transformation:An unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.


21.       What are different types of transformations available in Informatica? 

There are various types of transformations available in Informatica :

    Aggregator

    Application Source Qualifier

    Custom

    Expression

    External Procedure

    Filter

    Input

    Joiner

    Lookup

    Normalizer

    Output

    Rank

    Router

    Sequence Generator

    Sorter

    Source Qualifier

    Stored Procedure

    Transaction Control

    Union

    Update Strategy

    XML Generator

    XML Parser

    XML Source Qualifier


22.       What are Aggregator Transformation?

            Aggregator transformation is an Active and Connected transformation. This transformation is useful to perform calculations such as averages and sums (mainly to perform calculations on multiple rows or groups).


23.       What are Expression transformation?

            Expression transformation is a Passive and Connected transformation. This can be used to calculate values in a single row before writing to the target.


24.       What are Filter transformation?

            Filter transformation is an Active and Connected transformation. This can be used to filter rows in a mapping that do not meet the condition.


25.       What are Joiner transformation?

            Joiner Transformation is an Active and Connected transformation. This can be used to join two sources coming from two different locations or from same location.


26.       Why we use lookup transformations?

            Lookup Transformations can access data from relational tables that are not sources in mapping.


27.       What are Normalizer transformation?

            Normalizer Transformation is an Active and Connected transformation. It is used mainly with COBOL sources where most of the time data is stored in denormalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data.


28.       What are Rank transformation?

            Rank transformation is an Active and Connected transformation. It is used to select the top or bottom rank of data.


29.       What are Router transformation?

            Router transformationis an Active and Connected transformation. It is similar to filter transformation. The only difference is, filter transformation drops the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions.


30.       What are Sorter transformation?

            Sorter transformation is a Connected and an Active transformation. It allows to sort data either in ascending or descending order according to a specified field.


31.       Name four output files that informations server creates during session running?

    Session Log

    Workflow Log

    Errors Log

    Badfile


32.       Why we use stored procedure transformation?

            A stored procedure transformation is an important tool for populating and maintaing databases.


33.       What are the difference between static cache and dynamic cache?

            Dynamic cache decreases the performance in comparision to static cache.

Static cache do not see such things just insert data as many times as it is coming


34.       Define maping and sessions?

            Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.

Session : It is a set of instructions that describe how and when to move data from source to targets.


35.       What is a command that used to run a batch?

            pmcmd is used to start a batch.


36.       What is Datadriven?

            The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert, update, delete or reject.


37.       What is power center repository?

            The PowerCenter repository allows you to share metadata across repositories to create a data mart domain.


38.       What is parameter file?

            A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file.

Maping parameters

Maping variables

session parameters.


39.       What are the types of lookup caches?

    Static cache

    Dynamic cache

    Persistent cache

    Shared cache

    Recache


40.       What are Stored Procedure transformation?

            Stored Procedure transformation is an Passive & Connected or UnConnected transformation. It is useful to automate time-consuming tasks and it is also used in error handling, to drop and recreate indexes and to determine the space in database, a specialized calculation.


41.       What is fact table?

    The centralized table in a star schema is called as fact table. Fact tables are three types

    additive

    non-additive

    semi additive


42.       What is Data warehouse?

            According to Bill Inmon, known as father of Data warehousing. “A Data warehouse is a subject oriented, integrated ,time variant, non volatile collection of data in support of management’s decision making process”.


43.       What is Data Transformation Manager(DTM)?

            After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run.


44.       How can you define a transformation?

            A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions.


45.       What are Lookup transformation?

            Lookup transformation is Passive and it can be both Connected and UnConnected as well. It is used to look up data in a relational table, view, or synonym. Lookup definition can be imported either from source or from target tables.


46.       What are Source Qualifier transformation?

            Source Qualifier transformation is an Active and Connected transformation. When adding a relational or a flat file source definition to a mapping, it is must to connect it to a Source Qualifier transformation. The Source Qualifier performs the various tasks such as overriding default SQL query, filtering records; join data from two or more tables etc.


47.       What is difference between maplet and reusable transformation?

            Maplet consists of set of transformations that is reusable.

A reusable transformation is a single transformation that can be reusable.


48.       What are Update Strategy transformation?

            Update strategy transformation is an active and connected transformation. It is used to update data in target table, either to maintain history of data or recent changes. You can specify how to treat source rows in table, insert, update, delete or data driven.


49.       How many types of dimensions are available in informatica?

    There are three types of dimensions.

    Star schema

    Snowflake schema

    Glaxy schema


50.       What is difference between maplet and reusable transformation?

            Maplet : one or more transformations.

set of transformations that are reusable.

Reusable transformation: only one transformation.

Single transformation which is reusable.


51.       What are different types of parsing?

    Quick parsing

    Thorough parsing


52.       What are Lookup and Fact Tables?

            A lookup (Dimension) table contains information about the entities. In general the Dimension and details objects are derived from lookup tables. A fact table contains the statistical information about transactions.


53.       What is Designer?

            Designer is the Business objects product that is intended to develop the universes. These universe is the semantic - layer of the database structure that isolates from technical issues.


54.       What is Surrogate Key?

            Surrogate keys are keys that are maintained within the data warehouse instead of keys taken from source data systems.


55.       What are the pitfalls of DWH?

    Limited value of data (Historical data not current data)

    DW solutions complicate business processes

    DW solutions may have too long a learning curve

    Costs of cleaning, capturing and delivering data


56.       How do you handle large datasets?

            By Using Bulk utility mode at the session level and if possible by disabling constraints after consulting with DBA; Using Bulk utility mode would mean that no writing is taking place in Roll Back Segment so loading is faster. However the pitfall is that recovery is not possible.


57.       What are the limitations of handling long datatypes?

            When the length of a datatype (e.g varchar2(4000)) goes beyond 4000, Informatica makes this as varchar2(2000).


58.       What are the types of OLAP?

            ROLAP (Relational OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in RDBMS. The performance is slow. A storage mode that uses tables in a relational database to store multidimensional structures.

MOLAP (Multidimensional OLAP) - Users see their data organized in cubes and dimensions but the data is really stored in MDBMS. Query performance is fast.

HOLAP (Hybrid OLAP) - It is a combination of ROLAP and HOLAP. EG: HOLOs. In this one will find data queries on aggregated data as well as detailed data.


59.       What is the difference between data mart and data warehouse?

            Data mart used on a business division/department level where as data warehouse is used on enterprise level.


60.       What is Meta data?

            Data about the data, contains the location and description of data warehouse system components such as name, definitions and end user views.


61.       How does the recovery mode work in informatica?

            In case of load failure an entry is made in OPB_SERV_ENTRY(?) table from where the extent of loading can be determined.


62.       What is Aggregate Awareness?

            Aggregate awareness is a feature of DESIGNER that makes use of aggregate tables in a database. These are tables that contain pre-calculated data. The purpose of these tables is to enhance the performance of SQL transactions; they are thus used to speed up the execution of queries.



63.       What is a difference between OLTP and OLAP?

    OLTP

    It focus on day to day transaction.

    Data Stability

    Dynamic

    Highly normalized.

    Access Frequency High.

    OLAP

    It focus on future predictions and decisions

    Static until refreshed

    Demoralized and replicated data

    Medium to low.


64.       When should you use a star schema and when a snowflake schema?

            A star schema is a simplest data warehouse schema. Snowflake schema is similar to the star schema. It normalizes dimension table to save data storage space. It can be used to represent hierarchies of information.


65.       What parameters can be tweaked to get better performance from a session?

            DTM shared memory, Index cache memory, Data cache memory, by indexing, using persistent cache, increasing commit interval etc.


66.       What are the benefits of DWH?

    Immediate information delivery

    Data Integration from across, even outside the organization

    Future vision of historical trends

    Tools for looking at data in new ways

    Enhanced customer service.


67.       Is It Possible to invoke Informatica batch or session outside Informatica UI?

            PMCMD.


68.       Why we are going for surrogate keys?

    Data tables in various source systems may use different keys for the same entity.

    Keys may change or be reused in the source data systems.

    Changes in organizational structures may move keys in the hierarchy.


69.       When is more convenient to join in the database or in Informatica?        

    Definitely at the database level

    at the source Qualifier query itself

    rather than using Joiner transformation


70.       How do you measure session performance?

            By checking Collect performance Data check box.


71.       What is Dimension Table?

    It contains data used to reference data stored in the fact table.

    Fewer rows

    Primarily character data

    One primary key (dimensional key)

    Updatable data


72.       What is a database connection?

            A connection is a set of parameters that provides access to an RDBMS. These parameters include system information such as the data account, user identification, and the path to the database. Designer provides three types of connections: secured, shared, and personal.


73.       What are all the types of dimensions?

  Informational Dimension

    Structural Dimension

    Categorical Dimension

    Partitioning Dimension

No comments:

Post a Comment