1. What is Data warehouse?
According to Bill Inmon, known as father of Data
warehousing. “A Data warehouse is a subject oriented, integrated ,time variant,
non volatile collection of data in support of management’s decision making
process”.
2. what
are the types of data warehouses?
There are three types of data
warehouses
Enterprise Data Warehouse
ODS (operational data store)
Data Mart
3. What
is Data mart?
A data mart is a subset of data warehouse that is designed for a particular
line of business, such as sales, marketing, or finance. In a dependent data
mart, data can be derived from an enterprise wide data warehouse. In an
independent data mart can be collected directly from sources.
4. What
is star schema?
A star schema is the simplest form of data warehouse schema that consists of
one or more dimensional and fact tables.
5. What
is snow flake schema?
A Snowflake schema is nothing but one Fact table which is connected to a number
of dimension tables, The snowflake and star schema are methods of storing data
which are multidimensional in nature.
6. What
are ETL Tools?
ETL Tools are stands for Extraction, Transformation, and Loading the data into
the data warehouse for decision making. ETL refers to the methods involved in
accessing and manipulating source data and loading it into target database.
7. What
are Dimensional table?
Dimension tables contain attributes that describe fact records in the fact
table.
8. What
is data Modelling?
Data Modeling is representing the real
world set of data structures or entities and their relationship in their of
data models, required for a database.Data Modelling consists of various types
like :
Conceptual data modeling
Logical data modeling
Physical data modeling
Enterprise data modeling
Relation data modeling
Dimensional data modeling.
9. What
is Surrogate key?
Surrogate key is a substitution for the natural primary key. It is just a unique
identifier or number of each row that can be used for the primary key to the
table.
10. What is Data
Mining?
A Data Mining is the process of analyzing data from different perpectives and
summarizing it into useful information
11. What is
Operational Data Store?
A ODS is an operational data store which comes as a second layer in a
datawarehouse architecture. It has got the characteristics of both OLTP and DSS
systems.
12. What is the
Difference between OLTP and OLAP?
OLTP is nothing but OnLine Transaction Processing which contains a normalised
tables .
But OLAP(Online Analtical Programming) contains the
history of OLTP data which is non-volatile acts as a Decisions Support System.
13. How many types of
dimensions are available in Informatica?
There are three types of dimensions
available are :
Junk dimension
Degenerative Dimension
Conformed Dimension
14. What is
Difference between ER Modeling and Dimensional Modeling?
ER Modeling is used for normalizing the OLTP database design.
Dimesional modeling is used for de-normalizing the ROLAP
/ MOLAP design.
15. What is the
maplet?
Maplet is a set of transformations that you build in the maplet designer and
you can use in multiple mapings.
16. What is Session
and Batches?
Session: A session is a set of commands that describes the server to move data
to the target.
Batch: A Batch is set of tasks that may include one or
more numbar of tasks (sessions, ewent wait, email, command, etc).
17. What are slowly
changing dimensions?
Dimensions that change overtime are
called Slowly Changing Dimensions(SCD).
Slowly Changing Dimension-Type1 :
Which has only current records.
Slowly Changing Dimension-Type2 :
Which has current records + historical records.
Slowly Changing Dimension-Type3 :
Which has current records + one previous records.
18. What are 2 modes
of data movement in Informatica Server?
There are two modes of data movement are:
Normal Mode in which for every record a separate DML stmt
will be prepared and executed.
Bulk Mode in which for multiple records DML stmt will be
preapred and executed thus improves performance.
19. What is the
difference between Active and Passive transformation?
Active Transformation:An active transformation can change the number of rows
that pass through it from source to target i.e it eliminates rows that do not
meet the condition in transformation.
Passive Transformation:A passive transformation does not
change the number of rows that pass through it i.e it passes all rows through
the transformation.
20. What is the
difference between connected and unconnected transformation?
Connected Transformation:Connected transformation is connected to other
transformations or directly to target table in the mapping.
UnConnected Transformation:An unconnected transformation
is not connected to other transformations in the mapping. It is called within
another transformation, and returns a value to that transformation.
21. What are
different types of transformations available in Informatica?
There are various types of transformations available in
Informatica :
Aggregator
Application Source Qualifier
Custom
Expression
External Procedure
Filter
Input
Joiner
Lookup
Normalizer
Output
Rank
Router
Sequence Generator
Sorter
Source Qualifier
Stored Procedure
Transaction Control
Union
Update Strategy
XML Generator
XML Parser
XML Source Qualifier
22. What are
Aggregator Transformation?
Aggregator transformation is an Active and Connected transformation. This
transformation is useful to perform calculations such as averages and sums (mainly
to perform calculations on multiple rows or groups).
23. What are
Expression transformation?
Expression transformation is a Passive and Connected transformation. This can
be used to calculate values in a single row before writing to the target.
24. What are Filter
transformation?
Filter transformation is an Active and Connected transformation. This can be
used to filter rows in a mapping that do not meet the condition.
25. What are Joiner
transformation?
Joiner Transformation is an Active and Connected transformation. This can be
used to join two sources coming from two different locations or from same
location.
26. Why we use lookup
transformations?
Lookup Transformations can access data from relational tables that are not
sources in mapping.
27. What are
Normalizer transformation?
Normalizer Transformation is an Active and Connected transformation. It is used
mainly with COBOL sources where most of the time data is stored in denormalized
format. Also, Normalizer transformation can be used to create multiple rows
from a single row of data.
28. What are Rank
transformation?
Rank transformation is an Active and Connected transformation. It is used to
select the top or bottom rank of data.
29. What are Router
transformation?
Router transformationis an Active and Connected transformation. It is similar
to filter transformation. The only difference is, filter transformation drops
the data that do not meet the condition whereas router has an option to capture
the data that do not meet the condition. It is useful to test multiple
conditions.
30. What are Sorter
transformation?
Sorter transformation is a Connected and an Active transformation. It allows to
sort data either in ascending or descending order according to a specified
field.
31. Name four output
files that informations server creates during session running?
Session Log
Workflow Log
Errors Log
Badfile
32. Why we use stored
procedure transformation?
A stored procedure transformation is an important tool for populating and
maintaing databases.
33. What are the
difference between static cache and dynamic cache?
Dynamic cache decreases the performance in comparision to static cache.
Static cache do not see such things just insert data as
many times as it is coming
34. Define maping and
sessions?
Maping: It is a set of source and target definitions linked by transformation
objects that define the rules for transformation.
Session : It is a set of instructions that describe how
and when to move data from source to targets.
35. What is a command
that used to run a batch?
pmcmd is used to start a batch.
36. What is
Datadriven?
The informatica server follows instructions coded into update strategy
transformations with in the session maping determine how to flag records for
insert, update, delete or reject.
37. What is power center
repository?
The PowerCenter repository allows you to share metadata across repositories to
create a data mart domain.
38. What is parameter
file?
A parameter file is a file created by text editor such as word pad or notepad.
U can define the following values in parameter file.
Maping parameters
Maping variables
session parameters.
39. What are the
types of lookup caches?
Static cache
Dynamic cache
Persistent cache
Shared cache
Recache
40. What are Stored
Procedure transformation?
Stored Procedure transformation is an Passive & Connected or
UnConnected transformation. It is useful to automate time-consuming tasks and
it is also used in error handling, to drop and recreate indexes and to
determine the space in database, a specialized calculation.
41. What is fact
table?
The centralized table in a star schema
is called as fact table. Fact tables are three types
additive
non-additive
semi additive
42. What is Data
warehouse?
According to Bill Inmon, known as father of Data warehousing. “A Data warehouse
is a subject oriented, integrated ,time variant, non volatile collection of
data in support of management’s decision making process”.
43. What is Data
Transformation Manager(DTM)?
After the load manager performs validations for the session, it creates the DTM
process. The DTM process is the second process associated with the session run.
44. How can you
define a transformation?
A transformation is a repository object that generates, modifies, or passes
data. The Designer provides a set of transformations that perform specific
functions.
45. What are Lookup
transformation?
Lookup transformation is Passive and it can be both Connected and UnConnected
as well. It is used to look up data in a relational table, view, or synonym.
Lookup definition can be imported either from source or from target tables.
46. What are Source
Qualifier transformation?
Source Qualifier transformation is an Active and Connected transformation. When
adding a relational or a flat file source definition to a mapping, it is must
to connect it to a Source Qualifier transformation. The Source Qualifier
performs the various tasks such as overriding default SQL query, filtering
records; join data from two or more tables etc.
47. What is
difference between maplet and reusable transformation?
Maplet consists of set of transformations that is reusable.
A reusable transformation is a single transformation that
can be reusable.
48. What are Update
Strategy transformation?
Update strategy transformation is an active and connected transformation. It is
used to update data in target table, either to maintain history of data or
recent changes. You can specify how to treat source rows in table, insert,
update, delete or data driven.
49. How many types of
dimensions are available in informatica?
There are three types of dimensions.
Star schema
Snowflake schema
Glaxy schema
50. What is
difference between maplet and reusable transformation?
Maplet : one or more transformations.
set of transformations that are reusable.
Reusable transformation: only one transformation.
Single transformation which is reusable.
51. What are
different types of parsing?
Quick parsing
Thorough parsing
52. What are Lookup
and Fact Tables?
A lookup (Dimension) table contains information about the entities. In general
the Dimension and details objects are derived from lookup tables. A fact table
contains the statistical information about transactions.
53. What is Designer?
Designer is the Business objects product that is intended to develop the
universes. These universe is the semantic - layer of the database structure
that isolates from technical issues.
54. What is Surrogate
Key?
Surrogate keys are keys that are maintained within the data warehouse instead
of keys taken from source data systems.
55. What are the
pitfalls of DWH?
Limited value of data (Historical data
not current data)
DW solutions complicate business
processes
DW solutions may have too long a
learning curve
Costs of cleaning, capturing and
delivering data
56. How do you handle
large datasets?
By Using Bulk utility mode at the session level and if possible by disabling
constraints after consulting with DBA; Using Bulk utility mode would mean that
no writing is taking place in Roll Back Segment so loading is faster. However
the pitfall is that recovery is not possible.
57. What are the
limitations of handling long datatypes?
When the length of a datatype (e.g varchar2(4000)) goes beyond 4000,
Informatica makes this as varchar2(2000).
58. What are the
types of OLAP?
ROLAP (Relational OLAP) - Users see their data organized in cubes and
dimensions but the data is really stored in RDBMS. The performance is slow. A
storage mode that uses tables in a relational database to store
multidimensional structures.
MOLAP (Multidimensional OLAP) - Users see their data
organized in cubes and dimensions but the data is really stored in MDBMS. Query
performance is fast.
HOLAP (Hybrid OLAP) - It is a combination of ROLAP and
HOLAP. EG: HOLOs. In this one will find data queries on aggregated data as well
as detailed data.
59. What is the
difference between data mart and data warehouse?
Data mart used on a business division/department level where as data warehouse
is used on enterprise level.
60. What is Meta
data?
Data about the data, contains the location and description of data warehouse
system components such as name, definitions and end user views.
61. How does the
recovery mode work in informatica?
In case of load failure an entry is made in OPB_SERV_ENTRY(?) table from where
the extent of loading can be determined.
62. What is Aggregate
Awareness?
Aggregate awareness is a feature of DESIGNER that makes use of aggregate tables
in a database. These are tables that contain pre-calculated data. The purpose
of these tables is to enhance the performance of SQL transactions; they are
thus used to speed up the execution of queries.
63. What is a
difference between OLTP and OLAP?
OLTP
It focus on day to day transaction.
Data Stability
Dynamic
Highly normalized.
Access Frequency High.
OLAP
It focus on future predictions and
decisions
Static until refreshed
Demoralized and replicated data
Medium to low.
64. When should you
use a star schema and when a snowflake schema?
A star schema is a simplest data warehouse schema. Snowflake schema is similar
to the star schema. It normalizes dimension table to save data storage space.
It can be used to represent hierarchies of information.
65. What parameters can
be tweaked to get better performance from a session?
DTM shared memory, Index cache memory, Data cache memory, by indexing, using
persistent cache, increasing commit interval etc.
66. What are the
benefits of DWH?
Immediate information delivery
Data Integration from across, even
outside the organization
Future vision of historical trends
Tools for looking at data in new ways
Enhanced customer service.
67. Is It Possible to
invoke Informatica batch or session outside Informatica UI?
PMCMD.
68. Why we are going
for surrogate keys?
Data tables in various source systems
may use different keys for the same entity.
Keys may change or be reused in the
source data systems.
Changes in organizational structures
may move keys in the hierarchy.
69. When is more
convenient to join in the database or in
Informatica?
Definitely at the database level
at the source Qualifier query itself
rather than using Joiner
transformation
70. How do you
measure session performance?
By checking Collect performance Data check box.
71. What is Dimension
Table?
It contains data used to reference
data stored in the fact table.
Fewer rows
Primarily character data
One primary key (dimensional key)
Updatable data
72. What is a
database connection?
A connection is a set of parameters that provides access to an RDBMS. These
parameters include system information such as the data account, user
identification, and the path to the database. Designer provides three types of
connections: secured, shared, and personal.
73. What are all the
types of dimensions?
Informational Dimension
Structural Dimension
Categorical Dimension
Partitioning Dimension
No comments:
Post a Comment