Thursday, April 21, 2011
ETL Process
ETL process
ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks:
- extracting the data from source systems (SAP, ERP, other oprational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing.
# - transforming the data may involve the following tasks: applying business rules (so-called derivations, e.g., calculating new measures and dimensions),
# cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.),
# filtering (e.g., selecting only certain columns to load),
# splitting a column into multiple columns and vice versa,
# joining together data from multiple sources (e.g., lookup, merge),
# transposing rows and columns,
# applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row are empty then reject the row from processing)
- loading the data into a data warehouse or data repository other reporting applications
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment