This post will explain the following:
- Slow changing dimensions (SCD0,SCD1,SCD2 and SCD3)
- Inferred Members
- How to configure a SCD component in a data flow using SSIS2012
Slow-Changing Dimensions are a special data warehouse modeling technique, that allow you to manage how your ETL process will resolve updates and inserts into a dimension, so that you can tell your packages what to do if an attribute value changes.
The last weekend I had an interesting challenge, creating a staging table from 13 excel files that have the same columns, only its data changed, according to the year they were referring to. This way, since 2000, my friends excel files, had data from soccer statistics, one for each season.
He challenged me to create a Business Intelligence solution in just 2 hours so that he could analyze some portion of the data in order to decide if he would invest in a full BI solution or not. This way I came up with this solution:
1.Create a linked server for each excel file
2.Create a query that would select the data from each linked server and then union all records fetched and insert them into a new table.
3.Create five dimensions based of the staging table data
4.Create one fact table
5.Create all packages for dimensions and facts
6.Create a cube
7.Show the data in excel 2013
Business Intelligence is a very trendy term these days, managers want it, IT companies are selling it and users are loving the colors of the charts, the ability to drill and roll up between hierarchies drives them crazy, however most of them don´t know what Business Intelligence really is. They often refer to it as building a data warehouse where all company data is integrated with ETL tools like SQL Server Integration Services and analyzed with Analytical tools as SQL Server Analysis Services or even Excel. I am sorry guys, if you think BI is only about this, you are either using it or making it wrong. And start by thinking in this, although emails are a great technology, if you need to talk to a college seated right next to you, will you send him and email? What is the core of your problem here, the communication channel or the message being sent?
While dealing with SSIS 2012, you might want configure connections to third party platforms, like SQL Server, Oracle DB or even Access or Excel inside your packages data flows. SSIS uses, what Microsoft called Connection Managers to configure this connections, and its easy to think about several scenarios in which your source or destination target could need to use the Excel Connection one, to select or insert data.
The problem this post wants to clarify is only applied to 64-bit environments, to which the Excel Connection manager isn´t compatible. This can be a problem in integration projects, and I speak for my self when I say that several of them, have an initial loading stage in which we get data from several Excel files and integrate them in a SQL Server database for instance.