Staging table from several Excel files (SQL Server 2012)

The last weekend I had an interesting challenge, creating a staging table from 13 excel files that have the same columns, only its data changed, according to the year they were referring to. This way, since 2000, my friends excel files, had data from soccer statistics, one for each season.

He challenged me to create a Business Intelligence solution in just 2 hours so that he could analyze some portion of the data in order to decide if he would invest in a full BI solution or not. This way I came up with this solution:

1.Create a linked server for each excel file

2.Create a query that would select the data from each linked server and then union all records fetched and insert them into a new table.

3.Create five dimensions based of the staging table data

4.Create one fact table

5.Create all packages for dimensions and facts

6.Create a cube

7.Show the data in excel 2013

The full solution took me about 2,5 hours to make however I think I was successful convincing my friend about the benefits of using Business Intelligence. What I want to show you with this post is how to create that staging table from the excel files, which means step 1 and 2 from my solution, described above.

So, step one, is creating the linked server. To do so you can run the following SQL command if you are using Excel 2013 (The provider I used is for Excel 2007+). I ran it 13 times, one for each file I had, just changing the path to the data source (datascr) and the name of the linked server (server).


EXEC sp_addlinkedserver

@server = 'NAME FOR LINKED SERVER',

@srvproduct = 'Excel',

@provider = 'Microsoft.ACE.OLEDB.12.0',

@datasrc = 'C:\BI_DS\file_name_1.xls',

@provstr = 'Excel 12.0;IMEX=1;HDR=YES;'

Then you need to create the query, which looks like the following, in which I created a select query for each of them and then union between all records fetched. Some of the columns didn’t exist in some excel files, so I needed to treat those special cases. The final select, inserts the data into a new SQL table. The result was, in just 15 minutes I had the data from all excel sheets, integrated in a single staging table.

select * into STG_FOOT_STAT from (

select '2000/2001' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,HO,AO,HY,AY,HR,AR from [SRC20002001]...[E0$]

union

select '2001/2002' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,HO,AO,HY,AY,HR,AR from [SRC20012002]...[E0$]

union

select '2002/2003' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20022003]...[E0$]

union

select '2003/2004' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20032004]...[E0$]

union

select '2004/2005' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20042005]...[E0$]

union

select '2005/2006' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20052006]...[E0$]

union

select '2006/2007' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20062007]...[E0$]

union

select '2007/2008' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20072008]...[E0$]

union

select '2008/2009' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20082009]...[E0$]

union

select '2009/2010' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20092010]...[E0$]

union

select '2010/2011' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20102011]...[E0$]

union

select '2011/2012' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20112012]...[E0$]

union

select '2012/2013' Season,[Date],HomeTeam,AwayTeam,FTHG,FTAG,FTR,'N/A' as Referee,HS,[AS],HC,AC,HF,0 as HO,0 as AO,HY,AY,HR,AR from [SRC20122013]...[E0$]

) x

As you can see this is a very simple solution for integrating several files into one table in SQL Server.

Thank you,

Rui Machado

Anúncios

Deixe uma Resposta

Preencha os seus detalhes abaixo ou clique num ícone para iniciar sessão:

Logótipo da WordPress.com

Está a comentar usando a sua conta WordPress.com Terminar Sessão / Alterar )

Imagem do Twitter

Está a comentar usando a sua conta Twitter Terminar Sessão / Alterar )

Facebook photo

Está a comentar usando a sua conta Facebook Terminar Sessão / Alterar )

Google+ photo

Está a comentar usando a sua conta Google+ Terminar Sessão / Alterar )

Connecting to %s