Dynamically filtering data before importing from SQL Server in Excel
I'm trying to figure out whether there is a way to filter the data from an SQL Server connection before it is imported into Excel. The initial import doesn't matter (when the connection is set up), but I need Excel to have little to no data when opened subsequently and import the relevant data only after a parameter is chosen. The SQL tables could have a lot of data and performance is an issue.
So if I import the data into a table, apply a filter for, say, Country to be UK, close the file and open it again next week, ideally it should only query for and import the data for UK. And when I choose another country it should import the data for that country alone. I need the choices to be enumerated from the table.
Now, I don't believe tables have that capability. But how about Slicers, PivotTables, PowerView or PowerPivot?
Appreciate any ideas.
You can absolutely do this with Tables, as well as Powerpivot and Data Explorer. The easiest, and my preferred method currently, is to create a parameterized query in Excel. To do this use the Query Wizard and at the end add constraints for each parameter you want to pass. Once this is done, you can edit the Connection Properties and assign the Parameter to be returned as a static value, dialog box or a spreadsheet cell.
- Select your data source using MS Query-this is required to be able to pass parameters back to SQL;
Data>Other Sources>Microsoft Query
. - Choose
Data Source
, select your SQL data source here. I'll generally leave all of these options blank until I get to theFinish
window, then I'll selectView data or edit query in MS Query.
- Note: I typically pre-process the info I want from SQL into a single view that I can connect to here, but you can generally pass any select statement. I haven't tried calling a Stored Procedure, so I'm not sure how that would work.
- In the MS Query Window, click the
Show/Hide Criteria button
and choose a Criteria field. This will be what calls your parameters to pass back to SQL.- Example: If you have a date field and select it in the Criteria field, then add a Value of Between [start] and [end]. This will create a parameter and MS Query will ask you to fill both values and immediately return the appropriate data set. If you check the SQL statement it has created, it will have added a WHERE(datasource.date Between ? and ?) clause.
- Select File>Return Data to Excel, then Choose Table.
- To have the table pull your query parameter automatically, select a cell in your table, the go to Data>Properties>Connection Properties>Definition>Parameters.
- For each parameter, you can choose for it to
- Prompt you to provide a value; or
- Use a specific value; or
- Get a value from a spreadsheet cell, with the option to automatically update your table data whenever that cell value changes.
- As an additional option, you can use Data Validation on the cell you choose for your parameter, allowing you to limit the possible values to choose from. You could even have the validation cells come from a table that is fed from values from your database, ensuring that only values held in the database could be selected.
- Set your paramater value for each parameter and click OK through all of the dialog boxes.
- Finally, to minimize the file size, in the Connection Properties>Usage tab, you can select Refresh Data when opening file and Remove data from the external range before saving the workbook. This will dump the connected data when the file isn't actively in use, minimizing file size.
I'll often create a Pivot Table based upon a Table fed from a paramterized query to return values for automatically updated, specific date ranges (e.g. prior 6 months).
EDIT: To return multiple values for use in a SQL IN clause, you just need to modify the Criteria window in the MS Query box. For example, if you have a field name of Country, select it in the Criteria Field, then for Value enter IN([First],[Second],[Third]). Choose Return Data to Excel and MS Query will ask to manually input values. Once you're back in Excel you can modify the parameters again, and point them to three cells in your spreadsheet, each of which can be constrained by Data Validation. While there's probably a limit to how many parameters you can pass like this, Excel easily handles three, so a few more isn't likely out of the question.
If you don't mind using VBA then a 'light' way to do this is by dynamically generating the SQL query.
I use something similar to the below pretty extensively for generating ad-hoc dashboards which don't need anything as complex as QlikView or Crystal
E.g.
Hidden sheet 'Variables' contains a table 'Table_SQL_SALES_EMPLOYEES'. This is populated by SQL to just return the minimum required for param selection. Something like SELECT DISTINCT AccountDirector FROM InvoicedSales ORDER BY AccountDirector
.
I then have a named range 'Sales_Employees' which his its Refers To as =Table_SQL_SALES_EMPLOYEES[AccountDirector]
. This is mapped to a combo box, which has the Input Range set to 'Sales_Employees'. This will send a number to a cell of your choice (E5 in my case) of the item in the list selected. To return the actual value as per original SQL extract you need to use =OFFSET(Table_SQL_SALES_EMPLOYEES[[#Headers],[AccountDirector]],$E$5,0)
. This is named 'Slp'.
.: We now have a drop-down selection box of all possible account directors. There's also another one for the financial quarter, but same principle :.
When the selection is changed in the combo box, it calls a macro 'changeFilter()'
Sub changeFilter()
Dim wb As Workbook
Set wb = Excel.ActiveWorkbook
Dim ws As Worksheet
Set ws = Excel.ActiveWorkbook.Sheets("Lookups")
Dim conn As Excel.WorkbookConnection
Dim slp As Name
Set slp = wb.Names("Slp")
Dim qtr As Integer
qtr = wb.Names("qtr").RefersToRange.Value2
Dim query As String
' Adjusted Sales Consolidated
Set conn = wb.Connections("SQL_ADJUSTEDSALES_CONSOLIDATED")
query = "SELECT * FROM InvoicedSales WHERE AccountDirector=" & "'" & slp.RefersToRange.Value2 & "'"
conn.OLEDBConnection.CommandText = query
conn.Refresh
I basically pre-configure the tables / pivots by hooking them up to the database with hardcoded variables and select the options 'Refresh data when opening file' and 'Remove data from the external data range before saving the workbook' - this keeps the filesize tiny as nothing is actually saved.
The filter change then overwrites the hardcoded variable with a dynamically generated SQL query using the parameter selected and then refreshes the table.