Jeff Prom's SQL Server Blog

Sharing knowledge and tips on SQL Server, Business Intelligence and Data Management.

Archive for the ‘SSIS’ Category

Jet Driver Error When Importing Data from Excel Files

Posted by jeffprom on March 24, 2017

Importing data from files is common. Csv file formats are also common but can cause issues with characters such as commas. Excel files offer a nice solution as they are usually formatted properly and do not have issues with characters. However, there can also be a common problem scheduling a job to load the data due to a jet driver issue.

ImportData:Error: The requested OLE DB provider Microsoft.Jet.OLEDB.4.0 is not registered. If the 64-bit driver is not installed, run the package in 32-bit mode. Error code 0x00000000. An OLE DB record is available. Source: “Microsoft OLE DB Service Components” Hresult: 0x80040154 Description: “Class not registered”.

Here is how to overcome this error. First, install the ‘Microsoft Access Database Engine 2010 Redistributable‘ file which will provide the required jet driver.

Next, open the SSIS package/solution in visual studio and click on Project, <project name> Properties. On the Configuration Properties, Debugging tab, Set Run64BitRuntime = False. The jet driver is only 32 bit.

There is one final step and that is a configuration setting on the SQL Agent job itself. Without setting this, it will fail. Edit the job, click on Steps and edit the step. On the General tab, click on Configuration, Advanced and check the box that says 32-bit runtime.

That’s it! Execute the job and it should now finish successfully.

Posted in Excel, SSIS, Uncategorized | Tagged: , | Leave a Comment »

Importing Stock Quotes Into SQL Server Through SSIS

Posted by jeffprom on January 23, 2015

I’ve worked at a publicly traded company for the past year and a half and one thing I always thought would be fun to do was to download stock values through SSIS and use it on reports or display it in a dashboard. Sounds easy enough, but as it turns out it was a bit harder than I expected for a few reasons:

1. Finding a stock quote provider that worked well with SSIS was a challenge.
2. I wasn’t finding any other good tutorials on how to download or process stock quotes using SSIS.
3. Lack of time to research and investigate a solution.

However, all three of the above have finally been overcome and that is why I am posting this article in case others were hoping to do this as well. So let’s take a look at how this can be done. In my examples, I am using Visual Studio 2012 and SQL Server 2012. If you are going to load data for just one particular stock, there won’t be a massive amount of data, so it’s easy to just truncate the destination table and reload each day which is what I’ll show here.

First, decide what stock symbol you would like to use. In my examples we will use Microsoft (MSFT). Their stock dates back to March 13th, 1986 and provides us with some good data. To get historical stock quotes, we are going to use an API from Yahoo Finance. We will make a URL call to download a csv file which accepts a stock quote parameter.

Here is what it looks like: http://ichart.finance.yahoo.com/table.csv?s=MSFT
The csv file includes data starting the day before (yesterday) and goes back to the fist IPO.

Now that we have a way to get stock quotes, let’s create an SSIS package and work on processing the data. Retrieving data through an HTTP connection is the next challenge we need to overcome. SSIS has an option to create a HTTP connection under Connection Managers but unfortunately it will not accomplish what we need. Let’s take a different approach. Start by creating two variables:
1. LocalFilename (point to a local folder and filename)
2. RemoteFilename (http://ichart.finance.yahoo.com/table.csv?s=MSFT)

variables used

Next, add a Script Task to the Control Flow. Edit the Script Task and add the two variables we just created to the ReadOnlyVariables.

add script task

Click on ‘Edit Script’. Under the public void Main() section, add the following:

Variables varCollection = null;

Dts.VariableDispenser.LockForRead(“User::RemoteFilename”);
Dts.VariableDispenser.LockForRead(“User::LocalFilename”);
Dts.VariableDispenser.GetVariables(ref varCollection);

System.Net.WebClient myWebClient = new System.Net.WebClient();
string webResource = varCollection[“User::RemoteFilename”].Value.ToString();
string fileName = varCollection[“User::LocalFilename”].Value.ToString();
myWebClient.DownloadFile(webResource, fileName);

Dts.TaskResult = (int)ScriptResults.Success;

script task code

At this point you can run the package and it should download the csv file to your local folder.

You can open the csv file and see that there are 7 columns (Date, Open, High, Low, Close, Volume, Adj Close).
(CAUTION: If you leave the csv file open and try to run the package again, you will get an obscure error because the script task cannot get a lock on the file)

error message

Now is also a good time to create a table where we can store the incoming data. Here is an example of the table I created. Feel free to modify it to fit your needs.

CREATE TABLE [dbo].[FactStockValue](
[StockValueKey] [int] IDENTITY(1,1) NOT NULL,
[DateKey] [int] NULL,
[StockSymbol] [varchar](5) NULL,
[OpenAmount] [money] NULL,
[HighAmount] [money] NULL,
[LowAmount] [money] NULL,
[CloseAmount] [money] NULL,
[Volume] [int] NULL,
[AdjustedCloseAmount] [money] NULL,
CONSTRAINT [PK_FactStockValue] PRIMARY KEY CLUSTERED
([StockValueKey] ASC)
)

Instead of storing just the date value, I convert the date to a smart key value which can then join to a date dimension table. Feel free to skip this step and store just the date if you like.

Now let’s hop back into the SSIS package and add a Data Flow Task. Within the Data Flow Task, add a Flat File Source and create a new connection that points to the local csv file. Since the csv file doesn’t include the stock symbol, I added a Derived Column transform to specify the symbol we are processing.

derived column

Next, I added a Data Conversion transform to convert Date to a database date [DT_DBDATE] which was needed to do a lookup on my date dimension table, and to convert the stock symbol value from unicode to string [DT_STR].

data conversion

In my example, I am doing a date lookup on a date dimension table to get a DateKey value. Finally, end with an OLE DB Destination which points to the destination table. Run the package and verify that it works.

data flow success

Check the destination table and verify that everything looks correct.

table results

Now that we have a way to retrieve and store stock data values, we can use it on dashboards and reports!

stock value chart

Posted in SSIS | Tagged: , | 5 Comments »

Cleansing Data Through SSIS with DQS

Posted by jeffprom on January 13, 2015

While Data Quality Services (DQS) takes some time to build up an effective knowledge base through manual data review, there are times when it sure would be handy to use SSIS. This is only recommended when you need quick results and are feeling pretty good about your knowledge base or are perhaps using an external knowledge base through reference data. Should you decide to cleanse data through SSIS using DQS, it’s important to note that it can export some quality control metrics such as Status and Confidence score with the final data which can later be used to review the results and make final decisions. So let’s take a look at using the DQS Cleansing transform in SSIS.

In this example, I created a basic SSIS package with one Data Flow. I selected an Excel Source and pointed it to a file that has some data to be cleansed. In my example I am going to send in some bad addresses and attempt to cleanse them using Melissa Data. To learn more about the Melissa Data Address Check add-in and reference data in general, view my other blog post here.

Create a new package, add a Data Flow Task, and then add a DQS Cleansing transform object.

dqs cleansing transform setup

Click on the New button next to the Data quality connection manager to setup a new connection to an instance of DQS and then select a knowledge base.

dqs cleansing transform setup with connection

Click on the mappings tab. Here you will map input columns from your data source to the domains on the knowledge base.

dqs cleansing transform with mappings

If you would like to see additional columns on the final results such as the Confidence score and the Reason then click on the Advanced tab.

dql cleansing transform advanced tab

When finished, click OK and test it out. In my example I am just going to a Union All object as my destination for demo purposes but I can still see the resulting cleansed data through the Data Viewer.

dqs cleansing results from transform

As you can see from the results, two addresses were corrected, two had suggestions, and one failed to be cleansed. We could get fancy and split up the results based on the Status values. Corrected ones may be considered ready to ship. Auto Suggest records may want to be reviewed, and the New ones would need to be reviewed since it was not found in the knowledge base.

Using the DQS Cleansing transform in SSIS is useful but doesn’t eliminate the need to review the final cleansed results. Cleansing data through SSIS is dependent on the quality of the underlying knowledge base but does provide a quick and efficient way to do an initial data cleanse. The resulting cleansed data can then be evaluated using the Status, Reason and Confidence score values. Once the data has been processed and reviewed, only then should it be considered cleansed and ‘ready for use’.

Posted in DQS, SSIS | Tagged: , , | Leave a Comment »

Passing parameters into SSIS from a SQL Job

Posted by jeffprom on August 30, 2010

Instead of hard coding values into your SSIS package, you could pass in parameter values into your SSIS variables using the following method. Here is an example of passing in a value of either a 1 or 0.
1. In your SSIS package, create a new variable with a data type of Int16. I’ll call the variable ‘YourVariable’ for this example.
2. If you want, you can use the variable on Precedence Constraints to control the flow of your package based on your variable value. So your expression might look like @YourVariable == 1.
3. Now in SSMS you can create your scheduled job and point it to your SSIS package.
4. On the ‘Set Values’ tab, under ‘Property Path’ you can put in your variable name of ‘\Package.Variables[YourVariable].Value’ and set the ‘Value’ field to what you want (ie. 0 or 1, etc).

SSIS Parameters

Posted in SSIS | 2 Comments »

Null values when importing from Excel with SSIS

Posted by jeffprom on October 28, 2008

I’ve written SSIS packages that import data from Excel spreadsheets and load them into SQL. However, I recently came across a situation where half the data in one of my columns were having their data values ignored and was written to the DB as Nulls. I tried changing the excel column format, etc with no luck. Perplexed I did some searching and came across an interesting little quirk. Basically when you load a column it reads the first 8 rows and tries to determine what it should load the data values as. Now if your data has a mix of characters and numbers, you need to make sure you reflect this in the first 8 rows or it will not load correctly. The official explanation and workarounds can be found here.

Posted in SSIS | Tagged: , | Leave a Comment »