Workaround for Self-Joining Table Limitations on Indexed Views

Posted on March 27, 2017 by jeffprom

Indexed views are special views that can offer huge performance improvements over regular views. They perform faster because they are stored in the database the same way a table is stored. They also have a unique clustered index which greatly improves performance. However, these benefits come at a price. Indexed views have a lot of limitations to consider before implementation.

To see the full list of limitations and to learn more about indexed views, click here:
https://docs.microsoft.com/en-us/sql/relational-databases/views/create-indexed-views

Recently, I was creating several indexed views and came across a limitation of joining to the same dimension table multiple times (role-playing dimension). I received the following error message: Cannot create index on view “x_DW.dbo.vw_SchemaBoundView_Test”. The view contains a self join on “x_DW.dbo.DimUser”.

Google searching resulted in several not-so-great suggestions to overcome this limitation. I believe the best solution was to create a new table and essentially writing duplicate data values to this new table. For example; create a dbo.DimUser2 table. I really do not like this approach because I would have to create another table for one single purpose and maintain the ETL and data for this one purpose.

After some thinking, I came up with another solution which requires less maintenance and seems to have good performance with less overhead. I broke my views up into two views: ViewName and ViewName_Base. The Base view has essentially everything minus the second join to the same table. I identified which of the two self-joining tables had the greatest amount of columns or caused the biggest performance hit. This was the join I included in my Base view because it will get stored like a table and indexed. I created the base view with schemabinding and created the unique clustered index.

Next, I created the other, non-base, view. This was nearly identical to the Base view. However, it’s primary source is the Base view. I then joined the Base view to the other self-joining table reference to get the final desired columns which I needed in my select statement.

By splitting the original view into two views, I was able to work around the self-joining table limitation in indexed views but was still able to have a really nice performance improvement. I did not need to create additional ETL or need to create another table with duplicate data that I would need to maintain.

Generating Dynamic CRUD Queries Within Microsoft Excel

Posted on February 21, 2014 by jeffprom

Last year I learned a neat trick from Ross McNeely. We kept getting ad-hoc Excel documents sent to us and we needed to update the databases with these values. Traditionally you might go through the import/export data wizard, or create an SSIS package to import the data. This usually leads to data conversion issues, possible temp tables, writing extra queries, and somtimes all around headaches.

This method shows how to write CRUD sql statements within Excel that can then update the database. This is a really quick way to take a handful of fields and update data with little effort. If you have a lot of fields you may want to consider going the more traditional route because it could take some time to write long query statments in Excel. That said, I have used this technique probably 100 times so I decided to post about it. Let’s take a look at how it works.

I have two examples here that will update two tables in the Adventure Works DW database. Let’s pretend that someone has just sent us an Excel file and needs product information updated in the DimProduct table. This is what the data looks like.

First, find an open cell off to the right (F) and begin writing the sql query as shown here. Begin with =” and place single quotes where the sql statement will require them. In order to reference other data from the row, place double quotes, an ampersand and then the cell location. To continue the rest of the statement simply place another ampersand and double quote.

Continue until the rest of the statement is complete. Here is what it should look like:
=”update DimProduct set EnglishProductName='”&B2&”‘, StandardCost='”&C2&”‘, FinishedGoodsFlag=”&D2&”, Color='”&E2&”‘ where ProductAlternateKey='”&A3&”‘”

Once finished, click off of the query statement. If it worked correctly you should now see the query populated with values. If you received some errors, go back and double check the statement. Syntax issues are the most common mistakes.

Now that we have one statement that looks correct, let’s quickly fill in all the other rows. Single click the cell with the query (F2), and then double click the green square on the lower right. This will populate the rest of the rows with the same query template.

After the rest of the rows are populated, you should now have a complete set of queries.

One last trick to note is when using dates. You will need to use the text() function as shown here and specify a date format.
=”update FactInternetSales set ShipDate='”&TEXT(C2,”yyyy-mm-dd”)&”‘ where SalesOrderNumber='”&A2&”‘ and SalesOrderLineNumber=”&B2

The final step is to copy the query statements from Excel and paste them into SQL Server Management Studio where you can simply execute them to update the database. In the previous screenshot you would click on column D, copy, and then paste into Management Studio, and execute.

And there you have it, writing queries directly within Excel provides a quick way to take data from an Excel spreadsheet and manage data within your database with very little effort.

View Running Queries by CPU Time

Posted on December 9, 2012 by jeffprom

If you have ever done performance tuning you know it can be a bit of an art and you need your detective hat on. I was recently working on a server that was performing poorly. After looking at resource monitor it was clear that the CPU usage was unusually high. As it turns out SQL Server was utilizing most of the CPU. The hunt was on. I grabbed my magnifying glass and followed the trail. While I could run sp_who2 to find some relevant info, I instead ran the query below. sys.dm_exec_requests returns information about each request that is executing within SQL Server. sys.dm_exec_sql_text returns the text of the SQL batch that is identified by the specified sql_handle.

-- check for queries running. sort by cpu time SELECT a.session_id, db_name(a.database_id) as db_name, a.start_time, a.command, a.status, a.cpu_time, a.total_elapsed_time, a.reads, a.writes, b.text as query FROM sys.dm_exec_requests a OUTER APPLY sys.dm_exec_sql_text(a.sql_handle) b WHERE a.session_id > 50 -- filter out background tasks and a.session_id <> @@spid -- filter out this query session order by a.cpu_time desc

This query shows open queries sorted in descending order by CPU time. I was able to nab a few culprits in the act. I copied the queries to new windows and checked the execution plans. After some intense interrogation it was clear that several tables needed an index and the sub-queries should be turned into joins. Updates were put in place, and the CPU usage went way down. Case closed!

Using RowVersion and Timestamp

Posted on October 5, 2011 by jeffprom

If you do any kind of batch ETL processing, it can be very useful to know if any records in your database have been updated. One way to do this is to use a rowversion or timestamp column in your tables.

Books online has a good definition for timestamp:
Each database has a counter that is incremented for each insert or update operation that is performed on a table that contains a timestamp column within the database.
This counter is the database timestamp. This tracks a relative time within a database, not an actual time that can be associated with a clock. A table can have only one timestamp column. This property makes a timestamp column a poor candidate for keys, especially primary keys. Any update made to the row changes the timestamp value and, therefore, changes the key value.

Below are a couple of examples I came up with to demonstrate their potential. The first set uses timestamp. The second set uses rowversion. However, rowversion still uses timestamp behind the scenes so it’s not much different.

--------------------------------------- -- TIMESTAMP --------------------------------------- -- 1. create a test table and put a few records in it CREATE TABLE Test_TimeStamp (RowID int PRIMARY KEY, Value int, TS timestamp); GO INSERT INTO Test_TimeStamp (RowID, Value) VALUES (1, 0); GO INSERT INTO Test_TimeStamp (RowID, Value) VALUES (2, 0); GO INSERT INTO Test_TimeStamp (RowID, Value) VALUES (3, 0); GO


-- 2. store the latest timestamp that is currently in the table

DECLARE @TS AS timestamp;

SET @TS = (SELECT @@DBTS AS TS)

--SELECT @TS
-- update a couple of records

UPDATE Test_TimeStamp SET Value=2 WHERE RowID=1

UPDATE Test_TimeStamp SET Value=3 WHERE RowID=2
-- show all of the records that have changed

SELECT * FROM Test_TimeStamp WHERE TS > @TS
-- show the new latest timestamp stored in the table

SELECT @@DBTS AS TS
-- 3. re-run step 2 and see the changes again and again

--DROP TABLE Test_TimeStamp

--GO
---------------------------------------

-- ROWVERSION

---------------------------------------

-- 1. create a test table and put a few records in it

CREATE TABLE Test_RowVersion (RowID int PRIMARY KEY, Value int, RV rowversion);

GO

INSERT INTO Test_RowVersion (RowID, Value) VALUES (1, 0);

GO

INSERT INTO Test_RowVersion (RowID, Value) VALUES (2, 0);

GO

INSERT INTO Test_RowVersion (RowID, Value) VALUES (3, 0);

GO
-- 2. store the latest timestamp that is currently in the table

DECLARE @TS AS timestamp;

SET @TS = (SELECT @@DBTS AS TS)

--SELECT @TS
-- update a couple of records

UPDATE Test_RowVersion SET Value=2 WHERE RowID=1

UPDATE Test_RowVersion SET Value=3 WHERE RowID=2
-- show all of the records that have changed

SELECT * FROM Test_RowVersion WHERE RV > @TS
-- show the new latest timestamp stored in the table

SELECT @@DBTS AS TS

-- 3. re-run step 2 and see the changes again and again --DROP TABLE Test_RowVersion --GO

MDX vs T-SQL in SSAS

Posted on August 30, 2010 by jeffprom

Using SSAS in BIDS, you need to remember that you should be using MDX instead of T-SQL when designing Calculated Members. Here is an example of how both would be used to prevent a divide by zero error.

T-SQL -- Comment Block CASE WHEN [Measures].[Measure2] = 0 THEN 0 ELSE [Measures].[Measure1] / [Measures].[Measure2] END

MDX // Comment Block IIF([Measures].[Measure2] = 0, 0, [Measures].[Measure1] / [Measures].[Measure2])

Both will work, but the T-SQL one will give you some warnings. It’s best to remember to use MDX instead.

Property Owner is not available for Database

Posted on April 16, 2010 by jeffprom

If a database becomes orphaned and has no database owner, you will get the following error message when you try to view the database properties in SSMS:

Cannot show requested dialog. (SqlMgmt)
Property Owner is not available for Database ‘[database]’. This property may not exist for this object, or may not be retrievable due to insufficient access rights.
(Microsoft.SqlServer.Smo)

You can use the following code to see orphaned databases.
SELECT databases.NAME AS DB_Name, server_Principals.NAME AS User_Name FROM sys.[databases] LEFT OUTER JOIN sys.[server_principals] ON [databases].owner_sid = [server_principals].sid

To assign a new owner:
EXEC sp_changedbowner 'newuser'

If you get the following error message:

Msg 15110, Level 16, State 1, Line 1 The proposed new database owner is already a user or aliased in the database.

Open up the user’s Login Properties window under Security\Logins and uncheck the Map checkbox for the database and click OK. Basically the code won’t assign them as the new owner because it thinks they are already associated with that database. Now run the sp_changedbowner command again and it will work.

Failed Job Steps That Didn’t Notify An Operator

Posted on March 16, 2010 by jeffprom

I often use jobs that have numerous steps. One example would be a job that has some initial prep work, multiple steps that run similar code on separate databases for multiple stores, and then finally some cleanup steps. I break the job down to individual steps per store so that if one of the steps fails, they all don’t fail. To do this, go to the advanced tab of the job step properties and set the ‘On failure action:‘ to ‘Go to the next step’. This allows the job to continue processing if there is an error on one of the steps. However, now it raises the question of how do we know if a step failed within a job? Unfortunately Microsoft doesn’t have built in failure notification for steps like they do with the overall job. You can view the job’s history and see if a step failed by the yellow icon, but that is not practical to check every day especially if you have multiple jobs setup this way. A better solution is to use the code below which shows jobs that recently had failed steps and did not notify an operator. It can be handy to setup in a SSRS report to keep an eye on all of your jobs that had failed steps.

-- FAILED JOB STEPS THAT DIDN'T NOTIFY AN OPERATOR VIA EMAIL USE msdb GO


DECLARE @DateStringToday			VARCHAR(8);

DECLARE @DateStringYesterday		VARCHAR(8);
SET @DateStringToday = convert(varchar, getdate(), 112);

SET @DateStringYesterday = convert(varchar, getdate()-1, 112);
SELECT

	job_name = sj.name,

	sj.enabled,

    sjh.step_id,

    sjh.step_name,

    sjh.sql_message_id,

    sjh.sql_severity,

    sjh.message,

    sjh.run_status,

    sjh.run_date,

    sjh.run_time,

    sjh.run_duration,

    operator_emailed = so.name
FROM msdb.dbo.sysjobhistory as sjh

	INNER JOIN msdb.dbo.sysjobs_view sj ON sj.job_id = sjh.job_id

    LEFT OUTER JOIN msdb.dbo.sysoperators so  ON (sjh.operator_id_emailed = so.id)
WHERE sjh.run_status = 0

AND sjh.run_date IN(@DateStringToday, @DateStringYesterday) -- show today and yesterday

AND sj.enabled = 1 -- make sure it's enabled

AND sj.category_id != '101' -- remove SSRS report process jobs

AND so.name IS NULL -- show jobs that didn't already email an operator

ORDER BY sjh.run_date DESC, sjh.run_time DESC

Moving SQL 2000 Logins to SQL 2005

Posted on December 10, 2009 by jeffprom

If you backup a database on one server and restore it to another, you can have the problem where the database has logins associated to it, but the instance does not. One problem, is that the database has a unique SID associated with the login name. If you create a new instance login with the same name, it will generate a different SID than the database one. I’ve read that if you are using 2005 and above, you simply create the new login for the instance and run the following to sync up the SID’s.
USE YourDatabaseName GO EXEC sp_change_users_login 'Update_One', 'UserName', 'UserName' EXEC sp_change_users_login 'Auto_Fix', 'UserName'

However, in my case I was moving a database from SQL 2000 to SQL 2005 and these did not work.
The Update_One produced:
Msg 15063, Level 16, State 1, Procedure sp_change_users_login, Line 143
The login already has an account under a different user name.

What I had to do was:
1. Restore the database to the new server (SQL 2005)
2. Do NOT create a new login yet.
3. Run the following to find the unique SID associated to the database user account. Next we create a
new instance login with the same SID as the database account.

-- Look up the SID from the database USE YourDatabaseName GO


SELECT D.name AS [DB_LoginName], D.sid AS [DB_SID], S.name AS [Server_LoginName], S.sid AS [Server_SID]

FROM sys.database_principals AS D LEFT OUTER JOIN sys.server_principals AS S ON D.name = S.name

-- Next, take that SID and put it in here to create the login account CREATE LOGIN UserName WITH PASSWORD = 'Password', SID = 0xB0A2667BAEDE1B4AB93EAA0F9525DD21

You can re-run the SELECT query to verify that they are indeed the same.

Date Formatting Functions

Posted on October 14, 2009 by jeffprom

Different date formats are often neccesary on many occasions. Here are some Scalar-valued Functions you can use to help. Simply call your function and give it the full datetime field and it will return the formatted value depending on which funtion you call. For example; dbo.fn_dateYM(fullDateField) would return YYYY-MM.

-- Enter in full date. Return Year and Month: 2009-10 CREATE FUNCTION [dbo].[fn_dateYM] ( @dateMDYTime smalldatetime ) RETURNS varchar(7) AS BEGIN DECLARE @dateYM varchar(7) SELECT @dateYM = CAST(YEAR(@dateMDYTime) AS CHAR(4)) + N'-' + RIGHT('00' + LTRIM(RTRIM(CAST(MONTH(@dateMDYTime) AS CHAR(2)))),2) RETURN @dateYM END


-- Enter in full date. Return Year: 2009

CREATE FUNCTION [dbo].[fn_dateY]

(

	@dateMDYTime		smalldatetime

)

RETURNS varchar(4)

AS

BEGIN

	DECLARE @dateY		varchar(4)

	SELECT @dateY = CAST(YEAR(@dateMDYTime) AS CHAR(4))

	RETURN @dateY

END
-- Enter in full date. Return Month, Day, and Year: 10-14-2009

CREATE FUNCTION [dbo].[fn_dateMDY]

(

	@dateMDYTime		smalldatetime

)

RETURNS varchar(25)

AS

BEGIN

	DECLARE @dateMDY		varchar(25)

	SELECT @dateMDY = convert(varchar(25), cast(@dateMDYTime as smalldatetime), 101)

	RETURN @dateMDY

END
-- Enter in full date. Return Month: 10

CREATE FUNCTION [dbo].[fn_dateM]

(

	@dateMDYTime		smalldatetime

)

RETURNS varchar(2)

AS

BEGIN

	DECLARE @dateM		varchar(2)

	SELECT @dateM = CAST(MONTH(@dateMDYTime) AS CHAR(2))

	RETURN @dateM

END

-- Enter in full date. Return Day: 14 CREATE FUNCTION [dbo].[fn_dateD] ( @dateMDYTime smalldatetime ) RETURNS varchar(2) AS BEGIN DECLARE @dateD varchar(2) SELECT @dateD = CAST(DAY(@dateMDYTime) AS CHAR(2)) RETURN @dateD END

Jeff Prom's Data, Database, and SQL Blog

Sharing knowledge and tips on Data, Databases, SQL, Snowflake, Azure, Business Intelligence, and Data Management.

Category Archives: T-SQL