rated by 0 users
This post has 0 Replies | 1 Follower

Top 25 Contributor
Male
Posts 11
Mahmud Posted: 06-28-2009 11:15 AM

In one of my previous post I tried to share my experience on SQL trigger. Here I am adding some additional points that may help us in writing better sql in our projects.

Analyzing SQL:

Analyzing the performance of your SQL at best you can in parallel to the development is very important. If you perform this analysis for every query/operational statement you write before moving to the next query/operational statement of the procedure you can easily trace where the performance is suffering. You should always remember, the time that your procedure takes is the summation of all the query and operational statement that executes inside it. Normally if any of your queries takes more than 0-20 millisecond, you must revise the SQL to see if there is better way to write it.   

Here I am providing one utility function that I have written and used in some of my optimization work and found very helpful for tracing the execution time of any SQL in millisecond level:

CREATE FUNCTION [dbo].[GetCurrentTime]()

RETURNS varchar(100)

AS

BEGIN

 

DECLARE @time varchar(100)

 

SET @time = (CAST(DATEPART(mi,getDate()) AS varchar(50)) + ' Min: ' + CAST(DATEPART(ss,getDate()) AS varchar(50)) + ' Sec: ' + CAST(DATEPART(ms,getDate()) AS varchar(50)) + ' Ms')

 

RETURN @time

END

You can use the above function in the following way:

DECLARE @time varchar(100)

SET @time = ( dbo.GetCurrentTime())

print 'Current Time: ' + @time

PRINT '----------------------------------------------------------'

 

--Here write your sql statement.

 

PRINT '----------------------------------------------------------'

SET @time = ( dbo.GetCurrentTime())

print 'Current Time: ' + @time

This will print the execution start time and execution end time of your SQL in milliseconds while you run the SQL for tracing in the query analyzer. So, from this result you can easily find the time your procedure is taking to execute. Instead of using this function you can utilize SQL profiler as well. But this function will help you get the timing with in the same window that you are working in. However, SQL profiler provides few more information about your query that also might help you to analyze the performance of the SQL. So, do not keep it away from you.

 1.     Database Design:

The Design of your database is very important for writing best performance SQL. You must ensure the database is normalized to some extent. Be sure that normalization does not affect the performance of your database. For large and frequently accessed database normalization can cause performance downfall in some cases. So, you must analyze the design carefully to decide the level of normalization you require.

3. Using INNER JOIN Vs WHERE clause:

This is true that from performance perspective a query written using WHERE clause and a query written using INNER JOIN do not have any difference. If we watch the execution plan of both the query we will see it actually executes in the same way. Then what actually you should use for joining 2 tables? You must always use INNER JOIN not WHERE clause. That means you have to ensure that if you have n number of tables in your query you must have at least n-1 number of joins. Why? Well, there are several reasons:

    i. INNER JOIN makes your query more readable and logical looking.

   ii. It makes your query more manageable.

  iii. It separates the joining condition from filtering condition of your query.

 iv. It will make your life easier when you have complex queries with several joins.

  v. It will minimize the possibility of error in the result of the query

  vi. Above all, trust me, it will reduce the time for writing the query.   

4. Index Creation

Creating necessary indexes on your database is very important. In one of my optimization work I found just creating 2 indexes on a table gives my query 50% performance boost. Determining the correct indexing policies is very crucial for your database. You must remember that proper indexing is good for your database performance but on the other hand creating indexes unwisely can affect your database performance significantly. Unnecessary index creates overheads on database server memory and also could result extremely slow (Sometimes never ending) query performance. So what are the basics you should follow to consider creating index? Here are some key features that makes database columns to be candidate for indexing:

 i. Ensure that you have at least one primary key in your table. Sql Server automatically indexes the primary keys.

 ii. All foreign keys are good candidates to be indexed.

iii. Columns on which ordering operation is done very frequently are very good candidate for indexing.

iv. Columns on which a large table is filtered or joined in different queries are also good candidates for indexing.

Remember for creating index no rule can be considered as Bible. You should always analyze the database and go for extensive testing to find if an index is helpful for your database. For better understanding you may like to do further study on it.

5. Using Temporary Table

In many stored procedures I have found we have used temporary tables without giving much thought on it. We should know that temporary tables are created in the tempdb and and any insert/update/delete operation or queering from temporary table has extra overhead as you have to access a table of out side of your database internally. So, try to avoid temporary table as long as you can. If you see there is no alternative of using this, you must define the table with CREATE TABLE statement with proper primary/unique key constraints and if required create indexes on it. This perhaps can improve the performance of your query.

6. Using Table Variable

In SQL server we have an alternative of temporary table called Table variable. Perhaps many of you are not aware of this. Table variable is created in memory and performs lot better than temporary table reducing the overhead on database. But If you need a table that will contain huge amount of data temporary table actually could be better for you. Because you cannot create index on Table variable but in temporary table you can.

7. Avoid Using Dynamic SQL:

Think twice before writing any dynamic SQL in your procedure. If possible try to avoid this. Here are few drawbacks of writing dynamic SQL:

   i. It does not allow your stored procedure to be pre compiled and thus affects performance.

  ii. It makes your query less manageable.

iii. It makes your query less readable.

iv. It can open your procedure for possible XSS attack.

In many cases we can avoid writing dynamic SQL. For example, in most of the cases you write dynamic SQL for using variable with TOP key word. Yes you needed to do it in SQL 2000. But now in SQL 2005 it can be achieved without dynamic query. Just put a bracket around your variable with the TOP key word and it will work. Example: SELECT TOP (@size) marks FROM result

So, the key point is you will not write dynamic query as long as it can be avoided.

8. Using Like Operator:

Filtering your query using Like key word could be very slow in performance if you query on a table with huge amount of data specially if you have other joins with this filtering criteria. So when you need to achieve the functionality of “Like” operator on a high volume table consider using Full Text query instead of that.

9. Using OR in your WHERE condition:

Be very careful in using OR in your filtering criteria’s. It can make your query significantly slow. In one of my optimization work I got a query of the following format:

FROM

      Action

      ,Relationship rel

where

      (Action.GROUP_ID=@GroupID

            OR (rel.GROUP_ID=@GroupID

                  AND rel.CONTACT_ID= Action.CONTACT_ID

                  )

      )

A query written in above format was taking almost 7-8 sec to execute. I just spited the query in 2 queries and united them using UNION operator. It improved the query to be executed just in few milliseconds. You see the significant performance gain you can achieve in this way? So always try to utilize UNION instead of using OR filtering criteria.

10. Use of cursor:

Probably cursor is the most dangerous tool exists in SQL that can kill the performance of your procedure. Remember in most of the cases cursors can be avoided by different kind of JOINS, sub queries etc. So think, analyze and if needed take help of others before writing any cursor in your procedure. In one of the optimization task that I worked on I improved the query performance more than 40% by removing cursor from the query.

11. Never Use SELECT * FROM TABLE:

Many of us have tendency of writing SQL using “SELECT * FROM TABLE” format. Do not do it. Even if you need all the columns to be returned, you specify all the columns by their name in your query. Moreover always ensure you are returning only those columns that you really need for your purpose – nothing more than that. This is actually a matter of habit that you can achieve just by not becoming lazy during writing SQL.

12. Print statement:

Sometimes you write lot of print statements to trace your procedure execution. Then after completing the procedure you do not delete those print statements. You should remember event for printing a single digit your processor needs some time and memory. It could be less then micro second but do not forget summation of microseconds become millisecond and summation of milliseconds become second. Yes I have gained almost 1 second performance gain just by deleting all the print statements in one of my optimization task.  

13. Using SCOPE_IDENTITY, IDENT_CURRENT, and @@IDENTITY:

SCOPE_IDENTITY, IDENT_CURRENT, and @@IDENTITY are similar functions in that they return values inserted into IDENTITY columns.

IDENT_CURRENT is not limited by scope and session; it is limited to a specified table. IDENT_CURRENT returns the value generated for a specific table in any session and any scope. For more information, see IDENT_CURRENT.

SCOPE_IDENTITY and @@IDENTITY will return last identity values generated in any table in the current session. However, SCOPE_IDENTITY returns values inserted only within the current scope; @@IDENTITY is not limited to a specific scope.

For example, you have two tables, T1 and T2, and an INSERT trigger defined on T1. When a row is inserted to T1, the trigger fires and inserts a row in T2. This scenario illustrates two scopes: the insert on T1, and the insert on T2 as a result of the trigger.

Assuming that both T1 and T2 have IDENTITY columns, @@IDENTITY and SCOPE_IDENTITY will return different values at the end of an INSERT statement on T1.

14.  Using @@ERROR:

Make sure that the @@ERROR global variable is checked after every statement which causes an update to the database (INSERT, UPDATE, DELETE). Make sure that rollbacks (if appropriate) are performed prior to inserting rows into an exception table

15. Minimize using Not Equal operator:

Minimize the use of not equal operations, <> or !=. SQL Server has to scan a table or index to find all values to see if they are not equal to the value given in the expression. Try rephrasing the expression using ranges: WHERE KeyColumn < 'TestValue' AND KeyColumn > 'TestValue'

16.  DONOT start the name of a stored procedure with SP_: This is because all the system related stored procedures follow this convention. Hence a valid procedure today may clash with the naming convention of a system procedure that gets bundled with a Service pack / Security patch tomorrow. Hence do not follow this convention.

17. Using Indexed View:

Sometimes we would require an view to be indexed. This feature is bundled with SQL Server 2000. The result set of the indexed view is persist in the database and indexed for fast access. Because indexed views depend on base tables, you should create indexed views with SCHEMABINDING option to prevent the table or column modification that would invalidate the view. Hence using them can reduce a lot of load on the base tables but increases the maintainability.

18. Readability:

  i. Ensure the readability of your SQL.

 ii. Capitalize all key word like SELECT, WHERE etc.

iii. Use camel casing in naming the variables.

iv. Breakdown your SQL in multiple lines. Never write too long line that can become tough to trace. A very well indented query could be like the following:

SELECT

   c.CONTACT_ID

   ,u.USER_ID

FROM

    Cac_Contact c

INNER JOIN

    Cac_User u

ON

  c.CONTACT_ID = u.CONTACT_ID

End Words:

The points I have discussed here are not something that I learned from any book. All of these I came to realize while working in real projects. So, it would be great if my friends in the community adds additional points into this. It would be even more useful if anybody shares different experience than my once in any case. I would be eagerly looking for that.

Page 1 of 1 (1 items) | RSS