An important checklist for a client server (VB - SQL Server) Project.

Ten nine eight seven …only five days left for
the launch of your death march project. When the time allocated for the project is in
hours and then programmers are dragged, crying and kicking in to extra features and
extended functionality, they have very less time to look for the speed limit, and they
often cross orange lights while zooming towards the dead lines. If these points kept in
prespective, it may save you some speeding tickets from the methodology police of the
project.

Once the application is tested and about to go on
line with real users, few of the things should be checked to make your application more
robust, fast and sleek. In the development and testing phase you were more concerned with
the applications basic working functionality and the bugs and debugging, that you have
hardly any time to think about the other jazz and frills of the system. Check these nuts
and bolts; if they are tightening properly your basic car can perform as a soupe up
Porsche.

The first and foremost important thing for the
horsepower and speed for your application is ‘Indexing’. An improper indexing is
like driving your car with your hand breaks on; it can not be emphasized more. You have to
pay close attention for the Indexing, Clustered or non-clustered Indexing. It is important
to understand the difference between the two.

When a Clustered Index is created on a table,
the data in the table is physically sorted in clustered Index order. SQL server allows
only one clustered index per table because there is only one way physically to sort the
data in the table.

A nonclustered index is a separate index
structure independent of the physical sort order of the data. You can think of a
nonclustered Index like an index on the back of a book. The key words are not located in
any sorted order through out the book, actually they are spread quite randomly all over
the book. To find a specific word, you look up the keyword in the index that contains all
the key words in an alphabetically order. For the word you are looking for, you find a
page number.

Indexes can help you speed up the data retrieval.
However indexes has an adverse effect on the update performance because the indexes need
to be maintained by SQL server to keep the proper sorted order. Before you get down to
tuning the engine, you should know few of the subtle intricacies of the engine.

In Clustered Index, during inserts, the row must be
inserted into the sorted order and at the appropriate location to maintain the sort order.
This is an overhead on the system for insert operation.

Because clustered indexes maintain a sort order and
insert data into the appropriate location with in a table, clustered indexes make use of
free space in page throughout the table, resulting in less wasted space. They can help
improve performance for certain types of queries, such as range retrieval and queries with
‘order by’. They take less space than non-clustered indexes defined on the same
column. They provide a faster access path to the data than a similarly created non-
–clustered index, unless the non-clustered index can cover the query. Clustered
indexes are updated less often that non- clustered index. Every time you perform some
operation on a data row, you had to update the non-clustered indexes. However clustered
indexes needed to be updated only if delete or modify the key field.

Most often clustered indexes are assign to the
primary key of the table. This is ok if the primary key is the main access path for the
that table, but other good candidates for the clustered indexes are, Range searches,
columns containing number of duplicate values, columns frequently referenced in an
‘order by’ query. Column other than primary key referenced in join clauses. In
most applications, the primary key column on a table is almost always retrieve in a single
row lookup. For single row lookups, a non-clustered index usually cost you only one more
I/O more than a clustered index. The difference which you will never notice. Clustered
indexes can improve performance for range retrieval because the clustered index can be
used to set the bounds of a search, even if the query involves a large percentage of the
rows in the table. Because the data is in sorted order. The same concept holds true for
indexes on columns with a large number of duplicates. With a clustered index, the
duplicate values are grouped together, minimize the number of pages that would need to be
searched. Another good candidate for a clustered index is a column used frequently in
queries for sorting the result set.

Also, when defining non-clustered indexes, you
typically want to define indexes on columns with a low number of duplicates (that is, with
low-density values) so that they can be used effectively by the optimizer. A high number
of duplicates on a non-clustered index can often make it more expensive (in term of I/O)
to process the query using the non-clustered index than a table scan.

If you have a search clause on the table that will
be satisfied by a non-clustered index, but are ordering by the clustered index column, SQL
server will need to use a worktable to sort the result. This is because the data in this
case is being retrieved via non clustered index order, not clustered index order.
Therefore rows aren’t sorted in the order specified by the order by clause when
retrieved.

With or without a clustered index on a table, every
time you insert, delete or update a row causing the row to move with in the table, all non
clustered indexes on the table need to be updated to reflect the new row location. With a
large number of indexes on a table, the overhead during data modification may become
excessive.

The most important aspect to improving SQL server
performance is proper indexing.

Now these facts can take you back to the indexes
design board. You would like to reorganize you indexing. For e.g. Suppose if you have a
form in which you have to see all the items from a inventory of millions items on some
particular quality, or grade, it will be a good idea to create a clustered index on that
field as the retrieval be really fast.

Another important thing to remember, take it with
some salt. Of course indexes are taxing for the update and insert, but whom you would you
like to wait an obscure data entry operator, whose second name no bodies knows or your
boss, or the CEO of the company, who uses your system to check some reports.

You need not be good engineer to understand the
proper oiling is very important for the functioning of your engine. Update statistics is
like getting new oil for your car every after 2000 Mils. Run this command often, It will
keep your index statistics uptodate.

If you’ve updated statistics for a table, and
want the query optimizer to reexamine the new statistics and generate new query plans for
all the stored procedure that reference that table, use the sp_recompile stored procedure
e.g.

Sp_recompile tablename.

You can also use sp_recompile when an index has been
added to a table and you want the optimizer to consider the new index and generate a new
query plan for the stored procedure.

A good understanding of the SQL server query
optimizer is needed to know how it uses indexes and index statistics to develop query
plan.

You should have a good feel for a good SARG (Search
Arguments). The general goal is to match a SARG with an index to avoid a table scan. A
search argument is defined as a where clause comparing a column to a constant.

Valid operator for a SARG are any one of
=,>,<,>= and <=. The inequality operator (!= or <> ) isn’t a valid
operator for a SARG. If you have an inequality operator, the optimizer ignores that as a
search statement because it can’t be used to match a value against an index. If you
have a search clause containing an inequality operator, try to rewrite it as a SARG so
that it can be recognized as a search argument by the query optimizer, for example,
consider the following query:

Select title from titles where price != 0

Can be written as

Select title from titles where price > 0

Both will return the same result, but the second
version has valid SARG that the optimizer will recognize and consider for matching with an
index to satisfy the query. You have to keep an eye out for invalid SARGs. Look for the
word reformatting in query execution plan.

For a big report writing a big query with many join
and using many table can be more time consuming than filling the report and getting the
other values from other table on the fly. Query written in development phase looks quite
fast but in production can go flat.

On the first execution, the optimizer generates a
query plan for all select statements in the stored procedure, regardless of the
conditional branching. The query plan for each select statement is based upon the
passed-in parameters.

If @arg=1
Select * from orders where price = @value
Else
Select * from orders where qty = @value

Having distinctly different queries in the stored
procedure could result in the wrong query plan being generated for one of the select
statements, as its query plan might be based upon inappropriate values for that query.

A better approach would be to break the different select
statements into two separate stored procedures and execute the appropriate stored
procedure.

Always use server cursors that allow an
application to fetch a small subset or block of rows from an arbitrarily large result set.
But if all the results from a given SELECT statement are going to be used in your
application, a server cursor is always going to be more expensive than a default result
set. . If a particular task requests a single row by primary key, use a default result
set. If another task requires an unpredictably large or updateable result set, use a
server cursor and fetch rows in reasonably sized blocks (for example, one screen of rows
at a time).

Commit transactional changes as soon as possible,
Committing changes as quickly as possible means that locks are released as quickly as
possible. Because SQL Server uses page locks for most operations, releasing locks quickly
minimizes their impact on other users who may need access to rows on the same data pages.
Locking mechanism is significantly changed in SQL 7, but still this advice will hold well.

Minimize network roundtrips to the server during a
transaction. This "fat server" implementation is like adding 1000 horse power in
your car, it will yield outstanding performance. Move more of the code from client to
Server as stored Procedure.

Check constraints for ensuring that column values
are valid. DEFAULT and NOT NULL constraints avoid the complexities caused by missing
column values. Normalizing the data – Removing the redundancies from the data.

Don’t join more than 4 or 5 tables better
still; break up the query in multiple queries.

The distinct keyword causes a worktable to be
created in tempdb for sorting and removing of duplicate row, even if the unique
index is present on the column. Avoid using distinct as much as possible, it do
make a difference on the performance.

If you want to check the existence of a row in a
table without needing to know the exact count, use the if exist statement, rather
than count (*) . The if exist statement stop the processing as soon as it meet the
matching row where as count (*) continues processing to look for all the matching rows.

Use if update (columnname) in stored
procedures to save unnecessary processing in the table triggers. Your update trigger will
fire every time, but you may want to do certain thing only if certain column has changed,
so save all the processing time by testing the particular column change early in the
trigger.

I am sure most of the programmers are already
familiar with these points, but unfortunatily they can be easily overlooked. You got to
check these points before you will begin your journey to best from better. Enjoy!

An important checklist for a client server (VB – SQL Server) Project.

Rafat Sarosh

Company

Categories