I drafted a document and submitted it to a company in an attempt to help eliminate some very common timeout issues that plague their internal software processes. Since I don’t have anyone I know who is a SQL guru, I had to rely on what I already know. I’m posting it here in hopes that it will either help someone or I can get a SQL guru to tell me I’m a dolt. To give you an idea of the magnitude of the issue the database is 300GB and running on a single disk drive with millions of records coming in each day.
Some of you might read this and ask, “Why not suggest a SAN?” and that would be a great question. SANs are expensive and everything is hosted in house. A SAN would make most sense in this situation but due to cost it was a suggestion that was frowned upon. I chose the following recommendations as ‘stage 1’ tasks. I had planned to draft up stage 2 and 3 which moved into index optimizations and then query optimizations.
I chose to start by solving the hardware bottle necks because even if you spend an hour shaving 3 minutes off of a long running query, with the amount of data being housed it would only really be a band-aid and a cost that wouldn’t give a very good ROI. Starting as close to the metal as possible will bubble up performance increases to the upper layers automatically and would have a far wider touch that focusing on specific queries.
The goal of these recommendations is to significantly reduce or eliminate SQL timeouts by decreasing the execution times of queries and increasing overall performance of the database.
The biggest bottleneck of any server will always be the physical disk drive(s). The recommendations in this section are based on scaling out instead of scaling up.
Problem – Single physical drive serving all IO requests
The databases are sharing a single physical drive for all SQL data files. This means that every IO read (table, index) and IO write (data, transaction logs) is served by the same drive causing longer queue times for all requests.
Solution: Moving transaction logs to dedicated physical drive
Moving transaction logs to a dedicated physical drive will allow SQL to send all transaction log IO reads and IO writes to the dedicated log drive while increasing the number of read requests that the data drive can serve. Both are done in parallel.
Result: Faster inserts and updates. Increased read throughput from data drive.
Solution: Moving tempdb to a dedicated physical drive
SQL uses the tempdb for many things including storing results of joins. By moving the tempdb to a dedicated physical drive, SQL can send both IO reads and IO writes to be handled by the tempdb drive thus freeing the data drive from serving these requests.
Result: Faster joins, improved performance while working with temp tables and increased read throughput from the data drive.
Solution: Moving non-clustered indexes to dedicated physical drive
Non clustered index pages are not the same as the data pages (as with clustered indexes) so each index seek requires 2 IO reads (one to find the index and the other to find the data row the index points to). By moving the non-clustered indexes to separate file groups on a dedicated physical drive, the required IO reads required by the non clustered indexes can be split between the data drive and the index drive.
Result: Increased performance for non clustered index lookups, reduced MDF file size.
Problem: Historical data
Historical data (data that is no longer relevant to the current time context) is important for statistical and performance reporting over wide time spans but requires storage. With data increasing every hour, the overhead of maintaining the historical data grows significantly and causes longer execution times for queries.
Solution: Archive historical data
By moving historical data out of the production environment, the overhead of storage, indexing and IO reads drops significantly and increases performance on clustered indexes.
- Separate server – Moving historical data to a dedicated, separate server will reduce the overhead of storage and some or most historical reporting and processing, thus reducing load on the production server.
- Separate table and file group – Keeping historical data in a separate table, isolated from the production tables will still require storage, but will no longer be a burden on production processing. Moving the historical data table into a file group and storing on a physical drive other than where production data is stored will improve performance of queries and reporting.
Result: Reduces number of rows in very large tables which will increase overall query performance, reduced IO workload, smaller index files, faster rebuilding of clustered indexes
Alternative Solution: Table partition
An alternative to moving historical data to separate storage would be to partition very large tables into different file groups and moving the file groups containing older data to a dedicated physical drive.
Result: Improved read performance against large tables