PERFORMANCE COMPARISON OF
MICROSOFT'S COM ADO
AND ADO.NET DATABASE CONNECTORS
SE655 - Performance Analysis of Real-time Systems - Spring 2008
Jason McGuire
Introduction
In the current era of computing, society has become ever more dependent on
the processing of vast quantities of data at near instantaneous speeds. In
order to handle this ever increasing demand, a tremendous effort has gone into
developing relational database systems since the late 1960's when E.F. Codd
first presented the concept. Today, these database systems handle terabytes of
information across the globe providing the lifeblood for the majority of the
commerce that takes place where delays measured in mere seconds can result in
the loss of incredible sums of money. With this in mind, a significant focus
has been placed on the transactional performance of the major database systems
on the market and this project looks at the software connection between the
client, perhaps a desktop application for a single user or a website servicing
millions of users, and the database system. With Microsoft placed as a major contender
in the database industry through their SQL Server family, the
Purpose and Scope
The purpose of this report is to outline the experiments performed in testing the read and write performance of Microsoft's ADO.NET connector and ADO COM connector against SQL Server 2000 and SQL Server 2005. The initial hypothesis of the analyst is that the performance for the latest ADO.NET connector used in conjunction with the .NET 2.0 platform is significantly higher. Through the application of various statistical tools it is the aim of this project to prove or disprove the stated hypothesis.
This report is intended as an assessment of the query execution performance
between the two
Executive Summary
For this project two types of experiments were performed with the first one testing the read (SELECT) execution time of an SQL query across both the ADO COM and ADO.NET connectors against Microsoft SQL Server 2000 and Microsoft SQL Server 2005. The second experiment type tested the write (INSERT) execution time of an SQL query using the same database connector setup against the same two database servers. The actual queries used were obtained by tracking all queries executed on a production helpdesk trouble ticket system used by a customer and selecting the most used SELECT SQL statement and the most used INSERT SQL statement. From seven major systems that makeup this helpdesk trouble ticket system, the query required for searching trouble tickets was used as the statement to test read execution time while the query required to insert new service cost estimates into the system was used to represent the test write execution time.
Initially the SELECT and INSERT SQL queries were repeated twenty times each to establish baseline statistical values as noted in the following table. From this point the total number of measurements required using a 95% confidence interval with a 5% (+/- 2.5%) error margin was calculated for both the SELECT and INSERT query statements against the database connection type and database servers. Results indicated that a wide range of measurements were going to be required and was noted with the lowest being one measurement and the highest being 1506 measurements. To err on the side of accuracy, the highest value was rounded to 2,000 measurements as the number of experiments required to be conducted for all tests in both SELECT and INSERT query execution time performance experiments.
Upon completion of all execution time tests using 2,000 measurements each, another set of basic statistical tests were taken along with a before-and-after comparison to check for statistical significance of the average performance between the two database connectors for both SELECT and INSERT SQL queries. Next a full ANOVA analysis comparing not only the average execution time of the two connectors in isolation but also any differences in average execution time as a result of SQL Server 2000 and SQL Server 2005. Results from the SELECT SQL query experiment indicated that the ADO.NET connector had an average execution time of 0.6184 milliseconds and the ADO COM connector had an average execution time of 2.4158 milliseconds against the SQL Server 2000 database. The same experiment was repeated using the same SELECT SQL query on SQL Server 2005 and the ADO.NET connector had an average execution time of 0.7630 milliseconds and ADO COM came in at 3.0513 milliseconds. Moving to the INSERT performance experiment on SQL Server 2000, ADO.NET had an average execution time of 13.7553 milliseconds and ADO COM had an average of 33.8934 milliseconds.
In order to substantiate the above values additional analysis was performance using first the before and after comparison to test the performance significance between the database connectors via the difference in execution times of the two for each iteration of the experiment and then calculating a confidence interval at the 95% confidence level. Results for this analysis indicated that there is indeed a significant difference in performance between the ADO.NET connector and the ADO COM connector on SQL Server 2000 and SQL Server 2005 for both the SELECT and INSERT SQL queries. To further refine the analysis ANOVA testing was applied to both SELECT and INSERT SQL queries which gave better insight to the average execution time difference not only between connector types but also between database systems.
Test Environment - SUT & CUT
The system under test is comprised of two physically distinct computer
systems both based on the Intel x86 processor architecture connected through an
Ethernet network. The first system, using the NETBIOS name Odie on a Windows
network, is assigned the role of "client" which executes the client
application running the SQL performance tests and reporting the execution
times. The second system, using the NETBIOS name Scooby on the same Windows
network, is assigned the role of "server" running a Windows Server
2003 OS (operating system) with both SQL Server 2000
The client system, Odie, is an Intel x86 based system running Windows XP 32 bit with service pack 2 and all critical updates as of 20 April 2008 applied. The processor is a Core 2 Duo running at 2.13 GHz with two separate cores combined with 2 GBs of DDR2 SDRAM and a SCSI320 disk subsystem running in RAID 0 configuration. A clean install of the OS was performed prior to the experiment and the only applications installed are SQL Server 2005 Studio Manager used to remotely manipulate the SQL Server 2000 and SQL Server 2005 running on Scooby and Visual Studio 2005 with service pack 1 for the development of the client application that executes the test SQL statement and records the execution time of the ADO connector.
The server system, Scooby, uses the same processor, main board and memory as Odie however the disk subsystem is composed of four SATA II drives. The first pair of drives are 160 GBs running in RAID 1 and contains the OS and all required programs for operation. The second pair of drives are 320 GBs running also in RAID 1 and contain the actual SQL databases used by SQL Server 2000 and SQL Server 2005. The OS is Windows Server 2003 with service pack 2 and all critical updates as of 20 April 2008 applied.
The component under test is the abstract software connector used between the
client application and the SQL database server. For this experiment Microsoft's
previous generation
The metric for the experiment is the execution time required for the SQL query measured in milliseconds. For an SQL query to be executed the client application submits the query via the Ethernet network to the SQL Server where the server accepts the query, evaluates it and processes it. As the query is processed a data record set is created and returned to the client. All other factors including opening and closing the database connector are not considered.
Analysis - Initial Runs
Initial runs were conducted on the SELECT and INSERT SQL queries twenty times to acquire some basic statistical data in the following table. From that point the number of measurements required to obtain a 95% confidence interval with 5% (+/- 2.5%) error margin were calculated for each connector type against each SQL Server system. The largest number rounded up was taken and used in all subsequent experiments.
Select query on SQL Server 2000
|
|
COM |
.NET |
|
Average Execution Time (ms) |
44.1573 |
13.7651 |
|
Standard Deviation (ms) |
0.9488 |
0.0735 |
|
|
|
|
Select query on SQL Server 2005
|
|
COM |
.NET |
|
Average Execution Time (ms) |
7.2470 |
2.5795 |
|
Standard Deviation (ms) |
2.3694 |
0.1699 |
|
|
|
|
The next table summarizes the execution time averages and standard deviation for the initial run of INSERT SQL query used in this experiment executed twenty times.
Insert query on SQL Server 2000
|
|
COM |
.NET |
|
Average Execution Time (ms) |
1.3666 |
0.4496 |
|
Standard Deviation (ms) |
0.0891 |
0.0070 |
|
|
|
|
Insert query on SQL Server 2005
|
|
COM |
.NET |
|
Average Execution Time (ms) |
2.1096 |
0.6744 |
|
Standard Deviation (ms) |
0.6446 |
0.0187 |
|
|
|
|
The initial runs created a baseline set of summary statistics including a solid average and standard deviation at the 95% confidence level to start off with. From this point the number of measurements required was calculated for both SELECT and INSERT queries executing on the ADO.NET and ADO COM connectors against SQL Server 2000 and SQL Server 2005. By rounding the largest number obtained the results stated that 2,000 measurements would be adequate given a 95% confidence level with 5% error margin (+/- 2.5%).
The full analysis using before and after comparison along with ANOVA for all 2,000 data points is available for download in the following links.
Full
report - PDF
Analysis - Microsoft Excel
Conclusion
In concluding these experiments, it can be stated that Microsoft has made a
significant performance enhancement in their latest version of the
Lessons Learned
During the course of this semester long project a number of new concepts were introduced and presented a significant learning curve through trial and error. As interesting as the project was a number of key points were observed and noted for future projects of a similar nature.
- Establish a clear understanding of exactly it is you are going assess. In this case a number of trials were attempted using all possible SQL queries that were not related and created extreme differences in execution times.
- Identify milestones and create a comprehensive schedule. Like most projects of any significance, a substantial amount of time needs to be invested in the project and the only way to prevent last minute changes is to make progress on a continuous basis and update the schedule to reflect changes on a regular basis.
- Make sure you understand your tools. A good portion of the analysis was done with Microsoft Excel which contains a host of powerful data analysis tool packs. Be sure that you have a good understanding of how they work and, more importantly, what the results mean.