PERFORMANCE COMPARISON OF MICROSOFT'S COM ADO AND ADO.NET DATABASE CONNECTORS

SE655 - Performance Analysis of Real-time Systems - Spring 2008

Jason McGuire

 


Introduction

In the current era of computing, society has become ever more dependent on the processing of vast quantities of data at near instantaneous speeds. In order to handle this ever increasing demand, a tremendous effort has gone into developing relational database systems since the late 1960's when E.F. Codd first presented the concept. Today, these database systems handle terabytes of information across the globe providing the lifeblood for the majority of the commerce that takes place where delays measured in mere seconds can result in the loss of incredible sums of money. With this in mind, a significant focus has been placed on the transactional performance of the major database systems on the market and this project looks at the software connection between the client, perhaps a desktop application for a single user or a website servicing millions of users, and the database system. With Microsoft placed as a major contender in the database industry through their SQL Server family, the ADO connection based on Microsoft's legacy COM platform is pitted against the newest rendition based on the .NET 2.0 platform. With the many advances in architecture between these two platforms the newer .NET is expected to show a significant improvement in performance.

 

Purpose and Scope

The purpose of this report is to outline the experiments performed in testing the read and write performance of Microsoft's ADO.NET connector and ADO COM connector against SQL Server 2000 and SQL Server 2005. The initial hypothesis of the analyst is that the performance for the latest ADO.NET connector used in conjunction with the .NET 2.0 platform is significantly higher. Through the application of various statistical tools it is the aim of this project to prove or disprove the stated hypothesis.

This report is intended as an assessment of the query execution performance between the two ADO connectors provided by Microsoft using their legacy COM platform and the current .NET 2.0 platform. The information contained within is not intended to determine which connector performs better in all applications but rather provide another piece in assessing performance for a particular write (INSERT) scenario as well as a particular read (SELECT) scenario.

 

Executive Summary

For this project two types of experiments were performed with the first one testing the read (SELECT) execution time of an SQL query across both the ADO COM and ADO.NET connectors against Microsoft SQL Server 2000 and Microsoft SQL Server 2005. The second experiment type tested the write (INSERT) execution time of an SQL query using the same database connector setup against the same two database servers. The actual queries used were obtained by tracking all queries executed on a production helpdesk trouble ticket system used by a customer and selecting the most used SELECT SQL statement and the most used INSERT SQL statement. From seven major systems that makeup this helpdesk trouble ticket system, the query required for searching trouble tickets was used as the statement to test read execution time while the query required to insert new service cost estimates into the system was used to represent the test write execution time.

Initially the SELECT and INSERT SQL queries were repeated twenty times each to establish baseline statistical values as noted in the following table. From this point the total number of measurements required using a 95% confidence interval with a 5% (+/- 2.5%) error margin was calculated for both the SELECT and INSERT query statements against the database connection type and database servers. Results indicated that a wide range of measurements were going to be required and was noted with the lowest being one measurement and the highest being 1506 measurements. To err on the side of accuracy, the highest value was rounded to 2,000 measurements as the number of experiments required to be conducted for all tests in both SELECT and INSERT query execution time performance experiments.

Upon completion of all execution time tests using 2,000 measurements each, another set of basic statistical tests were taken along with a before-and-after comparison to check for statistical significance of the average performance between the two database connectors for both SELECT and INSERT SQL queries. Next a full ANOVA analysis comparing not only the average execution time of the two connectors in isolation but also any differences in average execution time as a result of SQL Server 2000 and SQL Server 2005. Results from the SELECT SQL query experiment indicated that the ADO.NET connector had an average execution time of 0.6184 milliseconds and the ADO COM connector had an average execution time of 2.4158 milliseconds against the SQL Server 2000 database. The same experiment was repeated using the same SELECT SQL query on SQL Server 2005 and the ADO.NET connector had an average execution time of 0.7630 milliseconds and ADO COM came in at 3.0513 milliseconds. Moving to the INSERT performance experiment on SQL Server 2000, ADO.NET had an average execution time of 13.7553 milliseconds and ADO COM had an average of 33.8934 milliseconds.

In order to substantiate the above values additional analysis was performance using first the before and after comparison to test the performance significance between the database connectors via the difference in execution times of the two for each iteration of the experiment and then calculating a confidence interval at the 95% confidence level. Results for this analysis indicated that there is indeed a significant difference in performance between the ADO.NET connector and the ADO COM connector on SQL Server 2000 and SQL Server 2005 for both the SELECT and INSERT SQL queries. To further refine the analysis ANOVA testing was applied to both SELECT and INSERT SQL queries which gave better insight to the average execution time difference not only between connector types but also between database systems.

 

Test Environment - SUT & CUT

The system under test is comprised of two physically distinct computer systems both based on the Intel x86 processor architecture connected through an Ethernet network. The first system, using the NETBIOS name Odie on a Windows network, is assigned the role of "client" which executes the client application running the SQL performance tests and reporting the execution times. The second system, using the NETBIOS name Scooby on the same Windows network, is assigned the role of "server" running a Windows Server 2003 OS (operating system) with both SQL Server 2000 Enterprise and SQL Server 2005 Enterprise database systems installed. Scooby receives the SQL query statement from the client application on Odie and executes the statement on both database systems.

The client system, Odie, is an Intel x86 based system running Windows XP 32 bit with service pack 2 and all critical updates as of 20 April 2008 applied. The processor is a Core 2 Duo running at 2.13 GHz with two separate cores combined with 2 GBs of DDR2 SDRAM and a SCSI320 disk subsystem running in RAID 0 configuration. A clean install of the OS was performed prior to the experiment and the only applications installed are SQL Server 2005 Studio Manager used to remotely manipulate the SQL Server 2000 and SQL Server 2005 running on Scooby and Visual Studio 2005 with service pack 1 for the development of the client application that executes the test SQL statement and records the execution time of the ADO connector.

The server system, Scooby, uses the same processor, main board and memory as Odie however the disk subsystem is composed of four SATA II drives. The first pair of drives are 160 GBs running in RAID 1 and contains the OS and all required programs for operation. The second pair of drives are 320 GBs running also in RAID 1 and contain the actual SQL databases used by SQL Server 2000 and SQL Server 2005. The OS is Windows Server 2003 with service pack 2 and all critical updates as of 20 April 2008 applied.

The component under test is the abstract software connector used between the client application and the SQL database server. For this experiment Microsoft's previous generation ADO database connector used under the COM platform is tested against the current generation ADO connector for the .NET platform executing a SELECT and INSERT query.

The metric for the experiment is the execution time required for the SQL query measured in milliseconds. For an SQL query to be executed the client application submits the query via the Ethernet network to the SQL Server where the server accepts the query, evaluates it and processes it. As the query is processed a data record set is created and returned to the client. All other factors including opening and closing the database connector are not considered.

 

Analysis - Initial Runs

Initial runs were conducted on the SELECT and INSERT SQL queries twenty times to acquire some basic statistical data in the following table. From that point the number of measurements required to obtain a 95% confidence interval with 5% (+/- 2.5%) error margin were calculated for each connector type against each SQL Server system. The largest number rounded up was taken and used in all subsequent experiments.

Select query on SQL Server 2000

 

COM

.NET

Average Execution Time (ms)

44.1573

13.7651

Standard Deviation (ms)

0.9488

0.0735

 

 

 

 

Select query on SQL Server 2005

 

COM

.NET

Average Execution Time (ms)

7.2470

2.5795

Standard Deviation (ms)

2.3694

0.1699

 

 

 

 

The next table summarizes the execution time averages and standard deviation for the initial run of INSERT SQL query used in this experiment executed twenty times.

Insert query on SQL Server 2000

 

COM

.NET

Average Execution Time (ms)

1.3666

0.4496

Standard Deviation (ms)

0.0891

0.0070

 

 

 

 

Insert query on SQL Server 2005

 

COM

.NET

Average Execution Time (ms)

2.1096

0.6744

Standard Deviation (ms)

0.6446

0.0187

 

 

 

The initial runs created a baseline set of summary statistics including a solid average and standard deviation at the 95% confidence level to start off with. From this point the number of measurements required was calculated for both SELECT and INSERT queries executing on the ADO.NET and ADO COM connectors against SQL Server 2000 and SQL Server 2005. By rounding the largest number obtained the results stated that 2,000 measurements would be adequate given a 95% confidence level with 5% error margin (+/- 2.5%).

The full analysis using before and after comparison along with ANOVA for all 2,000 data points is available for download in the following links.

Full report - PDF
Analysis - Microsoft Excel

Conclusion

In concluding these experiments, it can be stated that Microsoft has made a significant performance enhancement in their latest version of the ADO connector: ADO.NET. Regardless of the results though, Microsoft is quickly putting the COM platform out to pasture along with the ADO COM connector. This particular connector is still an option for the time being but a serious push by the company is being made for all developers to convert to the .NET platform. The good news in addition to a performance enhancement is the fact that the ADO.NET connector is object oriented and provides a host of new features that will make software development faster and less prone to errors. Before wrapping up, one thing that must be kept in mind is the fact that only one SQL statement representing a SELECT SQL query and one SQL statement representing the INSERT SQL query were used. The SQL update query was not examined in this case and there remains the fact that SQL queries are very dynamic and have a diverse range in complexity. This one factor alone could have an impact on performance results as well.

Lessons Learned

During the course of this semester long project a number of new concepts were introduced and presented a significant learning curve through trial and error. As interesting as the project was a number of key points were observed and noted for future projects of a similar nature.

  1. Establish a clear understanding of exactly it is you are going assess. In this case a number of trials were attempted using all possible SQL queries that were not related and created extreme differences in execution times.
  2.  
  3. Identify milestones and create a comprehensive schedule. Like most projects of any significance, a substantial amount of time needs to be invested in the project and the only way to prevent last minute changes is to make progress on a continuous basis and update the schedule to reflect changes on a regular basis.
  4.  
  5. Make sure you understand your tools. A good portion of the analysis was done with Microsoft Excel which contains a host of powerful data analysis tool packs. Be sure that you have a good understanding of how they work and, more importantly, what the results mean.