Projects/Cloudstone/Database Performance
From RAD Lab
The goals of benchmarking database performance across multiple database implementations is to:
- Compare the performance between a single server and a replicated cluster.
- Compare the performance between MySQL and PostgreSQL.
- Discover the bottlenecks of configurations varying in the number of databases and application processes.
The statistics are collected using Cloudstone and are measured in operations per second.
Contents |
Introduction
When tests were done to study the effects of more replicated database slaves, the data collected did not confirm the fact that more slaves would improve performance. We suspected that the application servers, at the time consisting of 16 Thin processes on a C1.XLarge, was the bottleneck (i.e. the data showed the limits of the application and not the database). This explained why the database type (i.e. MySQL or PostgreSQL) and number of slaves did not significantly affect performance.
The following tests were done in an attempt to discover the maximum number of application servers needed to test the effects of database replication up to 3 slaves. Tests were run with varying amounts of application servers while maintaining a fixed amount of MySQL slaves (or servers). The expected outcome of each experiment is for the performance to continue increasing with each additional application server until a certain point. This certain point would be the point at which the database becomes the bottleneck.
Cluster Configuration
The Cloudstone cluster used to test the application bottlenecks was:
- Varying # of
c1.xlargeinstances with 16 Thin (application) processes. - 1
m1.smallinstance acting as an Nginx load balancing server. - 1
c1.xlargeinstance running Faban (load generator). - Database Servers (Pick One)
- Single MySQL
- 1
c1.xlargeinstance running MySQL.
- 1
- MySQL Replicated
- 1
c1.xlargeinstance acting as the MySQL master server. - Varying # of
c1.xlargeinstances acting as MySQL slaves. - A deployment of MySQL acting as the "initial slave" is colocated on a Rails instance.
- 1
- PostgreSQL
- 1
c1.xlargeinstance running PostgreSQL.
- 1
- PostgreSQL with PGPool
- 1
c1.xlargeinstance running PGPool. - Varying # of
c1.xlargeinstances running PostgreSQL.
- 1
- Single MySQL
Application Bottleneck with Single MySQL Server
| # Users | 16 Thins | 48 Thins | 80 Thins | Maximum |
|---|---|---|---|---|
| 50 | 10.0554 | 10* | 10* | 10 |
| 100 | 19.9844 | 20* | 20* | 20 |
| 200 | 39.5256 | 38.5144 | 40.038 | 40 |
| 500 | 46.4524 | 54.9614 | 50.3926 | 100 |
| 750 | 46.6134 | 50.11 | 54.9106 | 150 |
| 1000 | 46.2876 | 52.5768 | 48.6913 | 200 |
*Values in gray are approximated values and not actually measured.
The results show that an increase in the number of application servers has negligible impacts on performance. Therefore, this suggests that the number of application servers needed in order for the database slave count to become the bottleneck for performance is less than 16 Thins.
Application Bottleneck with 3 MySQL Slaves
| # Users | 16 Thins | 48 Thins | 80 Thins | 112 Thins | Maximum |
|---|---|---|---|---|---|
| 50 | 9.9098 | 10.0412 | 10* | 10* | 10 |
| 100 | 20.3032 | 20.3476 | 20* | 20* | 20 |
| 200 | 37.2888 | 39.8954 | 40* | 40* | 40 |
| 500 | 34.929 | 82.4532 | 100* | 100* | 100 |
| 750 | 36.9054 | 96.279 | 129.5672 | 134.2426 | 150 |
| 1000 | 40.719 | 94.1156 | 154.0638 | 156.828 | 200 |
*Values in gray are approximated values and not actually measured.
The results confirmed the suspected effects of increasing the number of application servers; as more application servers were added, the performance of Cloudstone increased until a certain point: 80 Thins. Therefore, this is the number of application servers needed in order for the database slave count to become the bottleneck for performance in Cloudstone.
Application Bottleneck with Single PostgreSQL Server
(Data)
(Results and Conclusion)
Irrelevant Results
Archive 1: These results are irrelevant because the upper bound on performance may be measuring the capacities of the application servers instead of the database servers. This was accounted for in the study of application server limits.
