Projects/Cloudstone/Database Performance

From RAD Lab

Jump to: navigation, search

The goals of benchmarking database performance across multiple database implementations is to:

  • Compare the performance between a single server and a replicated cluster.
  • Compare the performance between MySQL and PostgreSQL.
  • Discover the bottlenecks of configurations varying in the number of databases and application processes.

The statistics are collected using Cloudstone and are measured in operations per second.

Contents

Introduction

When tests were done to study the effects of more replicated database slaves, the data collected did not confirm the fact that more slaves would improve performance. We suspected that the application servers, at the time consisting of 16 Thin processes on a C1.XLarge, was the bottleneck (i.e. the data showed the limits of the application and not the database). This explained why the database type (i.e. MySQL or PostgreSQL) and number of slaves did not significantly affect performance.

The following tests were done in an attempt to discover the maximum number of application servers needed to test the effects of database replication up to 3 slaves. Tests were run with varying amounts of application servers while maintaining a fixed amount of MySQL slaves (or servers). The expected outcome of each experiment is for the performance to continue increasing with each additional application server until a certain point. This certain point would be the point at which the database becomes the bottleneck.

Cluster Configuration

The Cloudstone cluster used to test the application bottlenecks was:

  • Varying # of c1.xlarge instances with 16 Thin (application) processes.
  • 1 m1.small instance acting as an Nginx load balancing server.
  • 1 c1.xlarge instance running Faban (load generator).
  • Database Servers (Pick One)
    • Single MySQL
      • 1 c1.xlarge instance running MySQL.
    • MySQL Replicated
      • 1 c1.xlarge instance acting as the MySQL master server.
      • Varying # of c1.xlarge instances acting as MySQL slaves.
      • A deployment of MySQL acting as the "initial slave" is colocated on a Rails instance.
    • PostgreSQL
      • 1 c1.xlarge instance running PostgreSQL.
    • PostgreSQL with PGPool
      • 1 c1.xlarge instance running PGPool.
      • Varying # of c1.xlarge instances running PostgreSQL.

Application Bottleneck with Single MySQL Server

Application Bottleneck with Single MySQL Server
Enlarge
Application Bottleneck with Single MySQL Server
# Users16 Thins48 Thins80 ThinsMaximum
5010.055410*10*10
10019.984420*20*20
20039.525638.514440.03840
50046.452454.961450.3926100
75046.613450.1154.9106150
100046.287652.576848.6913200

*Values in gray are approximated values and not actually measured.

The results show that an increase in the number of application servers has negligible impacts on performance. Therefore, this suggests that the number of application servers needed in order for the database slave count to become the bottleneck for performance is less than 16 Thins.

Application Bottleneck with 3 MySQL Slaves

Application Bottleneck with 3 MySQL Slaves
Enlarge
Application Bottleneck with 3 MySQL Slaves
# Users16 Thins48 Thins80 Thins112 ThinsMaximum
509.909810.041210*10*10
10020.303220.347620*20*20
20037.288839.895440*40*40
50034.92982.4532100*100*100
75036.905496.279129.5672134.2426150
100040.71994.1156154.0638156.828200

*Values in gray are approximated values and not actually measured.

The results confirmed the suspected effects of increasing the number of application servers; as more application servers were added, the performance of Cloudstone increased until a certain point: 80 Thins. Therefore, this is the number of application servers needed in order for the database slave count to become the bottleneck for performance in Cloudstone.

Application Bottleneck with Single PostgreSQL Server

Image:Application Bottleneck with Single PostgreSQL Server.png
Application Bottleneck with Single PostgreSQL Server

(Data)

(Results and Conclusion)


Irrelevant Results

Archive 1: These results are irrelevant because the upper bound on performance may be measuring the capacities of the application servers instead of the database servers. This was accounted for in the study of application server limits.