Friday, August 21, 2015

Sysbench File I/O tests on SSD

Here's an indication why the IO Scheduler for SSD disks should be noop or deadline. I've performed a sysbench File I/O test on an 100G SSD disk.

sysbench --test=fileio --file-total-size=40G prepare

echo "cfq" > /sys/block/sda/queue/scheduler
sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0 run

echo "deadline" > /sys/block/sda/queue/scheduler
sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0 run

echo "noop" > /sys/block/sda/queue/scheduler
sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0 run

Preparing the files:
sysbench --test=fileio --file-total-size=40G prepare
42949672960 bytes written in 310.16 seconds (132.06 MB/sec).

* Sequential write is 132MB/Sec for this disk

root@fisher-All-Series:/mnt/ssd/sysbench-tests# echo "cfq" > /sys/block/sda/queue/scheduler
root@fisher-All-Series:/mnt/ssd/sysbench-tests# sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0 run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  357906 reads, 238594 writes, 763476 Other = 1359976 Total
Read 5.4612Gb  Written 3.6407Gb  Total transferred 9.1019Gb  (31.067Mb/sec)
 1988.30 Requests/sec executed

General statistics:
    total time:                          300.0044s
    total number of events:              596500
    total time taken by event execution: 683.7045s
    response time:
         min:                                  0.00ms
         avg:                                  1.15ms
         max:                                221.08ms
         approx.  95 percentile:               6.49ms

Threads fairness:
    events (avg/stddev):           37281.2500/9609.62
    execution time (avg/stddev):   42.7315/11.04

root@fisher-All-Series:/mnt/ssd/sysbench-tests# 
root@fisher-All-Series:/mnt/ssd/sysbench-tests# echo "deadline" > /sys/block/sda/queue/scheduler
root@fisher-All-Series:/mnt/ssd/sysbench-tests# sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0  run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  371466 reads, 247634 writes, 792422 Other = 1411522 Total
Read 5.6681Gb  Written 3.7786Gb  Total transferred 9.4467Gb  (32.244Mb/sec)
 2063.62 Requests/sec executed

General statistics:
    total time:                          300.0069s
    total number of events:              619100
    total time taken by event execution: 73.9194s
    response time:
         min:                                  0.00ms
         avg:                                  0.12ms
         max:                                 31.94ms
         approx.  95 percentile:               0.04ms

Threads fairness:
    events (avg/stddev):           38693.7500/825.77
    execution time (avg/stddev):   4.6200/0.16

root@fisher-All-Series:/mnt/ssd/sysbench-tests# echo "noop" > /sys/block/sda/queue/scheduler
root@fisher-All-Series:/mnt/ssd/sysbench-tests# sysbench --num-threads=16 --test=fileio --file-total-size=40G --file-test-mode=rndrw --max-time=300 --max-requests=0 run
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  444665 reads, 296435 writes, 948587 Other = 1689687 Total
Read 6.785Gb  Written 4.5232Gb  Total transferred 11.308Gb  (38.597Mb/sec)
 2470.19 Requests/sec executed

General statistics:
    total time:                          300.0173s
    total number of events:              741100
    total time taken by event execution: 199.5376s
    response time:
         min:                                  0.00ms
         avg:                                  0.27ms
         max:                                 22.52ms
         approx.  95 percentile:               0.81ms

Threads fairness:
    events (avg/stddev):           46318.7500/1365.75
    execution time (avg/stddev):   12.4711/0.15

Clearly, noop wins!

Just for kicks, I added nobarrier and noatime option on the ssd mount. Do note that nobarrier is unsafe for disk subsystem that does not have a working battery backup unit. Surprisingly, cfq wins in this benchmark:

cfq
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  1706155 reads, 1137429 writes, 3639680 Other = 6483264 Total
Read 26.034Gb  Written 17.356Gb  Total transferred 43.39Gb  (148.1Mb/sec)
 9478.49 Requests/sec executed

General statistics:
    total time:                          300.0038s
    total number of events:              2843584
    total time taken by event execution: 3621.4250s
    response time:
         min:                                  0.00ms
         avg:                                  1.27ms
         max:                                 47.11ms
         approx.  95 percentile:               7.53ms

Threads fairness:
    events (avg/stddev):           177724.0000/1206.64
    execution time (avg/stddev):   226.3391/0.40

deadline
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  1527184 reads, 1018116 writes, 3257894 Other = 5803194 Total
Read 23.303Gb  Written 15.535Gb  Total transferred 38.838Gb  (132.57Mb/sec)
 8484.28 Requests/sec executed

General statistics:
    total time:                          300.0019s
    total number of events:              2545300
    total time taken by event execution: 3884.6784s
    response time:
         min:                                  0.00ms
         avg:                                  1.53ms
         max:                                 46.72ms
         approx.  95 percentile:               7.59ms

Threads fairness:
    events (avg/stddev):           159081.2500/578.04
    execution time (avg/stddev):   242.7924/0.39

noop
sysbench 0.5:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 16
Random number generator seed is 0 and will be ignored


Extra file open flags: 0
128 files, 320Mb each
40Gb total file size
Block size 16Kb
Number of IO requests: 0
Read/Write ratio for combined random IO test: 1.50
Periodic FSYNC enabled, calling fsync() each 100 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test
Threads started!

Operations performed:  1498986 reads, 999314 writes, 3197715 Other = 5696015 Total
Read 22.873Gb  Written 15.248Gb  Total transferred 38.121Gb  (130.12Mb/sec)
 8327.61 Requests/sec executed

General statistics:
    total time:                          300.0022s
    total number of events:              2498300
    total time taken by event execution: 3877.6835s
    response time:
         min:                                  0.00ms
         avg:                                  1.55ms
         max:                                 49.04ms
         approx.  95 percentile:               7.62ms

Threads fairness:
    events (avg/stddev):           156143.7500/628.55
    execution time (avg/stddev):   242.3552/0.51

For now, we can conclude that noop and deadline are best IO schedulers for SSD compared to cfq when barrier option is set. When nobarrier option is set throughput increased around 3-4x times for each scheduler. I am curious why cfq ran supreme with this setting.  On the next post, we will run benchmarks on MySQL directly.






No comments: