blktrace と btt を使ってみる

debugfs をマウントする。

[root@yazekats-linux ~]# mount -t debugfs debugfs /sys/kernel/debug

blktrace でトレースを取得する。

[root@yazekats-linux work]# blktrace -w 30 -d /dev/sda -o sda
=== sda ===
  CPU  0:               160503 events,     7524 KiB data
  CPU  1:                 2788 events,      131 KiB data
  CPU  2:                   31 events,        2 KiB data
  CPU  3:                  114 events,        6 KiB data
  Total:                163436 events (dropped 0),     7662 KiB data

blktrace でトレースを取得している間に fio で負荷をかける。

[yazekats@yazekats-linux ~]$ fio -rw=randwrite -bs=4k -size=100m -directory=/tmp -direct=1 -numjobs=1 -name=file1 -group_reporting
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.0.13
Starting 1 process

...

file1: (groupid=0, jobs=1): err= 0: pid=3918: Mon Oct  7 22:48:56 2013
  write: io=102400KB, bw=17098KB/s, iops=4274 , runt=  5989msec

...

Disk stats (read/write):
    dm-1: ios=65/25174, merge=0/0, ticks=17/5017, in_queue=5033, util=83.76%, aggrios=65/25642, aggrmerge=0/0, aggrticks=17/5058, aggrin_queue=5074, aggrutil=83.06%
    dm-0: ios=65/25642, merge=0/0, ticks=17/5058, in_queue=5074, util=83.06%, aggrios=65/25642, aggrmerge=0/0, aggrticks=14/3944, aggrin_queue=3934, aggrutil=64.50%
  sda: ios=65/25642, merge=0/0, ticks=14/3944, in_queue=3934, util=64.50%

blktrace で取得したバイナリのトレースファイルを btt で集計する。

[root@yazekats-linux work]# btt -i sda.blktrace.* 
==================== All Devices ====================

            ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q               0.000041798   0.000933868   5.205773456       25246
Q2G               0.000000252   0.000000766   0.000057532       25247
G2I               0.000000288   0.000000838   0.000018782       25247
I2D               0.000000194   0.000000574   0.000013993       25247
D2C               0.000029740   0.000155021   0.004134574       25247
Q2C               0.000032080   0.000157199   0.004136896       25247

==================== Device Overhead ====================

       DEV |       Q2G       G2I       Q2M       I2D       D2C
---------- | --------- --------- --------- --------- ---------
 (  8,  0) |   0.4875%   0.5330%   0.0000%   0.3650%  98.6145%
---------- | --------- --------- --------- --------- ---------
   Overall |   0.4875%   0.5330%   0.0000%   0.3650%  98.6145%

==================== Device Merge Information ====================

       DEV |       #Q       #D   Ratio |   BLKmin   BLKavg   BLKmax    Total
---------- | -------- -------- ------- | -------- -------- -------- --------
 (  8,  0) |    25247    25247     1.0 |        8        8        8   206240

==================== Device Q2Q Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8,  0) |           25247        241742.5               0 | 0(273)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |           25247        241742.5               0 | 0(273)

==================== Device D2D Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8,  0) |           25247        241742.5               0 | 0(273)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |           25247        241742.5               0 | 0(273)

==================== Plug Information ====================

       DEV |    # Plugs # Timer Us  | % Time Q Plugged
---------- | ---------- ----------  | ----------------

       DEV |    IOs/Unp   IOs/Unp(to)
---------- | ----------   ----------
 (  8,  0) |        0.0          0.0

==================== Active Requests At Q Information ====================

       DEV |  Avg Reqs @ Q
---------- | -------------
 (  8,  0) |           0.0

# Total System
#     Total System : q activity
967.404847527   0.0
967.404847527   0.4
973.392224775   0.4
973.392224775   0.0
975.360740631   0.0
975.360740631   0.4
975.362222205   0.4
975.362222205   0.0
979.394058685   0.0
979.394058685   0.4
979.394058685   0.4
979.394058685   0.0
980.567205366   0.0
980.567205366   0.4
980.567989392   0.4
980.567989392   0.0
985.773762848   0.0
985.773762848   0.4
985.775139780   0.4
985.775139780   0.0
990.979986421   0.0
990.979986421   0.4
990.981267284   0.4
990.981267284   0.0

#     Total System : c activity
967.405038820   0.5
967.405038820   0.9
973.392340003   0.9
973.392340003   0.5
975.361929208   0.5
975.361929208   0.9
975.362591937   0.9
975.362591937   0.5
975.504717388   0.5
975.504717388   0.9
975.504717388   0.9
975.504717388   0.5
979.395101268   0.5
979.395101268   0.9
979.395101268   0.9
979.395101268   0.5
980.504929360   0.5
980.504929360   0.9
980.568306111   0.9
980.568306111   0.5
985.774974636   0.5
985.774974636   0.9
985.775488386   0.9
985.775488386   0.5
990.981123044   0.5
990.981123044   0.9
990.981638615   0.9
990.981638615   0.5

...

キター


ちょっと検証してみる。
noop の場合、

[root@yazekats-linux work]# echo noop > /sys/block/sda/queue/scheduler
[root@yazekats-linux work]# blktrace -w 10 -d /dev/sda -o sda
[root@yazekats-linux work]# btt -i sda.blktrace.* 
==================== All Devices ====================

            ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q               0.000016545   0.000150502   0.058013758       66441
Q2G               0.000000176   0.000000438   0.000362092       66442
G2I               0.000000142   0.000000440   0.000014414       66442
I2D               0.000000105   0.000000345   0.001247547       66442
D2C               0.000032505   0.000309451   0.057950194       66442
Q2C               0.000034012   0.000310674   0.057951719       66442

==================== Device Overhead ====================

       DEV |       Q2G       G2I       Q2M       I2D       D2C
---------- | --------- --------- --------- --------- ---------
 (  8,  0) |   0.1410%   0.1415%   0.0000%   0.1111%  99.6064%
---------- | --------- --------- --------- --------- ---------
   Overall |   0.1410%   0.1415%   0.0000%   0.1111%  99.6064%

cfq の場合、

[root@yazekats-linux work]# echo cfq > /sys/block/sda/queue/scheduler
[root@yazekats-linux work]# blktrace -w 10 -d /dev/sda -o sda
[root@yazekats-linux work]# btt -i sda.blktrace.* 
==================== All Devices ====================

            ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q               0.000015123   0.000130726   0.061972012       76496
Q2G               0.000000268   0.000001068   0.000599402       76301
G2I               0.000000151   0.000000540   0.000035957       76301
Q2M               0.000000100   0.000000128   0.000000571         196
I2D               0.000000264   0.000004083   0.001405822       76300
M2D               0.000013899   0.000575561   0.001230303         196
D2C               0.000025953   0.000194480   0.061873548       76494
Q2C               0.000029128   0.000201655   0.061908653       76495

==================== Device Overhead ====================

       DEV |       Q2G       G2I       Q2M       I2D       D2C
---------- | --------- --------- --------- --------- ---------
 (  8,  0) |   0.5282%   0.2672%   0.0002%   2.0196%  96.4406%
---------- | --------- --------- --------- --------- ---------
   Overall |   0.5282%   0.2672%   0.0002%   2.0196%  96.4406%

SSD を使ってるので cfq だとI/Oスケジューラでの処理のオーバーヘッドが大きいように見える。

blktrace 中は fio で負荷をかけてました。

[yazekats@yazekats-linux ~]$ fio -rw=randwrite -bs=4k -size=500m -directory=/tmp -direct=1 -numjobs=3 -name=file1 -group_reporting