latch系の待機時間にはCPUでスピンしている時間は含まれない

Oracle Database には latch(ラッチ) という排他制御があります。latch はスピンロックで、CPUを使って一定回数スピン(ループ)しながらロックを獲得しようとします。一定回数スピンして獲得できないとスリープします。AWR レポートや Statspack レポートに出ている latch 系の待機イベントの待ち時間はスリープしていた時間で、スピンした時間は含まれません。CPUを使ってスピンしていた時間は DB CPU に含まれます。mutex*1も同様です。

参考

Oracle Performance Firefighting: Craig Shallahamer: 9780984102303: Amazon.com: Books
P.83

Oracle latch wait time reflects only the sleep time, not the spin time.

P.84

When you look at a wait event report or response time report, all latch wait event time is latch sleep time, and not the related CPU time. In other words, when you see latch wait time or latch wait event, you know associated process has already been spinning and consuming CPU before the wait event was posted!


絵で見てわかるITインフラの仕組み

絵で見てわかるITインフラの仕組み

P.102

DBMS排他制御では、極めて短時間しかロックを獲得しない場合に用いられるラッチなどのCPU で特に意味のない処理をしながら待つスピンロックと、比較的長時間ロックを獲得する場合に用いられるキューで管理してスリープして待つロックなどがあり、保護する共有データの特性や待機する時間の長さによって使い分けられています。スピンロックはCPU でスピン(ループ)して特に意味のない処理をして待ちますが、単純にスピンし続けるのではなく、しばらくスピンしてロックを獲得できない場合はスリープしたり、状況に応じてスピンするかスリープするか判断する(アダプティブロック)といった方式があります。スリープして待つとコストの高い操作であるコンテキストスイッチが発生するため、極めて短時間のロックではCPU でスピンして待つスピンロックが使われます。


Oracle8I Internal Services for Waits, Latches, Locks, and Memory

Oracle8I Internal Services for Waits, Latches, Locks, and Memory

P.26

Another task that the process performs before it goes to sleep is to update the session wait information visible in the V$SESSION_WAIT view to indicate that the process is waiting on a latch free wait . The wait parameters are shown in Table 3.1.
Table 3.1. Wait Parameters (latch free waits)

Parameter Description
p1 The SGA address of the latch required; corresponds to the ADDR column of the V$LATCH_PARENT and V$LATCH_CHILDREN views (but not V$LATCH itself)
p2 The type of the latch; corresponds to the LATCH# column of the V$LATCH family of views
p3 The number of times that the process has slept during this attempt to acquire the latch

When the process wakes up again, it will update the session wait information to indicate that the wait is over, and if timed statistics are enabled, it will record the time for which it slept. The cumulative statistics for latch free waits that are visible in the V$SESSION_EVENT and V$SYSTEM_EVENT views are also updated at this time. Note that consecutive sleeps during a single attempt to acquire a latch are recorded as separate waits. However, the latching statistics visible in the V$LATCH family of views are only updated once the latch has been acquired.


Oracle Core: Essential Internals for DBAs and Developers (Expert's Voice in Databases)

Oracle Core: Essential Internals for DBAs and Developers (Expert's Voice in Databases)

P.70

My initial description of latch activity included the directive “loop a few thousand times.” It’s now time to consider what happens if you fail to acquire the latch even after running around this loop.
Historically the session would set an alarm to wake itself up after a short interval and take itself off the operating system run queue. When it was put back on the run queue by the operating system scheduler and got to the top of the queue, it would go into a loop again to try and acquire the latch, and put itself back to sleep if it failed again. The sleep time was designed to use an exponential backoff algorithm—meaning the more times the session went to sleep, the longer the sleep interval became—and this could, occasionally, lead to very long delays as a process tried to get a latch. Here’s one of the extreme examples I once saw on an Oracle8i system when a session spent over 8 seconds trying to acquire one of the library cache latches:

WAIT #4: nam='latch free' ela= 1 p1=-1351741396 p2=62 p3=0
WAIT #4: nam='latch free' ela= 1 p1=-1351741396 p2=62 p3=1
WAIT #4: nam='latch free' ela= 1 p1=-1351741396 p2=62 p3=2
WAIT #4: nam='latch free' ela= 3 p1=-1351741396 p2=62 p3=3
WAIT #4: nam='latch free' ela= 3 p1=-1351741396 p2=62 p3=4
WAIT #4: nam='latch free' ela= 7 p1=-1351741396 p2=62 p3=5
WAIT #4: nam='latch free' ela= 9 p1=-1351741396 p2=62 p3=6
WAIT #4: nam='latch free' ela= 18 p1=-1351741396 p2=62 p3=7
WAIT #4: nam='latch free' ela= 15 p1=-1351741396 p2=62 p3=8
WAIT #4: nam='latch free' ela= 55 p1=-1351741396 p2=62 p3=9
WAIT #4: nam='latch free' ela= 33 p1=-1351741396 p2=62 p3=10
WAIT #4: nam='latch free' ela= 69 p1=-1351741396 p2=62 p3=11
WAIT #4: nam='latch free' ela= 100 p1=-1351741396 p2=62 p3=12
WAIT #4: nam='latch free' ela= 150 p1=-1351741396 p2=62 p3=13
WAIT #4: nam='latch free' ela= 151 p1=-1351741396 p2=62 p3=14
WAIT #4: nam='latch free' ela= 205 p1=-1351741396 p2=62 p3=15

In principle the elapsed time (ela= nnnn, reported here in hundredths of a second) should double every other wait until it hits the maximum of 2 seconds,1 but the extreme CPU overload on the machine made the queue times very unstable in this case.

...

It’s important to note, though, that things no longer work like this: in outline, a process that doesn’t acquire its target latch almost immediately will attach itself to a latch wait list and go to sleep until woken. There are variations in the amount of effort a session will use before doing this, and Andrey Nikolaev, mentioned in an earlier note, has used DTrace on a Solaris system to track down the details, which are summarized in Table 4-3.

1 Steve Adams, Oracle8i Internal Services for Waits, Latches, Locks and Memory (Sebastopol, CA: O'Reilly Media, 1999).

P.24

Generic “latch free” and 27 latch specific “latch free: %” wait events are associated with sleep only. Spin is invisible to Oracle Wait Interface.



*1:Oracle Databaseの