NFS のマウントオプションの hard と soft について調べたメモ

NFS のマウントオプションの hard、soft について調べたメモ(Linux限定)。

まとめ

hard の動作
  • NFS サーバが応答するまで書込を永遠に繰返す。
  • アプリケーションはI/Oを発行した後、完了待ちでスリープし続ける。
  • hard と intr を併用するとシグナルを送ってI/Oを停止することができる*1
kill -s SIGINT or SIGQUIT or SIGHUP <PID>
soft の動作
  • retrans で指定された回数書込に失敗すると、I/Oを発行したアプリケーションにエラーを返す。
どちらが良いか
  • 整合性が求められるデータを読み書きに使う場合は hard にすべき。
    • 不完全な書込*2や読込*3が発生する可能性があるため。
  • 実行可能ファイルを置く場合も hard にすべき。
    • 実行可能ファイルのデータをメモリに読込中やページアウトされたページを再読込中に、NFS サーバがクラッシュすると想定外の動作*4をする可能性がある。

参考

soft / hard
Determines the recovery behavior of the NFS client after an NFS request times out. If neither option is specified (or if the hard option is specified), NFS requests are retried indefinitely. If the soft option is specified, then the NFS client fails an NFS request after retrans retransmissions have been sent, causing the NFS client to return an error to the calling application.
NB: A so-called "soft" timeout can cause silent data corruption in certain cases. As such, use the soft option only when client responsiveness is more important than data integrity. Using NFS over TCP or increasing the value of the retrans option may mitigate some of the risks of using the soft option.

retrans=n
The number of times the NFS client retries a request before it attempts further recovery action. If the retrans option is not specified, the NFS client tries each request three times.
The NFS client generates a "server not responding" message after retrans retries, then attempts further recovery (depending on whether the hard mount option is in effect).

intr / nointr
This option is provided for backward compatibility. It is ignored after kernel 2.6.25.

nfs(5) - Linux manual page


Managing Nfs and Nis

Managing Nfs and Nis

  • 6.3. Mounting filesystems - Mount options

hard/soft
By default, NFS filesystems are hard mounted, and operations on them are retried until they are acknowledged by the server. If the soft option is specified, an NFS RPC call returns a timeout error if it fails the number of times specified by the retrans option.

  • 6.3. Mounting filesystems - Mounting filesystems - Hard and soft mounts

Hard and soft mounts
The hard and soft mount options determine how a client behaves when the server is excessively loaded for a long period or when it crashes. By default, all NFS filesystems are mounted hard, which means that an RPC call that times out will be retried indefinitely until a response is received from the server. This makes the NFS server look as much like a local disk as possible — the request that needs to go to disk completes at some point in the future. An NFS server that crashes looks like a disk that is very, very slow.
A side effect of hard-mounting NFS filesystems is that processes block (or “hang”) in a high-priority disk wait state until their NFS RPC calls complete. If an NFS server goes down, the clients using its filesystems hang if they reference these filesystems before the server recovers. Using intr in conjunction with the hard mount option allows users to interrupt system calls that are blocked waiting on a crashed server. The system call is interrupted when the process making the call receives a signal, usually sent by the user typing CTRL-C (interrupt) or using the kill command. CTRL-\ (quit) is another way to generate a signal, as is logging out of the NFS client host. When using kill , only SIGINT, SIGQUIT, and SIGHUP will interrupt NFS operations.
When an NFS filesystem is soft-mounted, repeated RPC call failures eventually cause the NFS operation to fail as well. Instead of emulating a painfully slow disk, a server exporting a soft-mounted filesystem looks like a failing disk when it crashes: system calls referencing the soft-mounted NFS filesystem return errors. Sometimes the errors can be ignored or are preferable to blocking at high priority; for example, if you were doing an ls -l when the NFS server crashed, you wouldn’t really care if the ls command returned an error as long as your system didn’t hang.
The other side to this “failing disk” analogy is that you never want to write data to an unreliable device, nor do you want to try to load executables from it. You should not use the soft option on any filesystem that is writable, nor on any filesystem from which you load executables. Furthermore, because many applications do not check return value of the read(2) system call when reading regular files (because those programs were written in the days before networking was ubiquitous, and disks were reliable enough that reads from disks virtually never failed), you should not use the soft option on any filesystem that is supplying input to applications that are in turn using the data for a mission-critical purpose. NFS only guarantees the consistency of data after a server crash if the NFS filesystem was hard-mounted by the client. Unless you really know what you are doing, neveruse the soft option.
We’ll come back to hard- and soft-mount issues in when we discuss modifying client behavior in the face of slow NFS servers in Chapter 18.

  • 18.2. Soft mount issues

Repeated retransmission cycles only occur for hard-mounted filesystems. When the soft option is supplied in a mount, the RPC retransmission sequence ends at the first major timeout, producing messages like:

NFS write failed for server wahoo: error 5 (RPC: Timed out)
NFS write error on host wahoo: error 145.
(file handle: 800000 2 a0000 114c9 55f29948 a0000 11494 5cf03971)

The NFS operation that failed is indicated, the server that failed to respond before the major timeout, and the filehandle of the file affected. RPC timeouts may be caused by extremely slow servers, or they can occur if a server crashes and is down or rebooting while an RPC retransmission cycle is in progress.
With soft-mounted filesystems, you have to worry about damaging data due to incomplete writes, losing access to the text segment of a swapped process, and making soft-mounted filesystems more tolerant of variances in server response time. If a client does not give the server enough latitude in its response time, the first two problems impair both the performance and correct operation of the client. If write operations fail, data consistency on the server cannot be guaranteed. The write error is reported to the application during some later call to write( ) or close( ), which is consistent with the behavior of a local filesystem residing on a failing or overflowing disk. When the actual write to disk is attempted by the kernel device driver, the failure is reported to the application as an error during the next similar or related system call.
A well-conditioned application should exit abnormally after a failed write, or retry the write if possible. If the application ignores the return code from write( ) or close( ), then it is possible to corrupt data on a soft-mounted filesystem. Some write operations may fail and never be retried, leaving holes in the open file.
To guarantee data integrity, all filesystems mounted read-write should be hard-mounted. Server performance as well as server reliability determine whether a request eventually succeeds on a soft-mounted filesystem, and neither can be guaranteed. Furthermore, any operating system that maps executable images directly into memory (such as Solaris) should hard-mount filesystems containing executables. If the filesystem is soft-mounted, and the NFS server crashes while the client is paging in an executable (during the initial load of the text segment or to refill a page frame that was paged out), an RPC timeout will cause the paging to fail. What happens next is system-dependent; the application may be terminated or the system may panic with unrecoverable swap errors.
A common objection to hard-mounting filesystems is that NFS clients remain catatonic until a crashed server recovers, due to the infinite loop of RPC retransmissions and timeouts. By default, Solaris clients allow interrupts to break the retransmission loop. Use the intr mount option if your client doesn’t specify interrupts by default. Unfortunately, some older implementations of NFS do not process keyboard interrupts until a major timeout has occurred: with even a small timeout period and retransmission count, the time required to recognize an interrupt can be quite large.
If you choose to ignore this advice, and choose to use soft-mounted NFS filesystems, you should at least make NFS clients more tolerant of soft-mounted NFS fileservers by increasing the retrans mount option. Increasing the number of attempts to reach the server makes the client less likely to produce an RPC error during brief periods of server loading.

補足

  • そもそも、整合性を求められるデータの読み書きや実行可能ファイルを置く領域に NFS を使うべきかという点には触れていません。

*1:nfs(5) の man では kernel 2.6.25 以降は無視されると書かれている

*2:そのI/Oリクエストで書きたかったデータの一部しか書けていない

*3:そのI/Oリクエストで読みたかったデータの一部しか読めていない

*4:OSの実装次第だが、アプリケーションの異常終了やカーネルパニックなど