From b38f3aaba153414eb357dce18611772d2cffa1f6 Mon Sep 17 00:00:00 2001 From: Richard Yao Date: Sat, 29 Aug 2015 12:59:05 -0400 Subject: Solid state drives should use noop IO elevator It is often suggested that users set noop on SSDs and it turns out that udev can do this for users. Setting noop disables the IO priorization and IO reordering logic inside the kernel, but leaves front/back merging in place. This reduction in overhead should increase the number of requests sent to solid state media to the maximum possible,which is said to improve performance on SSDs. Unfortunately, few benchmarks try real world work loads with a clear cache to measure the actual difference. The benchmarks conducted by Daniel Nashed cleared the cache. They favor noop, although the workload seems somewhat unrealistic: http://blog.nashcom.de/nashcomblog.nsf/dx/linux-io-performance-tweek.htm The BFQ developers' benchmarks on SSDs appear to account for both. They show noop as being far better than CFQ and second only to BFQ, which is out of tree: https://lwn.net/Articles/600366/ In addition, I have experienced lockup-like effects on ext4 on an OCZ Vertex 2 SSD with the discard mount option enabled when recursively unlinking a subdirectory path that contains millions of files. The system was useless for hours. Setting noop allowed the unlink to finish in minutes. This is because the reordering from CFQ interleaved the TRIM command with write IOs, effectively putting barriers between them because because TRIM is a non-queued command prior to SATA 3.1. A good default should perform well in general and have the property that poor performance in the worst case scenarios is minimized. The previous examples contradict CFQ's ability to achieve that on solid state media. I believe that we should implement a udev rule to set noop on solid state media by default. It should be said that Milan Broz wrote it first, although there is only one way to write this rule in a manner consistent with the codebase: http://permalink.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/6045 It should be said that this will be a regression for those that rely on the "Block IO Controller" cgroup because it is only supported by CFQ when CONFIG_CFQ_GROUP_IOSCHED=y. My experience as a ZoL developer is that very few users rely on this behavior and consequently, I believe that the benefit from enabling this far outweighs the harm to the few that need it. Those that do need it should be able to disable this rule themselves. Container management software that expects the Block IO Controller to be supported should be modified to enable CFQ explicitly if it does not already do that. This has been tested against both a SATA mechanical drive and a SATA solid state drive. It changes the elevator to noop on the solid state drive, but does not touch it on the mechanical drive. Signed-off-by: Richard Yao --- rules/60-block.rules | 3 +++ 1 file changed, 3 insertions(+) diff --git a/rules/60-block.rules b/rules/60-block.rules index c74caca49f..3d1e1c0207 100644 --- a/rules/60-block.rules +++ b/rules/60-block.rules @@ -9,3 +9,6 @@ ACTION=="change", SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_device", TEST=="block", # watch metadata changes, caused by tools closing the device node which was opened for writing ACTION!="remove", SUBSYSTEM=="block", KERNEL=="loop*|nvme*|sd*|vd*|xvd*", OPTIONS+="watch" + +# set noop on solid state drives +SUBSYSTEM=="block", ACTION=="add", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop" -- cgit v1.2.3-54-g00ecf