diff options
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r-- | Documentation/sysctl/kernel.txt | 72 | ||||
-rw-r--r-- | Documentation/sysctl/net.txt | 11 | ||||
-rw-r--r-- | Documentation/sysctl/vm.txt | 50 |
3 files changed, 123 insertions, 10 deletions
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt index f77a2514e..01e0a4a86 100644 --- a/Documentation/sysctl/kernel.txt +++ b/Documentation/sysctl/kernel.txt @@ -59,12 +59,18 @@ show up in /proc/sys/kernel: - panic_on_stackoverflow - panic_on_unrecovered_nmi - panic_on_warn +- panic_on_rcu_stall +- perf_cpu_time_max_percent +- perf_event_paranoid +- perf_event_max_stack +- perf_event_max_contexts_per_stack - pid_max - powersave-nap [ PPC only ] - printk - printk_delay - printk_ratelimit - printk_ratelimit_burst +- pty ==> Documentation/filesystems/devpts.txt - randomize_va_space - real-root-dev ==> Documentation/initrd.txt - reboot-cmd [ SPARC only ] @@ -398,7 +404,7 @@ kernel stack. ============================================================== -iso_cpu: (BFS CPU scheduler only). +iso_cpu: (MuQSS CPU scheduler only). This sets the percentage cpu that the unprivileged SCHED_ISO tasks can run effectively at realtime priority, averaged over a rolling five @@ -625,6 +631,17 @@ a kernel rebuild when attempting to kdump at the location of a WARN(). ============================================================== +panic_on_rcu_stall: + +When set to 1, calls panic() after RCU stall detection messages. This +is useful to define the root cause of RCU stalls using a vmcore. + +0: do not panic() when RCU stall takes place, default behavior. + +1: panic() after printing RCU stall messages. + +============================================================== + perf_cpu_time_max_percent: Hints to the kernel how much CPU time it should be allowed to @@ -651,6 +668,43 @@ allowed to execute. ============================================================== +perf_event_paranoid: + +Controls use of the performance events system by unprivileged +users (without CAP_SYS_ADMIN). The default value is 2. + + -1: Allow use of (almost) all events by all users +>=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK +>=1: Disallow CPU event access by users without CAP_SYS_ADMIN +>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN + +============================================================== + +perf_event_max_stack: + +Controls maximum number of stack frames to copy for (attr.sample_type & +PERF_SAMPLE_CALLCHAIN) configured events, for instance, when using +'perf record -g' or 'perf trace --call-graph fp'. + +This can only be done when no events are in use that have callchains +enabled, otherwise writing to this file will return -EBUSY. + +The default value is 127. + +============================================================== + +perf_event_max_contexts_per_stack: + +Controls maximum number of stack frame context entries for +(attr.sample_type & PERF_SAMPLE_CALLCHAIN) configured events, for +instance, when using 'perf record -g' or 'perf trace --call-graph fp'. + +This can only be done when no events are in use that have callchains +enabled, otherwise writing to this file will return -EBUSY. + +The default value is 8. + +============================================================== pid_max: @@ -722,6 +776,20 @@ send before ratelimiting kicks in. ============================================================== +printk_devkmsg: + +Control the logging to /dev/kmsg from userspace: + +ratelimit: default, ratelimited +on: unlimited logging to /dev/kmsg from userspace +off: logging to /dev/kmsg disabled + +The kernel command line parameter printk.devkmsg= overrides this and is +a one-time setting until next reboot: once set, it cannot be changed by +this sysctl interface anymore. + +============================================================== + randomize_va_space: This option can be used to select the type of process address @@ -762,7 +830,7 @@ rebooting. ??? ============================================================== -rr_interval: (BFS CPU scheduler only) +rr_interval: (MuQSS CPU scheduler only) This is the smallest duration that any cpu process scheduling unit will run for. Increasing this value can increase throughput of cpu diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt index 809ab6efc..f0480f7ea 100644 --- a/Documentation/sysctl/net.txt +++ b/Documentation/sysctl/net.txt @@ -43,6 +43,17 @@ Values : 1 - enable the JIT 2 - enable the JIT and ask the compiler to emit traces on kernel log. +bpf_jit_harden +-------------- + +This enables hardening for the Berkeley Packet Filter Just in Time compiler. +Supported are eBPF JIT backends. Enabling hardening trades off performance, +but can mitigate JIT spraying. +Values : + 0 - disable JIT hardening (default value) + 1 - enable JIT hardening for unprivileged users only + 2 - enable JIT hardening for all users + dev_weight -------------- diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index 89a887c76..95ccbe6d7 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -57,9 +57,11 @@ Currently, these files are in /proc/sys/vm: - panic_on_oom - percpu_pagelist_fraction - stat_interval +- stat_refresh - swappiness - user_reserve_kbytes - vfs_cache_pressure +- watermark_scale_factor - zone_reclaim_mode ============================================================== @@ -581,15 +583,16 @@ Specify "[Nn]ode" for node order "Zone Order" orders the zonelists by zone type, then by node within each zone. Specify "[Zz]one" for zone order. -Specify "[Dd]efault" to request automatic configuration. Autoconfiguration -will select "node" order in following case. -(1) if the DMA zone does not exist or -(2) if the DMA zone comprises greater than 50% of the available memory or -(3) if any node's DMA zone comprises greater than 70% of its local memory and - the amount of local memory is big enough. +Specify "[Dd]efault" to request automatic configuration. -Otherwise, "zone" order will be selected. Default order is recommended unless -this is causing problems for your system/application. +On 32-bit, the Normal zone needs to be preserved for allocations accessible +by the kernel, so "zone" order will be selected. + +On 64-bit, devices that require DMA32/DMA are relatively rare, so "node" +order will be selected. + +Default order is recommended unless this is causing problems for your +system/application. ============================================================== @@ -754,6 +757,19 @@ is 1 second. ============================================================== +stat_refresh + +Any read or write (by root only) flushes all the per-cpu vm statistics +into their global totals, for more accurate reports when testing +e.g. cat /proc/sys/vm/stat_refresh /proc/meminfo + +As a side-effect, it also checks for negative totals (elsewhere reported +as 0) and "fails" with EINVAL if any are found, with a warning in dmesg. +(At time of writing, a few stats are known sometimes to be found negative, +with no ill effects: errors and warnings on these stats are suppressed.) + +============================================================== + swappiness This control is used to define how aggressive the kernel will swap @@ -803,6 +819,24 @@ performance impact. Reclaim code needs to take various locks to find freeable directory and inode objects. With vfs_cache_pressure=1000, it will look for ten times more freeable objects than there are. +============================================================= + +watermark_scale_factor: + +This factor controls the aggressiveness of kswapd. It defines the +amount of memory left in a node/system before kswapd is woken up and +how much memory needs to be free before kswapd goes back to sleep. + +The unit is in fractions of 10,000. The default value of 10 means the +distances between watermarks are 0.1% of the available memory in the +node/system. The maximum value is 1000, or 10% of memory. + +A high rate of threads entering direct reclaim (allocstall) or kswapd +going to sleep prematurely (kswapd_low_wmark_hit_quickly) can indicate +that the number of free pages kswapd maintains for latency reasons is +too small for the allocation bursts occurring in the system. This knob +can then be used to tune kswapd aggressiveness accordingly. + ============================================================== zone_reclaim_mode: |