summaryrefslogtreecommitdiff
path: root/Documentation/sysctl
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/sysctl')
-rw-r--r--Documentation/sysctl/kernel.txt72
-rw-r--r--Documentation/sysctl/net.txt11
-rw-r--r--Documentation/sysctl/vm.txt50
3 files changed, 123 insertions, 10 deletions
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index f77a2514e..01e0a4a86 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -59,12 +59,18 @@ show up in /proc/sys/kernel:
- panic_on_stackoverflow
- panic_on_unrecovered_nmi
- panic_on_warn
+- panic_on_rcu_stall
+- perf_cpu_time_max_percent
+- perf_event_paranoid
+- perf_event_max_stack
+- perf_event_max_contexts_per_stack
- pid_max
- powersave-nap [ PPC only ]
- printk
- printk_delay
- printk_ratelimit
- printk_ratelimit_burst
+- pty ==> Documentation/filesystems/devpts.txt
- randomize_va_space
- real-root-dev ==> Documentation/initrd.txt
- reboot-cmd [ SPARC only ]
@@ -398,7 +404,7 @@ kernel stack.
==============================================================
-iso_cpu: (BFS CPU scheduler only).
+iso_cpu: (MuQSS CPU scheduler only).
This sets the percentage cpu that the unprivileged SCHED_ISO tasks can
run effectively at realtime priority, averaged over a rolling five
@@ -625,6 +631,17 @@ a kernel rebuild when attempting to kdump at the location of a WARN().
==============================================================
+panic_on_rcu_stall:
+
+When set to 1, calls panic() after RCU stall detection messages. This
+is useful to define the root cause of RCU stalls using a vmcore.
+
+0: do not panic() when RCU stall takes place, default behavior.
+
+1: panic() after printing RCU stall messages.
+
+==============================================================
+
perf_cpu_time_max_percent:
Hints to the kernel how much CPU time it should be allowed to
@@ -651,6 +668,43 @@ allowed to execute.
==============================================================
+perf_event_paranoid:
+
+Controls use of the performance events system by unprivileged
+users (without CAP_SYS_ADMIN). The default value is 2.
+
+ -1: Allow use of (almost) all events by all users
+>=0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
+>=1: Disallow CPU event access by users without CAP_SYS_ADMIN
+>=2: Disallow kernel profiling by users without CAP_SYS_ADMIN
+
+==============================================================
+
+perf_event_max_stack:
+
+Controls maximum number of stack frames to copy for (attr.sample_type &
+PERF_SAMPLE_CALLCHAIN) configured events, for instance, when using
+'perf record -g' or 'perf trace --call-graph fp'.
+
+This can only be done when no events are in use that have callchains
+enabled, otherwise writing to this file will return -EBUSY.
+
+The default value is 127.
+
+==============================================================
+
+perf_event_max_contexts_per_stack:
+
+Controls maximum number of stack frame context entries for
+(attr.sample_type & PERF_SAMPLE_CALLCHAIN) configured events, for
+instance, when using 'perf record -g' or 'perf trace --call-graph fp'.
+
+This can only be done when no events are in use that have callchains
+enabled, otherwise writing to this file will return -EBUSY.
+
+The default value is 8.
+
+==============================================================
pid_max:
@@ -722,6 +776,20 @@ send before ratelimiting kicks in.
==============================================================
+printk_devkmsg:
+
+Control the logging to /dev/kmsg from userspace:
+
+ratelimit: default, ratelimited
+on: unlimited logging to /dev/kmsg from userspace
+off: logging to /dev/kmsg disabled
+
+The kernel command line parameter printk.devkmsg= overrides this and is
+a one-time setting until next reboot: once set, it cannot be changed by
+this sysctl interface anymore.
+
+==============================================================
+
randomize_va_space:
This option can be used to select the type of process address
@@ -762,7 +830,7 @@ rebooting. ???
==============================================================
-rr_interval: (BFS CPU scheduler only)
+rr_interval: (MuQSS CPU scheduler only)
This is the smallest duration that any cpu process scheduling unit
will run for. Increasing this value can increase throughput of cpu
diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 809ab6efc..f0480f7ea 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -43,6 +43,17 @@ Values :
1 - enable the JIT
2 - enable the JIT and ask the compiler to emit traces on kernel log.
+bpf_jit_harden
+--------------
+
+This enables hardening for the Berkeley Packet Filter Just in Time compiler.
+Supported are eBPF JIT backends. Enabling hardening trades off performance,
+but can mitigate JIT spraying.
+Values :
+ 0 - disable JIT hardening (default value)
+ 1 - enable JIT hardening for unprivileged users only
+ 2 - enable JIT hardening for all users
+
dev_weight
--------------
diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 89a887c76..95ccbe6d7 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -57,9 +57,11 @@ Currently, these files are in /proc/sys/vm:
- panic_on_oom
- percpu_pagelist_fraction
- stat_interval
+- stat_refresh
- swappiness
- user_reserve_kbytes
- vfs_cache_pressure
+- watermark_scale_factor
- zone_reclaim_mode
==============================================================
@@ -581,15 +583,16 @@ Specify "[Nn]ode" for node order
"Zone Order" orders the zonelists by zone type, then by node within each
zone. Specify "[Zz]one" for zone order.
-Specify "[Dd]efault" to request automatic configuration. Autoconfiguration
-will select "node" order in following case.
-(1) if the DMA zone does not exist or
-(2) if the DMA zone comprises greater than 50% of the available memory or
-(3) if any node's DMA zone comprises greater than 70% of its local memory and
- the amount of local memory is big enough.
+Specify "[Dd]efault" to request automatic configuration.
-Otherwise, "zone" order will be selected. Default order is recommended unless
-this is causing problems for your system/application.
+On 32-bit, the Normal zone needs to be preserved for allocations accessible
+by the kernel, so "zone" order will be selected.
+
+On 64-bit, devices that require DMA32/DMA are relatively rare, so "node"
+order will be selected.
+
+Default order is recommended unless this is causing problems for your
+system/application.
==============================================================
@@ -754,6 +757,19 @@ is 1 second.
==============================================================
+stat_refresh
+
+Any read or write (by root only) flushes all the per-cpu vm statistics
+into their global totals, for more accurate reports when testing
+e.g. cat /proc/sys/vm/stat_refresh /proc/meminfo
+
+As a side-effect, it also checks for negative totals (elsewhere reported
+as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
+(At time of writing, a few stats are known sometimes to be found negative,
+with no ill effects: errors and warnings on these stats are suppressed.)
+
+==============================================================
+
swappiness
This control is used to define how aggressive the kernel will swap
@@ -803,6 +819,24 @@ performance impact. Reclaim code needs to take various locks to find freeable
directory and inode objects. With vfs_cache_pressure=1000, it will look for
ten times more freeable objects than there are.
+=============================================================
+
+watermark_scale_factor:
+
+This factor controls the aggressiveness of kswapd. It defines the
+amount of memory left in a node/system before kswapd is woken up and
+how much memory needs to be free before kswapd goes back to sleep.
+
+The unit is in fractions of 10,000. The default value of 10 means the
+distances between watermarks are 0.1% of the available memory in the
+node/system. The maximum value is 1000, or 10% of memory.
+
+A high rate of threads entering direct reclaim (allocstall) or kswapd
+going to sleep prematurely (kswapd_low_wmark_hit_quickly) can indicate
+that the number of free pages kswapd maintains for latency reasons is
+too small for the allocation bursts occurring in the system. This knob
+can then be used to tune kswapd aggressiveness accordingly.
+
==============================================================
zone_reclaim_mode: