GNU/Linux Keyboard Maps: xmodmap
The modmap subsystem is part of the core X11 protocol. However, it has been replaced by the X Keyboard (XKB) Extension to the protocol, which defines a facade that emulates the legacy modmap subsystem so that old programs still work—including those that manipulate the modmap directly!
For people who like to Keep It Stupid Simple, the XKB extension looks horribly complicated and gross—even ignoring protocol details, the configuration syntax is a monstrosity! There’s no way to say something like “I’d like to remap Caps-Lock to be Control”, you have to copy and edit the entire keyboard definition, which includes mucking with vector graphics of the physical keyboard layout! So it’s very tempting to pretend that XKB doesn’t exist, and it’s still using modmap.
However, this is a leaky abstraction; for instance: when running the
xmodmap
command to manipulate the modmap, if you have
multiple keyboards plugged in, the result can depend on which keyboard
you used to press “enter” after typing the command!
Despite only existing as a compatibility shim today, I think it is important to understand the modmap subsystem to understand modern XKB.
Conceptual overview
There are 3 fundamental tasks that the modmap subsystem performs:
keyboard: map keycode -> keysym
(client-side)keyboard: map keycode -> modifier bitmask
(server-side)pointer: map physical button -> logical button
(server-side)
You’re thinking: “Great, so the X server does these things for us!” Nope! Not entirely, anyway. It does the keycode->modifier lookup, and the mouse-button lookup, but the keycode->keysym lookup must be done client-side by querying the mapping stored on the server. Generally, this is done automatically inside of libX11/libxcb, and the actual client application code doesn’t need to worry about it.
So, what’s the difference between a keycode and a keysym, and how’s the modifier bitmask work?
keycode: A numeric ID for a hardware button; this is as close the the hardware as X11 modmaps let us get. These are conceptually identical to Linux kernel keycodes, but the numbers don’t match up. Xorg keycodes are typically
linux_keycode+8
.keysym: A 29-bit integer code that is meaningful to applications. A mapping of these to symbolic names is defined in
<X11/keysymdef.h>
and augmented by/usr/share/X11/XKeysymDB
. See:XStringToKeysym()
andXKeysymToString()
. We will generally use the symbolic name in the modmap file. The symbolic names are case-sensitive.Modifier state: An 8-bit bitmask of modifier keys (names are case-insensitive):
1 << 0 : shift 1 << 1 : lock 1 << 2 : control 1 << 3 : mod1 1 << 4 : mod2 1 << 5 : mod3 1 << 6 : mod4 1 << 7 : mod5
With that knowledge, and the libX11/libxcb API docs, you can probably
figure out how to interact with the modmap subsystem from C, but who
does that? Everyone just uses the xmodmap(1)
command.
The X11 protocol
As I said, the modifier and button lookup is handled server-side; each of the input events ({Key,Button}{Press,Release}, and MotionNotify) and pointer window events ({Enter,Leave}Notify) include a bitmask of active keyboard modifiers and pointer buttons. Each are given an 8-bit bitmask—hence 8 key modifiers. For some reason, only up to Button5 is included in the bitmask; the upper 3 bits are always zero; but the Button{Press,Release} events will happily deliver events for up to Button255!
The X11 protocol has 6 request types for dealing with these 3 mappings; an accessor and a mutator pair for each. Since the 2 of the mappings are done server-side, of these, most clients will only use GetKeyboardMapping. Anyway, let’s look at those 6 requests, grouped by the mappings that they work with (pardon the Java-like pseudo-code syntax for indicating logical argument and return types):
keyboard: map keycode -> keysym
- GetKeyboardMapping
::
List<keycode> -> Map<keycode,List<keysym>>
- ChangeKeyboardMapping
::
Map<keycode,List<keysym>> -> ()
GetKeyboardMapping
returns the keycode->keysym mappings for the requested keycodes; this way clients can choose to look up only the keycodes that they need to handle (the ones that got sent to them). Each keycode gets a list of keysyms; which keysym they should use from that list depends on which modifiers are pressed.ChangeKeyboardMapping
changes the mapping for the given keycodes; not all keycodes must be given, any keycodes that aren’t included in the request aren’t changed.- GetKeyboardMapping
::
keyboard: map keycode -> modifier bitmask
- GetModifierMapping
::
() -> Map<modifier,List<keycode>>
- SetModifierMapping
::
Map<modifier,List<keycode>> -> ()
The modifiers mapping is a lot smaller than the keysym mapping; you must operate on the entire mapping at once. For each modifier bit, there’s a list of keycodes that will cause that modifier bit to be flipped in the events that are delivered while it is pressed.
- GetModifierMapping
::
pointer: map physical button -> logical button
- GetPointerMapping
() -> List<logicalButton>
(indexed byphysicalButton-1
) - SetPointerMapping
List<logicalButton> -> ()
(indexed byphysicalButton-1
)
Like the modifier mapping, the button mapping is expected to be small, most mice only have 5-7 buttons (left, middle, right, scroll up, scroll down, scroll left, scroll right—that’s right, X11 handles scroll events as button presses), though some fancy gaming mice have more than that, but not much more.
- GetPointerMapping
I mentioned earlier that the keycode->keysym mapping isn’t
actually done by the X server, and is done in the client; whenever a
client receives a key event or pointer button event, it must do a
Get*Mapping
request to see what that translates to. Of
course, doing a that for every keystroke would be crazy; but at the same
time, the each client is expected to know about changes to the mappings
that happen at run-time. So, each of the “set”/“change” commands
generate a MappingNotify
event that is sent to all clients, so they know when they must dump
their cache of mappings.
For completeness, if you are looking at this as background for understanding XKB, I should also mention:
The xmodmap
command
The xmodmap
command reads a configuration file and
modifies the maps in the X server to match. The xmodmap
config file has its own little quirky syntax. For one, the comment
character is !
(and comments may only start at the
beginning of the line, but that’s fairly common).
There are 8 commands that xmodmap
recognizes. Let’s look
at those, grouped by the 3 tasks that the modmap subsystem performs:
keyboard: map keycode -> keysym
keycode KEYCODE = PLAIN [SHIFT [MODE_SWITCH [MODE_SWITCH+SHIFT ]]]
Actually takes a list of up to 8 keysyms, but only the first 4 have standard uses.
keysym OLD_KEYSYM = NEW_KEYSYMS...
Takes the keycodes mapped to
OLD_KEYSYM
and maps them toNEW_KEYSYM
.keysym any = KEYSYMS...
Finds an otherwise unused keycode, and has it map to the specified keysyms.
keyboard: map keycode -> modifier bitmask
clear MODIFIER
add MODIFIERNAME = KEYSYMS...
remove MODIFIERNAME = KEYSYMS...
Wait, the modmap subsystem maps keycodes to modifiers, but the commands take keysyms? Yup! When executing one of these commands, it first looks up those keysyms in the keyboard map to translate them in to a set of keycodes, then associates those keycodes with that modifier. But how does it look up keysym->keycode; the protocol only supports querying keycode->keysym? It loops over every keycode finding all the matches.
pointer: map physical button -> logical button
pointer = default
This is equivalent to
pointer = 1 2 3 4 5 6...
where the list is as long as the number of buttons that there are.pointer = NUMBERS...
pointer = A B C D...
sets the physical button 1 to logical button A, physical button 2 to logical button B, and so on. Setting a physical button to logical button 0 disables that button.
Appendix:
I use this snippet in my Emacs configuration to make editing xmodmap files nicer:
;; http://www.emacswiki.org/emacs/XModMapMode
(when (not (fboundp 'xmodmap-mode))
(define-generic-mode 'xmodmap-mode
'(?!)
'("add" "clear" "keycode" "keysym" "pointer" "remove")
nil
'("[xX]modmap\\(rc\\)?\\'")
nil
"Simple mode for xmodmap files."))