summaryrefslogtreecommitdiff
path: root/src/shared/hashmap.c
AgeCommit message (Collapse)Author
2014-12-30tree-wide: spelling fixesVeres Lajos
https://github.com/vlajos/misspell_fixer https://github.com/torstehu/systemd/commit/b6fdeb618cf2f3ce1645b3315f15f482710c7ffa Thanks to Torstein Husebo <torstein@huseboe.net>.
2014-12-13configure.ac: add a generic --enable-debug, replace --enable-hashmap-debugMichal Schmidt
There will be more debugging options later. --enable-debug will enable them all. --enable-debug=hashmap will enable only hashmap debugging. Also rename the C #define to ENABLE_DEBUG_* pattern.
2014-12-11tree-wide: use our memset() macros instead of memset() itselfLennart Poettering
2014-11-20set: make set_consume() actually free the allocated string if the string ↵Lennart Poettering
already is in the set
2014-10-30hashmap: rewrite the implementationMichal Schmidt
This is a rewrite of the hashmap implementation. Its advantage is lower memory usage. It uses open addressing (entries are stored in an array, as opposed to linked lists). Hash collisions are resolved with linear probing and Robin Hood displacement policy. See the references in hashmap.c. Some fun empirical findings about hashmap usage in systemd on my laptop: - 98 % of allocated hashmaps are Sets. - Sets contain 78 % of all entries, plain Hashmaps 17 %, and OrderedHashmaps 5 %. - 60 % of allocated hashmaps contain only 1 entry. - 90 % of allocated hashmaps contain 5 or fewer entries. - 75 % of all entries are in hashmaps that use trivial_hash_ops. Clearly it makes sense to: - store entries in distinct entry types. Especially for Sets - their entries are the most numerous and they require the least information to store an entry. - have a way to store small numbers of entries directly in the hashmap structs, and only allocate the usual entry arrays when the direct storage is full. The implementation has an optional debugging feature (enabled by defining the ENABLE_HASHMAP_DEBUG macro), where it: - tracks all allocated hashmaps in a linked list so that one can easily find them in gdb, - tracks which function/line allocated a given hashmap, and - checks for invalid mixing of hashmap iteration and modification. Since entries are not allocated one-by-one anymore, mempools are not used for entries. Originally I meant to drop mempools entirely, but it's still worth it to use them for the hashmap structs. My testing indicates that it makes loading of units about 5 % faster (a test with 10000 units where more than 200000 hashmaps are allocated - pure malloc: 449±4 ms, mempools: 427±7 ms). Here are some memory usage numbers, taken on my laptop with a more or less normal Fedora setup after booting with SELinux disabled (SELinux increases systemd's memory usage significantly): systemd (PID 1) Original New Change dirty memory (from pmap -x 1) [KiB] 2152 1264 -41 % total heap allocations (from gdb-heap) [KiB] 1623 756 -53 %
2014-10-23hashmap: allow hashmap_move() to failMichal Schmidt
It cannot fail in the current hashmap implementation, but it may fail in alternative implementations (unless a sufficiently large reservation has been placed beforehand).
2014-10-23hashmap: introduce hashmap_reserve()Michal Schmidt
With the current hashmap implementation that uses chaining, placing a reservation can serve two purposes: - To optimize putting of entries if the number of entries to put is known. The reservation allocates buckets, so later resizing can be avoided. - To avoid having very long bucket chains after using hashmap_move(_one). In an alternative hashmap implementation it will serve an additional purpose: - To guarantee a subsequent hashmap_move(_one) will not fail with -ENOMEM (this never happens in the current implementation).
2014-10-23hashmap: return more information from resize_buckets()Michal Schmidt
Return 0 if no resize was needed, 1 if successfully resized and negative on error.
2014-10-23shared: split mempool implementation from hashmapsMichal Schmidt
2014-10-23hashmap: drop assert(h) from hashmap_next()Michal Schmidt
It's handled just fine by returning NULL.
2014-10-23hashmap: hashmap_move_one() should return -ENOENT when 'other' is NULLMichal Schmidt
-ENOENT is the same return value as if 'other' were an allocated hashmap that does not contain the key. A NULL hashmap is a possible way of expressing a hashmap that contains no key.
2014-09-15hashmap: minor hashmap_replace optimizationMichal Schmidt
When hashmap_replace detects no such key exists yet, it calls hashmap_put that performs the same check again. Avoid that by splitting the core of hashmap_put into a separate function.
2014-09-15hashmap, set: remove unused functionsMichal Schmidt
The following hashmap_* and set_* functions/macros have never had any users in systemd's history: *_iterate_backwards *_iterate_skip *_last *_FOREACH_BACKWARDS Remove this dead code.
2014-09-15hashmap: introduce hash_ops to make struct Hashmap smallerMichal Schmidt
It is redundant to store 'hash' and 'compare' function pointers in struct Hashmap separately. The functions always comprise a pair. Store a single pointer to struct hash_ops instead. systemd keeps hundreds of hashmaps, so this saves a little bit of memory.
2014-08-19hashmap: try to use the existing 64bit hash functions for dev_t if it is 64bitLennart Poettering
2014-05-15hashmap: add hashmap_remove2() to remove item from hashtable and return both ↵Lennart Poettering
value and key
2014-01-31use memzero(foo, length); for all memset(foo, 0, length); callsGreg KH
In trying to track down a stupid linker bug, I noticed a bunch of memset() calls that should be using memzero() to make it more "obvious" that the options are correct (i.e. 0 is not the length, but the data to set). So fix up all current calls to memset(foo, 0, length) to memzero(foo, length).
2013-12-22shared: switch our hash table implementation over to SipHashLennart Poettering
SipHash appears to be the new gold standard for hashing smaller strings for hashtables these days, so let's make use of it.
2013-12-10hashmap: make gcc shut up on old glibcs that lack getauxval()Lennart Poettering
2013-11-20hashmap: be a bit more conservative with pre-allocating hash tables and itemsLennart Poettering
2013-10-01hashmap: randomize hash functions a bitLennart Poettering
2013-10-01hashmap: size hashmap bucket array dynamicallyLennart Poettering
Instead of fixing the hashmap bucket array to 127 entries dynamically size it, starting with a smaller one of 31. As soon as a fill level of 75% is reached, quadruple the size, and so on. This should siginficantly optimize the lookup time in large tables (from O(n) back to O(1)), and save memory on smaller tables (which most are).
2013-08-15hashmap: remove empty linesKay Sievers
2013-02-11binfmt,tmpfiles,modules-load,sysctl: rework the various early-boot services ↵Lennart Poettering
that work on .d/ directories This unifies much of the logic behind them: - All four will now ofllow the rule that the earlier file and earlier assignment in the .d/ directories wins. Before, sysctl was the only outlier, where the later setting always won. - All four now support getopt() and --help on the command line. - All four can now handle specification of configuration file names on the command line to apply. The tools will automatically find them, and apply them. Previously only tmpfiles could do that. This is useful for %post scripts in RPMs and suchlike. - This fixes various error path issues in conf_files_list()
2013-01-06build-sys: use VALGRIND not __OPTIMIZE__ as condition for valgrind compatZbigniew Jędrzejewski-Szmek
Actually, one might want to run valgrind even on optimized code. Now the same check is used in the jenkins hash functions and hashtable.
2012-10-26journal: introduce entry array chain cacheLennart Poettering
When traversing entry array chains for a bisection or for retrieving an item by index we previously always started at the beginning of the chain. Since we tend to look at the same chains repeatedly, let's cache where we have been the last time, and maybe we can skip ahead with this the next time. This turns most bisections and index lookups from O(log(n)*log(n)) into O(log(n)). More importantly however, we seek around on disk much less, which is good to reduce buffer cache and seek times on rotational disks.
2012-10-25journal: properly serialize fields with multiple values into JSONLennart Poettering
This now matches the JSON serialization spec from: http://www.freedesktop.org/wiki/Software/systemd/json
2012-10-18journal: add ability to list values a specified field can take in all ↵Lennart Poettering
entries of the journal The new 'unique' API allows listing all unique field values that a field specified by a field name can take in all entries of the journal. This allows answering queries such as "What units logged to the journal?", "What hosts have logged into the journal?", "Which boot IDs have logged into the journal?". Ultimately this allows implementation of tools similar to lastlog based on journal data. Note that listing these field values will not work for journal files created with older journald, as the field values are not indexed in older files.
2012-09-14systemctl: show unit name when a job failsLennart Poettering
https://bugzilla.redhat.com/show_bug.cgi?id=845028 https://bugzilla.redhat.com/show_bug.cgi?id=846483
2012-08-23hashmap: hashmap_contains does not need hashmap_entryLukas Nykryn
2012-08-14service: add options RestartPreventExitStatus and SuccessExitStatusLukas Nykryn
In some cases, like wrong configuration, restarting after error does not help, so administrator can specify statuses by RestartPreventExitStatus which will not cause restart of a service. Sometimes you have non-standart exit status, so this can be specified by SuccessfulExitStatus.
2012-07-03hashmap: make hashmap_clear() work on NULL hashmapsLennart Poettering
2012-07-03load-fragment: a few modernizationsLennart Poettering
2012-04-12relicense to LGPLv2.1 (with exceptions)Lennart Poettering
We finally got the OK from all contributors with non-trivial commits to relicense systemd from GPL2+ to LGPL2.1+. Some udev bits continue to be GPL2+ for now, but we are looking into relicensing them too, to allow free copy/paste of all code within systemd. The bits that used to be MIT continue to be MIT. The big benefit of the relicensing is that closed source code may now link against libsystemd-login.so and friends.
2012-04-10util: move all to shared/ and split external dependencies in separate ↵Kay Sievers
internal libraries Before: $ ldd /lib/systemd/systemd-timestamp linux-vdso.so.1 => (0x00007fffb05ff000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f90aac57000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f90aaa53000) librt.so.1 => /lib64/librt.so.1 (0x00007f90aa84a000) libc.so.6 => /lib64/libc.so.6 (0x00007f90aa494000) /lib64/ld-linux-x86-64.so.2 (0x00007f90aae90000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f90aa290000) libattr.so.1 => /lib64/libattr.so.1 (0x00007f90aa08a000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f90a9e6e000) After: $ ldd systemd-timestamp linux-vdso.so.1 => (0x00007fff3cbff000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f5eaa1c3000) librt.so.1 => /lib64/librt.so.1 (0x00007f5ea9fbb000) libc.so.6 => /lib64/libc.so.6 (0x00007f5ea9c04000) /lib64/ld-linux-x86-64.so.2 (0x00007f5eaa3fc000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f5ea9a00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5ea97e4000)