public/index.atom


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
2603
2604
2605
2606
2607
2608
2609
2610
2611
2612
2613
2614
2615
2616
2617
2618
2619
2620
2621
2622
2623
2624
2625
2626
2627
2628
2629
2630
2631
2632
2633
2634
2635
2636
2637
2638
2639
2640
2641
2642
2643
2644
2645
2646
2647
2648
2649
2650
2651
2652
2653
2654
2655
2656
2657
2658
2659
2660
2661
2662
2663
2664
2665
2666
2667
2668
2669
2670
2671
2672
2673
2674
2675
2676
2677
2678
2679
2680
2681
2682
2683
2684
2685
2686
2687
2688
2689
2690
2691
2692
2693
2694
2695
2696
2697
2698
2699
2700
2701
2702
2703
2704
2705
2706
2707
2708
2709
2710
2711
2712
2713
2714
2715
2716
2717
2718
2719
2720
2721
2722
2723
2724
2725
2726
2727
2728
2729
2730
2731
2732
2733
2734
2735
2736
2737
2738
2739
2740
2741
2742
2743
2744
2745
2746
2747
2748
2749
2750
2751
2752
2753
2754
2755
2756
2757
2758
2759
2760
2761
2762
2763
2764
2765
2766
2767
2768
2769
2770
2771
2772
2773
2774
2775
2776
2777
2778
2779
2780
2781
2782
2783
2784
2785
2786
2787
2788
2789
2790
2791
2792
2793
2794
2795
2796
2797
2798
2799
2800
2801
2802
2803
2804
2805
2806
2807
2808
2809
2810
2811
2812
2813
2814
2815
2816
2817
2818
2819
2820
2821
2822
2823
2824
2825
2826
2827
2828
2829
2830
2831
2832
2833
2834
2835
2836
2837
2838
2839
2840
2841
2842
2843
2844
2845
2846
2847
2848
2849
2850
2851
2852
2853
2854
2855
2856
2857
2858
2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
2874
2875
2876
2877
2878
2879
2880
2881
2882
2883
2884
2885
2886
2887
2888
2889
2890
2891
2892
2893
2894
2895
2896
2897
2898
2899
2900
2901
2902
2903
2904
2905
2906
2907
2908
2909
2910
2911
2912
2913
2914
2915
2916
2917
2918
2919
2920
2921
2922
2923
2924
2925
2926
2927
2928
2929
2930
2931
2932
2933
2934
2935
2936
2937
2938
2939
2940
2941
2942
2943
2944
2945
2946
2947
2948
2949
2950
2951
2952
2953
2954
2955
2956
2957
2958
2959
2960
2961
2962
2963
2964
2965
2966
2967
2968
2969
2970
2971
2972
2973
2974
2975
2976
2977
2978
2979
2980
2981
2982
2983
2984
2985
2986
2987
2988
2989
2990
2991
2992
2993
2994
2995
2996
2997
2998
2999
3000
3001
3002
3003
3004
3005
3006
3007
3008
3009
3010
3011
3012
3013
3014
3015
3016
3017
3018
3019
3020
3021
3022
3023
3024
3025
3026
3027
3028
3029
3030
3031
3032
3033
3034
3035
3036
3037
3038
3039
3040
3041
3042
3043
3044
3045
3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
3070
3071
3072
3073
3074
3075
3076
3077
3078
3079
3080
3081
3082
3083
3084
3085
3086
3087
3088
3089
3090
3091
3092
3093
3094
3095
3096
3097
3098
3099
3100
3101
3102
3103
3104
3105
3106
3107
3108
3109
3110
3111
3112
3113
3114
3115
3116
3117
3118
3119
3120
3121
3122
3123
3124
3125
3126
3127
3128
3129
3130
3131
3132
3133
3134
3135
3136
3137
3138
3139
3140
3141
3142
3143
3144
3145
3146
3147
3148
3149
3150
3151
3152
3153
3154
3155
3156
3157
3158
3159
3160
3161
3162
3163
3164
3165
3166
3167
3168
3169
3170
3171
3172
3173
3174
3175
3176
3177
3178
3179
3180
3181
3182
3183
3184
3185
3186
3187
3188
3189
3190
3191
3192
3193
3194
3195
3196
3197
3198
3199
3200
3201
3202
3203
3204
3205
3206
3207
3208
3209
3210
3211
3212
3213
3214
3215
3216
3217
3218
3219
3220
3221
3222
3223
3224
3225
3226
3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
3240
3241
3242
3243
3244
3245
3246
3247
3248
3249
3250
3251
3252
3253
3254
3255
3256
3257
3258
3259
3260
3261
3262
3263
3264
3265
3266
3267
3268
3269
3270
3271
3272
3273
3274
3275
3276
3277
3278
3279
3280
3281
3282
3283
3284
3285
3286
3287
3288
3289
3290
3291
3292
3293
3294
3295
3296
3297
3298
3299
3300
3301
3302
3303
3304
3305
3306
3307
3308
3309
3310
3311
3312
3313
3314
3315
3316
3317
3318
3319
3320
3321
3322
3323
3324
3325
3326
3327
3328
3329
3330
3331
3332
3333
3334
3335
3336
3337
3338
3339
3340
3341
3342
3343
3344
3345
3346
3347
3348
3349
3350
3351
3352
3353
3354
3355
3356
3357
3358
3359
3360
3361
3362
3363
3364
3365
3366
3367
3368
3369
3370
3371
3372
3373
3374
3375
3376
3377
3378
3379
3380
3381
3382
3383
3384
3385
3386
3387
3388
3389
3390
3391
3392
3393
3394
3395
3396
3397
3398
3399
3400
3401
3402
3403
3404
3405
3406
3407
3408
3409
3410
3411
3412
3413
3414
3415
3416
3417
3418
3419
3420
3421
3422
3423
3424
3425
3426
3427
3428
3429
3430
3431
3432
3433
3434
3435
3436
3437
3438
3439
3440
3441
3442
3443
3444
3445
3446
3447
3448
3449
3450
3451
3452
3453
3454
3455
3456
3457
3458
3459
3460
3461
3462
3463
3464
3465
3466
3467
3468
3469
3470
3471
3472
3473
3474
3475
3476
3477
3478
3479
3480
3481
3482
3483
3484
3485
3486
3487
3488
3489
3490
3491
3492
3493
3494
3495
3496
3497
3498
3499
3500
3501
3502
3503
3504
3505
3506
3507
3508
3509
3510
3511
3512
3513
3514
3515
3516
3517
3518
3519
3520
3521
3522
3523
3524
3525
3526
3527
3528
3529
3530
3531
3532
3533
3534
3535
3536
3537
3538
3539
3540
3541
3542
3543
3544
3545
3546
3547
3548
3549
3550
3551
3552
3553
3554
3555
3556
3557
3558
3559
3560
3561
3562
3563
3564
3565
3566
3567
3568
3569
3570
3571
3572
3573
3574
3575
3576
3577
3578
3579
3580
3581
3582
3583
3584
3585
3586
3587
3588
3589
3590
3591
3592
3593
3594
3595
3596
3597
3598
3599
3600
3601
3602
3603
3604
3605
3606
3607
3608
3609
3610
3611
3612
3613
3614
3615
3616
3617
3618
3619
3620
3621
3622
3623
3624
3625
3626
3627
3628
3629
3630
3631
3632
3633
3634
3635
3636
3637
3638
3639
3640
3641
3642
3643
3644
3645
3646
3647
3648
3649
3650
3651
3652
3653
3654
3655
3656
3657
3658
3659
3660
3661
3662
3663
3664
3665
3666
3667
3668
3669
3670
3671
3672
3673
3674
3675
3676
3677
3678
3679
3680
3681
3682
3683
3684
3685
3686
3687
3688
3689
3690
3691
3692
3693
3694
3695
3696
3697
3698
3699
3700
3701
3702
3703
3704
3705
3706
3707
3708
3709
3710
3711
3712
3713
3714
3715
3716
3717
3718
3719
3720
3721
3722
3723
3724
3725
3726
3727
3728
3729
3730
3731
3732
3733
3734
3735
3736
3737
3738
3739
3740
3741
3742
3743
3744
3745
3746
3747
3748
3749
3750
3751
3752
3753
3754
3755
3756
3757
3758
3759
3760
3761
3762
3763
3764
3765
3766
3767
3768
3769
3770
3771
3772
3773
3774
3775
3776
3777
3778
3779
3780
3781
3782
3783
3784
3785
3786
3787
3788
3789
3790
3791
3792
3793
3794
3795
3796
3797
3798
3799
3800
3801
3802
3803
3804
3805
3806
3807
3808
3809
3810
3811
3812
3813
3814
3815
3816
3817
3818
3819
3820
3821
3822
3823
3824
3825
3826
3827
3828
3829
3830
3831
3832
3833
3834
3835
3836
3837
3838
3839
3840
3841
3842
3843
3844
3845
3846
3847
3848
3849
3850
3851
3852
3853
3854
3855
3856
3857
3858
3859
3860
3861
3862
3863
3864
3865
3866
3867
3868
3869
3870
3871
3872
3873
3874
3875
3876
3877
3878
3879
3880
3881
3882
3883
3884
3885
3886
3887
3888
3889
3890
3891
3892
3893
3894
3895
3896
3897
3898
3899
3900
3901
3902
3903
3904
3905
3906
3907
3908
3909
3910
3911
3912
3913
3914
3915
3916
3917
3918
3919
3920
3921
3922
3923
3924
3925
3926
3927
3928
3929
3930
3931
3932
3933
3934
3935
3936
3937
3938
3939
3940
3941
3942
3943
3944
3945
3946
3947
3948
3949
3950
3951
3952
3953
3954
3955
3956
3957
3958
3959
3960
3961
3962
3963
3964
3965
3966
3967
3968
3969
3970
3971
3972
3973
3974
3975
3976
3977
3978
3979
3980
3981
3982
3983
3984
3985
3986
3987
3988
3989
3990
3991
3992
3993
3994
3995
3996
3997
3998
3999
4000
4001
4002
4003
4004
4005
4006
4007
4008
4009
4010
4011
4012
4013
4014
4015
4016
4017
4018
4019
4020
4021
4022
4023
4024
4025
4026
4027
4028
4029
4030
4031
4032
4033
4034
4035

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

	<title>Luke Shumaker's Web Log</title>
	<link rel="self"      type="application/atom+xml" href="./index.atom"/>
	<link rel="alternate" type="text/html"            href="./"/>
	<link rel="alternate" type="text/markdown"        href="./index.md"/>
	<updated>2023-07-10T00:00:00+00:00</updated>
	<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
	<id>https://lukeshu.com/blog/</id>

	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./btrfs-rec.html"/>
		<link rel="alternate" type="text/markdown" href="./btrfs-rec.md"/>
		<id>https://lukeshu.com/blog/btrfs-rec.html</id>
		<updated>2023-07-10T00:00:00+00:00</updated>
		<published>2023-07-10T00:00:00+00:00</published>
		<title>Announcing: btrfs-rec: Recover (data from) a broken btrfs filesystem</title>
		<content type="html">&lt;h1
id="announcing-btrfs-rec-recover-data-from-a-broken-btrfs-filesystem"&gt;Announcing:
btrfs-rec: Recover (data from) a broken btrfs filesystem&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;I originally sent this email on 2023-07-10, but it has been caught by
their bogofilter. Yes, it was &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/README.md?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;plaintext&lt;/a&gt;.
No, I didn't use GMail. Yes, I've successfully participated in vger
lists in the past. Yes, I've reached out to postmaster; no, I haven't
received a reply yet (as of 2023-07-14).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div style="font-family: monospace"&gt;
&lt;p&gt;To: linux-btrfs@vger.kernel.org&lt;br/&gt; From: Luke T. Shumaker
&amp;lt;lukeshu@lukeshu.com&amp;gt;&lt;br/&gt; Subject: btrfs-rec: Recover (data from)
a broken btrfs filesystem&lt;br/&gt; Date: Mon, 10 Jul 2023 21:23:41
-0600&lt;br/&gt; Message-ID:
&amp;lt;87jzv7uo5e.wl-lukeshu@lukeshu.com&amp;gt;&lt;br/&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Inspired by a mis-typed &lt;code&gt;dd&lt;/code&gt; command, for the last year
I've been working on a tool for recovering corrupt btrfs filesystems; at
first idly here and there, but more actively in the last few months. I
hope to get it incorporated into btrfs-progs, though perhaps that is
problematic for a few reasons I'll get to. If the code can't be
incorporated into btrfs-progs, at least the ideas and algorithms should
be.&lt;/p&gt;
&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/"&gt;https://git.lukeshu.com/btrfs-progs-ng/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Highlights:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;In general, it's more tolerant of corrupt filesystems than
&lt;code&gt;btrfs check --repair&lt;/code&gt;, &lt;code&gt;btrfs rescue&lt;/code&gt; or
&lt;code&gt;btrfs restore&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;btrfs-rec inspect rebuild-mappings&lt;/code&gt; is a better
&lt;code&gt;btrfs rescue chunk-recover&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;btrfs-rec inspect rebuild-trees&lt;/code&gt; can re-attach lost
branches to broken B+ trees.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;btrfs-rec inspect mount&lt;/code&gt; is a read-only FUSE
implementation of btrfs. This is conceptually a replacement for
&lt;code&gt;btrfs restore&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It's entirely written in Go. I'm not saying that's a good thing,
but it's an interesting thing.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hopefully some folks will find it useful, or at least neat!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#motivation"&gt;1. Motivation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#overview-of-use"&gt;2. Overview of use&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#prior-art"&gt;3. Prior art&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#internalsdesign"&gt;4. Internals/Design&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#overview-of-the-source-tree-layout"&gt;4.1. Overview of the
source tree layout&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#base-decisions-cli-structure-go-json"&gt;4.2. Base decisions:
CLI structure, Go, JSON&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#algorithms"&gt;4.3. Algorithms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-rebuild-mappings-algorithm"&gt;4.3.1. The
&lt;code&gt;rebuild-mappings&lt;/code&gt; algorithm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the---rebuild-algorithm"&gt;4.3.2. The &lt;code&gt;--rebuild&lt;/code&gt;
algorithm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#rebuilt-forrest-behavior-looking-up-trees"&gt;4.3.2.1.
rebuilt forrest behavior&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#rebuilt-individual-tree-behavior"&gt;4.3.2.2. rebuilt
individual tree behavior&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-rebuild-trees-algorithm"&gt;4.3.3. The
&lt;code&gt;rebuild-trees&lt;/code&gt; algorithm&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#initialization"&gt;4.3.3.1. initialization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#the-main-loop"&gt;4.3.3.2. the main loop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#graph-callbacks"&gt;4.3.3.3. graph callbacks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#future-work"&gt;5. Future work&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#problems-with-merging-this-code-into-btrfs"&gt;6. Problems
for merging this code into btrfs-progs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="motivation"&gt;1. Motivation&lt;/h1&gt;
&lt;p&gt;Have you ever ended up with a corrupt btrfs filesystem (through no
fault of btrfs itself, but perhaps a failing drive, or a mistaken
&lt;code&gt;dd&lt;/code&gt; invocation)? Surely losing less than 100MB of data from
a drive should not render hundreds of GB of perfectly intact data
unreadable! And yet, the existing tools are unable to even attempt to
read that data:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs check --repair --force dump-zero.1.img
enabling repair mode
Opening filesystem to check...
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
ERROR: cannot open file system&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs check --init-extent-tree --force dump-zero.1.img
Opening filesystem to check...
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
ERROR: cannot open file system&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs check --init-csum-tree --force dump-zero.1.img
Creating a new CRC tree
Opening filesystem to check...
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
ERROR: cannot open file system&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs rescue chunk-recover dump-zero.1.img
Scanning: DONE in dev0
corrupt node: root=1 block=160410271744 slot=0, corrupt node: root=1 block=160410271744, nritems too large, have 39 expect range [1,0]
Couldn&amp;#39;t read tree root
open with broken chunk error&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs rescue zero-log dump-zero.1.img
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
ERROR: cannot read chunk root
ERROR: could not open ctree&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ mkdir out
$ btrfs restore dump-zero.1.img out
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
Could not open root, trying backup super
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
Could not open root, trying backup super
ERROR: superblock bytenr 274877906944 is larger than device size 256060514304
Could not open root, trying backup super&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs restore --list-roots dump-zero.1.img
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
Could not open root, trying backup super
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
checksum verify failed on 1048576 wanted 0xf81c950a found 0xd66a46e0
bad tree block 1048576, bytenr mismatch, want=1048576, have=11553381380038442733
ERROR: cannot read chunk root
Could not open root, trying backup super
ERROR: superblock bytenr 274877906944 is larger than device size 256060514304
Could not open root, trying backup super&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs-find-root dump-zero.1.img
WARNING: cannot read chunk root, continue anyway
Superblock thinks the generation is 6596071
Superblock thinks the level is 1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Well, have I got a tool for you!&lt;/p&gt;
&lt;p&gt;(FWIW, I also tried manipulating the filesystem and patching to tools
to try to get past those errors, only to get a different set of errors.
Some of these patches I am separately submitting to btrfs-progs.)&lt;/p&gt;
&lt;h1 id="overview-of-use"&gt;2. Overview of use&lt;/h1&gt;
&lt;p&gt;There are two &lt;code&gt;btrfs-rec&lt;/code&gt; sub-command groups:
&lt;code&gt;btrfs-rec inspect &lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; and &lt;code&gt;btrfs-rec
repair &lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt;, and you can find out about various
sub-commands with &lt;code&gt;btrfs-rec help&lt;/code&gt;. These are both told about
devices or images with the &lt;code&gt;--pv&lt;/code&gt; flag.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;btrfs-rec inspect &lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; commands open the
filesystem read-only, and (generally speaking) write extracted or
rebuilt information to stdout. &lt;code&gt;btrfs-rec repair
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; commands open the filesystem read+write, and
consume information from &lt;code&gt;btrfs-rec inspect
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; commands to actually repair the filesystem
(except I haven't actually implemented any &lt;code&gt;repair&lt;/code&gt; commands
yet... despite the lack of &lt;code&gt;repair&lt;/code&gt; commands, I believe that
&lt;code&gt;btrfs-rec&lt;/code&gt; is already a useful because of the
&lt;code&gt;btrfs-rec inspect mount&lt;/code&gt; command to get data out of the
broken filesystem). This split allows you to try things without being
scared by WARNINGs about not using these tools unless you're an expert
or have been told to by a developer.&lt;/p&gt;
&lt;p&gt;In the broken &lt;code&gt;dump-zero.1.img&lt;/code&gt; example above (which has a
perfectly intact superblock, but a totally broken
&lt;code&gt;CHUNK_TREE&lt;/code&gt;), to "repair" it I'd:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;Start by using &lt;code&gt;btrfs-rec inspect rebuild-mappings&lt;/code&gt; to
rebuild the broken chunk/dev/blockgroup trees:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs-rec inspect rebuild-mappings \
    --pv=dump-zero.1.img \
    &amp;gt; mappings-1.json&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If it only mostly succeeds, but on stderr tells us about a few
regions of the image that it wasn't able to figure out the chunks for.
Using some human-level knowledge, you can write those yourself,
inserting them into the generated &lt;code&gt;mappings.json&lt;/code&gt;, and ask
&lt;code&gt;rebuild-mappings&lt;/code&gt; to normalize what you wrote:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs-rec inspect rebuild-mappings \
    --pv=dump-zero.1.img \
    --mappings=&amp;lt;(sed &amp;lt;mappings-1.json \
        -e &amp;#39;2a{&amp;quot;LAddr&amp;quot;:5242880,&amp;quot;PAddr&amp;quot;:{&amp;quot;Dev&amp;quot;:1,&amp;quot;Addr&amp;quot;:5242880},&amp;quot;Size&amp;quot;:1},&amp;#39; \
        -e &amp;#39;2a{&amp;quot;LAddr&amp;quot;:13631488,&amp;quot;PAddr&amp;quot;:{&amp;quot;Dev&amp;quot;:1,&amp;quot;Addr&amp;quot;:13631488},&amp;quot;Size&amp;quot;:1},&amp;#39;) \
    &amp;gt; mappings-2.json&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now that it has functioning chunk/dev/blockgroup trees, we can
use &lt;code&gt;btrfs-rec inspect rebuild-trees&lt;/code&gt; to rebuild other trees
that rely on those:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ btrfs-rec inspect rebuild-mappings \
    --pv=dump-zero.1.img \
    --mappings=mappings-2.json \
    &amp;gt; trees.json&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now that (hopefully) everything that was damaged has been
reconstructed, we can use &lt;code&gt;btrfs-rec inspect mount&lt;/code&gt; to mount
the filesystem read-only and copy out our data:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ mkdir mnt
$ sudo btrfs-rec inspect mount \
    --pv=dump-zero.1.img \
    --mappings=mappings-2.json \
    --trees=trees.json \
    ./mnt&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This example is fleshed out more (and the manual edits to
&lt;code&gt;mappings.json&lt;/code&gt; explained more) in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/examples/main.sh?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./examples/main.sh&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id="prior-art"&gt;3. Prior art&lt;/h1&gt;
&lt;p&gt;Comparing &lt;code&gt;btrfs-rec inspect mount&lt;/code&gt; with the existing
https://github.com/adam900710/btrfs-fuse project:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Again, mine has better fault tolerance&lt;/li&gt;
&lt;li&gt;Mine is read-only&lt;/li&gt;
&lt;li&gt;Mine supports xattrs ("TODO" in Adam's)&lt;/li&gt;
&lt;li&gt;Mine supports separate inode address spaces for subvolumes; Adam's
doesn't due to limitations in FUSE, mine works around this by lazily
setting up separate mountpoints for each subvolume (though this does
mean that the process needs to run as root, which is a bummer).&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="internalsdesign"&gt;4. Internals/Design&lt;/h1&gt;
&lt;h2 id="overview-of-the-source-tree-layout"&gt;4.1. Overview of the source
tree layout&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/examples?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;examples/&lt;/code&gt;&lt;/a&gt;
has example scripts showing how to use &lt;code&gt;btrfs-rec&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfs?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/btrfs/&lt;/code&gt;&lt;/a&gt;
is the core btrfs implementation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfscheck?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/btrfscheck/&lt;/code&gt;&lt;/a&gt;
and &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/btrfsutil/&lt;/code&gt;&lt;/a&gt;
are libraries for "btrfs-progs" type programs, that are userland-y
things that I thought should be separate from the core implementation;
something that frustrated me about libbtrfs was having to figure out "is
this thing here in support of btrfs bits-on-disk, or in support of a
higher-level 'how btrfs-progs wants to think about things'?"&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;cmd/btrfs-rec/&lt;/code&gt;&lt;/a&gt;
is where the command implementations live. If a sub-command fits in a
single file, it's
&lt;code&gt;cmd/btrfs-rec/inspect_&lt;var&gt;SUBCMD&lt;/var&gt;.go&lt;/code&gt;, otherwise, it's
in a separate &lt;code&gt;cmd/btrfs-rec/inspect/&lt;var&gt;SUBCMD&lt;/var&gt;/&lt;/code&gt;
package.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/textui?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/textui/&lt;/code&gt;&lt;/a&gt;
is reasonably central to how the commands implement a text/CLI
user-interface.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/binstruct?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/binstruct/&lt;/code&gt;&lt;/a&gt;,
&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/diskio?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/diskio/&lt;/code&gt;&lt;/a&gt;,
and &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/streamio?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/streamio/&lt;/code&gt;&lt;/a&gt;
are non-btrfs-specific libraries related to the problem domain.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/containers?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/containers/&lt;/code&gt;&lt;/a&gt;,
&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/fmtutil?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/fmtutil/&lt;/code&gt;&lt;/a&gt;,
&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/maps?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/maps/&lt;/code&gt;&lt;/a&gt;,
&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/slices?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/slices/&lt;/code&gt;&lt;/a&gt;,
and &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/profile?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;lib/profile/&lt;/code&gt;&lt;/a&gt;
are all generic Go libraries that have nothing to do with btrfs or the
problem domain, but weren't in the Go standard library and I didn't
find/know-of exiting implementations that I liked. Of these, all but
&lt;code&gt;containers&lt;/code&gt; are pretty simple utility libraries. Also, some
of these things have been added to the standard library since I started
the project.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="base-decisions-cli-structure-go-json"&gt;4.2. Base decisions: CLI
structure, Go, JSON&lt;/h2&gt;
&lt;p&gt;I started with trying to enhance btrfs-progs, but ended up writing a
wholy new program in Go, for several reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;writing a new thing: I was having to learn both the btrfs-progs
codebase and how btrfs-bits-on-disk work, and it got to the point that I
decided I should just focus on learning btrfs-bits-on-disk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;writing a new thing: It was becoming increasingly apparent to me
that it was going to be an uphill-fight of having recovery-tools share
the same code as the main-tools, as the routines used by the main-tools
rightly have validity checks, where recovery-tools want to say "yes, I
know it's invalid, can you give it to me anyway?".&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;writing it in not-C: I love me some C, but higher level languages
are good for productivity. And I was trying to write a whole lot of code
at once, I needed a productivity boost.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;writing it in not-C: This forced me to learn btrfs-bits-on-disk
better, instead of just cribbing from btrfs-progs. That knowledge is
particularly important for having ideas on how to deal with corrupt
bits-on-disk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;writing it in Go: At the time I started, my day job was writing
Go, so I had Go swapped into my brain. And Go still feels close to C but
provides &lt;em&gt;a lot&lt;/em&gt; of niceness and safety over C.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It turned out that Go was perhaps not the best choice, but we'll come
back to that.&lt;/p&gt;
&lt;p&gt;I wanted to separate things into a pipeline. For instance: Instead of
&lt;code&gt;btrfs rescue chunk-recover&lt;/code&gt; trying to do everything to
rebuild a broken chunk tree, I wanted to separate I/O from computation
from repairs. So I have
&lt;code&gt;btrfs-rec inspect rebuild-mappings scan&lt;/code&gt; that reads all the
info necessary to rebuild the chunk tree, then dump that as a 2GB glob
of JSON. Then I can feed that JSON to
&lt;code&gt;btrfs-rec inspect rebuild-mappings process&lt;/code&gt; which actually
rebuilds the mappings in the chunk tree, and dumps them as JSON. And
then other commands can consume that &lt;code&gt;mappings.json&lt;/code&gt; to use
that instead of trying to read the chunk tree from the actual FS, so
that you don't have to make potentially destructive writes to inspect an
FS with a broken chunk tree, and can inspect it more forensically. Or
then use &lt;code&gt;btrfs-rec repair
&lt;var&gt;SOME_SUBCMD_I_HAVENT_WRITTEN_YET&lt;/var&gt;&lt;/code&gt; to write that chunk
tree in &lt;code&gt;mappings.json&lt;/code&gt; back to the filesystem.&lt;/p&gt;
&lt;p&gt;(But also, the separate steps thing was useful just so I could
iterate on the algorithms of &lt;code&gt;rebuild-mappings process&lt;/code&gt;
separately from having to scan the entire FS)&lt;/p&gt;
&lt;p&gt;So, I made the decision that &lt;code&gt;btrfs-rec inspect
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; commands should all only open the FS read-only,
and output their work to a separate file; that writing that info back to
the FS should be separate in &lt;code&gt;btrfs-rec repair
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;For connecting those parts of the pipeline, I chose JSON, for a few
reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I wanted something reasonably human-readable, so that I could
debug it easier.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I wanted something reasonably human-readable, so that human
end-users could make manual edits; for example, in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/examples/main.sh?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;examples/main.sh&lt;/code&gt;&lt;/a&gt;
I have an example of manually editing &lt;code&gt;mappings.json&lt;/code&gt; to
resolve a region that the algorithm couldn't figure out, but with
knowledge of what caused the corruption a human can.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I didn't want to invent my own DSL and have to handle writing a
parser. (This part didn't pay off! See below.)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I wanted something that I thought would have good support in a
variety of languages, so that if Go is problematic for getting things
merged upstream it could be rewritten in C (or maybe Rust?) piece-meal
where each subcommand can be rewritten one at a time.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It turned out that JSON was perhaps not the best choice.&lt;/p&gt;
&lt;p&gt;OK, so: Go and/or JSON maybe being mistakes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I spent a lot of time getting the garbage collector to not just
kill performance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;btrfs-rec inspect rebuild-mappings
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; subcommands all throw a lot of data through the
JSON encoder/decoder, and I learned that the Go stdlib
&lt;code&gt;encoding/json&lt;/code&gt; package has memory use that grows O(n^2)
(-ish? I didn't study the implementation, but that's what the curve
looks like just observing it) on the size of the data being shoved
through it, so I had to go take a break and go write
https://pkg.go.dev/git.lukeshu.com/go/lowmemjson which is a
mostly-drop-in-replacement that tries to be as close-as possible to O(1)
memory use. So I did end up having to write my own parser anyway
:(&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="algorithms"&gt;4.3. Algorithms&lt;/h2&gt;
&lt;p&gt;There are 3 algorithms of note in &lt;code&gt;btrfs-rec&lt;/code&gt;, that I
think are worth getting into mainline btrfs-progs even if the code of
&lt;code&gt;btrfs-rec&lt;/code&gt; doesn't get in:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;btrfs-rec inspect rebuild-mappings&lt;/code&gt; algoritithm
to rebuild information from the &lt;code&gt;CHUNK_TREE&lt;/code&gt;,
&lt;code&gt;DEV_TREE&lt;/code&gt;, and &lt;code&gt;BLOCK_GROUP_TREE&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;btrfs-rec --rebuild&lt;/code&gt; algorithm to cope with
reading broken B+ trees.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;btrfs-rec inspect rebuild-trees&lt;/code&gt; algorithm to
re-attach lost branches to broken B+ trees.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="the-rebuild-mappings-algorithm"&gt;4.3.1. The
&lt;code&gt;rebuild-mappings&lt;/code&gt; algorithm&lt;/h3&gt;
&lt;p&gt;(This step-zero scan is
&lt;code&gt;btrfs-rec inspect rebuild-mappings scan&lt;/code&gt;, and principally
lives in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil/scan.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfsutil/scan.go&lt;/code&gt;&lt;/a&gt;
and &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildmappings/scan.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildmappings/scan.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;ol start="0" type="1"&gt;
&lt;li&gt;Similar to &lt;code&gt;btrfs rescue chunk-recover&lt;/code&gt;, scan each device
for things that look like nodes; keep track of:
&lt;ul&gt;
&lt;li&gt;Checksums of every block on the device&lt;/li&gt;
&lt;li&gt;Which physical addresses contain nodes that claim to be at a given
logical addess.&lt;/li&gt;
&lt;li&gt;Any found Chunk items, BlockGroup items, DevExtent, and CSum items.
Keep track of the key for each of these, and for CSum items also track
the generation.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Create a bucket of the data from Chunks, DevExtents, and BlockGroups;
since these are mostly a Chunk and a DevExtent+BlockGroup store pretty
much the same information; we can use one to reconstruct the other. How
we "merge" these and handle conflicts is in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfs/btrfsvol/lvm.go?id=18e6066c241cf3d252b6521150843ffc858d8434#n121"&gt;&lt;code&gt;./lib/btrfs/btrfsvol/lvm.go:addMapping()&lt;/code&gt;&lt;/a&gt;,
I don't think this part is particularly clever, but given that
&lt;code&gt;btrfs rescue chunk-recover&lt;/code&gt; crashes if it encounters two
overlapping chunks, I suppose I should spell it out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A "mapping" is represented as a group of 4 things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;logical address&lt;/li&gt;
&lt;li&gt;a list of 1 or more physical addresses (device ID and offset)&lt;/li&gt;
&lt;li&gt;size, and a Boolean indicator of whether the size is "locked"&lt;/li&gt;
&lt;li&gt;block group flags, and a Boolean presence-indicator&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mappings must be merged if their logical or physical regions
overlap.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If a mapping has a "locked" size, then when merging it may
subsume smaller mappings with unlocked sizes, but its size cannot be
changed; trying to merge a locked-size mapping with another mapping that
is not for a subset region should return an error.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If a mapping has block group flags present, then those flags may
not be changed; it may only be merged with another mapping that does not
have flags present, or has identical flags.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When returning an error because of overlapping non-mergeable
mappings, just log an error on stderr and keep going. That's an
important design thing that is different than normal filesystem code; if
there's an error, yeah, detect and notify about it, &lt;strong&gt;but don't
bail out of the whole routine&lt;/strong&gt;. Just skip that one item or
whatever.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now that we know how to "add a mapping", let's do that:&lt;/p&gt;
&lt;p&gt;(The following main-steps are
&lt;code&gt;btrfs-rec inspect rebuild-mappings process&lt;/code&gt;, and principally
live in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildmappings/process.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildmappings/process.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;Add all found Chunks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add all found DevExtents.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add a phyical:logical mapping of length nodesize for each node
that was found.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Any mappings from steps 2 or 3 that are missing blockgroup flags
(that is: they weren't able to be merged with a mapping from step 1),
use the found BlockGroups to fill in those flags.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now we'll merge all found CSum items into a map of the sums of
the logical address space. Sort all of the csum items by generation,
then by address. Loop over them in that order, inserting their sums into
the map. If two csum items overlap, but agree about the sums of the
overlapping region, that's fine, just take their union. For overlaps
that disagree, items with a newer generation kick out items with an
older generation. If disagreeing items have the same generation... I
don't think that can happen except by a filesystem bug (i.e. not by a
failing drive or other external corruption), so I wasn't too concerned
about it, so I just log an error on stderr and skip the later-processed
item. See &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildmappings/process_sums_logical.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildmappings/process_sums_logical.go&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Look at regions of the logical address space that meet all the 3
criteria:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;we have CSum items for them&lt;/li&gt;
&lt;li&gt;we have a BlockGroup for them&lt;/li&gt;
&lt;li&gt;we don't have a Chunk/DevExtent mapping them to the pysical address
space.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Pair those CSums up with BlockGroups, and for each BlockGroup, search
the list of checksums of physical blocks to try to find a physical
region that matches the logical csums (and isn't already mapped to a
different logical region). I used a Knuth-Morris-Pratt search, modified
to handle holes in the logical csum list as wildcards.&lt;/p&gt;
&lt;p&gt;Insert any found mappings into our bucket of mappings.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Do the same again, but with a fuzzy search (we can re-use the
csum map of the logical address space). My implementation of this is
comparatively time and space intensive; I just walk over the entire
unmapped physical address space, noting what % of match each BlockGroup
has if placed at that location. I keep track of the best 2 matches for
each BlockGroup. If the best match is better than a 50% match, and the
second best is less than a 50% match, then I add the best match. In my
experience, the best match is &amp;gt;90% (or at whatever the maximum
percent is for how much of the BlockGroup has logical sums), and the
second best is 0% or 1%. The point of tracking both is that if there
isn't a clear-cut winner, I don't want it to commit to a potentially
wrong choice.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="the---rebuild-algorithm"&gt;4.3.2. The &lt;code&gt;--rebuild&lt;/code&gt;
algorithm&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;--rebuild&lt;/code&gt; flag is implied by the
&lt;code&gt;--trees=trees.json&lt;/code&gt; flag, and triggers an algorithm that
allows "safely" reading from a broken B+ tree, rather than the usual B+
tree lookup and search functions. I probably should have tried to
understand the &lt;code&gt;btrfs restore&lt;/code&gt; algorithm, maybe I reinvented
the wheel...&lt;/p&gt;
&lt;p&gt;This algorithm requires a list of all nodes on the filesystem; we
find these using the same scan as above (&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil/scan.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfsutil/scan.go&lt;/code&gt;&lt;/a&gt;),
the same procedure as &lt;code&gt;btrfs rescue chunk-recover&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;We walk all of those nodes, and build a reasonably lightweight
in-memory graph of all nodes (&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil/graph.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfsutil/graph.go&lt;/code&gt;&lt;/a&gt;),
tracking&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;each node's
&lt;ul&gt;
&lt;li&gt;logical address&lt;/li&gt;
&lt;li&gt;level&lt;/li&gt;
&lt;li&gt;generation&lt;/li&gt;
&lt;li&gt;tree&lt;/li&gt;
&lt;li&gt;each item's key and size&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;each keypointer's
&lt;ul&gt;
&lt;li&gt;source node&lt;/li&gt;
&lt;li&gt;source slot within the node&lt;/li&gt;
&lt;li&gt;tree of the source node&lt;/li&gt;
&lt;li&gt;destination node&lt;/li&gt;
&lt;li&gt;destination level implied by the level of the source node&lt;/li&gt;
&lt;li&gt;destination key&lt;/li&gt;
&lt;li&gt;destination generation&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;logical addresses and error messages for nodes that are pointed to
by a keypointer or the superblock, but can't be read (because that
logical address isn't mapped, or it doesn't look like a node,
or...)&lt;/li&gt;
&lt;li&gt;an index such that for a given node we can quickly list both
keypointers both originating at that node and pointing to that
node.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="rebuilt-forrest-behavior-looking-up-trees"&gt;4.3.2.1. rebuilt
forrest behavior (looking up trees)&lt;/h4&gt;
&lt;p&gt;(see: &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil/rebuilt_forrest.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfsutil/rebuilt_forrest.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;ROOT_TREE&lt;/code&gt;, &lt;code&gt;CHUNK_TREE&lt;/code&gt;,
&lt;code&gt;TREE_LOG&lt;/code&gt;, and &lt;code&gt;BLOCK_GROUP_TREE&lt;/code&gt; (the trees
pointed to directy by the superblock) work as you'd expect.&lt;/li&gt;
&lt;li&gt;For other trees, we (as you'd expect) look up the root item in the
rebuilt &lt;code&gt;ROOT_TREE&lt;/code&gt;, and then (if rootitem.ParentUUID is
non-zero) eagerly also look up the parent tree (recursing on ourself).
We try to use the &lt;code&gt;UUID_TREE&lt;/code&gt; tree to help with this, but
fall back to just doing a linear scan over the &lt;code&gt;ROOT_TREE&lt;/code&gt;.
If we fail to look up the parent tree (or its parent, or a more distant
ancestor), then (depending on a flag) we either make a note of that, or
error out and fail to look up the child tree. For &lt;code&gt;--rebuild&lt;/code&gt;
and &lt;code&gt;--trees=trees.json&lt;/code&gt; we are permissive of this error, and
just make note of it; but we'll re-use this algorithm in the
&lt;code&gt;rebuild-trees&lt;/code&gt; algorithm below, and it needs the more strict
handling.&lt;/li&gt;
&lt;li&gt;When creating the rebuilt individual tree, we start by adding the
root node specified by the superblock/root-item. But we may also add
additional root nodes grafted on to the tree by the
&lt;code&gt;--trees=trees.json&lt;/code&gt; flag or by the
&lt;code&gt;rebuild-trees&lt;/code&gt; algorithm below. So a tree may have more than
1 root node.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="rebuilt-individual-tree-behavior"&gt;4.3.2.2. rebuilt individual
tree behavior&lt;/h4&gt;
&lt;p&gt;(see: &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfsutil/rebuilt_tree.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfsutil/rebuilt_tree.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;In order to read from a tree, we first have to build a few indexes.
We store these indexes in an Adaptive Replacement Cache; they are all
re-buildable based on the tree's list of roots and the above graph; if
we have a bunch of trees we don't need to keep all of this in memory at
once. Note that this is done 100% with the in-memory graph, we don't
need to read anything from the filesystem during these procedures.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The first index we build is the "node index". This is an index
that for every node tells us what root(s) the tree would need to have in
order for the tree to include that node, and also what the highest item
key would be acceptable in the node if the tree includes that root. We
track both a &lt;code&gt;loMaxItem&lt;/code&gt; and a &lt;code&gt;hiMaxItem&lt;/code&gt;, in
case the tree is real broken and there are multiple paths from the root
to the node; as these different paths may imply different max-item
constraints. Put more concretely, the type of the index is:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;map[ nodeID → map[ rootNodeID → {loMaxItem, hiMaxItem} ] ]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We'll do a loop over the graph, using dynamic-programming memoization
to figure out ordering and avoid processing the same node twice; for
each node we'll&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Check whether the owner-tree is this tree or one of this tree's
ancestors (and if it's an ancestor, that the node's generation isn't
after the point that the child tree was forked from the parent tree). If
not, we are done processing that node (record an empty/nil set of roots
for it).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Create an empty map of &lt;code&gt;rootID&lt;/code&gt; →
{&lt;code&gt;loMaxItem&lt;/code&gt;, &lt;code&gt;hiMaxItem&lt;/code&gt;}.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Look at each keypointer that that points at the node and:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Skip the keypointer if its expectations of the node aren't met:
if the level, generation, and min-key constraints don't match up. If the
keypointer isn't in the last slot in the source node, we also go ahead
and include checking that the destination node's max-key is under the
min-key of the keypointer in the next slot, since that's cheap to do
now.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Skip the keypointer if its source node's owner-tree isn't this
tree or one of this tree's ancestors (and if it's an ancestor, that the
node's generation isn't after the point that the child tree was forked
from the parent tree).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;dynamic-programming recurse and index the keypointer's source
node.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;for every root that would result in the keypointer's source node
being included in the tree:&lt;/p&gt;
&lt;p&gt;. If the keypointer is in the last slot, look at what the what the
source node's last-item constraints would be if that root is included,
and can now check the max-item of our destination node. We check against
the &lt;code&gt;hiMaxItem&lt;/code&gt;; as if there is any valid path from the root
to this node, then we want to be permissive and include it. If that
check fails, then we're done with this keypointer. Also, make node of
those &lt;code&gt;loMaxItem&lt;/code&gt; and &lt;code&gt;hiMaxItem&lt;/code&gt; values, we'll
use them again in just a moment.&lt;/p&gt;
&lt;p&gt;. Otherwise, set both &lt;code&gt;loMaxItem&lt;/code&gt; and
&lt;code&gt;hiMaxItem&lt;/code&gt; to 1-under the min-item of the keypointer in the
next slot.&lt;/p&gt;
&lt;p&gt;. Insert that &lt;code&gt;loMaxItem&lt;/code&gt; and &lt;code&gt;hiMaxItem&lt;/code&gt; pair
into the &lt;code&gt;rootID&lt;/code&gt; → {&lt;code&gt;loMaxItem&lt;/code&gt;,
&lt;code&gt;hiMaxItem&lt;/code&gt;} map we created above. If an entry already exists
for this root (since a broken tree might have multiple paths from the
root to our node), then set &lt;code&gt;loMaxItem&lt;/code&gt; to the min of the
existing entry and our value, and &lt;code&gt;hiMaxItem&lt;/code&gt; to the
max.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If that &lt;code&gt;rootID&lt;/code&gt; → {&lt;code&gt;loMaxItem&lt;/code&gt;,
&lt;code&gt;hiMaxItem&lt;/code&gt;} map is still empty, then consider this node to
be a (potential) root, and insert &lt;code&gt;rootID=thisNode&lt;/code&gt; -&amp;gt;
{&lt;code&gt;loMaxItem=maxKey&lt;/code&gt;, &lt;code&gt;hiMaxItem=maxKey&lt;/code&gt;} (where
&lt;code&gt;maxKey&lt;/code&gt; is the maximum value of the key datatype).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Take that &lt;code&gt;rootID&lt;/code&gt; → {&lt;code&gt;loMaxItem&lt;/code&gt;,
&lt;code&gt;hiMaxItem&lt;/code&gt;} map and insert it into the index as the entry
for this node.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The next index we build is the "item index". This is a "sorted
map" (implemented as a red-black tree, supporting sub-range iteration)
of &lt;code&gt;key&lt;/code&gt; → {&lt;code&gt;nodeID&lt;/code&gt;, &lt;code&gt;slotNumber&lt;/code&gt;}; a
map that for each key tells us where to find the item with that key.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Loop over the node index, and for each node check if both (a) it
has &lt;code&gt;level==0&lt;/code&gt; (is a leaf node containing items), and (b) its
set of roots that would include it has any overlap with the tree's set
of roots.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Loop over each of those included leaf nodes, and loop over the
items in each node. Insert the &lt;code&gt;key&lt;/code&gt; → {&lt;code&gt;nodeId&lt;/code&gt;,
&lt;code&gt;slot&lt;/code&gt;} into our sorted map. If there is already an entry for
that key, decide which one wins by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Use the one from the node with the owner-tree that is closer to
this tree; node with owner=thisTree wins over a node with
owner=thisTree.parent, which would win over a node with
owner.thisTree.parent.parent. If that's a tie, then...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use the one from the node with the higher generation. If that's a
tie, then...&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I don't know, I have the code &lt;code&gt;panic&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;// TODO: This is a panic because I&amp;#39;m not really sure what the
// best way to handle this is, and so if this happens I want the
// program to crash and force me to figure out how to handle it.
panic(fmt.Errorf(&amp;quot;dup nodes in tree=%v: old=%v=%v ; new=%v=%v&amp;quot;,
    tree.ID,
    oldNode, tree.forrest.graph.Nodes[oldNode],
    newNode, tree.forrest.graph.Nodes[newNode]))&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note that this algorithm means that for a given node we may use a few
items from that node, while having other items from that same node be
overridden by another node.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The final index we build is the "error index". This is an index
of what errors correspond to which range of keys, so that we can report
them, and give an idea of "there may be entries missing from this
directory" and similar.&lt;/p&gt;
&lt;p&gt;For each error, we'll track the min-key and max-key of the range it
applies to, the node it came from, and what the error string is. We'll
store these into an interval tree keyed on that min-key/max-key
range.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Create an empty set &lt;code&gt;nodesToProcess&lt;/code&gt;. Now populate
it:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Once again, we'll loop over the node index, but this time we'll
only check that there's overlap between the set of roots that would
include the node and the tree's set of roots. The nodes that are
included in this tree, insert both that node itself and all node IDs
that it has keypointers pointing to into the &lt;code&gt;nodesToProcess&lt;/code&gt;
set.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Also insert all of the tree's roots into
&lt;code&gt;nodesToProcess&lt;/code&gt;; this is in case the superblock/root-item
points to an invalid node that we couldn't read.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Now loop over &lt;code&gt;nodesToProcess&lt;/code&gt;. For each node, create
an empty list of errors. Use the keypointers pointing to and the min
&lt;code&gt;loMaxItem&lt;/code&gt; from the node index to construct a set of
expectations for the node; this should be reasonably straight-forward,
given:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;If different keypointers have disagreeing levels, insert an error
in to the list, and don't bother with checking the node's
level.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If different keypointers have disagreeing generations, insert an
error in to the list, and don't bother with checking the node's
generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If different keypointers have different min-item expectations,
use the max of them.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If the node is a "bad node" in the graph, insert the error message
associated with it. Otherwise, check those expectations against the node
in the graph.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If the list of error messages is non-empty, then insert their
concatenation into the interval tree, with the range set to the min of
the min-item expectations from the keypointers through the max of the
&lt;code&gt;hiMaxItem&lt;/code&gt;s from the node index. If the min min-item
expectation turns out to be higher than the max &lt;code&gt;hiMaxItem&lt;/code&gt;,
then set the range to the zero-key through the max-key.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;From there, it should be trivial to implement the usual B+ tree
operations using those indexes; exact-lookup using the item index, and
range-lookups and walks using the item index together with the error
index. Efficiently searching the &lt;code&gt;CSUM_TREE&lt;/code&gt; requires knowing
item sizes, so that's why we recorded the item sizes into the graph.&lt;/p&gt;
&lt;h3 id="the-rebuild-trees-algorithm"&gt;4.3.3. The
&lt;code&gt;rebuild-trees&lt;/code&gt; algorithm&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;btrfs inspect rebuild-trees&lt;/code&gt; algorithm finds nodes to
attach as extra roots to trees. I think that conceptually it's the the
simplest of the 3 algorithms, but turned out to be the hardest to get
right. So... maybe more than the others reference the source code too
(&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/&lt;/code&gt;&lt;/a&gt;)
because I might forget some small but important detail.&lt;/p&gt;
&lt;p&gt;The core idea here is that we're just going to walk each tree,
inspecting each item in the tree, and checking for any items that are
implied by other items (e.g.: a dir entry item implies the existence of
inode item for the inode that it points at). If an implied item is not
in the tree, but is in some other node, then we look at which potential
roots we could add to the tree that would add that other node. Then,
after we've processed all of the items in the filesystem, we go add
those various roots to the various trees, keeping track of which items
are added or updated. If any of those added/updated items have a version
with a newer generation on a different node, see what roots we could add
to get that newer version. Then add those roots, keeping track of items
that are added/updated. Once we reach steady-state with the newest
version of each item has been added, loop back and inspect all
added/updated items for implied items, keeping track of roots we could
add. Repeat until a steady-state is reached.&lt;/p&gt;
&lt;p&gt;There are lots of little details in that process, some of which are
for correctness, and some of which are for "it should run in hours
instead of weeks."&lt;/p&gt;
&lt;h4 id="initialization"&gt;4.3.3.1. initialization&lt;/h4&gt;
&lt;p&gt;First up, we're going to build and in-memory graph, same as above.
But this time, while we're reading the nodes to do that, we're also
going to watch for some specific items and record a few things about
them.&lt;/p&gt;
&lt;p&gt;(see: &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/scan.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/scan.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;For each {&lt;code&gt;nodeID&lt;/code&gt;, &lt;code&gt;slotNumber&lt;/code&gt;} pair that
matches one of these item types, we're going to record:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;flags:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;INODE_ITEM&lt;/code&gt;s: whether it has the
&lt;code&gt;INODE_NODATASUM&lt;/code&gt; flag set&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;names:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;DIR_INDEX&lt;/code&gt; items: the file's name&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;sizes:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;EXTENT_CSUM&lt;/code&gt; items: the number of bytes that this is a
sum for (i.e. the item size over the checksum size, times the block
size)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;EXTENT_DATA&lt;/code&gt; items: the number of bytes in this extent
(i.e. either the item size minus
&lt;code&gt;offsetof(btrfs_file_extent_item.disk_bytenr)&lt;/code&gt; if
&lt;code&gt;FILE_EXTENT_INLINE&lt;/code&gt;, or else the item's
&lt;code&gt;num_bytes&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;data backrefs:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;EXTENT_ITEM&lt;/code&gt;s and &lt;code&gt;METADATA_ITEM&lt;/code&gt;s: a list of
the same length as the number of refs embedded in the item; for embeded
ExtentDataRefs, the list entry is the subvolume tree ID that the
ExtentDataRef points at, otherwise it is zero.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;EXTENT_DATA_REF&lt;/code&gt; items: a list of length 1, with the
sole member being the subvolume tree ID that the ExtentDataRef points
at.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="the-main-loop"&gt;4.3.3.2. the main loop&lt;/h4&gt;
&lt;p&gt;(see: &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;Start with that scan data (graph + info about items), and also a
rebuilt forrest from the above algorithm, but with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;the flag set so that it refuses to look up a tree if it can't
look up all of that tree's ancestors&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;an additional "potential-item index" that is similar to the item
index. It is generated the same way and can cache/evict the same way;
the difference is that we invert the check for if the set of roots for a
node has overlap with the tree's set of roots; we're looking for
&lt;em&gt;potential&lt;/em&gt; nodes that we could add to this tree.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;some callbacks; we'll get to what we do in these callbacks in a
bit, but for now, what the callbacks are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;a callback that is called for each added/updated item when we add
a root.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;a callback that is called whenever we add a root&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;a callback that intercepts looking up a root item&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;a callback that intercepts resolving an UUID to an object
ID.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;(The callbacks are in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild_treecb.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/rebuild_treecb.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;We have 5 unordered queues ("work lists"?); these are sets that when
it's time to drain them we'll sort the members and process them in that
order.&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;the tree queue: a list of tree IDs that we need to crawl&lt;/li&gt;
&lt;li&gt;the retry-item queue: for each tree ID, a set of items that we
should re-process if we add a root to that tree&lt;/li&gt;
&lt;li&gt;the added-item queue: a set of key/tree pairs identifying items that
have been added by adding a root to a tree&lt;/li&gt;
&lt;li&gt;the settled-item-queue: a set of key/tree pairs that have have not
just been added by adding a root, but we've also verified that they are
the newest-generation item with that key that we could add to the
tree.&lt;/li&gt;
&lt;li&gt;the augment queue: for each item that we want to add to a tree, the
list of roots that we could add to get that item.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The roots all start out empty, except for the tree queue, which we
seed with the &lt;code&gt;ROOT_TREE&lt;/code&gt;, the &lt;code&gt;CHUNK_TREE&lt;/code&gt;, and
the &lt;code&gt;BLOCK_GROUP_TREE&lt;/code&gt; (It is a "TODO" task that it should
probably also be seeded with the &lt;code&gt;TREE_LOG&lt;/code&gt;, but as I will
say below in the "future work" section, I don't actually understand the
&lt;code&gt;TREE_LOG&lt;/code&gt;, so I couldn't implement it).&lt;/p&gt;
&lt;p&gt;Now we're going to loop until the tree queue, added-item queue,
settled-item queue, and augment queue are all empty (all queues except
for the retry-item queue). Each loop "pass" has 3 substeps:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;Crawl the trees (drain the tree queue, fill the added-item
queue).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Either:&lt;/p&gt;
&lt;ol type="a"&gt;
&lt;li&gt;&lt;p&gt;if the added-item queue is non-empty: "settle" those items (drain
the added-item queue, fill the augment queue and the settled-item
queue).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;otherwise: process items (drain the settled-item queue, fill the
augment queue and the tree queue)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apply augments (drain the augment queue and maybe the retry-item
queue, fill the added-item queue).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;OK, let's look at those 3 substeps in more detail:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;Crawl the trees; drain the tree queue, fill the added-item
queue.&lt;/p&gt;
&lt;p&gt;We just look up the tree in the rebuilt forrest, which will (per the
above &lt;code&gt;--rebuild&lt;/code&gt; algorithm) will either fail to look up the
tree, or succeed, and add to that tree the root node from the
superblock/root-item. Because we set an item-added callback, when adding
that root it will loop over the nodes added by that root, and call our
callback for each item in one of the added nodes. Our callback inserts
each item into the added-item queue. The forrest also calls our
root-added callback, but because of the way this algorithm works, that
turns out to be a no-op at this step.&lt;/p&gt;
&lt;p&gt;I mentioned that we added callbacks to intercept the forrest's
looking up of root items and resolving UUIDs; we override the forrest's
"lookup root item" routine and "resolve UUID" routine to instead of
doing normal lookups on the &lt;code&gt;ROOT_TREE&lt;/code&gt; and
&lt;code&gt;UUID_TREE&lt;/code&gt;, use the above &lt;code&gt;Want&lt;var&gt;XXX&lt;/var&gt;&lt;/code&gt;
routines that we'll define below in the "graph callbacks" section.&lt;/p&gt;
&lt;p&gt;It shouldn't matter what order this queue is processed in, but I sort
tree IDs numerically.&lt;/p&gt;
&lt;p&gt;The crawling is fairly fast because it's just in-memory, the only
accesses to disk are looking up root items and resolving UUIDs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Either:&lt;/p&gt;
&lt;ol type="a"&gt;
&lt;li&gt;&lt;p&gt;Settle items from the added-item queue to the settled-item queue
(and fill the augment queue).&lt;/p&gt;
&lt;p&gt;For each item in the queue, we look in the tree's item index to get
the {node, slot} pair for it, then we do the same in the tree's
potential-item index. If the potential-item index contains an entry for
the item's key, then we check if the potential-item's node should "win"
over the queue item's node, deciding the "winner" using the same routine
as when building the item index. If the potential-item's node wins, then
we add the potential node's set of roots to the augment queue. If the
queue-item's node wins, then we add the item to the settled-item queue
(except, as an optimization, if the item is of a type that cannot
possibly imply the existence of another item, then we just drop it and
don't add it to the settled-item queue).&lt;/p&gt;
&lt;p&gt;It shouldn't matter what order this queue is processed in, but I sort
it numerically by treeID and then by item key.&lt;/p&gt;
&lt;p&gt;This step is fairly fast because it's entirely in-memory, making no
accesses to disk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Process items from the settled-item queue (drain the settled-item
queue, fill the augment queue and the tree queue).&lt;/p&gt;
&lt;p&gt;This step accesses disk, and so the order we process the queue in
turns out to be pretty important in order to keep our disk access
patterns cache-friendly. For the most part, we just sort each queue item
by tree, then by key. But, we have special handling for
&lt;code&gt;EXTENT_ITEM&lt;/code&gt;s, &lt;code&gt;METADATA_ITEM&lt;/code&gt;s, and
&lt;code&gt;EXTENT_DATA_REF&lt;/code&gt; items: We break &lt;code&gt;EXTENT_ITEM&lt;/code&gt;s
and &lt;code&gt;METADATA_ITEM&lt;/code&gt;s in to "sub-items", treating each ref
embedded in them as a separate item. For those embedded items that are
&lt;code&gt;EXTENT_DATA_REF&lt;/code&gt;s, and for stand-alone
&lt;code&gt;EXTENT_DATA_REF&lt;/code&gt; items, we sort them not with the
&lt;code&gt;EXTENT_TREE&lt;/code&gt; items, but with the items of the tree that the
extent data ref points at. Recall that during the intitial scan step, we
took note of which tree every extent data ref points at, so we can
perform this sort without accessing disk yet. This splitting does mean
that we may visit/read an &lt;code&gt;EXTENT_ITEM&lt;/code&gt; or
&lt;code&gt;METADATA_ITEM&lt;/code&gt; multiple times as we process the queue, but
to do otherwise is to solve MinLA, which is NP-hard and also an optimal
MinLA solution I still think would perform worse than this; there is a
reasonably lengthy discussion of this in a comment in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go?id=18e6066c241cf3d252b6521150843ffc858d8434#n251"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go:sortSettledItemQueue()&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now we loop over that sorted queue. In the code, this loop is
deceptively simple. Read the item, then pass it to a function that tells
us what other items are implied by it. That function is large, but
simple; it's just a giant table. The trick is how it tells us about
implied items; we give it set of callbacks that it calls to tell us
these things; the real complexity is in the callbacks. These "graph
callbacks" will be discussed in detail below, but as an illustrative
example: It may call &lt;code&gt;.WantOff()&lt;/code&gt; with a tree ID, object ID,
item type, and offset to specify a precise item that it believes should
exist.&lt;/p&gt;
&lt;p&gt;If we encounter a &lt;code&gt;ROOT_ITEM&lt;/code&gt;, add the tree described by
that item to the tree queue.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;(Both the "can this item even imply the existence of another item"
check and the "what items are implied by this item" routine are in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/btrfscheck/graph.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./lib/btrfscheck/graph.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apply augments; drain the augment queue (and maybe the retry-item
queue), fill the added-item queuee.&lt;/p&gt;
&lt;p&gt;It is at this point that I call out that the augment queue isn't
implemented as a simple map/set like the others, the
&lt;code&gt;treeAugmentQueue struct&lt;/code&gt; has special handling for sets of
different sizes; optimizing the space for empty and len()==1 sized sets,
and falling back to normal the usual implementation for larger sets;
this is important because those small sets are the overwhelming
majority, and otherwise there's no way the program would be able to run
on my 32GB RAM laptop. Now that I think about it, I bet it would even be
worth it to add optimized storage for len()==2 sized sets.&lt;/p&gt;
&lt;p&gt;The reason is that each "want" from above is tracked in the queue
separately; if we were OK merging them, then this optimized storage
wouldn't be nescessary. But we keep them separate, so that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;For all "wants", including ones with empty sets, graph callbacks
can check if a want has already been processed; avoiding re-doing any
work (see the description of the graph callbacks below).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For "wants" with non-empty sets, we can see how many different
"wants" could be satisfied with a given root, in order to decide which
root to choose.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Anyway, we loop over the trees in the augment queue. For each tree we
look at that tree's augment queue and look at all the choices of root
nodes to add (below), and decide on a list to add. The we add each of
those roots to the tree; the adding of each root triggers several calls
to our item-added callback (filling the added-item queue), and our
root-added callback. The root-added callback moves any items from the
retry-item queue for this tree to the added-item queue.&lt;/p&gt;
&lt;p&gt;How do we decide between choices of root nodes to add? &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go?id=18e6066c241cf3d252b6521150843ffc858d8434#n528"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/rebuild.go:resolveTreeAugments()&lt;/code&gt;&lt;/a&gt;
has a good comment explaining the criteria we'd like to optimize for,
and then code that does an OK-ish job of actually optimizing for
that:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It loops over the augment queue for that tree, building a list of
possible roots, for each possible root making note of 3 things:&lt;/p&gt;
&lt;ol type="a"&gt;
&lt;li&gt;&lt;p&gt;how many "wants" that root satisfies,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;how far from treee the root's owner is (owner=tree is a distance
of 0, owner=tree.parent is a distance of 1, owner=tree.parent.parent is
a distance of 2, and so on), and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;what the generation of that root is.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We sort that list first by highest-count-first, then by
lowest-distance-first, then by highest-generation-first.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We create a "return" set and an "illegal" set. We loop over the
sorted list; for each possible root if it is in the illegal set, we skip
it, otherwise we insert it into the return set and for each "want" that
includes this root we all all roots that satisfy that want to the
illegal list.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;It is important that the rebuilt forrest have the flag set so that it
refuses to look up a tree if it can't look up all of that tree's
ancestors; otherwise the potential-items index would be garbage as we
wouldn't have a good idea of which nodes are OK to consider; but this
does have the downside that it won't even attempt to improve a tree with
a missing parent. Perhaps the algorithm should flip the flag once the
loop terminates, and then re-seed the tree queue with each
&lt;code&gt;ROOT_ITEM&lt;/code&gt; from the &lt;code&gt;ROOT_TREE&lt;/code&gt;?&lt;/p&gt;
&lt;h4 id="graph-callbacks"&gt;4.3.3.3. graph callbacks&lt;/h4&gt;
&lt;p&gt;(see: &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild_wantcb.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;./cmd/btrfs-rec/inspect/rebuildtrees/rebuild_wantcb.go&lt;/code&gt;&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;The graph callbacks are what tie the above together.&lt;/p&gt;
&lt;p&gt;For each of these callbacks, whenever I say that it looks up
something in a tree's item index or potential-item index, that implies
looking the tree up from the forrest; if the forrest cannot look up that
tree, then the callback returns early, after either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;if we are in substep 1 and are processing a tree: we add the tree
that is being processed to the tree queue. (TODO: Wait, this assumes
that an augment will be applied to the &lt;code&gt;ROOT_TREE&lt;/code&gt; before the
next pass... if that isn't the case, this will result in the loop never
terminating... I guess I need to add a separate retry-tree
queue?)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;if we are in substep 2 and are processing an item: we add the
item that is being processed to the retry-item queue for the tree that
cannot be looked up&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The 6 methods in the &lt;code&gt;brfscheck.GraphCallbacks&lt;/code&gt; interface
are:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;FSErr()&lt;/code&gt;: There's an error with the filesystem; this
callback just spits it out on stderr. I say such a trivial matter
because, again, for a recovery tool I think it's worth putting care in
to how you handle errors and where you expect them: We expect them here,
so we have to check for them to avoid reading invalid data or whatever,
but we don't actually need to do anything other than watch our
step.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Want()&lt;/code&gt;: We want an item in a given tree with a given
object ID and item type, but we don't care about what the item's offset
is.&lt;/p&gt;
&lt;p&gt;The callback works by searching the item index to see if it can find
such an item; if so, it has nothing else to do and returns. Otherwise,
it searches the potential-item index; for each matching item it finds it
looks in the node index for the node containing that item, and adds the
roots that would add that node, and adds those roots to a set. Once it
has finished searching the potential-item index, it adds that set to the
augment queue (even if that set is still empty).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;WantOff()&lt;/code&gt;: The same, but we want a specific
offset.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;WantDirIndex()&lt;/code&gt;: We want a &lt;code&gt;DIR_INDEX&lt;/code&gt;
item for a given inode and filename, but we don't know what the offset
of that item is.&lt;/p&gt;
&lt;p&gt;First we scan over the item index, looking at all
&lt;code&gt;DIR_INDEX&lt;/code&gt; items for that inode number. For each item, we
can check the scan data to see what the filename in that
&lt;code&gt;DIR_INDEX&lt;/code&gt; is, so we can see if the item satisfies this want
without accessing the disk. If there's a match, then there is nothing
else to do, so we return. Otherwise, we do that same search over the
potential-item index; if we find any matches, then we build the set of
roots to add to the augment queue the same as in
&lt;code&gt;Want&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;WantFileExt()&lt;/code&gt;: We want 1 or more
&lt;code&gt;DATA_EXTENT&lt;/code&gt; items in the given tree for the given inode,
and we want them to cover from 0 to a given size bytes of that file.&lt;/p&gt;
&lt;p&gt;First we walk that range in the item index, to build a list of the
gaps that we need to fill ("Step 1" in &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild_wantcb.go?id=18e6066c241cf3d252b6521150843ffc858d8434#n260"&gt;&lt;code&gt;rebuild_wantcb.go:_wantRange()&lt;/code&gt;&lt;/a&gt;).
This walk (&lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/cmd/btrfs-rec/inspect/rebuildtrees/rebuild_wantcb.go?id=18e6066c241cf3d252b6521150843ffc858d8434#n195"&gt;&lt;code&gt;rebuild_wantcb.go:_walkRange()&lt;/code&gt;&lt;/a&gt;)
requires knowing the size of each file extent; so doing this quickly
without hitting disk is why we recorded the size of each file extent in
our initialization step.&lt;/p&gt;
&lt;p&gt;Then ("Step 2" in &lt;code&gt;_wantRange()&lt;/code&gt;) we iterate over each of
the gaps, and for each gap do a very similar walk (again, by calling
&lt;code&gt;_walkRange()&lt;/code&gt;, but this time over the potential-item index.
For each file extent we find that has is entirely within the gap, we
"want" that extent, and move the beginning of of the gap forward to the
end of that extent. This algorithm is dumb and greedy, potentially
making sub-optimal selections; and so could probably stand to be
improved; but in my real-world use, it seems to be "good
enough".&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;WantCSum()&lt;/code&gt;: We want 1 or more
&lt;code&gt;EXTENT_CSUM&lt;/code&gt; items to cover the half-open interval
[&lt;code&gt;lo_logical_addr&lt;/code&gt;, &lt;code&gt;hi_logical_addr&lt;/code&gt;). Well,
maybe. It also takes a subvolume ID and an inode number; and looks up in
the scan data whether that inode has the &lt;code&gt;INODE_NODATASUM&lt;/code&gt;
flag set; if it does have the flag set, then it returns early without
looking for any &lt;code&gt;EXTENT_CSUM&lt;/code&gt; items. If it doesn't return
early, then it performs the same want-range routine as
&lt;code&gt;WantFileExt&lt;/code&gt;, but with the appropriate tree, object ID, and
item types for csums as opposed to data extents.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For each of these callbacks, we generate a "wantKey", a tuple
representing the function and its arguments; we check the augment-queue
to see if we've already enqueued a set of roots for that want, and if
so, that callback can return early without checking the potential-item
index.&lt;/p&gt;
&lt;h1 id="future-work"&gt;5. Future work&lt;/h1&gt;
&lt;p&gt;It's in a reasonably useful place, I think; and so now I'm going to
take a break from it for a while. But there's still lots of work to
do:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;RAID almost certainly doesn't work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Encryption is not implemented.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It doesn't understand (ignores) the &lt;code&gt;TREE_LOG&lt;/code&gt;
(because I don't understand the &lt;code&gt;TREE_LOG&lt;/code&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;btrfs-rec inspect mount&lt;/code&gt; should add "lost+found"
directories for inodes that are included in the subvolume's tree but
aren't reachable from the tree's root inode&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I still need to implement &lt;code&gt;btrfs-rec repair
&lt;var&gt;SUBCMD&lt;/var&gt;&lt;/code&gt; subcommands to write rebuilt-information from
&lt;code&gt;btrfs-rec inspect&lt;/code&gt; back to the filesystem.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I need to figure out the error handling/reporting story for
&lt;code&gt;mount&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It needs a lot more tests&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I'd like to get the existing btrfs-progs fsck tests to run on
it.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the process of writing this email, I realized that I probably
need to add a retry-tree queue; see the "graph callbacks" section in the
description of the &lt;code&gt;rebuild-trees&lt;/code&gt; algorithm above.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Shere are a number of "TODO" comments or panics in the code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Some of them definitely need done.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Some of them are &lt;code&gt;panic("TODO")&lt;/code&gt; on the basis that if
it's seeing something on the filesystem that it doesn't recognize, it's
probably that I didn't get to implementing that thing/situation, but
it's possible that the thing is just corrupt. This should only be for
situations that the node passed the checksum test, so it being corrupt
would have to be caused by a bug in btrfs rather than a failing drive or
other corruption; I wasn't too worried about btrfs bugs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;btrfs-rec inspect rebuild-trees&lt;/code&gt; is slow, and can
probably be made a lot faster.&lt;/p&gt;
&lt;p&gt;Just to give you an idea of the speeds, the run-times for the various
steps on my ThinkPad E15 for a 256GB disk image are as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt; btrfs-rec inspect rebuild-mappings scan       :     7m 31s
 btrfs-rec inspect rebuild-mappings list-nodes :        47s
 btrfs-rec inspect rebuild-mappings process    :     8m 22s
 btrfs-rec inspect rebuild-trees               : 1h  4m 55s
 btrfs-rec inspect ls-files                    :    29m 55s
 btrfs-rec inspect ls-trees                    :     8m 40s&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For the most part, it's all single-threaded (with the main exception
that in several places I/O has been moved to a separate thread from the
main CPU-heavy thread), but a lot of the algorithms could be
parallelized.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There are a lot of "tunable" values that I haven't really spent
time tuning. These are all annotated with &lt;a
href="https://git.lukeshu.com/btrfs-progs-ng/tree/lib/textui/tunable.go?id=18e6066c241cf3d252b6521150843ffc858d8434"&gt;&lt;code&gt;textui.Tunable()&lt;/code&gt;&lt;/a&gt;.
I sort-of intended for them to be adjustable on the CLI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Perhaps the &lt;code&gt;btrfs inspect rebuild-trees&lt;/code&gt; algorithm
could be adjusted to also try to rebuild trees with missing parents; see
the above discussion of the algorithm.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id="problems-for-merging-this-code-into-btrfs-progs"&gt;6. Problems for
merging this code into btrfs-progs&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;It's written in Go, not C.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It's effectively GPLv3+ (not GPLv2-only or GPLv2+) because of use
of some code under the Apache 2.0 license (2 files in the codebase
itself that are based off of Apache-licensed code, and use of unmodified
3rd-party libraries).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It uses ARC (Adaptive Replacement Cache), which is patented by
IBM, and the patent doesn't expire for another 7 months. An important
property of ARC over LRU is that it is scan-resistant; the above
algorithms do a lot of scanning. On that note, now that RedHat is owned
by IBM: who in the company do we need to get to talk to eachother so
that we can get ARC into the Linux kernel before then?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div style="font-family: monospace"&gt;
&lt;p&gt;-- &lt;br/&gt; Happy hacking,&lt;br/&gt; ~ Luke Shumaker&lt;br/&gt;&lt;/p&gt;
&lt;/div&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2023 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./posix-pricing.html"/>
		<link rel="alternate" type="text/markdown" href="./posix-pricing.md"/>
		<id>https://lukeshu.com/blog/posix-pricing.html</id>
		<updated>2018-02-09T00:00:00+00:00</updated>
		<published>2018-02-09T00:00:00+00:00</published>
		<title>POSIX pricing and availability; or: Do you really need the PDF?</title>
		<content type="html">&lt;h1
id="posix-pricing-and-availability-or-do-you-really-need-the-pdf"&gt;POSIX
pricing and availability; or: Do you really need the PDF?&lt;/h1&gt;
&lt;p&gt;The Open Group and IEEE are weird about POSIX pricing. They’re
protective of the PDF, making you pay &lt;a
href="http://standards.ieee.org/findstds/standard/1003.1-2008.html"&gt;hundreds
of dollars&lt;/a&gt; for the PDF; but will happily post an HTML version for
free both &lt;a
href="http://pubs.opengroup.org/onlinepubs/9699919799/"&gt;online&lt;/a&gt;, and
(with free account creation) download as a &lt;a
href="https://www2.opengroup.org/ogsys/catalog/t101"&gt;a .zip&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;They also offer a special license to the “Linux man-pages” project,
allowing them to &lt;a
href="https://www.kernel.org/pub/linux/docs/man-pages/man-pages-posix/"&gt;distribute&lt;/a&gt;
the man page portions of POSIX (most of it is written as a series of man
pages) for free; so on a GNU/Linux box, you probably have most of POSIX
already downloaded in manual sections 0p, 1p, and 3p.&lt;/p&gt;
&lt;p&gt;Anyway, the only thing you aren’t getting with the free HTML version
is a line number next to every line of text. It’s generated from the
same troff sources. So, in an article or in a discussion, I’m not
cheating you out of specification details by citing the webpage.&lt;/p&gt;
&lt;p&gt;If you’re concerned that you’re looking at the correct version of the
webpage or man pages, the current version (as of February 2018) of POSIX
is “POSIX-2008, 2016 edition.”&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2018 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./kbd-xmodmap.html"/>
		<link rel="alternate" type="text/markdown" href="./kbd-xmodmap.md"/>
		<id>https://lukeshu.com/blog/kbd-xmodmap.html</id>
		<updated>2018-02-09T00:00:00+00:00</updated>
		<published>2018-02-09T00:00:00+00:00</published>
		<title>GNU/Linux Keyboard Maps: xmodmap</title>
		<content type="html">&lt;h1 id="gnulinux-keyboard-maps-xmodmap"&gt;GNU/Linux Keyboard Maps:
xmodmap&lt;/h1&gt;
&lt;p&gt;The modmap subsystem is part of the core &lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html"&gt;X11
protocol&lt;/a&gt;. However, it has been replaced by the &lt;a
href="https://www.x.org/releases/current/doc/kbproto/xkbproto.html"&gt;X
Keyboard (XKB) Extension&lt;/a&gt; to the protocol, which defines a facade
that emulates the legacy modmap subsystem so that old programs still
work—including those that manipulate the modmap directly!&lt;/p&gt;
&lt;p&gt;For people who like to Keep It Stupid Simple, the XKB extension looks
horribly complicated and gross—even ignoring protocol details, the
configuration syntax is a monstrosity! There’s no way to say something
like “I’d like to remap Caps-Lock to be Control”, you have to copy and
edit the entire keyboard definition, which includes mucking with vector
graphics of the physical keyboard layout! So it’s very tempting to
pretend that XKB doesn’t exist, and it’s still using modmap.&lt;/p&gt;
&lt;p&gt;However, this is a leaky abstraction; for instance: when running the
&lt;code&gt;xmodmap&lt;/code&gt; command to manipulate the modmap, if you have
multiple keyboards plugged in, the result can depend on which keyboard
you used to press “enter” after typing the command!&lt;/p&gt;
&lt;p&gt;Despite only existing as a compatibility shim today, I think it is
important to understand the modmap subsystem to understand modern
XKB.&lt;/p&gt;
&lt;h2 id="conceptual-overview"&gt;Conceptual overview&lt;/h2&gt;
&lt;p&gt;There are 3 fundamental tasks that the modmap subsystem performs:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; keysym&lt;/code&gt;
(client-side)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; modifier bitmask&lt;/code&gt;
(server-side)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pointer:  map  physical button -&amp;gt; logical button&lt;/code&gt;
(server-side)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You’re thinking: “Great, so the X server does these things for us!”
Nope! Not entirely, anyway. It does the keycode-&amp;gt;modifier lookup, and
the mouse-button lookup, but the keycode-&amp;gt;keysym lookup must be done
client-side by querying the mapping stored on the server. Generally,
this is done automatically inside of libX11/libxcb, and the actual
client application code doesn’t need to worry about it.&lt;/p&gt;
&lt;p&gt;So, what’s the difference between a keycode and a keysym, and how’s
the modifier bitmask work?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;keycode: A numeric ID for a hardware button; this is as close the
the hardware as X11 modmaps let us get. These are conceptually identical
to Linux kernel keycodes, but the numbers don’t match up. Xorg keycodes
are typically &lt;code&gt;linux_keycode+8&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;keysym: A 29-bit integer code that is meaningful to applications.
A mapping of these to symbolic names is defined in
&lt;code&gt;&amp;lt;X11/keysymdef.h&amp;gt;&lt;/code&gt; and augmented by
&lt;code&gt;/usr/share/X11/XKeysymDB&lt;/code&gt;. See:
&lt;code&gt;XStringToKeysym()&lt;/code&gt; and &lt;code&gt;XKeysymToString()&lt;/code&gt;. We
will generally use the symbolic name in the modmap file. The symbolic
names are case-sensitive.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Modifier state: An 8-bit bitmask of modifier keys (names are
case-insensitive):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;1 &amp;lt;&amp;lt; 0 : shift
1 &amp;lt;&amp;lt; 1 : lock
1 &amp;lt;&amp;lt; 2 : control
1 &amp;lt;&amp;lt; 3 : mod1
1 &amp;lt;&amp;lt; 4 : mod2
1 &amp;lt;&amp;lt; 5 : mod3
1 &amp;lt;&amp;lt; 6 : mod4
1 &amp;lt;&amp;lt; 7 : mod5&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With that knowledge, and the libX11/libxcb API docs, you can probably
figure out how to interact with the modmap subsystem from C, but who
does that? Everyone just uses the &lt;code&gt;xmodmap(1)&lt;/code&gt; command.&lt;/p&gt;
&lt;h2 id="the-x11-protocol"&gt;The X11 protocol&lt;/h2&gt;
&lt;p&gt;As I said, the modifier and button lookup is handled server-side;
each of the &lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#events:input"&gt;input
events&lt;/a&gt; ({Key,Button}{Press,Release}, and MotionNotify) and &lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#events:pointer_window"&gt;pointer
window events&lt;/a&gt; ({Enter,Leave}Notify) include a bitmask of active
keyboard modifiers and pointer buttons. Each are given an 8-bit
bitmask—hence 8 key modifiers. For some reason, only up to Button5 is
included in the bitmask; the upper 3 bits are always zero; but the
Button{Press,Release} events will happily deliver events for up to
Button255!&lt;/p&gt;
&lt;p&gt;The X11 protocol has 6 request types for dealing with these 3
mappings; an accessor and a mutator pair for each. Since the 2 of the
mappings are done server-side, of these, most clients will only use
GetKeyboardMapping. Anyway, let’s look at those 6 requests, grouped by
the mappings that they work with (pardon the Java-like pseudo-code
syntax for indicating logical argument and return types):&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; keysym&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:GetKeyboardMapping"&gt;GetKeyboardMapping&lt;/a&gt;
::
&lt;code&gt;List&amp;lt;keycode&amp;gt; -&amp;gt; Map&amp;lt;keycode,List&amp;lt;keysym&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:ChangeKeyboardMapping"&gt;ChangeKeyboardMapping&lt;/a&gt;
:: &lt;code&gt;Map&amp;lt;keycode,List&amp;lt;keysym&amp;gt;&amp;gt; -&amp;gt; ()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;GetKeyboardMapping&lt;/code&gt; returns the keycode-&amp;gt;keysym
mappings for the requested keycodes; this way clients can choose to look
up only the keycodes that they need to handle (the ones that got sent to
them). Each keycode gets a list of keysyms; which keysym they should use
from that list depends on which modifiers are pressed.
&lt;code&gt;ChangeKeyboardMapping&lt;/code&gt; changes the mapping for the given
keycodes; not all keycodes must be given, any keycodes that aren’t
included in the request aren’t changed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; modifier bitmask&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:GetModifierMapping"&gt;GetModifierMapping&lt;/a&gt;
:: &lt;code&gt;() -&amp;gt; Map&amp;lt;modifier,List&amp;lt;keycode&amp;gt;&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:SetModifierMapping"&gt;SetModifierMapping&lt;/a&gt;
:: &lt;code&gt;Map&amp;lt;modifier,List&amp;lt;keycode&amp;gt;&amp;gt; -&amp;gt; ()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The modifiers mapping is a lot smaller than the keysym mapping; you
must operate on the entire mapping at once. For each modifier bit,
there’s a list of keycodes that will cause that modifier bit to be
flipped in the events that are delivered while it is pressed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;pointer:  map  physical button -&amp;gt; logical button&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:GetPointerMapping"&gt;GetPointerMapping&lt;/a&gt;
&lt;code&gt;() -&amp;gt; List&amp;lt;logicalButton&amp;gt;&lt;/code&gt; (indexed by
&lt;code&gt;physicalButton-1&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:SetPointerMapping"&gt;SetPointerMapping&lt;/a&gt;
&lt;code&gt;List&amp;lt;logicalButton&amp;gt; -&amp;gt; ()&lt;/code&gt; (indexed by
&lt;code&gt;physicalButton-1&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Like the modifier mapping, the button mapping is expected to be
small, most mice only have 5-7 buttons (left, middle, right, scroll up,
scroll down, scroll left, scroll right—that’s right, X11 handles scroll
events as button presses), though some fancy gaming mice have more than
that, but not much more.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I mentioned earlier that the keycode-&amp;gt;keysym mapping isn’t
actually done by the X server, and is done in the client; whenever a
client receives a key event or pointer button event, it must do a
&lt;code&gt;Get*Mapping&lt;/code&gt; request to see what that translates to. Of
course, doing a that for every keystroke would be crazy; but at the same
time, the each client is expected to know about changes to the mappings
that happen at run-time. So, each of the “set”/“change” commands
generate a &lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#events:MappingNotify"&gt;MappingNotify&lt;/a&gt;
event that is sent to all clients, so they know when they must dump
their cache of mappings.&lt;/p&gt;
&lt;p&gt;For completeness, if you are looking at this as background for
understanding XKB, I should also mention:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:GetKeyboardControl"&gt;GetKeyboardControl&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:ChangeKeyboardControl"&gt;ChangeKeyboardControl&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:GetPointerControl"&gt;GetPointerControl&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://www.x.org/releases/current/doc/xproto/x11protocol.html#requests:ChangePointerControl"&gt;ChangePointerControl&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="the-xmodmap-command"&gt;The &lt;code&gt;xmodmap&lt;/code&gt; command&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;xmodmap&lt;/code&gt; command reads a configuration file and
modifies the maps in the X server to match. The &lt;code&gt;xmodmap&lt;/code&gt;
config file has its own little quirky syntax. For one, the comment
character is &lt;code&gt;!&lt;/code&gt; (and comments may only start at the
&lt;em&gt;beginning&lt;/em&gt; of the line, but that’s fairly common).&lt;/p&gt;
&lt;p&gt;There are 8 commands that &lt;code&gt;xmodmap&lt;/code&gt; recognizes. Let’s look
at those, grouped by the 3 tasks that the modmap subsystem performs:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; keysym&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keycode KEYCODE = PLAIN [SHIFT [MODE_SWITCH [MODE_SWITCH+SHIFT ]]]&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Actually takes a list of up to 8 keysyms, but only the first 4 have
standard uses.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keysym OLD_KEYSYM = NEW_KEYSYMS...&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Takes the keycodes mapped to &lt;code&gt;OLD_KEYSYM&lt;/code&gt; and maps them to
&lt;code&gt;NEW_KEYSYM&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keysym any = KEYSYMS...&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Finds an otherwise unused keycode, and has it map to the specified
keysyms.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;keyboard: map  keycode         -&amp;gt; modifier bitmask&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;clear MODIFIER&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;add MODIFIERNAME = KEYSYMS...&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;remove MODIFIERNAME = KEYSYMS...&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Wait, the modmap subsystem maps &lt;em&gt;keycodes&lt;/em&gt; to modifiers, but
the commands take &lt;em&gt;keysyms&lt;/em&gt;? Yup! When executing one of these
commands, it first looks up those keysyms in the keyboard map to
translate them in to a set of keycodes, then associates those keycodes
with that modifier. But how does it look up keysym-&amp;gt;keycode; the
protocol only supports querying keycode-&amp;gt;keysym? It &lt;a
href="https://cgit.freedesktop.org/xorg/app/xmodmap/tree/handle.c?h=xmodmap-1.0.9#n59"&gt;loops&lt;/a&gt;
over &lt;em&gt;every&lt;/em&gt; keycode finding all the matches.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;pointer:  map  physical button -&amp;gt; logical button&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;pointer = default&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This is equivalent to &lt;code&gt;pointer = 1 2 3 4 5 6...&lt;/code&gt; where the
list is as long as the number of buttons that there are.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;pointer = NUMBERS...&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pointer = A B C D...&lt;/code&gt; sets the physical button 1 to
logical button A, physical button 2 to logical button B, and so on.
Setting a physical button to logical button 0 disables that
button.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="appendix"&gt;Appendix:&lt;/h2&gt;
&lt;p&gt;I use this snippet in my Emacs configuration to make editing xmodmap
files nicer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;;; http://www.emacswiki.org/emacs/XModMapMode
(when (not (fboundp &amp;#39;xmodmap-mode))
  (define-generic-mode &amp;#39;xmodmap-mode
    &amp;#39;(?!)
    &amp;#39;(&amp;quot;add&amp;quot; &amp;quot;clear&amp;quot; &amp;quot;keycode&amp;quot; &amp;quot;keysym&amp;quot; &amp;quot;pointer&amp;quot; &amp;quot;remove&amp;quot;)
    nil
    &amp;#39;(&amp;quot;[xX]modmap\\(rc\\)?\\&amp;#39;&amp;quot;)
    nil
    &amp;quot;Simple mode for xmodmap files.&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2018 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./crt-sh-architecture.html"/>
		<link rel="alternate" type="text/markdown" href="./crt-sh-architecture.md"/>
		<id>https://lukeshu.com/blog/crt-sh-architecture.html</id>
		<updated>2018-02-09T00:00:00+00:00</updated>
		<published>2018-02-09T00:00:00+00:00</published>
		<title>The interesting architecture of crt.sh</title>
		<content type="html">&lt;h1 id="the-interesting-architecture-of-crt.sh"&gt;The interesting
architecture of crt.sh&lt;/h1&gt;
&lt;p&gt;A while back I wrote myself a little dashboard for monitoring TLS
certificates for my domains. Right now it works by talking to &lt;a
href="https://crt.sh/" class="uri"&gt;https://crt.sh/&lt;/a&gt;. Sometimes this
works great, but sometimes crt.sh is really slow. Plus, it’s another
thing that could be compromised.&lt;/p&gt;
&lt;p&gt;So, I started looking at how crt.sh works. It’s kinda cool.&lt;/p&gt;
&lt;p&gt;There are only 3 separate processes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Cron
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/ct_monitor"&gt;&lt;code&gt;ct_monitor&lt;/code&gt;&lt;/a&gt;
is program that uses libcurl to get CT log changes and libpq to put them
into the database.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;PostgreSQL
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/certwatch_db"&gt;&lt;code&gt;certwatch_db&lt;/code&gt;&lt;/a&gt;
is the core web application, written in PL/pgSQL. It even includes the
HTML templating and query parameter handling. Of course, there are a
couple of things not entirely done in pgSQL…&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/libx509pq"&gt;&lt;code&gt;libx509pq&lt;/code&gt;&lt;/a&gt;
adds a set of &lt;code&gt;x509_*&lt;/code&gt; functions callable from pgSQL for
parsing X509 certificates.&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/libcablintpq"&gt;&lt;code&gt;libcablintpq&lt;/code&gt;&lt;/a&gt;
adds the &lt;code&gt;cablint_embedded(bytea)&lt;/code&gt; function to pgSQL.&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/libx509lintpq"&gt;&lt;code&gt;libx509lintpq&lt;/code&gt;&lt;/a&gt;
adds the &lt;code&gt;x509lint_embedded(bytea,integer)&lt;/code&gt; function to
pgSQL.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Apache HTTPD
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/mod_certwatch"&gt;&lt;code&gt;mod_certwatch&lt;/code&gt;&lt;/a&gt;
is a pretty thin wrapper that turns every HTTP request into an SQL
statement sent to PostgreSQL, via…&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://github.com/crtsh/mod_pgconn"&gt;&lt;code&gt;mod_pgconn&lt;/code&gt;&lt;/a&gt;,
which manages PostgreSQL connections.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The interface exposes HTML, ATOM, and JSON. All from code written in
SQL.&lt;/p&gt;
&lt;p&gt;And then I guess it’s behind an nginx-based load-balancer or somesuch
(based on the 504 Gateway Timout messages it’s given me). But that’s not
interesting.&lt;/p&gt;
&lt;p&gt;The actual website is &lt;a
href="https://groups.google.com/d/msg/mozilla.dev.security.policy/EPv_u9V06n0/gPJY5T7ILlQJ"&gt;run
from a read-only slave&lt;/a&gt; of the master DB that the
&lt;code&gt;ct_monitor&lt;/code&gt; cron-job updates; which makes several security
considerations go away, and makes horizontal scaling easy.&lt;/p&gt;
&lt;p&gt;Anyway, I thought it was neat that so much of it runs inside the
database; you don’t see that terribly often. I also thought the little
shims to make that possible were neat. I didn’t get deep enough in to it
to end up running my own instance or clone, but I thought my notes on it
were worth sharing.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2018 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./http-notes.html"/>
		<link rel="alternate" type="text/markdown" href="./http-notes.md"/>
		<id>https://lukeshu.com/blog/http-notes.html</id>
		<updated>2016-09-30T00:00:00+00:00</updated>
		<published>2016-09-30T00:00:00+00:00</published>
		<title>Notes on subtleties of HTTP implementation</title>
		<content type="html">&lt;h1 id="notes-on-subtleties-of-http-implementation"&gt;Notes on subtleties
of HTTP implementation&lt;/h1&gt;
&lt;p&gt;I may add to this as time goes on, but I’ve written up some notes on
subtleties HTTP/1.1 message syntax as specified in RFC 2730.&lt;/p&gt;
&lt;h2 id="why-the-absolute-form-is-used-for-proxy-requests"&gt;Why the
absolute-form is used for proxy requests&lt;/h2&gt;
&lt;p&gt;&lt;a
href="https://tools.ietf.org/html/rfc7230#section-5.3.2"&gt;RFC7230§5.3.2&lt;/a&gt;
says that a (non-CONNECT) request to an HTTP proxy should look like&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GET http://authority/path HTTP/1.1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;rather than the usual&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;GET /path HTTP/1.1
Host: authority&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And doesn’t give a hint as to why the message syntax is different
here.&lt;/p&gt;
&lt;p&gt;&lt;a
href="https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/#3-1-1-why-not-use-the-host-header"&gt;A
blog post by Parsia Hakimian&lt;/a&gt; claims that the reason is that it’s a
legacy behavior inherited from HTTP/1.0, which had proxies, but not the
Host header field. Which is mostly true. But we can also realize that
the usual syntax does not allow specifying a URI scheme, which means
that we cannot specify a transport. Sure, the only two HTTP transports
we might expect to use today are TCP (scheme: http) and TLS (scheme:
https), and TLS requires we use a CONNECT request to the proxy, meaning
that the only option left is a TCP transport; but that is no reason to
avoid building generality into the protocol.&lt;/p&gt;
&lt;h2 id="on-taking-short-cuts-based-on-early-header-field-values"&gt;On
taking short-cuts based on early header field values&lt;/h2&gt;
&lt;p&gt;&lt;a
href="https://tools.ietf.org/html/rfc7230#section-3.2.2"&gt;RFC7230§3.2.2&lt;/a&gt;
says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;The order in which header fields with differing field names are
received is not significant.  However, it is good practice to send
header fields that contain control data first, such as Host on
requests and Date on responses, so that implementations can decide
when not to handle a message as early as possible.&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which is great! We can make an optimization!&lt;/p&gt;
&lt;p&gt;This is only a valid optimization for deciding to &lt;em&gt;not handle&lt;/em&gt;
a message. You cannot use it to decide to route to a backend early based
on this. Part of the reason is that &lt;a
href="https://tools.ietf.org/html/rfc7230#section-5.4"&gt;§5.4&lt;/a&gt; tells us
we must inspect the entire header field set to know if we need to
respond with a 400 status code:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;A server MUST respond with a 400 (Bad Request) status code to any
HTTP/1.1 request message that lacks a Host header field and to any
request message that contains more than one Host header field or a
Host header field with an invalid field-value.&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;However, if I decide not to handle a request based on the Host header
field, the correct thing to do is to send a 404 status code. Which
implies that I have parsed the remainder of the header field set to
validate the message syntax. We need to parse the entire field-set to
know if we need to send a 400 or a 404. Did this just kill the
possibility of using the optimization?&lt;/p&gt;
&lt;p&gt;Well, there are a number of “A server MUST respond with a XXX code
if” rules that can all be triggered on the same request. So we get to
choose which to use. And fortunately for optimizing implementations, &lt;a
href="https://tools.ietf.org/html/rfc7230#section-3.2.5"&gt;§3.2.5&lt;/a&gt; gave
us:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;A server that receives a            ...           set of fields,
larger than it wishes to process MUST respond with an appropriate 4xx
(Client Error) status code.&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Since the header field set is longer than we want to process (since
we want to short-cut processing), we are free to respond with whichever
4XX status code we like!&lt;/p&gt;
&lt;h2 id="on-normalizing-target-uris"&gt;On normalizing target URIs&lt;/h2&gt;
&lt;p&gt;An implementer is tempted to normalize URIs all over the place, just
for safety and sanitation. After all, &lt;a
href="https://tools.ietf.org/html/rfc3986#section-6.1"&gt;RFC3986§6.1&lt;/a&gt;
says it’s safe!&lt;/p&gt;
&lt;p&gt;Unfortunately, most URI normalization implementations will normalize
an empty path to “/”. Which is not always safe; &lt;a
href="https://tools.ietf.org/html/rfc7230#section-2.7.3"&gt;RFC7230§2.7.3&lt;/a&gt;,
which defines this “equivalence”, actually says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;pre&gt;&lt;code&gt;                                        When not being used in
absolute form as the request target of an OPTIONS request, an empty
path component is equivalent to an absolute path of &amp;quot;/&amp;quot;, so the
normal form is to provide a path of &amp;quot;/&amp;quot; instead.&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which means we can’t use the usual normalization implementation if we
are making an OPTIONS request!&lt;/p&gt;
&lt;p&gt;Why is that? Well, if we turn to &lt;a
href="https://tools.ietf.org/html/rfc7230#section-5.3.4"&gt;§5.3.4&lt;/a&gt;, we
find the answer. One of the special cases for when the request target is
not a URI, is that we may use “*” as the target for an OPTIONS request
to request information about the origin server itself, rather than a
resource on that server.&lt;/p&gt;
&lt;p&gt;However, as discussed above, the target in a request to a proxy must
be an absolute URI (and &lt;a
href="https://tools.ietf.org/html/rfc7230#section-5.3.2"&gt;§5.3.2&lt;/a&gt; says
that the origin server must also understand this syntax). So, we must
define a way to map “*” to an absolute URI.&lt;/p&gt;
&lt;p&gt;Naively, one might be tempted to use “/*” as the path. But that would
make it impossible to have a resource actually named “/*”. So, we must
define a special case in the URI syntax that doesn’t obstruct a real
path.&lt;/p&gt;
&lt;p&gt;If we didn’t have this special case in the URI normalization rules,
and we handled the “/” path as the same as empty in the OPTIONS handler
of the last proxy server, then it would be impossible to request OPTIONS
for the “/” resources, as it would get translated into “*” and treated
as OPTIONS for the entire server.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2016 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./x11-systemd.html"/>
		<link rel="alternate" type="text/markdown" href="./x11-systemd.md"/>
		<id>https://lukeshu.com/blog/x11-systemd.html</id>
		<updated>2016-02-28T00:00:00+00:00</updated>
		<published>2016-02-28T00:00:00+00:00</published>
		<title>My X11 setup with systemd</title>
		<content type="html">&lt;h1 id="my-x11-setup-with-systemd"&gt;My X11 setup with systemd&lt;/h1&gt;
&lt;p&gt;Somewhere along the way, I decided to use systemd user sessions to
manage the various parts of my X11 environment would be a good idea. If
that was a good idea or not… we’ll see.&lt;/p&gt;
&lt;p&gt;I’ve sort-of been running this setup as my daily-driver for &lt;a
href="https://lukeshu.com/git/dotfiles.git/commit/?id=a9935b7a12a522937d91cb44a0e138132b555e16"&gt;a
bit over a year&lt;/a&gt;, continually tweaking it though.&lt;/p&gt;
&lt;p&gt;My setup is substantially different than the one on &lt;a
href="https://wiki.archlinux.org/index.php/Systemd/User"&gt;ArchWiki&lt;/a&gt;,
because the ArchWiki solution assumes that there is only ever one X
server for a user; I like the ability to run &lt;code&gt;Xorg&lt;/code&gt; on my
real monitor, and also have &lt;code&gt;Xvnc&lt;/code&gt; running headless, or start
my desktop environment on a remote X server. Though, I would like to
figure out how to use systemd socket activation for the X server, as the
ArchWiki solution does.&lt;/p&gt;
&lt;p&gt;This means that all of my graphical units take &lt;code&gt;DISPLAY&lt;/code&gt;
as an &lt;code&gt;@&lt;/code&gt; argument. To get this to all work out, this goes in
each &lt;code&gt;.service&lt;/code&gt; file, unless otherwise noted:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[Unit]
After=X11@%i.target
Requisite=X11@%i.target
[Service]
Environment=DISPLAY=%I&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’ll get to &lt;code&gt;X11@.target&lt;/code&gt; later, what it says is “I
should only be running if X11 is running”.&lt;/p&gt;
&lt;p&gt;I eschew complex XDMs or &lt;code&gt;startx&lt;/code&gt; wrapper scripts, opting
for the more simple &lt;code&gt;xinit&lt;/code&gt;, which I either run on login for
some boxes (my media station), or type &lt;code&gt;xinit&lt;/code&gt; when I want
X11 on others (most everything else). Essentially, what
&lt;code&gt;xinit&lt;/code&gt; does is run &lt;code&gt;~/.xserverrc&lt;/code&gt; (or
&lt;code&gt;/etc/X11/xinit/xserverrc&lt;/code&gt;) to start the server, then once
the server is started (which it takes a substantial amount of magic to
detect) it runs run &lt;code&gt;~/.xinitrc&lt;/code&gt; (or
&lt;code&gt;/etc/X11/xinit/xinitrc&lt;/code&gt;) to start the clients. Once
&lt;code&gt;.xinitrc&lt;/code&gt; finishes running, it stops the X server and exits.
Now, when I say “run”, I don’t mean execute, it passes each file to the
system shell (&lt;code&gt;/bin/sh&lt;/code&gt;) as input.&lt;/p&gt;
&lt;p&gt;Xorg requires a TTY to run on; if we log in to a TTY with
&lt;code&gt;logind&lt;/code&gt;, it will give us the &lt;code&gt;XDG_VTNR&lt;/code&gt; variable
to tell us which one we have, so I pass this to &lt;code&gt;X&lt;/code&gt; in &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/X11/serverrc"&gt;my
&lt;code&gt;.xserverrc&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/hint/sh
if [ -z &amp;quot;$XDG_VTNR&amp;quot; ]; then
  exec /usr/bin/X -nolisten tcp &amp;quot;$@&amp;quot;
else
  exec /usr/bin/X -nolisten tcp &amp;quot;$@&amp;quot; vt$XDG_VTNR
fi&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This was the default for &lt;a
href="https://projects.archlinux.org/svntogit/packages.git/commit/trunk/xserverrc?h=packages/xorg-xinit&amp;amp;id=f9f5de58df03aae6c8a8c8231a83327d19b943a1"&gt;a
while&lt;/a&gt; in Arch, to support &lt;code&gt;logind&lt;/code&gt;, but was &lt;a
href="https://projects.archlinux.org/svntogit/packages.git/commit/trunk/xserverrc?h=packages/xorg-xinit&amp;amp;id=5a163ddd5dae300e7da4b027e28c37ad3b535804"&gt;later
removed&lt;/a&gt; in part because &lt;code&gt;startx&lt;/code&gt; (which calls
&lt;code&gt;xinit&lt;/code&gt;) started adding it as an argument as well, so
&lt;code&gt;vt$XDG_VTNR&lt;/code&gt; was being listed as an argument twice, which is
an error. IMO, that was a problem in &lt;code&gt;startx&lt;/code&gt;, and they
shouldn’t have removed it from the default system
&lt;code&gt;xserverrc&lt;/code&gt;, but that’s just me. So I copy/pasted it into my
user &lt;code&gt;xserverrc&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That’s the boring part, though. Where the magic starts happening is
in &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/X11/clientrc"&gt;my
&lt;code&gt;.xinitrc&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/hint/sh

if [ -z &amp;quot;$XDG_RUNTIME_DIR&amp;quot; ]; then
    printf &amp;quot;XDG_RUNTIME_DIR isn&amp;#39;t set\n&amp;quot; &amp;gt;&amp;amp;2
    exit 6
fi

_DISPLAY=&amp;quot;$(systemd-escape -- &amp;quot;$DISPLAY&amp;quot;)&amp;quot;
trap &amp;quot;rm -f $(printf &amp;#39;%q&amp;#39; &amp;quot;${XDG_RUNTIME_DIR}/x11-wm@${_DISPLAY}&amp;quot;)&amp;quot; EXIT
mkfifo &amp;quot;${XDG_RUNTIME_DIR}/x11-wm@${_DISPLAY}&amp;quot;

cat &amp;lt; &amp;quot;${XDG_RUNTIME_DIR}/x11-wm@${_DISPLAY}&amp;quot; &amp;amp;
systemctl --user start &amp;quot;X11@${_DISPLAY}.target&amp;quot; &amp;amp;
wait
systemctl --user stop &amp;quot;X11@${_DISPLAY}.target&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are two contracts/interfaces here: the
&lt;code&gt;X11@DISPLAY.target&lt;/code&gt; systemd target, and the
&lt;code&gt;${XDG_RUNTIME_DIR}/x11-wm@DISPLAY&lt;/code&gt; named pipe. The systemd
&lt;code&gt;.target&lt;/code&gt; should be pretty self explanatory; the most
important part is that it starts the window manager. The named pipe is
just a hacky way of blocking until the window manager exits
(“traditional” &lt;code&gt;.xinitrc&lt;/code&gt; files end with the line
&lt;code&gt;exec your-window-manager&lt;/code&gt;, so this mimics that behavior). It
works by assuming that the window manager will open the pipe at startup,
and keep it open (without necessarily writing anything to it); when the
window manager exits, the pipe will get closed, sending EOF to the
&lt;code&gt;wait&lt;/code&gt;ed-for &lt;code&gt;cat&lt;/code&gt;, allowing it to exit, letting
the script resume. The window manager (WMII) is made to have the pipe
opened by executing it this way in &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/wmii@.service"&gt;its
&lt;code&gt;.service&lt;/code&gt; file&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ExecStart=/usr/bin/env bash -c &amp;#39;exec 8&amp;gt;${XDG_RUNTIME_DIR}/x11-wm@%I; exec wmii&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;which just opens the file on file descriptor 8, then launches the
window manager normally. The only further logic required by the window
manager with regard to the pipe is that in the window manager &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/wmii-hg/config.sh"&gt;configuration&lt;/a&gt;,
I should close that file descriptor after forking any process that isn’t
“part of” the window manager:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;runcmd() (
    ...
    exec 8&amp;gt;&amp;amp;- # xinit/systemd handshake
    ...
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, back to the &lt;code&gt;X11@DISPLAY.target&lt;/code&gt;; I configure what it
“does” with symlinks in the &lt;code&gt;.requires&lt;/code&gt; and
&lt;code&gt;.wants&lt;/code&gt; directories:&lt;/p&gt;
&lt;ul class="tree"&gt;
&lt;li&gt;
&lt;p&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user"&gt;.config/systemd/user/&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/X11@.target"&gt;X11@.target&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/X11@.target.requires"&gt;X11@.target.requires&lt;/a&gt;/
&lt;ul&gt;
&lt;li&gt;wmii@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/wmii@.service"&gt;wmii@.service&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/X11@.target.wants"&gt;X11@.target.wants&lt;/a&gt;/
&lt;ul&gt;
&lt;li&gt;xmodmap@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xmodmap@.service"&gt;xmodmap@.service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;xresources-dpi@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xresources-dpi@.service"&gt;xresources-dpi@.service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;xresources@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xresources@.service"&gt;xresources@.service&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code&gt;.requires&lt;/code&gt; directory is how I configure which window
manager it starts. This would allow me to configure different window
managers on different displays, by creating a &lt;code&gt;.requires&lt;/code&gt;
directory with the &lt;code&gt;DISPLAY&lt;/code&gt; included,
e.g. &lt;code&gt;X11@:2.requires&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;.wants&lt;/code&gt; directory is for general X display setup;
it’s analogous to &lt;code&gt;/etc/X11/xinit/xinitrc.d/&lt;/code&gt;. All of the
files in it are simple &lt;code&gt;Type=oneshot&lt;/code&gt; service files. The &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xmodmap@.service"&gt;xmodmap&lt;/a&gt;
and &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xresources@.service"&gt;xresources&lt;/a&gt;
files are pretty boring, they’re just systemd versions of the couple
lines that just about every traditional &lt;code&gt;.xinitrc&lt;/code&gt; contains,
the biggest difference being that they look at &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/X11/modmap"&gt;&lt;code&gt;~/.config/X11/modmap&lt;/code&gt;&lt;/a&gt;
and &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/X11/resources"&gt;&lt;code&gt;~/.config/X11/resources&lt;/code&gt;&lt;/a&gt;
instead of the traditional locations &lt;code&gt;~/.xmodmap&lt;/code&gt; and
&lt;code&gt;~/.Xresources&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;What’s possibly of note is &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xresources-dpi@.service"&gt;&lt;code&gt;xresources-dpi@.service&lt;/code&gt;&lt;/a&gt;.
In X11, there are two sources of DPI information, the X display
resolution, and the XRDB &lt;code&gt;Xft.dpi&lt;/code&gt; setting. It isn’t defined
which takes precedence (to my knowledge), and even if it were (is),
application authors wouldn’t be arsed to actually do the right thing.
For years, Firefox (well, Iceweasel) happily listened to the X display
resolution, but recently it decided to only look at
&lt;code&gt;Xft.dpi&lt;/code&gt;, which objectively seems a little silly, since the
X display resolution is always present, but &lt;code&gt;Xft.dpi&lt;/code&gt; isn’t.
Anyway, Mozilla’s change drove me to to create a &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.local/bin/xrdb-set-dpi"&gt;script&lt;/a&gt;
to make the &lt;code&gt;Xft.dpi&lt;/code&gt; setting match the X display resolution.
Disclaimer: I have no idea if it works if the X server has multiple
displays (with possibly varying resolution).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/usr/bin/env bash
dpi=$(LC_ALL=C xdpyinfo|sed -rn &amp;#39;s/^\s*resolution:\s*(.*) dots per inch$/\1/p&amp;#39;)
xrdb -merge &amp;lt;&amp;lt;&amp;lt;&amp;quot;Xft.dpi: ${dpi}&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since we want XRDB to be set up before any other programs launch, we
give both of the &lt;code&gt;xresources&lt;/code&gt; units
&lt;code&gt;Before=X11@%i.target&lt;/code&gt; (instead of &lt;code&gt;After=&lt;/code&gt; like
everything else). Also, two programs writing to &lt;code&gt;xrdb&lt;/code&gt; at the
same time has the same problem as two programs writing to the same file;
one might trash the other’s changes. So, I stuck
&lt;code&gt;Conflicts=xresources@:i.service&lt;/code&gt; into
&lt;code&gt;xresources-dpi.service&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;And that’s the “core” of my X11 systemd setup. But, you generally
want more things running than just the window manager, like a desktop
notification daemon, a system panel, and an X composition manager
(unless your window manager is bloated and has a composition manager
built in). Since these things are probably window-manager specific, I’ve
stuck them in a directory &lt;code&gt;wmii@.service.wants&lt;/code&gt;:&lt;/p&gt;
&lt;ul class="tree"&gt;
&lt;li&gt;
&lt;p&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user"&gt;.config/systemd/user/&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/wmii@.service.wants"&gt;wmii@.service.wants&lt;/a&gt;/
&lt;ul&gt;
&lt;li&gt;dunst@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/dunst@.service"&gt;dunst@.service&lt;/a&gt;       
# a notification daemon&lt;/li&gt;
&lt;li&gt;lxpanel@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/lxpanel@.service"&gt;lxpanel@.service&lt;/a&gt;   
# a system panel&lt;/li&gt;
&lt;li&gt;rbar@97_acpi.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/rbar@.service"&gt;rbar@.service&lt;/a&gt;  
# wmii stuff&lt;/li&gt;
&lt;li&gt;rbar@99_clock.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/rbar@.service"&gt;rbar@.service&lt;/a&gt; 
# wmii stuff&lt;/li&gt;
&lt;li&gt;xcompmgr@.service -&amp;gt; ../&lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/xcompmgr@.service"&gt;xcompmgr@.service&lt;/a&gt; 
# an X composition manager&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For the window manager &lt;code&gt;.service&lt;/code&gt;, I &lt;em&gt;could&lt;/em&gt; just
say &lt;code&gt;Type=simple&lt;/code&gt; and call it a day (and I did for a while).
But, I like to have &lt;code&gt;lxpanel&lt;/code&gt; show up on all of my WMII tags
(desktops), so I have &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/wmii-hg/config.sh"&gt;my
WMII configuration&lt;/a&gt; stick this in the WMII &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/wmii-hg/rules"&gt;&lt;code&gt;/rules&lt;/code&gt;&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/panel/ tags=/.*/ floating=always&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unfortunately, for this to work, &lt;code&gt;lxpanel&lt;/code&gt; must be started
&lt;em&gt;after&lt;/em&gt; that gets inserted into WMII’s rules. That wasn’t a
problem pre-systemd, because &lt;code&gt;lxpanel&lt;/code&gt; was started by my WMII
configuration, so ordering was simple. For systemd to get this right, I
must have a way of notifying systemd that WMII’s fully started, and it’s
safe to start &lt;code&gt;lxpanel&lt;/code&gt;. So, I stuck this in &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/wmii@.service"&gt;my
WMII &lt;code&gt;.service&lt;/code&gt; file&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# This assumes that you write READY=1 to $NOTIFY_SOCKET in wmiirc
Type=notify
NotifyAccess=all&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and this in &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/wmii-hg/wmiirc"&gt;my
WMII configuration&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;systemd-notify --ready || true&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, this setup means that &lt;code&gt;NOTIFY_SOCKET&lt;/code&gt; is set for all
the children of &lt;code&gt;wmii&lt;/code&gt;; I’d rather not have it leak into the
applications that I start from the window manager, so I also stuck
&lt;code&gt;unset NOTIFY_SOCKET&lt;/code&gt; after forking a process that isn’t part
of the window manager:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;runcmd() (
    ...
    unset NOTIFY_SOCKET # systemd
    ...
    exec 8&amp;gt;&amp;amp;- # xinit/systemd handshake
    ...
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Unfortunately, because of a couple of &lt;a
href="https://github.com/systemd/systemd/issues/2739"&gt;bugs&lt;/a&gt; and &lt;a
href="https://github.com/systemd/systemd/issues/2737"&gt;race
conditions&lt;/a&gt; in systemd, &lt;code&gt;systemd-notify&lt;/code&gt; isn’t reliable.
If systemd can’t receive the &lt;code&gt;READY=1&lt;/code&gt; signal from my WMII
configuration, there are two consequences:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;code&gt;lxpanel&lt;/code&gt; will never start, because it will always be
waiting for &lt;code&gt;wmii&lt;/code&gt; to be ready, which will never happen.&lt;/li&gt;
&lt;li&gt;After a couple of minutes, systemd will consider &lt;code&gt;wmii&lt;/code&gt;
to be timed out, which is a failure, so then it will kill
&lt;code&gt;wmii&lt;/code&gt;, and exit my X11 session. That’s no good!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Using &lt;code&gt;socat&lt;/code&gt; to send the message to systemd instead of
&lt;code&gt;systemd-notify&lt;/code&gt; “should” always work, because it tries to
read from both ends of the bi-directional stream, and I can’t imagine
that getting EOF from the &lt;code&gt;UNIX-SENDTO&lt;/code&gt; end will ever be
faster than the systemd manager from handling the datagram that got
sent. Which is to say, “we work around the race condition by being slow
and shitty.”&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;socat STDIO UNIX-SENDTO:&amp;quot;$NOTIFY_SOCKET&amp;quot; &amp;lt;&amp;lt;&amp;lt;READY=1 || true&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But, I don’t like that. I’d rather write my WMII configuration to the
world as I wish it existed, and have workarounds encapsulated elsewhere;
&lt;a
href="http://blog.robertelder.org/interfaces-most-important-software-engineering-concept/"&gt;“If
you have to cut corners in your project, do it inside the
implementation, and wrap a very good interface around it.”&lt;/a&gt;. So, I
wrote a &lt;code&gt;systemd-notify&lt;/code&gt; compatible &lt;a
href="https://lukeshu.com/git/dotfiles.git/tree/.config/wmii-hg/workarounds.sh"&gt;function&lt;/a&gt;
that ultimately calls &lt;code&gt;socat&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;##
# Just like systemd-notify(1), but slower, which is a shitty
# workaround for a race condition in systemd.
##
systemd-notify() {
    local args
    args=&amp;quot;$(getopt -n systemd-notify -o h -l help,version,ready,pid::,status:,booted -- &amp;quot;$@&amp;quot;)&amp;quot;
    ret=$?; [[ $ret == 0 ]] || return $ret
    eval set -- &amp;quot;$args&amp;quot;

    local arg_ready=false
    local arg_pid=0
    local arg_status=
    while [[ $# -gt 0 ]]; do
        case &amp;quot;$1&amp;quot; in
            -h|--help) command systemd-notify --help; return $?;;
            --version) command systemd-notify --version; return $?;;
            --ready) arg_ready=true; shift 1;;
            --pid) arg_pid=${2:-$$}; shift 2;;
            --status) arg_status=$2; shift 2;;
            --booted) command systemd-notify --booted; return $?;;
            --) shift 1; break;;
        esac
    done

    local our_env=()
    if $arg_ready; then
        our_env+=(&amp;quot;READY=1&amp;quot;)
    fi
    if [[ -n &amp;quot;$arg_status&amp;quot; ]]; then
        our_env+=(&amp;quot;STATUS=$arg_status&amp;quot;)
    fi
    if [[ &amp;quot;$arg_pid&amp;quot; -gt 0 ]]; then
        our_env+=(&amp;quot;MAINPID=$arg_pid&amp;quot;)
    fi
    our_env+=(&amp;quot;$@&amp;quot;)
    local n
    printf -v n &amp;#39;%s\n&amp;#39; &amp;quot;${our_env[@]}&amp;quot;
    socat STDIO UNIX-SENDTO:&amp;quot;$NOTIFY_SOCKET&amp;quot; &amp;lt;&amp;lt;&amp;lt;&amp;quot;$n&amp;quot;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, one day when the systemd bugs have been fixed (and presumably the
Linux kernel supports passing the cgroup of a process as part of its
credentials), I can remove that from &lt;code&gt;workarounds.sh&lt;/code&gt;, and
not have to touch anything else in my WMII configuration (I do use
&lt;code&gt;systemd-notify&lt;/code&gt; in a couple of other, non-essential, places
too; this wasn’t to avoid having to change just 1 line).&lt;/p&gt;
&lt;p&gt;So, now that &lt;code&gt;wmii@.service&lt;/code&gt; properly has
&lt;code&gt;Type=notify&lt;/code&gt;, I can just stick
&lt;code&gt;After=wmii@.service&lt;/code&gt; into my &lt;code&gt;lxpanel@.service&lt;/code&gt;,
right? Wrong! Well, I &lt;em&gt;could&lt;/em&gt;, but my &lt;code&gt;lxpanel&lt;/code&gt;
service has nothing to do with WMII; why should I couple them? Instead,
I create &lt;a
href="https://lukeshu.com/git/dotfiles/tree/.config/systemd/user/wm-running@.target"&gt;&lt;code&gt;wm-running@.target&lt;/code&gt;&lt;/a&gt;
that can be used as a synchronization point:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# wmii@.service
Before=wm-running@%i.target

# lxpanel@.service
After=X11@%i.target wm-running@%i.target
Requires=wm-running@%i.target&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, I have my desktop started and running. Now, I’d like for
programs that aren’t part of the window manager to not dump their stdout
and stderr into WMII’s part of the journal, like to have a record of
which graphical programs crashed, and like to have a prettier
cgroup/process graph. So, I use &lt;code&gt;systemd-run&lt;/code&gt; to run external
programs from the window manager:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;runcmd() (
    ...
    unset NOTIFY_SOCKET # systemd
    ...
    exec 8&amp;gt;&amp;amp;- # xinit/systemd handshake
    exec systemd-run --user --scope -- sh -c &amp;quot;$*&amp;quot;
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I run them as a scope instead of a service so that they inherit
environment variables, and don’t have to mess with getting
&lt;code&gt;DISPLAY&lt;/code&gt; or &lt;code&gt;XAUTHORITY&lt;/code&gt; into their units (as I
&lt;em&gt;don’t&lt;/em&gt; want to make them global variables in my systemd user
session).&lt;/p&gt;
&lt;p&gt;I’d like to get &lt;code&gt;lxpanel&lt;/code&gt; to also use
&lt;code&gt;systemd-run&lt;/code&gt; when launching programs, but it’s a low
priority because I don’t really actually use &lt;code&gt;lxpanel&lt;/code&gt; to
launch programs, I just have the menu there to make sure that I didn’t
break the icons for programs that I package (I did that once back when I
was Parabola’s packager for Iceweasel and IceCat).&lt;/p&gt;
&lt;p&gt;And that’s how I use systemd with X11.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2016 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./java-segfault-redux.html"/>
		<link rel="alternate" type="text/markdown" href="./java-segfault-redux.md"/>
		<id>https://lukeshu.com/blog/java-segfault-redux.html</id>
		<updated>2016-02-28T00:00:00+00:00</updated>
		<published>2016-02-28T00:00:00+00:00</published>
		<title>My favorite bug: segfaults in Java (redux)</title>
		<content type="html">&lt;h1 id="my-favorite-bug-segfaults-in-java-redux"&gt;My favorite bug:
segfaults in Java (redux)&lt;/h1&gt;
&lt;p&gt;Two years ago, I &lt;a href="./java-segfault.html"&gt;wrote&lt;/a&gt; about one
of my favorite bugs that I’d squashed two years before that. About a
year after that, someone posted it &lt;a
href="https://news.ycombinator.com/item?id=9283571"&gt;on Hacker
News&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There was some fun discussion about it, but also some confusion.
After finishing a season of mentoring team 4272, I’ve decided that it
would be fun to re-visit the article, and dig up the old actual code,
instead of pseudo-code, hopefully improving the clarity (and providing a
light introduction for anyone wanting to get into modifying the current
SmartDashbaord).&lt;/p&gt;
&lt;h2 id="the-context"&gt;The context&lt;/h2&gt;
&lt;p&gt;In 2012, I was a high school senior, and lead programmer programmer
on the FIRST Robotics Competition team 1024. For the unfamiliar, the
relevant part of the setup is that there are 2 minute and 15 second
matches in which you have a 120 pound robot that sometimes runs
autonomously, and sometimes is controlled over WiFi from a person at a
laptop running stock “driver station” software and modifiable
“dashboard” software.&lt;/p&gt;
&lt;p&gt;That year, we mostly used the dashboard software to allow the human
driver and operator to monitor sensors on the robot, one of them being a
video feed from a web-cam mounted on it. This was really easy because
the new standard dashboard program had a click-and drag interface to add
stock widgets; you just had to make sure the code on the robot was
actually sending the data.&lt;/p&gt;
&lt;p&gt;That’s great, until when debugging things, the dashboard would
suddenly vanish. If it was run manually from a terminal (instead of
letting the driver station software launch it), you would see a core
dump indicating a segmentation fault.&lt;/p&gt;
&lt;p&gt;This wasn’t just us either; I spoke with people on other teams,
everyone who was streaming video had this issue. But, because it only
happened every couple of minutes, and a match is only 2:15, it didn’t
need to run very long, they just crossed their fingers and hoped it
didn’t happen during a match.&lt;/p&gt;
&lt;p&gt;The dashboard was written in Java, and the source was available
(under a 3-clause BSD license) via read-only SVN at
&lt;code&gt;http://firstforge.wpi.edu/svn/repos/smart_dashboard/trunk&lt;/code&gt;
(which is unfortunately no longer online, fortunately I’d posted some
snapshots on the web). So I dove in, hunting for the bug.&lt;/p&gt;
&lt;p&gt;The repository was divided into several NetBeans projects (not
exhaustively listed):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a
href="https://gitorious.org/absfrc/sources.git/?p=absfrc:sources.git;a=blob_plain;f=smartdashboard-client-2012-1-any.src.tar.xz;hb=HEAD"&gt;&lt;code&gt;client/smartdashboard&lt;/code&gt;&lt;/a&gt;:
The main dashboard program, has a plugin architecture.&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://gitorious.org/absfrc/sources.git/?p=absfrc:sources.git;a=blob_plain;f=wpijavacv-208-1-any.src.tar.xz;hb=HEAD"&gt;&lt;code&gt;WPIJavaCV&lt;/code&gt;&lt;/a&gt;:
A higher-level wrapper around JavaCV, itself a Java Native Interface
(JNI) wrapper to talk to OpenCV (C and C++).&lt;/li&gt;
&lt;li&gt;&lt;a
href="https://gitorious.org/absfrc/sources.git/?p=absfrc:sources.git;a=blob_plain;f=smartdashboard-extension-wpicameraextension-210-1-any.src.tar.xz;hb=HEAD"&gt;&lt;code&gt;extensions/camera/WPICameraExtension&lt;/code&gt;&lt;/a&gt;:
The standard camera feed plugin, processes the video through
WPIJavaCV.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I figured that the bug must be somewhere in the C or C++ code that
was being called by JavaCV, because that’s the language where segfaults
happen. It was especially a pain to track down the pointers that were
causing the issue, because it was hard with native debuggers to see
through all of the JVM stuff to the OpenCV code, and the OpenCV stuff is
opaque to Java debuggers.&lt;/p&gt;
&lt;p&gt;Eventually the issue lead me back into the WPICameraExtension, then
into WPIJavaCV—there was a native pointer being stored in a Java
variable; Java code called the native routine to &lt;code&gt;free()&lt;/code&gt; the
structure, but then tried to feed it to another routine later. This lead
to difficulty again—tracking objects with Java debuggers was hard
because they don’t expect the program to suddenly segfault; it’s Java
code, Java doesn’t segfault, it throws exceptions!&lt;/p&gt;
&lt;p&gt;With the help of &lt;code&gt;println()&lt;/code&gt; I was eventually able to see
that some code was executing in an order that straight didn’t make
sense.&lt;/p&gt;
&lt;h2 id="the-bug"&gt;The bug&lt;/h2&gt;
&lt;p&gt;The basic flow of WPIJavaCV is you have a &lt;code&gt;WPICamera&lt;/code&gt;, and
you call &lt;code&gt;.getNewImage()&lt;/code&gt; on it, which gives you a
&lt;code&gt;WPIImage&lt;/code&gt;, which you could do all kinds of fancy OpenCV
things on, but then ultimately call &lt;code&gt;.getBufferedImage()&lt;/code&gt;,
which gives you a &lt;code&gt;java.awt.image.BufferedImage&lt;/code&gt; that you can
pass to Swing to draw on the screen. You do this every for frame. Which
is exactly what &lt;code&gt;WPICameraExtension.java&lt;/code&gt; did, except that
“all kinds of fancy OpenCV things” consisted only of:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;public WPIImage processImage(WPIColorImage rawImage) {
    return rawImage;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The idea was that you would extend the class, overriding that one
method, if you wanted to do anything fancy.&lt;/p&gt;
&lt;p&gt;One of the neat things about WPIJavaCV was that every OpenCV object
class extended had a &lt;code&gt;finalize()&lt;/code&gt; method (via inheriting from
the abstract class &lt;code&gt;WPIDisposable&lt;/code&gt;) that freed the underlying
C/C++ memory, so you didn’t have to worry about memory leaks like in
plain JavaCV. To inherit from &lt;code&gt;WPIDisposable&lt;/code&gt;, you had to
write a &lt;code&gt;disposed()&lt;/code&gt; method that actually freed the memory.
This was better than writing &lt;code&gt;finalize()&lt;/code&gt; directly, because
it did some safety with NULL pointers and idempotency if you wanted to
manually free something early.&lt;/p&gt;
&lt;p&gt;Now, &lt;code&gt;edu.wpi.first.WPIImage.disposed()&lt;/code&gt; called &lt;code&gt;&lt;a
href="https://github.com/bytedeco/javacv/blob/svn/src/com/googlecode/javacv/cpp/opencv_core.java#L398"&gt;com.googlecode.javacv.cpp.opencv_core.IplImage&lt;/a&gt;.release()&lt;/code&gt;,
which called (via JNI) &lt;code&gt;IplImage:::release()&lt;/code&gt;, which called
libc &lt;code&gt;free()&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;@Override
protected void disposed() {
    image.release();
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Elsewhere, the C buffer for the image was copied into a Java buffer
via a similar chain kicked off by
&lt;code&gt;edu.wpi.first.WPIImage.getBufferedImage()&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;/**
 * Copies this {@link WPIImage} into a {@link BufferedImage}.
 * This method will always generate a new image.
 * @return a copy of the image
 */
public BufferedImage getBufferedImage() {
    validateDisposed();

    return image.getBufferedImage();
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;println()&lt;/code&gt; output I saw that didn’t make sense was
that &lt;code&gt;someFrame.finalize()&lt;/code&gt; was running before
&lt;code&gt;someFrame.getBuffereImage()&lt;/code&gt; had returned!&lt;/p&gt;
&lt;p&gt;You see, if it is waiting for the return value of a method
&lt;code&gt;m()&lt;/code&gt; of object &lt;code&gt;a&lt;/code&gt;, and code in &lt;code&gt;m()&lt;/code&gt;
that is yet to be executed doesn’t access any other methods or
properties of &lt;code&gt;a&lt;/code&gt;, then it will go ahead and consider
&lt;code&gt;a&lt;/code&gt; eligible for garbage collection before &lt;code&gt;m()&lt;/code&gt;
has finished running.&lt;/p&gt;
&lt;p&gt;Put another way, &lt;code&gt;this&lt;/code&gt; is passed to a method just like
any other argument. If a method is done accessing &lt;code&gt;this&lt;/code&gt;,
then it’s “safe” for the JVM to go ahead and garbage collect it.&lt;/p&gt;
&lt;p&gt;That is normally a safe “optimization” to make… except for when a
destructor method (&lt;code&gt;finalize()&lt;/code&gt;) is defined for the object;
the destructor can have side effects, and Java has no way to know
whether it is safe for them to happen before &lt;code&gt;m()&lt;/code&gt; has
finished running.&lt;/p&gt;
&lt;p&gt;I’m not entirely sure if this is a “bug” in the compiler or the
language specification, but I do believe that it’s broken behavior.&lt;/p&gt;
&lt;p&gt;Anyway, in this case it’s unsafe with WPI’s code.&lt;/p&gt;
&lt;h2 id="my-work-around"&gt;My work-around&lt;/h2&gt;
&lt;p&gt;My work-around was to change this function in
&lt;code&gt;WPIImage&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;public BufferedImage getBufferedImage() {
    validateDisposed();

    return image.getBufferedImage(); // `this` may get garbage collected before it returns!
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the above code, &lt;code&gt;this&lt;/code&gt; is a &lt;code&gt;WPIImage&lt;/code&gt;, and
it may get garbage collected between the time that
&lt;code&gt;image.getBufferedImage()&lt;/code&gt; is dispatched, and the time that
&lt;code&gt;image.getBufferedImage()&lt;/code&gt; accesses native memory. When it is
garbage collected, it calls &lt;code&gt;image.release()&lt;/code&gt;, which
&lt;code&gt;free()&lt;/code&gt;s that native memory. That seems pretty unlikely to
happen; that’s a very small gap of time. However, running 30 times a
second, eventually bad luck with the garbage collector happens, and the
program crashes.&lt;/p&gt;
&lt;p&gt;The work-around was to insert a bogus call to this to keep
&lt;code&gt;this&lt;/code&gt; around until after we were also done with
&lt;code&gt;image&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;to this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;public BufferedImage getBufferedImage() {
    validateDisposed();
    BufferedImage ret = image.getBufferedImage();
    getWidth(); // bogus call to keep `this` around
    return ret;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Yeah. After spending weeks wading through though thousands of lines
of Java, C, and C++, a bogus call to a method I didn’t care about was
the fix.&lt;/p&gt;
&lt;p&gt;TheLoneWolfling on Hacker News noted that they’d be worried about the
JVM optimizing out the call to &lt;code&gt;getWidth()&lt;/code&gt;. I’m not, because
&lt;code&gt;WPIImage.getWidth()&lt;/code&gt; calls &lt;code&gt;IplImage.width()&lt;/code&gt;,
which is declared as &lt;code&gt;native&lt;/code&gt;; the JVM must run it because it
might have side effects. On the other hand, looking back, I think I just
shrunk the window for things to go wrong: it may be possible for the
garbage collection to trigger in the time between
&lt;code&gt;getWidth()&lt;/code&gt; being dispatched and &lt;code&gt;width()&lt;/code&gt;
running. Perhaps there was something in the C/C++ code that made it
safe, I don’t recall, and don’t care quite enough to dig into OpenCV
internals again. Or perhaps I’m mis-remembering the fix (which I don’t
actually have a file of), and I called some other method that
&lt;em&gt;could&lt;/em&gt; get optimized out (though I &lt;em&gt;do&lt;/em&gt; believe that it
was either &lt;code&gt;getWidth()&lt;/code&gt; or &lt;code&gt;getHeight()&lt;/code&gt;).&lt;/p&gt;
&lt;h2 id="wpis-fix"&gt;WPI’s fix&lt;/h2&gt;
&lt;p&gt;Four years later, the SmartDashboard is still being used! But it no
longer has this bug, and it’s not using my workaround. So, how did the
WPILib developers fix it?&lt;/p&gt;
&lt;p&gt;Well, the code now lives &lt;a
href="https://usfirst.collab.net/gerrit/#/admin/projects/"&gt;in git at
collab.net&lt;/a&gt;, so I decided to take a look.&lt;/p&gt;
&lt;p&gt;The stripped out WPIJavaCV from the main video feed widget, and now
use a purely Java implementation of MPJPEG streaming.&lt;/p&gt;
&lt;p&gt;However, the old video feed widget is still available as an extension
(so that you can still do cool things with &lt;code&gt;processImage&lt;/code&gt;),
and it also no longer has this bug. Their fix was to put a mutex around
all accesses to &lt;code&gt;image&lt;/code&gt;, which should have been the obvious
solution to me.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2016 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./nginx-mediawiki.html"/>
		<link rel="alternate" type="text/markdown" href="./nginx-mediawiki.md"/>
		<id>https://lukeshu.com/blog/nginx-mediawiki.html</id>
		<updated>2015-05-19T00:00:00+00:00</updated>
		<published>2015-05-19T00:00:00+00:00</published>
		<title>An Nginx configuration for MediaWiki</title>
		<content type="html">&lt;h1 id="an-nginx-configuration-for-mediawiki"&gt;An Nginx configuration for
MediaWiki&lt;/h1&gt;
&lt;p&gt;There are &lt;a href="http://wiki.nginx.org/MediaWiki"&gt;several&lt;/a&gt; &lt;a
href="https://wiki.archlinux.org/index.php/MediaWiki#Nginx"&gt;example&lt;/a&gt;
&lt;a
href="https://www.mediawiki.org/wiki/Manual:Short_URL/wiki/Page_title_--_nginx_rewrite--root_access"&gt;Nginx&lt;/a&gt;
&lt;a
href="https://www.mediawiki.org/wiki/Manual:Short_URL/Page_title_-_nginx,_Root_Access,_PHP_as_a_CGI_module"&gt;configurations&lt;/a&gt;
&lt;a href="http://wiki.nginx.org/RHEL_5.4_%2B_Nginx_%2B_Mediawiki"&gt;for&lt;/a&gt;
&lt;a
href="http://stackoverflow.com/questions/11080666/mediawiki-on-nginx"&gt;MediaWiki&lt;/a&gt;
floating around the web. Many of them don’t block the user from
accessing things like &lt;code&gt;/serialized/&lt;/code&gt;. Many of them also &lt;a
href="https://labs.parabola.nu/issues/725"&gt;don’t correctly handle&lt;/a&gt; a
wiki page named &lt;code&gt;FAQ&lt;/code&gt;, since that is a name of a file in the
MediaWiki root! In fact, the configuration used on the official Nginx
Wiki has both of those issues!&lt;/p&gt;
&lt;p&gt;This is because most of the configurations floating around basically
try to pass all requests through, and blacklist certain requests, either
denying them, or passing them through to &lt;code&gt;index.php&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;It’s my view that blacklisting is inferior to whitelisting in
situations like this. So, I developed the following configuration that
instead works by whitelisting certain paths.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;root /path/to/your/mediawiki; # obviously, change this line

index index.php;
location /                     { try_files /var/empty @rewrite; }
location /images/              { try_files $uri $uri/ @rewrite; }
location /skins/               { try_files $uri $uri/ @rewrite; }
location /api.php              { try_files /var/empty @php; }
location /api.php5             { try_files /var/empty @php; }
location /img_auth.php         { try_files /var/empty @php; }
location /img_auth.php5        { try_files /var/empty @php; }
location /index.php            { try_files /var/empty @php; }
location /index.php5           { try_files /var/empty @php; }
location /load.php             { try_files /var/empty @php; }
location /load.php5            { try_files /var/empty @php; }
location /opensearch_desc.php  { try_files /var/empty @php; }
location /opensearch_desc.php5 { try_files /var/empty @php; }
location /profileinfo.php      { try_files /var/empty @php; }
location /thumb.php            { try_files /var/empty @php; }
location /thumb.php5           { try_files /var/empty @php; }
location /thumb_handler.php    { try_files /var/empty @php; }
location /thumb_handler.php5   { try_files /var/empty @php; }
location /wiki.phtml           { try_files /var/empty @php; }

location @rewrite {
    rewrite ^/(.*)$ /index.php?title=$1&amp;amp;$args;
}

location @php {
    # obviously, change this according to your PHP setup
    include fastcgi.conf;
    fastcgi_pass unix:/run/php-fpm/wiki.sock;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We are now using this configuration on &lt;a
href="https://wiki.parabola.nu/"&gt;ParabolaWiki&lt;/a&gt;, but with an alias for
&lt;code&gt;location = /favicon.ico&lt;/code&gt; to the correct file in the skin,
and with FastCGI caching for PHP.&lt;/p&gt;
&lt;p&gt;The only thing I don’t like about this is the
&lt;code&gt;try_files /var/emtpy&lt;/code&gt; bits—surely there is a better way to
have it go to one of the &lt;code&gt;@&lt;/code&gt; location blocks, but I couldn’t
figure it out.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2015 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./lp2015-videos.html"/>
		<link rel="alternate" type="text/markdown" href="./lp2015-videos.md"/>
		<id>https://lukeshu.com/blog/lp2015-videos.html</id>
		<updated>2015-03-22T00:00:00+00:00</updated>
		<published>2015-03-22T00:00:00+00:00</published>
		<title>I took some videos at LibrePlanet</title>
		<content type="html">&lt;h1 id="i-took-some-videos-at-libreplanet"&gt;I took some videos at
LibrePlanet&lt;/h1&gt;
&lt;p&gt;I’m at &lt;a href="https://libreplanet.org/2015/"&gt;LibrePlanet&lt;/a&gt;, and
have been loving the talks. For most of yesterday, there was a series of
short “lightning” talks in room 144. I decided to hang out in that room
for the later part of the day, because while most of the talks were live
streamed and recorded, there were no cameras in room 144; so I couldn’t
watch them later.&lt;/p&gt;
&lt;p&gt;Way too late in the day, I remembered that I have the capability to
record videos, so I cought the last two talks in 144.&lt;/p&gt;
&lt;p&gt;I appologize for the changing orientation.&lt;/p&gt;
&lt;p&gt;&lt;a
href="https://lukeshu.com/dump/lp-2015-last-2-short-talks.ogg"&gt;Here’s
the video I took&lt;/a&gt;.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2015 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./build-bash-1.html"/>
		<link rel="alternate" type="text/markdown" href="./build-bash-1.md"/>
		<id>https://lukeshu.com/blog/build-bash-1.html</id>
		<updated>2015-03-18T00:00:00+00:00</updated>
		<published>2015-03-18T00:00:00+00:00</published>
		<title>Building Bash 1.14.7 on a modern system</title>
		<content type="html">&lt;h1 id="building-bash-1.14.7-on-a-modern-system"&gt;Building Bash 1.14.7 on
a modern system&lt;/h1&gt;
&lt;p&gt;In a previous revision of my &lt;a href="./bash-arrays.html"&gt;Bash arrays
post&lt;/a&gt;, I wrote:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Bash 1.x won’t compile with modern GCC, so I couldn’t verify how it
behaves.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I recall spending a little time fighting with it, but apparently I
didn’t try very hard: getting Bash 1.14.7 to build on a modern box is
mostly just adjusting it to use &lt;code&gt;stdarg&lt;/code&gt; instead of the
no-longer-implemented &lt;code&gt;varargs&lt;/code&gt;. There’s also a little
fiddling with the pre-autoconf automatic configuration.&lt;/p&gt;
&lt;h2 id="stdarg"&gt;stdarg&lt;/h2&gt;
&lt;p&gt;Converting to &lt;code&gt;stdarg&lt;/code&gt; is pretty simple: For each variadic
function (functions that take a variable number of arguments), follow
these steps:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;Replace &lt;code&gt;#include &amp;lt;varargs.h&amp;gt;&lt;/code&gt; with
&lt;code&gt;#include &amp;lt;stdarg.h&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Replace &lt;code&gt;function_name (va_alist) va_dcl&lt;/code&gt; with
&lt;code&gt;function_name (char *format, ...)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Removing the declaration and assignment for &lt;code&gt;format&lt;/code&gt; from
the function body.&lt;/li&gt;
&lt;li&gt;Replace &lt;code&gt;va_start (args);&lt;/code&gt; with
&lt;code&gt;va_start (args, format);&lt;/code&gt; in the function bodies.&lt;/li&gt;
&lt;li&gt;Replace &lt;code&gt;function_name ();&lt;/code&gt; with
&lt;code&gt;function_name (char *, ...)&lt;/code&gt; in header files and/or at the
top of C files.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There’s one function that uses the variable name &lt;code&gt;control&lt;/code&gt;
instead of &lt;code&gt;format&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I’ve prepared &lt;a href="./bash-1.14.7-gcc4-stdarg.patch"&gt;a patch&lt;/a&gt;
that does this.&lt;/p&gt;
&lt;h2 id="configuration"&gt;Configuration&lt;/h2&gt;
&lt;p&gt;Instead of using autoconf-style tests to test for compiler and
platform features, Bash 1 used the file &lt;code&gt;machines.h&lt;/code&gt; that had
&lt;code&gt;#ifdefs&lt;/code&gt; and a huge database of of different operating
systems for different platforms. It’s gross. And quite likely won’t
handle your modern operating system.&lt;/p&gt;
&lt;p&gt;I made these two small changes to &lt;code&gt;machines.h&lt;/code&gt; to get it
to work correctly on my box:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;Replace &lt;code&gt;#if defined (i386)&lt;/code&gt; with
&lt;code&gt;#if defined (i386) || defined (__x86_64__)&lt;/code&gt;. The purpose of
this is obvious.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;#define USE_TERMCAP_EMULATION&lt;/code&gt; to the section for
Linux [sic] on i386
(&lt;code&gt;#  if !defined (done386) &amp;amp;&amp;amp; (defined (__linux__) || defined (linux))&lt;/code&gt;).
What this does is tell it to link against libcurses to use curses
termcap emulation, instead of linking against libtermcap (which doesn’t
exist on modern GNU/Linux systems).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Again, I’ve prepared &lt;a href="./bash-1.14.7-machines-config.patch"&gt;a
patch&lt;/a&gt; that does this.&lt;/p&gt;
&lt;h2 id="building"&gt;Building&lt;/h2&gt;
&lt;p&gt;With those adjustments, it should build, but with quite a few
warnings. Making a couple of changes to &lt;code&gt;CFLAGS&lt;/code&gt; should fix
that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;make CFLAGS=&amp;#39;-O -g -Werror -Wno-int-to-pointer-cast -Wno-pointer-to-int-cast -Wno-deprecated-declarations -include stdio.h -include stdlib.h -include string.h -Dexp2=bash_exp2&amp;#39;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That’s a doozy! Let’s break it down:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;-O -g&lt;/code&gt; The default value for CFLAGS (defined in
&lt;code&gt;cpp-Makefile&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-Werror&lt;/code&gt; Treat warnings as errors; force us to deal with
any issues.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-Wno-int-to-pointer-cast -Wno-pointer-to-int-cast&lt;/code&gt; Allow
casting between integers and pointers. Unfortunately, the way this
version of Bash was designed requires this.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-Wno-deprecated-declarations&lt;/code&gt; The &lt;code&gt;getwd&lt;/code&gt;
function in &lt;code&gt;unistd.h&lt;/code&gt; is considered deprecated (use
&lt;code&gt;getcwd&lt;/code&gt; instead). However, if &lt;code&gt;getcwd&lt;/code&gt; is
available, Bash uses it’s own &lt;code&gt;getwd&lt;/code&gt; wrapper around
&lt;code&gt;getcwd&lt;/code&gt; (implemented in &lt;code&gt;general.c&lt;/code&gt;), and only
uses the signature from &lt;code&gt;unistd.h&lt;/code&gt;, not the actuall
implementation from libc.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-include stdio.h -include stdlib.h -include string.h&lt;/code&gt;
Several files are missing these header file includes. If not for
&lt;code&gt;-Werror&lt;/code&gt;, the default function signature fallbacks would
work.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-Dexp2=bash_exp2&lt;/code&gt; Avoid a conflict between the parser’s
&lt;code&gt;exp2&lt;/code&gt; helper function and &lt;code&gt;math.h&lt;/code&gt;’s base-2
exponential function.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Have fun, software archaeologists!&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2015 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./purdue-cs-login.html"/>
		<link rel="alternate" type="text/markdown" href="./purdue-cs-login.md"/>
		<id>https://lukeshu.com/blog/purdue-cs-login.html</id>
		<updated>2015-02-06T00:00:00+00:00</updated>
		<published>2015-02-06T00:00:00+00:00</published>
		<title>Customizing your login on Purdue CS computers (WIP, but updated)</title>
		<content type="html">&lt;h1
id="customizing-your-login-on-purdue-cs-computers-wip-but-updated"&gt;Customizing
your login on Purdue CS computers (WIP, but updated)&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;This article is currently a Work-In-Progress. Other than the one
place where I say “I’m not sure”, the GDM section is complete. The
network shares section is a mess, but has some good information.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Most CS students at Purdue spend a lot of time on the lab boxes, but
don’t know a lot about them. This document tries to fix that.&lt;/p&gt;
&lt;p&gt;The lab boxes all run Gentoo.&lt;/p&gt;
&lt;h2 id="gdm-the-gnome-display-manager"&gt;GDM, the Gnome Display
Manager&lt;/h2&gt;
&lt;p&gt;The boxes run &lt;code&gt;gdm&lt;/code&gt; (Gnome Display Manager) 2.20.11 for
the login screen. This is an old version, and has a couple behaviors
that are slightly different than new versions, but here are the
important bits:&lt;/p&gt;
&lt;p&gt;System configuration:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;/usr/share/gdm/defaults.conf&lt;/code&gt; (lower precidence)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;/etc/X11/gdm/custom.conf&lt;/code&gt; (higher precidence)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;User configuration:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;~/.dmrc&lt;/code&gt; (more recent versions use
&lt;code&gt;~/.desktop&lt;/code&gt;, but Purdue boxes aren’t running more recent
versions)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="purdues-gdm-configuration"&gt;Purdue’s GDM configuration&lt;/h3&gt;
&lt;p&gt;Now, &lt;code&gt;custom.conf&lt;/code&gt; sets&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;BaseXsession=/usr/local/share/xsessions/Xsession
SessionDesktopDir=/usr/local/share/xsessions/&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is important, because there are &lt;em&gt;multiple&lt;/em&gt; locations that
look like these files; I take it that they were used at sometime in the
past. Don’t get tricked into thinking that it looks at
&lt;code&gt;/etc/X11/gdm/Xsession&lt;/code&gt; (which exists, and is where it would
look by default).&lt;/p&gt;
&lt;p&gt;If you look at the GDM login screen, it has a “Sessions” button that
opens a prompt where you can select any of several sessions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Last session&lt;/li&gt;
&lt;li&gt;1. MATE (&lt;code&gt;mate.desktop&lt;/code&gt;;
&lt;code&gt;Exec=mate-session&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;2. CS Default Session (&lt;code&gt;default.desktop&lt;/code&gt;;
&lt;code&gt;Exec=default&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;3. Custom Session (&lt;code&gt;custom.desktop&lt;/code&gt;;
&lt;code&gt;Exec=custom&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;4. FVWM2 (&lt;code&gt;fvwm2.desktop&lt;/code&gt;; &lt;code&gt;Exec=fvwm2&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;5. gnome.desktop (&lt;code&gt;gnome.desktop&lt;/code&gt;;
&lt;code&gt;Exec=gnome-session&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;6. KDE (&lt;code&gt;kde.desktop&lt;/code&gt;, &lt;code&gt;Exec=startkde&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Failsafe MATE (&lt;code&gt;ShowGnomeFailsafeSession=true&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Failsafe Terminal (&lt;code&gt;ShowXtermFailsafeSession=true&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The main 6 are configured by the &lt;code&gt;.desktop&lt;/code&gt; files in
&lt;code&gt;SessionDesktopDir=/usr/local/share/xsessions&lt;/code&gt;; the last 2
are auto-generated. The reason &lt;code&gt;ShowGnomeFailsafeSession&lt;/code&gt;
correctly generates a Mate session instead of a Gnome session is because
of the patch
&lt;code&gt;/p/portage/*/overlay/gnome-base/gdm/files/gdm-2.20.11-mate.patch&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I’m not sure why Gnome shows up as &lt;code&gt;gnome.desktop&lt;/code&gt; instead
of &lt;code&gt;GNOME&lt;/code&gt; as specified by &lt;code&gt;gnome.desktop:Name&lt;/code&gt;. I
imagine it might be something related to the aforementioned patch, but I
can’t find anything in the patch that looks like it would screw that up;
at least not without a better understanding of GDM’s code.&lt;/p&gt;
&lt;p&gt;Which of the main 6 is used by default (“Last Session”) is configured
with &lt;code&gt;~/.dmrc:Session&lt;/code&gt;, which contains the basename of the
associated &lt;code&gt;.desktop&lt;/code&gt; file (that is, without any directory
part or file extension).&lt;/p&gt;
&lt;p&gt;Every one of the &lt;code&gt;.desktop&lt;/code&gt; files sets
&lt;code&gt;Type=XSession&lt;/code&gt;, which means that instead of running the
argument in &lt;code&gt;Exec=&lt;/code&gt; directly, it passes it as arguments to
the &lt;code&gt;Xsession&lt;/code&gt; program (in the location configured by
&lt;code&gt;BaseXsession&lt;/code&gt;).&lt;/p&gt;
&lt;h4 id="xsession"&gt;Xsession&lt;/h4&gt;
&lt;p&gt;So, now we get to read
&lt;code&gt;/usr/local/share/xsessions/Xsession&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Before it does anything else, it:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;code&gt;. /etc/profile.env&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;unset ROOTPATH&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Try to set up logging to one of &lt;code&gt;~/.xsession-errors&lt;/code&gt;,
&lt;code&gt;$TMPDIR/xses-$USER&lt;/code&gt;, or &lt;code&gt;/tmp/xses-$USER&lt;/code&gt; (it
tries them in that order).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;xsetroot -default&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Fiddles with the maximum number of processes.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;After that, it handles these 3 “special” arguments that were given to
it by various &lt;code&gt;.desktop&lt;/code&gt; &lt;code&gt;Exec=&lt;/code&gt; lines:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;failsafe&lt;/code&gt;: Runs a single xterm window. NB: This is NOT
run by either of the failsafe options. It is likey a vestiage from a
prior configuration.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;startkde&lt;/code&gt;: Displays a message saying KDE is no longer
available.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gnome-session&lt;/code&gt;: Displays a message saying GNOME has been
replaced by MATE.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Assuming that none of those were triggered, it then does:&lt;/p&gt;
&lt;ol type="1"&gt;
&lt;li&gt;&lt;code&gt;source ~/.xprofile&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;xrdb -merge ~/.Xresources&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;xmodmap ~/.xmodmaprc&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Finally, it has a switch statement over the arguments given to it by
the various &lt;code&gt;.desktop&lt;/code&gt; &lt;code&gt;Exec=&lt;/code&gt; lines:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;custom&lt;/code&gt;: Executes &lt;code&gt;~/.xsession&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;default&lt;/code&gt;: Executes &lt;code&gt;~/.Xrc.cs&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;mate-session&lt;/code&gt;: It has this whole script to start DBus,
run the &lt;code&gt;mate-session&lt;/code&gt; command, then cleanup when it’s
done.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;*&lt;/code&gt; (&lt;code&gt;fvwm2&lt;/code&gt;): Runs
&lt;code&gt;eval exec "$@"&lt;/code&gt;, which results in it executing the
&lt;code&gt;fvwm2&lt;/code&gt; command.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="network-shares"&gt;Network Shares&lt;/h2&gt;
&lt;p&gt;Your data is on various hosts. I believe most undergrads have their
data on &lt;code&gt;data.cs.purdue.edu&lt;/code&gt; (or just &lt;a
href="https://en.wikipedia.org/wiki/Data_%28Star_Trek%29"&gt;&lt;code&gt;data&lt;/code&gt;&lt;/a&gt;).
Others have theirs on &lt;a
href="http://swfanon.wikia.com/wiki/Antor"&gt;&lt;code&gt;antor&lt;/code&gt;&lt;/a&gt; or &lt;a
href="https://en.wikipedia.org/wiki/Tux"&gt;&lt;code&gt;tux&lt;/code&gt;&lt;/a&gt; (that I
know of).&lt;/p&gt;
&lt;p&gt;Most of the boxes with tons of storage have many network cards; each
with a different IP; a single host’s IPs are mostly the same, but with
varying 3rd octets. For example, &lt;code&gt;data&lt;/code&gt; is 128.10.X.13. If
you need a particular value of X, but don’t want to remember the other
octets; they are individually addressed with
&lt;code&gt;BASENAME-NUMBER.cs.purdue.edu&lt;/code&gt;. For example,
&lt;code&gt;data-25.cs.purdu.edu&lt;/code&gt; is 128.10.25.13.&lt;/p&gt;
&lt;p&gt;They use &lt;a
href="https://www.kernel.org/pub/linux/daemons/autofs/"&gt;AutoFS&lt;/a&gt; quite
extensively. The maps are generated dynamically by
&lt;code&gt;/etc/autofs/*.map&lt;/code&gt;, which are all symlinks to
&lt;code&gt;/usr/libexec/amd2autofs&lt;/code&gt;. As far as I can tell,
&lt;code&gt;amd2autofs&lt;/code&gt; is custom to Purdue. Its source lives in
&lt;code&gt;/p/portage/*/overlay/net-fs/autofs/files/amd2autofs.c&lt;/code&gt;. The
name appears to be a misnomer; seems to claim to dynamically translate
from the configuration of &lt;a href="http://www.am-utils.org/"&gt;Auto
Mounter Daemon (AMD)&lt;/a&gt; to AutoFS, but it actually talks to NIS. It
does so using the &lt;code&gt;yp&lt;/code&gt; interface, which is in Glibc for
compatibility, but is undocumented. For documentation for that
interface, look at the one of the BSDs, or Mac OS X. From the comments
in the file, it appears that it once did look at the AMD configuration,
but has since been changed.&lt;/p&gt;
&lt;p&gt;There are 3 mountpoints using AutoFS: &lt;code&gt;/homes&lt;/code&gt;,
&lt;code&gt;/p&lt;/code&gt;, and &lt;code&gt;/u&lt;/code&gt;. &lt;code&gt;/homes&lt;/code&gt; creates
symlinks on-demand from &lt;code&gt;/homes/USERNAME&lt;/code&gt; to
&lt;code&gt;/u/BUCKET/USERNAME&lt;/code&gt;. &lt;code&gt;/u&lt;/code&gt; mounts NFS shares to
&lt;code&gt;/u/SERVERNAME&lt;/code&gt; on-demand, and creates symlinks from
&lt;code&gt;/u/BUCKET&lt;/code&gt; to &lt;code&gt;/u/SERVERNAME/BUCKET&lt;/code&gt; on-demand.
&lt;code&gt;/p&lt;/code&gt; mounts on-demand various NFS shares that are organized
by topic; the Xinu/MIPS tools are in &lt;code&gt;/p/xinu&lt;/code&gt;, the Portage
tree is in &lt;code&gt;/p/portage&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I’m not sure how &lt;code&gt;scratch&lt;/code&gt; works; it seems to be
heterogenous between different servers and families of lab boxes.
Sometimes it’s in &lt;code&gt;/u&lt;/code&gt;, sometimes it isn’t.&lt;/p&gt;
&lt;p&gt;This 3rd-party documentation was very helpful to me: &lt;a
href="http://www.linux-consulting.com/Amd_AutoFS/"
class="uri"&gt;http://www.linux-consulting.com/Amd_AutoFS/&lt;/a&gt; It’s where
Gentoo points for the AutoFS homepage, as it doesn’t have a real
homepage. Arch just points to FreshMeat. Debian points to
kernel.org.&lt;/p&gt;
&lt;h3 id="lore"&gt;Lore&lt;/h3&gt;
&lt;p&gt;&lt;a
href="https://en.wikipedia.org/wiki/List_of_Star_Trek:_The_Next_Generation_characters#Lore"&gt;&lt;code&gt;lore&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Lore is a SunOS 5.10 box running on Sun-Fire V445 (sun4u) hardware.
SunOS is NOT GNU/Linux, and sun4u is NOT x86.&lt;/p&gt;
&lt;p&gt;Instead of &lt;code&gt;/etc/fstab&lt;/code&gt; it is
&lt;code&gt;/etc/mnttab&lt;/code&gt;.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2015 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./make-memoize.html"/>
		<link rel="alternate" type="text/markdown" href="./make-memoize.md"/>
		<id>https://lukeshu.com/blog/make-memoize.html</id>
		<updated>2014-11-20T00:00:00+00:00</updated>
		<published>2014-11-20T00:00:00+00:00</published>
		<title>A memoization routine for GNU Make functions</title>
		<content type="html">&lt;h1 id="a-memoization-routine-for-gnu-make-functions"&gt;A memoization
routine for GNU Make functions&lt;/h1&gt;
&lt;p&gt;I’m a big fan of &lt;a href="https://www.gnu.org/software/make/"&gt;GNU
Make&lt;/a&gt;. I’m pretty knowledgeable about it, and was pretty active on
the help-make mailing list for a while. Something that many experienced
make-ers know of is John Graham-Cumming’s “GNU Make Standard Library”,
or &lt;a href="http://gmsl.sourceforge.net/"&gt;GMSL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I don’t like to use it, as I’m capable of defining macros myself as I
need them instead of pulling in a 3rd party dependency (and generally
like to stay away from the kind of Makefile that would lean heavily on
something like GMSL).&lt;/p&gt;
&lt;p&gt;However, one really neat thing that GMSL offers is a way to memoize
expensive functions (such as those that shell out). I was considering
pulling in GMSL for one of my projects, almost just for the
&lt;code&gt;memoize&lt;/code&gt; function.&lt;/p&gt;
&lt;p&gt;However, John’s &lt;code&gt;memoize&lt;/code&gt; has a couple short-comings that
made it unsuitable for my needs.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Only allows functions that take one argument.&lt;/li&gt;
&lt;li&gt;Considers empty values to be unset; for my needs, an empty string is
a valid value that should be cached.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;So, I implemented my own, more flexible memoization routine for
Make.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# This definition of `rest` is equivalent to that in GMSL
rest = $(wordlist 2,$(words $1),$1)

# How to use: Define 2 variables (the type you would pass to $(call):
# `_&lt;var&gt;NAME&lt;/var&gt;_main` and `_&lt;var&gt;NAME&lt;/var&gt;_hash`.  Now, `_&lt;var&gt;NAME&lt;/var&gt;_main` is the function getting
# memoized, and _&lt;var&gt;NAME&lt;/var&gt;_hash is a function that hashes the function arguments
# into a string suitable for a variable name.
#
# Then, define the final function like:
#
#     &lt;var&gt;NAME&lt;/var&gt; = $(foreach func,&lt;var&gt;NAME&lt;/var&gt;,$(memoized))

_main = $(_$(func)_main)
_hash = __memoized_$(_$(func)_hash)
memoized = $(if $($(_hash)),,$(eval $(_hash) := _ $(_main)))$(call rest,$($(_hash)))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, I later removed it from the Makefile, as I &lt;a
href="https://projects.parabola.nu/~lukeshu/maven-dist.git/commit/?id=fec5a7281b3824cb952aa0bb76bbbaa41eaafdf9"&gt;re-implemented&lt;/a&gt;
the bits that it memoized in a more efficient way, such that memoization
was no longer needed, and the whole thing was faster.&lt;/p&gt;
&lt;p&gt;Later, I realized that my memoized routine could have been improved
by replacing &lt;code&gt;func&lt;/code&gt; with &lt;code&gt;$0&lt;/code&gt;, which would
simplify how the final function is declared:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# This definition of `rest` is equivalent to that in GMSL
rest = $(wordlist 2,$(words $1),$1)

# How to use:
#
#     _&lt;var&gt;NAME&lt;/var&gt;_main = &lt;var&gt;your main function to be memoized&lt;/var&gt;
#     _&lt;var&gt;NAME&lt;/var&gt;_hash = &lt;var&gt;your hash function for parameters&lt;/var&gt;
#     &lt;var&gt;NAME&lt;/var&gt; = $(memoized)
#
# The output of your hash function should be a string following
# the same rules that variable names follow.

_main = $(_$0_main)
_hash = __memoized_$(_$0_hash)
memoized = $(if $($(_hash)),,$(eval $(_hash) := _ $(_main)))$(call rest,$($(_hash)))&lt;/pre&gt;
&lt;p&gt;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Now, I’m pretty sure that should work, but I have only actually
tested the first version.&lt;/p&gt;
&lt;h2 id="tldr"&gt;TL;DR&lt;/h2&gt;
&lt;p&gt;Avoid doing things in Make that would make you lean on complex
solutions like an external memoize function.&lt;/p&gt;
&lt;p&gt;However, if you do end up needing a more flexible memoize routine, I
wrote one that you can use.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="http://www.wtfpl.net/txt/copying/"&gt;WTFPL-2&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./ryf-routers.html"/>
		<link rel="alternate" type="text/markdown" href="./ryf-routers.md"/>
		<id>https://lukeshu.com/blog/ryf-routers.html</id>
		<updated>2014-09-12T00:00:00+00:00</updated>
		<published>2014-09-12T00:00:00+00:00</published>
		<title>I'm excited about the new RYF-certified routers from ThinkPenguin</title>
		<content type="html">&lt;h1
id="im-excited-about-the-new-ryf-certified-routers-from-thinkpenguin"&gt;I’m
excited about the new RYF-certified routers from ThinkPenguin&lt;/h1&gt;
&lt;p&gt;I just learned that on Wednesday, the FSF &lt;a
href="https://www.fsf.org/resources/hw/endorsement/thinkpenguin"&gt;awarded&lt;/a&gt;
the &lt;abbr title="Respects Your Freedom"&gt;RYF&lt;/abbr&gt; certification to the
&lt;a href="https://www.thinkpenguin.com/TPE-NWIFIROUTER"&gt;Think Penguin
TPE-NWIFIROUTER&lt;/a&gt; wireless router.&lt;/p&gt;
&lt;p&gt;I didn’t find this information directly published up front, but
simply: It is a re-branded &lt;strong&gt;TP-Link TL-841ND&lt;/strong&gt; modded to
be running &lt;a href="http://librecmc.com/"&gt;libreCMC&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’ve been a fan of the TL-841/740 line of routers for several years
now. They are dirt cheap (if you go to Newegg and sort by “cheapest,”
it’s frequently the TL-740N), are extremely reliable, and run OpenWRT
like a champ. They are my go-to routers.&lt;/p&gt;
&lt;p&gt;(And they sure beat the snot out of the Arris TG862 that it seems
like everyone has in their homes now. I hate that thing, it even has
buggy packet scheduling.)&lt;/p&gt;
&lt;p&gt;So this announcement is &lt;del&gt;doubly&lt;/del&gt;triply exciting for me:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I have a solid recommendation for a router that doesn’t require me
or them to manually install an after-market firmware (buy it from
ThinkPenguin).&lt;/li&gt;
&lt;li&gt;If it’s for me, or someone technical, I can cut costs by getting a
stock TP-Link from Newegg, installing libreCMC ourselves.&lt;/li&gt;
&lt;li&gt;I can install a 100% libre distribution on my existing routers
(until recently, they were not supported by any of the libre
distributions, not for technical reasons, but lack of manpower).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I hope to get libreCMC installed on my boxes this weekend!&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./what-im-working-on-fall-2014.html"/>
		<link rel="alternate" type="text/markdown" href="./what-im-working-on-fall-2014.md"/>
		<id>https://lukeshu.com/blog/what-im-working-on-fall-2014.html</id>
		<updated>2014-09-11T00:00:00+00:00</updated>
		<published>2014-09-11T00:00:00+00:00</published>
		<title>What I'm working on (Fall 2014)</title>
		<content type="html">&lt;h1 id="what-im-working-on-fall-2014"&gt;What I’m working on (Fall
2014)&lt;/h1&gt;
&lt;p&gt;I realized today that I haven’t updated my log in a while, and I
don’t have any “finished” stuff to show off right now, but I should just
talk about all the cool stuff I’m working on right now.&lt;/p&gt;
&lt;h2 id="static-parsing-of-subshells"&gt;Static parsing of subshells&lt;/h2&gt;
&lt;p&gt;Last year I wrote a shell (for my Systems Programming class);
however, I went above-and-beyond and added some really novel features.
In my opinion, the most significant is that it parses arbitrarily deep
subshells in one pass, instead of deferring them until execution. No
shell that I know of does this.&lt;/p&gt;
&lt;p&gt;At first this sounds like a really difficult, but minor feature.
Until you think about scripting, and maintenance of those scripts. Being
able to do a full syntax check of a script is &lt;em&gt;crucial&lt;/em&gt; for
long-term maintenance, yet it’s something that is missing from every
major shell. I’d love to get this code merged into bash. It would be
incredibly useful for &lt;a
href="/git/mirror/parabola/packages/libretools.git"&gt;some software I
maintain&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Anyway, I’m trying to publish this code, but because of a recent
kerfuffle with a student publishing all of his projects on the web (and
other students trying to pass it off as their own), I’m being cautious
with this and making sure Purdue is alright with what I’m putting
online.&lt;/p&gt;
&lt;h2 id="stateless-user-configuration-for-pamnss"&gt;&lt;a
href="https://lukeshu.com/git/mirror/parabola/hackers.git/log/?h=lukeshu/restructure"&gt;Stateless
user configuration for PAM/NSS&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Parabola GNU/Linux-libre users know that over this summer, we had a
&lt;em&gt;mess&lt;/em&gt; with server outages. One of the servers is still out (due
to things out of our control), and we don’t have some of the data on it
(because volunteer developers are terrible about back-ups,
apparently).&lt;/p&gt;
&lt;p&gt;This has caused us to look at how we manage our servers, back-ups,
and several other things.&lt;/p&gt;
&lt;p&gt;One thing that I’ve taken on as my pet project is making sure that if
a server goes down, or we need to migrate (for example, Jon is telling
us that he wants us to hurry up and switch to the new 64 bit hardware so
he can turn off the 32 bit box), we can spin up a new server from
scratch pretty easily. Part of that is making configurations stateless,
and dynamic based on external data; having data be located in one place
instead of duplicated across 12 config files and 3 databases… on the
same box.&lt;/p&gt;
&lt;p&gt;Right now, that’s looking like some custom software interfacing with
OpenLDAP and OpenSSH via sockets (OpenLDAP being a middle-man between us
and PAM (Linux) and NSS (libc)). However, the OpenLDAP documentation is…
inconsistent and frustrating. I might end up hacking up the LDAP modules
for NSS and PAM to talk to our system directly, and cut OpenLDAP out of
the picture. We’ll see!&lt;/p&gt;
&lt;p&gt;PS: Pablo says that tomorrow we should be getting out-of-band access
to the drive of the server that is down, so that we can finally restore
those services on a different server.&lt;/p&gt;
&lt;h2 id="project-leaguer"&gt;&lt;a
href="https://lukeshu.com/git/mirror/leaguer.git/"&gt;Project
Leaguer&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;Last year, some friends and I began writing some “eSports tournament
management software”, primarily targeting League of Legends (though it
has a module system that will allow it to support tons of different data
sources). We mostly got it done last semester, but it had some rough
spots and sharp edges we need to work out. Because we were all out of
communication for the summer, we didn’t work on it very much (but we did
a little!). It’s weird that I care about this, because I’m not a gamer.
Huh, I guess coding with friends is just fun.&lt;/p&gt;
&lt;p&gt;Anyway, this year, &lt;a
href="https://github.com/AndrewMurrell"&gt;Andrew&lt;/a&gt;, &lt;a
href="https://github.com/DavisLWebb"&gt;Davis&lt;/a&gt;, and I are planning to
get it to a polished state by the end of the semester. We could probably
do it faster, but we’d all also like to focus on classes and other
projects a little more.&lt;/p&gt;
&lt;h2 id="c1"&gt;C+=1&lt;/h2&gt;
&lt;p&gt;People tend to lump C and C++ together, which upsets me, because I
love C, but have a dislike for C++. That’s not to say that C++ is
entirely bad; it has some good features. My “favorite” code is actually
code that is basically C, but takes advantage of a couple C++ features,
while still being idiomatic C, not C++.&lt;/p&gt;
&lt;p&gt;Anyway, with the perspective of history (what worked and what
didn’t), and a slightly opinionated view on language design (I’m pretty
much a Rob Pike fan-boy), I thought I’d try to tackle “object-oriented
C” with roughly the same design criteria as Stroustrup had when
designing C++. I’m calling mine C+=1, for obvious reasons.&lt;/p&gt;
&lt;p&gt;I haven’t published anything yet, because calling it “working” would
be stretching the truth. But I am using it for my assignments in CS 334
(Intro to Graphics), so it should move along fairly quickly, as my grade
depends on it.&lt;/p&gt;
&lt;p&gt;I’m not taking it too seriously; I don’t expect it to be much more
than a toy language, but it is an excuse to dive into the GCC
internals.&lt;/p&gt;
&lt;h2 id="projects-that-ive-put-on-the-back-burner"&gt;Projects that I’ve put
on the back-burner&lt;/h2&gt;
&lt;p&gt;I’ve got several other projects that I’m putting on hold for a
while.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;maven-dist&lt;/code&gt; (was hosted with Parabola, apparently I
haven’t pushed it anywhere except the server that is down): A tool to
build Apache Maven from source. That sounds easy, it’s open source,
right? Well, except that Maven is the build system from hell. It doesn’t
support cyclic dependencies, yet uses them internally to build itself.
It &lt;em&gt;loves&lt;/em&gt; to just get binaries from Maven Central to “optimize”
the build process. It depends on code that depends on compiler bugs that
no longer exist (which I guess means that &lt;em&gt;no one&lt;/em&gt; has tried to
build it from source after it was originally published). I’ve been
working on-and-off on this for more than a year. My favorite part of it
was writing a &lt;a href="/dump/jflex2jlex.sed.txt"&gt;sed script&lt;/a&gt; that
translates a JFlex grammar specification into a JLex grammar, which is
used to bootstrap JFlex; its both gross and delightful at the same
time.&lt;/li&gt;
&lt;li&gt;Integration between &lt;code&gt;dbscripts&lt;/code&gt; and
&lt;code&gt;abslibre&lt;/code&gt;. If you search IRC logs, mailing lists, and
ParabolaWiki, you can find numerous rants by me against &lt;a
href="/git/mirror/parabola/dbscripts.git/tree/db-sync"&gt;&lt;code&gt;dbscripts:db-sync&lt;/code&gt;&lt;/a&gt;.
I just hate the data-flow, it is almost designed to make things get out
of sync, and broken. I mean, does &lt;a
href="/dump/parabola-data-flow.svg"&gt;this&lt;/a&gt; look like a simple diagram?
For contrast, &lt;a href="/dump/parabola-data-flow-xbs.svg"&gt;here’s&lt;/a&gt; a
rough (slightly incomplete) diagram of what I want to replace it
with.&lt;/li&gt;
&lt;li&gt;Git backend for MediaWiki (or, pulling out the rendering module of
MediaWiki). I’ve made decent progress on that front, but there is
&lt;em&gt;crazy&lt;/em&gt; de-normalization going on in the MediaWiki schema that
makes this very difficult. I’m sure some of it is for historical
reasons, and some of it for performance, but either way it is a mess for
someone trying to neatly gut that part of the codebase.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="other"&gt;Other&lt;/h2&gt;
&lt;p&gt;I should consider doing a write-up of deterministic-&lt;code&gt;tar&lt;/code&gt;
behavior (something that I’ve been implementing in Parabola for a while,
meanwhile the Debian people have also been working on it).&lt;/p&gt;
&lt;p&gt;I should also consider doing a “post-mortem” of &lt;a
href="https://lukeshu.com/git/mirror/parabola/packages/pbs-tools.git/"&gt;PBS&lt;/a&gt;,
which never actually got used, but launched XBS (part of the
&lt;code&gt;dbscripts&lt;/code&gt;/&lt;code&gt;abslibre&lt;/code&gt; integration mentioned
above), as well as serving as a good test-bed for features that did get
implemented.&lt;/p&gt;
&lt;p&gt;I over-use the word “anyway.”&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./rails-improvements.html"/>
		<link rel="alternate" type="text/markdown" href="./rails-improvements.md"/>
		<id>https://lukeshu.com/blog/rails-improvements.html</id>
		<updated>2014-05-08T00:00:00+00:00</updated>
		<published>2014-05-08T00:00:00+00:00</published>
		<title>Miscellaneous ways to improve your Rails experience</title>
		<content type="html">&lt;h1
id="miscellaneous-ways-to-improve-your-rails-experience"&gt;Miscellaneous
ways to improve your Rails experience&lt;/h1&gt;
&lt;p&gt;Recently, I’ve been working on &lt;a
href="https://github.com/LukeShu/leaguer"&gt;a Rails web application&lt;/a&gt;,
that’s really the baby of a friend of mine. Anyway, through its
development, I’ve come up with a couple things that should make your
interactions with Rails more pleasant.&lt;/p&gt;
&lt;h2
id="auto-reload-classes-from-other-directories-than-app"&gt;Auto-(re)load
classes from other directories than &lt;code&gt;app/&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The development server automatically loads and reloads files from the
&lt;code&gt;app/&lt;/code&gt; directory, which is extremely nice. However, most web
applications are going to involve modules that aren’t in that directory;
and editing those files requires re-starting the server for the changes
to take effect.&lt;/p&gt;
&lt;p&gt;Adding the following lines to your &lt;a
href="https://github.com/LukeShu/leaguer/blob/c846cd71411ec3373a5229cacafe0df6b3673543/config/application.rb#L15"&gt;&lt;code&gt;config/application.rb&lt;/code&gt;&lt;/a&gt;
will allow it to automatically load and reload files from the
&lt;code&gt;lib/&lt;/code&gt; directory. You can of course change this to whichever
directory/ies you like.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;module YourApp
    class Application &amp;lt; Rails::Application
        …
        config.autoload_paths += [&amp;quot;#{Rails.root}/lib&amp;quot;]
        config.watchable_dirs[&amp;quot;#{Rails.root}/lib&amp;quot;] = [:rb]
        …
    end
end&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id="have-submit_tag-generate-a-button-instead-of-an-input"&gt;Have
&lt;code&gt;submit_tag&lt;/code&gt; generate a button instead of an input&lt;/h2&gt;
&lt;p&gt;In HTML, the &lt;code&gt;&amp;lt;input type="submit"&amp;gt;&lt;/code&gt; tag styles
slightly differently than other inputs or buttons. It is impossible to
precisely controll the hight via CSS, which makes designing forms a
pain. This is particularly noticable if you use Bootstrap 3, and put it
next to another button; the submit button will be slightly shorter
vertically.&lt;/p&gt;
&lt;p&gt;The obvious fix here is to use
&lt;code&gt;&amp;lt;button type="submit"&amp;gt;&lt;/code&gt; instead. The following code
will modify the default Rails form helpers to generate a button tag
instead of an input tag. Just stick the code in &lt;a
href="https://github.com/LukeShu/leaguer/blob/521eae01be1ca3f69b47b3170a0548c3268f4a22/config/initializers/form_improvements.rb"&gt;&lt;code&gt;config/initializers/form_improvements.rb&lt;/code&gt;&lt;/a&gt;;
it will override
&lt;code&gt;ActionView::Hlepers::FormTagHelper#submit_tag&lt;/code&gt;. It is mostly
the standard definition of the function, except for the last line, which
has changed.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# -*- ruby-indent-level: 2; indent-tabs-mode: nil -*-
module ActionView
  module Helpers
    module FormTagHelper

      # This is modified from actionpack-4.0.2/lib/action_view/helpers/form_tag_helper.rb#submit_tag
      def submit_tag(value = &amp;quot;Save changes&amp;quot;, options = {})
        options = options.stringify_keys

        if disable_with = options.delete(&amp;quot;disable_with&amp;quot;)
          message = &amp;quot;:disable_with option is deprecated and will be removed from Rails 4.1. &amp;quot; \
                    &amp;quot;Use &amp;#39;data: { disable_with: \&amp;#39;Text\&amp;#39; }&amp;#39; instead.&amp;quot;
          ActiveSupport::Deprecation.warn message

          options[&amp;quot;data-disable-with&amp;quot;] = disable_with
        end

        if confirm = options.delete(&amp;quot;confirm&amp;quot;)
          message = &amp;quot;:confirm option is deprecated and will be removed from Rails 4.1. &amp;quot; \
                    &amp;quot;Use &amp;#39;data: { confirm: \&amp;#39;Text\&amp;#39; }&amp;#39; instead&amp;#39;.&amp;quot;
          ActiveSupport::Deprecation.warn message

          options[&amp;quot;data-confirm&amp;quot;] = confirm
        end

        content_tag(:button, value, { &amp;quot;type&amp;quot; =&amp;gt; &amp;quot;submit&amp;quot;, &amp;quot;name&amp;quot; =&amp;gt; &amp;quot;commit&amp;quot;, &amp;quot;value&amp;quot; =&amp;gt; value }.update(options))
      end

    end
  end
end&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I’ll probably update this page as I tweak other things I don’t
like.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./bash-redirection.html"/>
		<link rel="alternate" type="text/markdown" href="./bash-redirection.md"/>
		<id>https://lukeshu.com/blog/bash-redirection.html</id>
		<updated>2014-02-13T00:00:00+00:00</updated>
		<published>2014-02-13T00:00:00+00:00</published>
		<title>Bash redirection</title>
		<content type="html">&lt;h1 id="bash-redirection"&gt;Bash redirection&lt;/h1&gt;
&lt;p&gt;Apparently, too many people don’t understand Bash redirection. They
might get the basic syntax, but they think of the process as
declarative; in Bourne-ish shells, it is procedural.&lt;/p&gt;
&lt;p&gt;In Bash, streams are handled in terms of “file descriptors” of “FDs”.
FD 0 is stdin, FD 1 is stdout, and FD 2 is stderr. The equivalence (or
lack thereof) between using a numeric file descriptor, and using the
associated file in &lt;code&gt;/dev/*&lt;/code&gt; and &lt;code&gt;/proc/*&lt;/code&gt; is
interesting, but beyond the scope of this article.&lt;/p&gt;
&lt;h2 id="step-1-pipes"&gt;Step 1: Pipes&lt;/h2&gt;
&lt;p&gt;To quote the Bash manual:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;A &amp;#39;pipeline&amp;#39; is a sequence of simple commands separated by one of the
control operators &amp;#39;|&amp;#39; or &amp;#39;|&amp;amp;&amp;#39;.

   The format for a pipeline is
     [time [-p]] [!] COMMAND1 [ [| or |&amp;amp;] COMMAND2 ...]&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, &lt;code&gt;|&amp;amp;&lt;/code&gt; is just shorthand for
&lt;code&gt;2&amp;gt;&amp;amp;1 |&lt;/code&gt;, the pipe part happens here, but the
&lt;code&gt;2&amp;gt;&amp;amp;1&lt;/code&gt; part doesn’t happen until step 2.&lt;/p&gt;
&lt;p&gt;First, if the command is part of a pipeline, the pipes are set up.
For every instance of the &lt;code&gt;|&lt;/code&gt; metacharacter, Bash creates a
pipe (&lt;code&gt;pipe(3)&lt;/code&gt;), and duplicates (&lt;code&gt;dup2(3)&lt;/code&gt;) the
write end of the pipe to FD 1 of the process on the left side of the
&lt;code&gt;|&lt;/code&gt;, and duplicate the read end of the pipe to FD 0 of the
process on the right side.&lt;/p&gt;
&lt;h2 id="step-2-redirections"&gt;Step 2: Redirections&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;After&lt;/em&gt; the initial FD 0 and FD 1 fiddling by pipes is done,
Bash looks at the redirections. &lt;strong&gt;This means that redirections can
override pipes.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Redirections are read left-to-right, and are executed as they are
read, using &lt;code&gt;dup2(right-side, left-side)&lt;/code&gt;. This is where most
of the confusion comes from, people think of them as declarative, which
leads to them doing the first of these, when they mean to do the
second:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cmd 2&amp;gt;&amp;amp;1 &amp;gt;file # stdout goes to file, stderr goes to stdout
cmd &amp;gt;file 2&amp;gt;&amp;amp;1 # both stdout and stderr go to file&lt;/code&gt;&lt;/pre&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./java-segfault.html"/>
		<link rel="alternate" type="text/markdown" href="./java-segfault.md"/>
		<id>https://lukeshu.com/blog/java-segfault.html</id>
		<updated>2014-01-13T00:00:00+00:00</updated>
		<published>2014-01-13T00:00:00+00:00</published>
		<title>My favorite bug: segfaults in Java</title>
		<content type="html">&lt;h1 id="my-favorite-bug-segfaults-in-java"&gt;My favorite bug: segfaults in
Java&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;Update: Two years later, I wrote a more detailed version of this
article: &lt;a href="./java-segfault-redux.html"&gt;My favorite bug: segfaults
in Java (redux)&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I’ve told this story orally a number of times, but realized that I
have never written it down. This is my favorite bug story; it might not
be my hardest bug, but it is the one I most like to tell.&lt;/p&gt;
&lt;h2 id="the-context"&gt;The context&lt;/h2&gt;
&lt;p&gt;In 2012, I was a Senior programmer on the FIRST Robotics Competition
team 1024. For the unfamiliar, the relevant part of the setup is that
there are 2 minute and 15 second matches in which you have a 120 pound
robot that sometimes runs autonomously, and sometimes is controlled over
WiFi from a person at a laptop running stock “driver station” software
and modifiable “dashboard” software.&lt;/p&gt;
&lt;p&gt;That year, we mostly used the dashboard software to allow the human
driver and operator to monitor sensors on the robot, one of them being a
video feed from a web-cam mounted on it. This was really easy because
the new standard dashboard program had a click-and drag interface to add
stock widgets; you just had to make sure the code on the robot was
actually sending the data.&lt;/p&gt;
&lt;p&gt;That’s great, until when debugging things, the dashboard would
suddenly vanish. If it was run manually from a terminal (instead of
letting the driver station software launch it), you would see a core
dump indicating a segmentation fault.&lt;/p&gt;
&lt;p&gt;This wasn’t just us either; I spoke with people on other teams,
everyone who was streaming video had this issue. But, because it only
happened every couple of minutes, and a match is only 2:15, it didn’t
need to run very long, they just crossed their fingers and hoped it
didn’t happen during a match.&lt;/p&gt;
&lt;p&gt;The dashboard was written in Java, and the source was available
(under a 3-clause BSD license), so I dove in, hunting for the bug. Now,
the program did use Java Native Interface to talk to OpenCV, which the
video ran through; so I figured that it must be a bug in the C/C++ code
that was being called. It was especially a pain to track down the
pointers that were causing the issue, because it was hard with native
debuggers to see through all of the JVM stuff to the OpenCV code, and
the OpenCV stuff is opaque to Java debuggers.&lt;/p&gt;
&lt;p&gt;Eventually the issue lead me back into the Java code—there was a
native pointer being stored in a Java variable; Java code called the
native routine to &lt;code&gt;free()&lt;/code&gt; the structure, but then tried to
feed it to another routine later. This lead to difficulty again—tracking
objects with Java debuggers was hard because they don’t expect the
program to suddenly segfault; it’s Java code, Java doesn’t segfault, it
throws exceptions!&lt;/p&gt;
&lt;p&gt;With the help of &lt;code&gt;println()&lt;/code&gt; I was eventually able to see
that some code was executing in an order that straight didn’t make
sense.&lt;/p&gt;
&lt;h2 id="the-bug"&gt;The bug&lt;/h2&gt;
&lt;p&gt;The issue was that Java was making an unsafe optimization (I never
bothered to figure out if it is the compiler or the JVM making the
mistake, I was satisfied once I had a work-around).&lt;/p&gt;
&lt;p&gt;Java was doing something similar to tail-call optimization with
regard to garbage collection. You see, if it is waiting for the return
value of a method &lt;code&gt;m()&lt;/code&gt; of object &lt;code&gt;o&lt;/code&gt;, and code in
&lt;code&gt;m()&lt;/code&gt; that is yet to be executed doesn’t access any other
methods or properties of &lt;code&gt;o&lt;/code&gt;, then it will go ahead and
consider &lt;code&gt;o&lt;/code&gt; eligible for garbage collection before
&lt;code&gt;m()&lt;/code&gt; has finished running.&lt;/p&gt;
&lt;p&gt;That is normally a safe optimization to make… except for when a
destructor method (&lt;code&gt;finalize()&lt;/code&gt;) is defined for the object;
the destructor can have side effects, and Java has no way to know
whether it is safe for them to happen before &lt;code&gt;m()&lt;/code&gt; has
finished running.&lt;/p&gt;
&lt;h2 id="the-work-around"&gt;The work-around&lt;/h2&gt;
&lt;p&gt;The routine that the segmentation fault was occurring in was
something like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;public type1 getFrame() {
    type2 child = this.getChild();
    type3 var = this.something();
    // `this` may now be garbage collected
    return child.somethingElse(var); // segfault comes here
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Where the destructor method of &lt;code&gt;this&lt;/code&gt; calls a method that
will &lt;code&gt;free()&lt;/code&gt; native memory that is also accessed by
&lt;code&gt;child&lt;/code&gt;; if &lt;code&gt;this&lt;/code&gt; is garbage collected before
&lt;code&gt;child.somethingElse()&lt;/code&gt; runs, the backing native code will
try to access memory that has been &lt;code&gt;free()&lt;/code&gt;ed, and receive a
segmentation fault. That usually didn’t happen, as the routines were
pretty fast. However, running 30 times a second, eventually bad luck
with the garbage collector happens, and the program crashes.&lt;/p&gt;
&lt;p&gt;The work-around was to insert a bogus call to this to keep
&lt;code&gt;this&lt;/code&gt; around until after we were also done with
&lt;code&gt;child&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;public type1 getFrame() {
    type2 child = this.getChild();
    type3 var = this.something();
    type1 ret = child.somethingElse(var);
    this.getSize(); // bogus call to keep `this` around
    return ret;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Yeah. After spending weeks wading through though thousands of lines
of Java, C, and C++, a bogus call to a method I didn’t care about was
the fix.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2014 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./bash-arrays.html"/>
		<link rel="alternate" type="text/markdown" href="./bash-arrays.md"/>
		<id>https://lukeshu.com/blog/bash-arrays.html</id>
		<updated>2013-10-13T00:00:00+00:00</updated>
		<published>2013-10-13T00:00:00+00:00</published>
		<title>Bash arrays</title>
		<content type="html">&lt;h1 id="bash-arrays"&gt;Bash arrays&lt;/h1&gt;
&lt;p&gt;Way too many people don’t understand Bash arrays. Many of them argue
that if you need arrays, you shouldn’t be using Bash. If we reject the
notion that one should never use Bash for scripting, then thinking you
don’t need Bash arrays is what I like to call “wrong”. I don’t even mean
real scripting; even these little stubs in &lt;code&gt;/usr/bin&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/sh
java -jar /…/something.jar $* # WRONG!&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Command line arguments are exposed as an array, that little
&lt;code&gt;$*&lt;/code&gt; is accessing it, and is doing the wrong thing (for the
lazy, the correct thing is &lt;code&gt;-- "$@"&lt;/code&gt;). Arrays in Bash offer a
safe way preserve field separation.&lt;/p&gt;
&lt;p&gt;One of the main sources of bugs (and security holes) in shell scripts
is field separation. That’s what arrays are about.&lt;/p&gt;
&lt;h2 id="what-field-separation"&gt;What? Field separation?&lt;/h2&gt;
&lt;p&gt;Field separation is just splitting a larger unit into a list of
“fields”. The most common case is when Bash splits a “simple command”
(in the Bash manual’s terminology) into a list of arguments.
Understanding how this works is an important prerequisite to
understanding arrays, and even why they are important.&lt;/p&gt;
&lt;p&gt;Dealing with lists is something that is very common in Bash scripts;
from dealing with lists of arguments, to lists of files; they pop up a
lot, and each time, you need to think about how the list is separated.
In the case of &lt;code&gt;$PATH&lt;/code&gt;, the list is separated by colons. In
the case of &lt;code&gt;$CFLAGS&lt;/code&gt;, the list is separated by whitespace.
In the case of actual arrays, it’s easy, there’s no special character to
worry about, just quote it, and you’re good to go.&lt;/p&gt;
&lt;h2 id="bash-word-splitting"&gt;Bash word splitting&lt;/h2&gt;
&lt;p&gt;When Bash reads a “simple command”, it splits the whole thing into a
list of “words”. “The first word specifies the command to be executed,
and is passed as argument zero. The remaining words are passed as
arguments to the invoked command.” (to quote &lt;code&gt;bash(1)&lt;/code&gt;)&lt;/p&gt;
&lt;p&gt;It is often hard for those unfamiliar with Bash to understand when
something is multiple words, and when it is a single word that just
contains a space or newline. To help gain an intuitive understanding, I
recommend using the following command to print a bullet list of words,
to see how Bash splits them up:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;printf ' -&gt; %s\n' &lt;var&gt;words…&lt;/var&gt;&lt;hr&gt; -&amp;gt; word one
 -&amp;gt; multiline
word
 -&amp;gt; third word
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In a simple command, in absence of quoting, Bash separates the “raw”
input into words by splitting on spaces and tabs. In other places, such
as when expanding a variable, it uses the same process, but splits on
the characters in the &lt;code&gt;$IFS&lt;/code&gt; variable (which has the default
value of space/tab/newline). This process is, creatively enough, called
“word splitting”.&lt;/p&gt;
&lt;p&gt;In most discussions of Bash arrays, one of the frequent criticisms is
all the footnotes and “gotchas” about when to quote things. That’s
because they usually don’t set the context of word splitting.
&lt;strong&gt;Double quotes (&lt;code&gt;"&lt;/code&gt;) inhibit Bash from doing word
splitting.&lt;/strong&gt; That’s it, that’s all they do. Arrays are already
split into words; without wrapping them in double quotes Bash re-word
splits them, which is almost &lt;em&gt;never&lt;/em&gt; what you want; otherwise,
you wouldn’t be working with an array.&lt;/p&gt;
&lt;h2 id="normal-array-syntax"&gt;Normal array syntax&lt;/h2&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Setting an array&lt;/h1&gt;
    &lt;p&gt;&lt;var&gt;words…&lt;/var&gt; is expanded and subject to word splitting
       based on &lt;code&gt;$IFS&lt;/code&gt;.&lt;/p&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;array=(&lt;var&gt;words…&lt;/var&gt;)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Set the contents of the entire array.&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;array+=(&lt;var&gt;words…&lt;/var&gt;)&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Appends &lt;var&gt;words…&lt;/var&gt; to the end of the array.&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;array[&lt;var&gt;n&lt;/var&gt;]=&lt;var&gt;word&lt;/var&gt;&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Sets an individual entry in the array, the first entry is at
          &lt;var&gt;n&lt;/var&gt;=0.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Now, for accessing the array. The most important things to
understanding arrays is to quote them, and understanding the difference
between &lt;code&gt;@&lt;/code&gt; and &lt;code&gt;*&lt;/code&gt;.&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Getting an entire array&lt;/h1&gt;
    &lt;p&gt;Unless these are wrapped in double quotes, they are subject to
       word splitting, which defeats the purpose of arrays.&lt;/p&gt;
    &lt;p&gt;I guess it's worth mentioning that if you don't quote them, and
       word splitting is applied, &lt;code&gt;@&lt;/code&gt; and &lt;code&gt;*&lt;/code&gt;
       end up being equivalent.&lt;/p&gt;
    &lt;p&gt;With &lt;code&gt;*&lt;/code&gt;, when joining the elements into a single
       string, the elements are separated by the first character in
       &lt;code&gt;$IFS&lt;/code&gt;, which is, by default, a space.&lt;/p&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to every element of the array, as a separate
          words.&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[*]}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to every element of the array, as a single
          word.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;It’s really that simple—that covers most usages of arrays, and most
of the mistakes made with them.&lt;/p&gt;
&lt;p&gt;To help you understand the difference between &lt;code&gt;@&lt;/code&gt; and
&lt;code&gt;*&lt;/code&gt;, here is a sample of each:&lt;/p&gt;
&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;&lt;th&gt;&lt;code&gt;@&lt;/code&gt;&lt;/th&gt;&lt;th&gt;&lt;code&gt;*&lt;/code&gt;&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Input:&lt;pre&gt;&lt;code&gt;#!/bin/bash
array=(foo bar baz)
for item in "${array[@]}"; do
        echo " - &amp;lt;${item}&amp;gt;"
done&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
      &lt;td&gt;Input:&lt;pre&gt;&lt;code&gt;#!/bin/bash
array=(foo bar baz)
for item in "${array[*]}"; do
        echo " - &amp;lt;${item}&amp;gt;"
done&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Output:&lt;pre&gt;&lt;code&gt; - &amp;lt;foo&amp;gt;
 - &amp;lt;bar&amp;gt;
 - &amp;lt;baz&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
     &lt;td&gt;Output:&lt;pre&gt;&lt;code&gt; - &amp;lt;foo bar baz&amp;gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In most cases, &lt;code&gt;@&lt;/code&gt; is what you want, but &lt;code&gt;*&lt;/code&gt;
comes up often enough too.&lt;/p&gt;
&lt;p&gt;To get individual entries, the syntax is
&lt;code&gt;${array[&lt;var&gt;n&lt;/var&gt;]}&lt;/code&gt;, where &lt;var&gt;n&lt;/var&gt; starts at 0.&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Getting a single entry from an array&lt;/h1&gt;
    &lt;p&gt;Also subject to word splitting if you don't wrap it in
       quotes.&lt;/p&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[&lt;var&gt;n&lt;/var&gt;]}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to the &lt;var&gt;n&lt;/var&gt;&lt;sup&gt;th&lt;/sup&gt; entry of the
          array, where the first entry is at &lt;var&gt;n&lt;/var&gt;=0.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;To get a subset of the array, there are a few options:&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Getting subsets of an array&lt;/h1&gt;
    &lt;p&gt;Substitute &lt;code&gt;*&lt;/code&gt; for &lt;code&gt;@&lt;/code&gt; to get the subset
       as a &lt;code&gt;$IFS&lt;/code&gt;-separated string instead of separate
       words, as described above.&lt;/p&gt;
    &lt;p&gt;Again, if you don't wrap these in double quotes, they are
       subject to word splitting, which defeats the purpose of
       arrays.&lt;/p&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]:&lt;var&gt;start&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to the entries from &lt;var&gt;n&lt;/var&gt;=&lt;var&gt;start&lt;/var&gt; to the end
          of the array.&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]:&lt;var&gt;start&lt;/var&gt;:&lt;var&gt;count&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to &lt;var&gt;count&lt;/var&gt; entries, starting at
          &lt;var&gt;n&lt;/var&gt;=&lt;var&gt;start&lt;/var&gt;.&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]::&lt;var&gt;count&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Evaluates to &lt;var&gt;count&lt;/var&gt; entries from the beginning of
          the array.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Notice that &lt;code&gt;"${array[@]}"&lt;/code&gt; is equivalent to
&lt;code&gt;"${array[@]:0}"&lt;/code&gt;.&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Getting the length of an array&lt;/h1&gt;
    &lt;p&gt;The is the only situation with arrays where quoting doesn't
       make a difference.&lt;/p&gt;
    &lt;p&gt;True to my earlier statement, when unquoted, there is no
       difference between &lt;code&gt;@&lt;/code&gt; and &lt;code&gt;*&lt;/code&gt;.&lt;/p&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;
        &lt;code&gt;${#array[@]}&lt;/code&gt;
        &lt;br&gt;or&lt;br&gt;
        &lt;code&gt;${#array[*]}&lt;/code&gt;
      &lt;/td&gt;
      &lt;td&gt;
        Evaluates to the length of the array
      &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id="argument-array-syntax"&gt;Argument array syntax&lt;/h2&gt;
&lt;p&gt;Accessing the arguments is mostly that simple, but that array doesn’t
actually have a variable name. It’s special. Instead, it is exposed
through a series of special variables (normal variables can only start
with letters and underscore), that &lt;em&gt;mostly&lt;/em&gt; match up with the
normal array syntax.&lt;/p&gt;
&lt;p&gt;Setting the arguments array, on the other hand, is pretty different.
That’s fine, because setting the arguments array is less useful
anyway.&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Accessing the arguments array&lt;/h1&gt;
    &lt;aside&gt;Note that for values of &lt;var&gt;n&lt;/var&gt; with more than 1
           digit, you need to wrap it in &lt;code&gt;{}&lt;/code&gt;.
           Otherwise, &lt;code&gt;"$10"&lt;/code&gt; would be parsed
           as &lt;code&gt;"${1}0"&lt;/code&gt;.&lt;/aside&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;&lt;th colspan=2&gt;Individual entries&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${array[0]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;$0&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${array[1]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;$1&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td colspan=2 style="text-align:center"&gt;…&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${array[9]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;$9&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${array[10]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;${10}&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td colspan=2 style="text-align:center"&gt;…&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${array[&lt;var&gt;n&lt;/var&gt;]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;${&lt;var&gt;n&lt;/var&gt;}&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th colspan=2&gt;Subset arrays (array)&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[@]}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${@:0}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[@]:1}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"$@"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[@]:&lt;var&gt;pos&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${@:&lt;var&gt;pos&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[@]:&lt;var&gt;pos&lt;/var&gt;:&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${@:&lt;var&gt;pos&lt;/var&gt;:&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[@]::&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${@::&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th colspan=2&gt;Subset arrays (string)&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[*]}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${*:0}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[*]:1}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"$*"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[*]:&lt;var&gt;pos&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${*:&lt;var&gt;pos&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[*]:&lt;var&gt;pos&lt;/var&gt;:&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${*:&lt;var&gt;pos&lt;/var&gt;:&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;"${array[*]::&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;"${*::&lt;var&gt;len&lt;/var&gt;}"&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th colspan=2&gt;Array length&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;${#array[@]}&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;$#&lt;/code&gt; + 1&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;th colspan=2&gt;Setting the array&lt;/th&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;array=("${array[0]}" &lt;var&gt;words…&lt;/var&gt;)&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;set -- &lt;var&gt;words…&lt;/var&gt;&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;array=("${array[0]}" "${array[@]:2}")&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;shift&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
    &lt;tr&gt;&lt;td&gt;&lt;code&gt;array=("${array[0]}" "${array[@]:&lt;var&gt;n+1&lt;/var&gt;}")&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;shift &lt;var&gt;n&lt;/var&gt;&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Did you notice what was inconsistent? The variables &lt;code&gt;$*&lt;/code&gt;,
&lt;code&gt;$@&lt;/code&gt;, and &lt;code&gt;$#&lt;/code&gt; behave like the &lt;var&gt;n&lt;/var&gt;=0
entry doesn’t exist.&lt;/p&gt;
&lt;table&gt;
  &lt;caption&gt;
    &lt;h1&gt;Inconsistencies&lt;/h1&gt;
  &lt;/caption&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;th colspan=3&gt;&lt;code&gt;@&lt;/code&gt; or &lt;code&gt;*&lt;/code&gt;&lt;/th&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;→&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;"${array[@]:0}"&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${@}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;→&lt;/td&gt;
      &lt;td&gt;&lt;code&gt;"${@:1}"&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;th colspan=3&gt;&lt;code&gt;#&lt;/code&gt;&lt;/th&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${#array[@]}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;→&lt;/td&gt;
      &lt;td&gt;length&lt;/td&gt;
    &lt;/tr&gt;&lt;tr&gt;
      &lt;td&gt;&lt;code&gt;"${#}"&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;→&lt;/td&gt;
      &lt;td&gt;length-1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;These make sense because argument 0 is the name of the script—we
almost never want that when parsing arguments. You’d spend more code
getting the values that it currently gives you.&lt;/p&gt;
&lt;p&gt;Now, for an explanation of setting the arguments array. You cannot
set argument &lt;var&gt;n&lt;/var&gt;=0. The &lt;code&gt;set&lt;/code&gt; command is used to
manipulate the arguments passed to Bash after the fact—similarly, you
could use &lt;code&gt;set -x&lt;/code&gt; to make Bash behave like you ran it as
&lt;code&gt;bash -x&lt;/code&gt;; like most GNU programs, the &lt;code&gt;--&lt;/code&gt; tells
it to not parse any of the options as flags. The &lt;code&gt;shift&lt;/code&gt;
command shifts each entry &lt;var&gt;n&lt;/var&gt; spots to the left, using
&lt;var&gt;n&lt;/var&gt;=1 if no value is specified; and leaving argument 0
alone.&lt;/p&gt;
&lt;h2 id="but-you-mentioned-gotchas-about-quoting"&gt;But you mentioned
“gotchas” about quoting!&lt;/h2&gt;
&lt;p&gt;But I explained that quoting simply inhibits word splitting, which
you pretty much never want when working with arrays. If, for some odd
reason, you do what word splitting, then that’s when you don’t quote.
Simple, easy to understand.&lt;/p&gt;
&lt;p&gt;I think possibly the only case where you do want word splitting with
an array is when you didn’t want an array, but it’s what you get
(arguments are, by necessity, an array). For example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Usage: path_ls PATH1 PATH2…
# Description:
#   Takes any number of PATH-style values; that is,
#   colon-separated lists of directories, and prints a
#   newline-separated list of executables found in them.
# Bugs:
#   Does not correctly handle programs with a newline in the name,
#   as the output is newline-separated.
path_ls() {
    local IFS dirs
    IFS=:
    dirs=($@) # The odd-ball time that it needs to be unquoted
    find -L &amp;quot;${dirs[@]}&amp;quot; -maxdepth 1 -type f -executable \
        -printf &amp;#39;%f\n&amp;#39; 2&amp;gt;/dev/null | sort -u
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Logically, there shouldn’t be multiple arguments, just a single
&lt;code&gt;$PATH&lt;/code&gt; value; but, we can’t enforce that, as the array can
have any size. So, we do the robust thing, and just act on the entire
array, not really caring about the fact that it is an array. Alas, there
is still a field-separation bug in the program, with the output.&lt;/p&gt;
&lt;h2 id="i-still-dont-think-i-need-arrays-in-my-scripts"&gt;I still don’t
think I need arrays in my scripts&lt;/h2&gt;
&lt;p&gt;Consider the common code:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ARGS=&amp;#39; -f -q&amp;#39;
…
command $ARGS  # unquoted variables are a bad code-smell anyway&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here, &lt;code&gt;$ARGS&lt;/code&gt; is field-separated by &lt;code&gt;$IFS&lt;/code&gt;,
which we are assuming has the default value. This is fine, as long as
&lt;code&gt;$ARGS&lt;/code&gt; is known to never need an embedded space; which you
do as long as it isn’t based on anything outside of the program. But
wait until you want to do this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ARGS=&amp;#39; -f -q&amp;#39;
…
if [[ -f &amp;quot;$filename&amp;quot; ]]; then
    ARGS+=&amp;quot; -F $filename&amp;quot;
fi
…
command $ARGS&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now you’re hosed if &lt;code&gt;$filename&lt;/code&gt; contains a space! More
than just breaking, it could have unwanted side effects, such as when
someone figures out how to make
&lt;code&gt;filename='foo --dangerous-flag'&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Compare that with the array version:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ARGS=(-f -q)
…
if [[ -f &amp;quot;$filename&amp;quot; ]]; then
    ARGS+=(-F &amp;quot;$filename&amp;quot;)
fi
…
command &amp;quot;${ARGS[@]}&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id="what-about-portability"&gt;What about portability?&lt;/h2&gt;
&lt;p&gt;Except for the little stubs that call another program with
&lt;code&gt;"$@"&lt;/code&gt; at the end, trying to write for multiple shells
(including the ambiguous &lt;code&gt;/bin/sh&lt;/code&gt;) is not a task for mere
mortals. If you do try that, your best bet is probably sticking to
POSIX. Arrays are not POSIX; except for the arguments array, which is;
though getting subset arrays from &lt;code&gt;$@&lt;/code&gt; and &lt;code&gt;$*&lt;/code&gt; is
not (tip: use &lt;code&gt;set --&lt;/code&gt; to re-purpose the arguments
array).&lt;/p&gt;
&lt;p&gt;Writing for various versions of Bash, though, is pretty do-able.
Everything here works all the way back in bash-2.0 (December 1996), with
the following exceptions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;+=&lt;/code&gt; operator wasn’t added until Bash 3.1.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As a work-around, use
&lt;code&gt;array[${#array[*]}]=&lt;var&gt;word&lt;/var&gt;&lt;/code&gt; to append a single
element.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Accessing subset arrays of the arguments array is inconsistent if
&lt;var&gt;pos&lt;/var&gt;=0 in &lt;code&gt;${@:&lt;var&gt;pos&lt;/var&gt;:&lt;var&gt;len&lt;/var&gt;}&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In Bash 2.x and 3.x, it works as expected, except that argument 0 is
silently missing. For example &lt;code&gt;${@:0:3}&lt;/code&gt; gives arguments 1
and 2; where &lt;code&gt;${@:1:3}&lt;/code&gt; gives arguments 1, 2, and 3. This
means that if &lt;var&gt;pos&lt;/var&gt;=0, then only &lt;var&gt;len&lt;/var&gt;-1 arguments are
given back.&lt;/li&gt;
&lt;li&gt;In Bash 4.0, argument 0 can be accessed, but if &lt;var&gt;pos&lt;/var&gt;=0,
then it only gives back &lt;var&gt;len&lt;/var&gt;-1 arguments. So,
&lt;code&gt;${@:0:3}&lt;/code&gt; gives arguments 0 and 1.&lt;/li&gt;
&lt;li&gt;In Bash 4.1 and higher, it works in the way described in the main
part of this document.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Now, Bash 1.x doesn’t have arrays at all. &lt;code&gt;$@&lt;/code&gt; and
&lt;code&gt;$*&lt;/code&gt; work, but using &lt;code&gt;:&lt;/code&gt; to select a range of
elements from them doesn’t. Good thing most boxes have been updated
since 1996!&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./git-go-pre-commit.html"/>
		<link rel="alternate" type="text/markdown" href="./git-go-pre-commit.md"/>
		<id>https://lukeshu.com/blog/git-go-pre-commit.html</id>
		<updated>2013-10-12T00:00:00+00:00</updated>
		<published>2013-10-12T00:00:00+00:00</published>
		<title>A git pre-commit hook for automatically formatting Go code</title>
		<content type="html">&lt;h1 id="a-git-pre-commit-hook-for-automatically-formatting-go-code"&gt;A
git pre-commit hook for automatically formatting Go code&lt;/h1&gt;
&lt;p&gt;One of the (many) wonderful things about the Go programming language
is the &lt;code&gt;gofmt&lt;/code&gt; tool, which formats your source in a canonical
way. I thought it would be nice to integrate this in my &lt;code&gt;git&lt;/code&gt;
workflow by adding it in a pre-commit hook to automatically format my
source code when I committed it.&lt;/p&gt;
&lt;p&gt;The Go distribution contains a git pre-commit hook that checks
whether the source code is formatted, and aborts the commit if it isn’t.
I don’t remember if I was aware of this at the time (or if it even
existed at the time, or if it is new), but I wanted it to go ahead and
format the code for me.&lt;/p&gt;
&lt;p&gt;I found a few solutions online, but they were all missing
something—support for partial commits. I frequently use
&lt;code&gt;git add -p&lt;/code&gt;/&lt;code&gt;git gui&lt;/code&gt; to commit a subset of the
changes I’ve made to a file, the existing solutions would end up adding
the entire set of changes to my commit.&lt;/p&gt;
&lt;p&gt;I ended up writing a solution that only formats the version of the
that is staged for commit; here’s my
&lt;code&gt;.git/hooks/pre-commit&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#!/bin/bash

# This would only loop over files that are already staged for commit.
#     git diff --cached --numstat |
#     while read add del file; do
#         …
#     done

shopt -s globstar
for file in **/*.go; do
    tmp=&amp;quot;$(mktemp &amp;quot;$file.bak.XXXXXXXXXX&amp;quot;)&amp;quot;
    mv &amp;quot;$file&amp;quot; &amp;quot;$tmp&amp;quot;
    git checkout &amp;quot;$file&amp;quot;
    gofmt -w &amp;quot;$file&amp;quot;
    git add &amp;quot;$file&amp;quot;
    mv &amp;quot;$tmp&amp;quot; &amp;quot;$file&amp;quot;
done&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It’s still not perfect. It will try to operate on every
&lt;code&gt;*.go&lt;/code&gt; file—which might do weird things if you have a file
that hasn’t been checked in at all. This also has the effect of
formatting files that were checked in without being formatted, but
weren’t modified in this commit.&lt;/p&gt;
&lt;p&gt;I don’t remember why I did that—as you can see from the comment, I
knew how to only select files that were staged for commit. I haven’t
worked on any projects in Go in a while—if I return to one of them, and
remember why I did that, I will update this page.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="http://www.wtfpl.net/txt/copying/"&gt;WTFPL-2&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./fd_printf.html"/>
		<link rel="alternate" type="text/markdown" href="./fd_printf.md"/>
		<id>https://lukeshu.com/blog/fd_printf.html</id>
		<updated>2013-10-12T00:00:00+00:00</updated>
		<published>2013-10-12T00:00:00+00:00</published>
		<title>`dprintf`: print formatted text directly to a file descriptor</title>
		<content type="html">&lt;h1
id="dprintf-print-formatted-text-directly-to-a-file-descriptor"&gt;&lt;code&gt;dprintf&lt;/code&gt;:
print formatted text directly to a file descriptor&lt;/h1&gt;
&lt;p&gt;This already existed as &lt;code&gt;dprintf(3)&lt;/code&gt;. I now feel stupid
for having Implemented &lt;code&gt;fd_printf&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The original post is as follows:&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;I wrote this while debugging some code, and thought it might be
useful to others:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#define _GNU_SOURCE     /* vasprintf() */
#include &amp;lt;stdarg.h&amp;gt;     /* va_start()/va_end() */
#include &amp;lt;stdio.h&amp;gt;      /* vasprintf() */
#include &amp;lt;stdlib.h&amp;gt;     /* free() */
#include &amp;lt;unistd.h&amp;gt;     /* write() */

int
fd_printf(int fd, const char *format, ...)
{
    va_list arg;
    int len;
    char *str;

    va_start(arg, format);
    len = vasprintf(&amp;amp;str, format, arg);
    va_end(arg);

    write(fd, str, len);

    free(str);
    return len;
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It is a version of &lt;code&gt;printf&lt;/code&gt; that prints to a file
descriptor—where &lt;code&gt;fprintf&lt;/code&gt; prints to a &lt;code&gt;FILE*&lt;/code&gt;
data structure.&lt;/p&gt;
&lt;p&gt;The appeal of this is that &lt;code&gt;FILE*&lt;/code&gt; I/O is buffered—which
means mixing it with raw file descriptor I/O is going to produce weird
results.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="http://www.wtfpl.net/txt/copying/"&gt;WTFPL-2&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./emacs-as-an-os.html"/>
		<link rel="alternate" type="text/markdown" href="./emacs-as-an-os.md"/>
		<id>https://lukeshu.com/blog/emacs-as-an-os.html</id>
		<updated>2013-08-29T00:00:00+00:00</updated>
		<published>2013-08-29T00:00:00+00:00</published>
		<title>Emacs as an operating system</title>
		<content type="html">&lt;h1 id="emacs-as-an-operating-system"&gt;Emacs as an operating system&lt;/h1&gt;
&lt;p&gt;This was originally published on &lt;a
href="https://news.ycombinator.com/item?id=6292742"&gt;Hacker News&lt;/a&gt; on
2013-08-29.&lt;/p&gt;
&lt;p&gt;Calling Emacs an OS is dubious, it certainly isn’t a general purpose
OS, and won’t run on real hardware. But, let me make the case that Emacs
is an OS.&lt;/p&gt;
&lt;p&gt;Emacs has two parts, the C part, and the Emacs Lisp part.&lt;/p&gt;
&lt;p&gt;The C part isn’t just a Lisp interpreter, it is a Lisp Machine
emulator. It doesn’t particularly resemble any of the real Lisp
machines. The TCP, Keyboard/Mouse, display support, and filesystem are
done at the hardware level (the operations to work with these things are
among the primitive operations provided by the hardware). Of these, the
display being handled by the hardware isn’t particularly uncommon,
historically; the filesystem is a little stranger.&lt;/p&gt;
&lt;p&gt;The Lisp part of Emacs is the operating system that runs on that
emulated hardware. It’s not a particularly powerful OS, it not a
multitasking system. It has many packages available for it (though not
until recently was there a official package manager). It has reasonably
powerful IPC mechanisms. It has shells, mail clients (MUAs and MSAs),
web browsers, web servers and more, all written entirely in Emacs
Lisp.&lt;/p&gt;
&lt;p&gt;You might say, “but a lot of that is being done by the host operating
system!” Sure, some of it is, but all of it is sufficiently low level.
If you wanted to share the filesystem with another OS running in a VM,
you might do it by sharing it as a network filesystem; this is necessary
when the VM OS is not designed around running in a VM. However, because
Emacs OS will always be running in the Emacs VM, we can optimize it by
having the Emacs VM include processor features mapping the native OS,
and have the Emacs OS be aware of them. It would be slower and more code
to do that all over the network.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./emacs-shells.html"/>
		<link rel="alternate" type="text/markdown" href="./emacs-shells.md"/>
		<id>https://lukeshu.com/blog/emacs-shells.html</id>
		<updated>2013-04-09T00:00:00+00:00</updated>
		<published>2013-04-09T00:00:00+00:00</published>
		<title>A summary of Emacs' bundled shell and terminal modes</title>
		<content type="html">&lt;h1 id="a-summary-of-emacs-bundled-shell-and-terminal-modes"&gt;A summary
of Emacs’ bundled shell and terminal modes&lt;/h1&gt;
&lt;p&gt;This is based on a post on &lt;a
href="http://www.reddit.com/r/emacs/comments/1bzl8b/how_can_i_get_a_dumbersimpler_shell_in_emacs/c9blzyb"&gt;reddit&lt;/a&gt;,
published on 2013-04-09.&lt;/p&gt;
&lt;p&gt;Emacs comes bundled with a few different shell and terminal modes. It
can be hard to keep them straight. What’s the difference between
&lt;code&gt;M-x term&lt;/code&gt; and &lt;code&gt;M-x ansi-term&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;Here’s a good breakdown of the different bundled shells and terminals
for Emacs, from dumbest to most Emacs-y.&lt;/p&gt;
&lt;h2 id="term-mode"&gt;term-mode&lt;/h2&gt;
&lt;p&gt;Your VT100-esque terminal emulator; it does what most terminal
programs do. Ncurses-things work OK, but dumping large amounts of text
can be slow. By default it asks you which shell to run, defaulting to
the environmental variable &lt;code&gt;$SHELL&lt;/code&gt; (&lt;code&gt;/bin/bash&lt;/code&gt;
for me). There are two modes of operation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;char mode: Keys are sent immediately to the shell (including keys
that are normally Emacs keystrokes), with the following exceptions:
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;(term-escape-char) (term-escape-char)&lt;/code&gt; sends
&lt;code&gt;(term-escape-char)&lt;/code&gt; to the shell (see above for what the
default value is).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;(term-escape-char) &amp;lt;anything-else&amp;gt;&lt;/code&gt; is like doing
equates to &lt;code&gt;C-x   &amp;lt;anything-else&amp;gt;&lt;/code&gt; in normal
Emacs.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;(term-escape-char) C-j&lt;/code&gt; switches to line mode.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;line mode: Editing is done like in a normal Emacs buffer,
&lt;code&gt;&amp;lt;enter&amp;gt;&lt;/code&gt; sends the current line to the shell. This is
useful for working with a program’s output.
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;C-c C-k&lt;/code&gt; switches to char mode.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This mode is activated with&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;; Creates or switches to an existing &amp;quot;*terminal*&amp;quot; buffer.
; The default &amp;#39;term-escape-char&amp;#39; is &amp;quot;C-c&amp;quot;
M-x term&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;; Creates a new &amp;quot;*ansi-term*&amp;quot; or &amp;quot;*ansi-term*&amp;lt;n&amp;gt;&amp;quot; buffer.
; The default &amp;#39;term-escape-char&amp;#39; is &amp;quot;C-c&amp;quot; and &amp;quot;C-x&amp;quot;
M-x ansi-term&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id="shell-mode"&gt;shell-mode&lt;/h2&gt;
&lt;p&gt;The name is a misnomer; shell-mode is a terminal emulator, not a
shell; it’s called that because it is used for running a shell (bash,
zsh, …). The idea of this mode is to use an external shell, but make it
Emacs-y. History is not handled by the shell, but by Emacs;
&lt;code&gt;M-p&lt;/code&gt; and &lt;code&gt;M-n&lt;/code&gt; access the history, while
arrows/&lt;code&gt;C-p&lt;/code&gt;/&lt;code&gt;C-n&lt;/code&gt; move the point (which is is
consistent with other Emacs REPL-type interfaces). It ignores VT100-type
terminal colors, and colorizes things itself (it inspects words to see
if they are directories, in the case of &lt;code&gt;ls&lt;/code&gt;). This has the
benefit that it does syntax highlighting on the currently being typed
command. Ncurses programs will of course not work. This mode is
activated with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;M-x shell&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id="eshell-mode"&gt;eshell-mode&lt;/h2&gt;
&lt;p&gt;This is a shell+terminal, entirely written in Emacs lisp.
(Interestingly, it doesn’t set &lt;code&gt;$SHELL&lt;/code&gt;, so that will be
whatever it was when you launched Emacs). This won’t even be running zsh
or bash, it will be running “esh”, part of Emacs.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./term-colors.html"/>
		<link rel="alternate" type="text/markdown" href="./term-colors.md"/>
		<id>https://lukeshu.com/blog/term-colors.html</id>
		<updated>2013-03-21T00:00:00+00:00</updated>
		<published>2013-03-21T00:00:00+00:00</published>
		<title>An explanation of common terminal emulator color codes</title>
		<content type="html">&lt;h1 id="an-explanation-of-common-terminal-emulator-color-codes"&gt;An
explanation of common terminal emulator color codes&lt;/h1&gt;
&lt;p&gt;This is based on a post on &lt;a
href="http://www.reddit.com/r/commandline/comments/1aotaj/solarized_is_a_sixteen_color_palette_designed_for/c8ztxpt?context=1"&gt;reddit&lt;/a&gt;,
published on 2013-03-21.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;So all terminals support the same 256 colors? What about 88 color
mode: is that a subset?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;TL;DR: yes&lt;/p&gt;
&lt;p&gt;Terminal compatibility is crazy complex, because nobody actually
reads the spec, they just write something that is compatible for their
tests. Then things have to be compatible with that terminal’s
quirks.&lt;/p&gt;
&lt;p&gt;But, here’s how 8-color, 16-color, and 256 color work. IIRC, 88 color
is a subset of the 256 color scheme, but I’m not sure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;8 colors: (actually 9)&lt;/strong&gt; First we had 8 colors (9 with
“default”, which doesn’t have to be one of the 8). These are always
roughly the same color: black, red, green, yellow/orange, blue, purple,
cyan, and white, which are colors 0–7 respectively. Color 9 is
default.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;16 colors: (actually 18)&lt;/strong&gt; Later, someone wanted to
add more colors, so they added a “bright” attribute. So when bright is
on, you get “bright red” instead of “red”. Hence 8*2=16 (plus two more
for “default” and “bright default”).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;256 colors: (actually 274)&lt;/strong&gt; You may have noticed,
colors 0–7 and 9 are used, but 8 isn’t. So, someone decided that color 8
should put the terminal into 256 color mode. In this mode, it reads
another byte, which is an 8-bit RGB value (2 bits for red, 2 for green,
2 for blue). The bright property has no effect on these colors. However,
a terminal can display 256-color-mode colors and 16-color-mode colors at
the same time, so you actually get 256+18 colors.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./fs-licensing-explanation.html"/>
		<link rel="alternate" type="text/markdown" href="./fs-licensing-explanation.md"/>
		<id>https://lukeshu.com/blog/fs-licensing-explanation.html</id>
		<updated>2013-02-21T00:00:00+00:00</updated>
		<published>2013-02-21T00:00:00+00:00</published>
		<title>An explanation of how "copyleft" licensing works</title>
		<content type="html">&lt;h1 id="an-explanation-of-how-copyleft-licensing-works"&gt;An explanation
of how “copyleft” licensing works&lt;/h1&gt;
&lt;p&gt;This is based on a post on &lt;a
href="http://www.reddit.com/r/freesoftware/comments/18xplw/can_software_be_free_gnu_and_still_be_owned_by_an/c8ixwq2"&gt;reddit&lt;/a&gt;,
published on 2013-02-21.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;While reading the man page for readline I noticed the copyright
section said “Readline is Copyright (C) 1989-2011 Free Software
Foundation Inc”. How can software be both licensed under GNU and
copyrighted to a single group? It was my understanding that once code
became free it didn’t belong to any particular group or individual.&lt;/p&gt;
&lt;p&gt;[LiveCode is GPLv3, but also sells non-free licenses] Can you really
have the same code under two conflicting licenses? Once licensed under
GPL3 wouldn’t they too be required to adhere to its rules?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I believe that GNU/the FSF has an FAQ that addresses this, but I
can’t find it, so here we go.&lt;/p&gt;
&lt;h3 id="glossary"&gt;Glossary:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;“&lt;em&gt;Copyright&lt;/em&gt;” is the right to control how copies are made of
something.&lt;/li&gt;
&lt;li&gt;Something for which no one holds the copyright is in the “&lt;em&gt;public
domain&lt;/em&gt;”, because anyone (“the public”) is allowed to do
&lt;em&gt;anything&lt;/em&gt; with it.&lt;/li&gt;
&lt;li&gt;A “&lt;em&gt;license&lt;/em&gt;” is basically a legal document that says “I
promise not to sue you if make copies in these specific ways.”&lt;/li&gt;
&lt;li&gt;A “&lt;em&gt;non-free&lt;/em&gt;” license basically says “There are no
conditions under which you can make copies that I won’t sue you.”&lt;/li&gt;
&lt;li&gt;A “&lt;em&gt;permissive&lt;/em&gt;” (type of free) license basically says “You
can do whatever you want, BUT have to give me credit”, and is very
similar to the public domain. If the copyright holder didn’t have the
copyright, they couldn’t sue you to make sure that you gave them credit,
and nobody would have to give them credit.&lt;/li&gt;
&lt;li&gt;A “&lt;em&gt;copyleft&lt;/em&gt;” (type of free) license basically says, “You
can do whatever you want, BUT anyone who gets a copy from you has to be
able to do whatever they want too.” If the copyright holder didn’t have
the copyright, they couldn’t sue you to make sure that you gave the
source to people go got it from you, and non-free versions of these
programs would start to exist.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="specific-questions"&gt;Specific questions:&lt;/h3&gt;
&lt;p&gt;Readline: The GNU GPL is a copyleft license. If you make a modified
version of Readline, and give it to others without letting them have the
source code, the FSF will sue you. They can do this because they have
the copyright on Readline, and in the GNU GPL (the license they used) it
only says that they won’t sue you if you distribute the source with the
modified version. If they didn’t have the copyright, they couldn’t sue
you, and the GNU GPL would be worthless.&lt;/p&gt;
&lt;p&gt;LiveCode: The copyright holder for something is not required to obey
the license—the license is only a promise not to sue you; of course they
won’t sue themselves. They can also offer different terms to different
people. They can tell most people “I won’t sue you as long as you share
the source,” but if someone gave them a little money, they might say, “I
also promise not sue sue this guy, even if he doesn’t give out the
source.”&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./pacman-overview.html"/>
		<link rel="alternate" type="text/markdown" href="./pacman-overview.md"/>
		<id>https://lukeshu.com/blog/pacman-overview.html</id>
		<updated>2013-01-23T00:00:00+00:00</updated>
		<published>2013-01-23T00:00:00+00:00</published>
		<title>A quick overview of usage of the Pacman package manager</title>
		<content type="html">&lt;h1 id="a-quick-overview-of-usage-of-the-pacman-package-manager"&gt;A quick
overview of usage of the Pacman package manager&lt;/h1&gt;
&lt;p&gt;This was originally published on &lt;a
href="https://news.ycombinator.com/item?id=5101416"&gt;Hacker News&lt;/a&gt; on
2013-01-23.&lt;/p&gt;
&lt;p&gt;Note: I’ve over-done quotation marks to make it clear when precise
wording matters.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;pacman&lt;/code&gt; is a little awkward, but I prefer it to apt/dpkg,
which have sub-commands, each with their own flags, some of which are
undocumented. pacman, on the other hand, has ALL options documented in
one fairly short man page.&lt;/p&gt;
&lt;p&gt;The trick to understanding pacman is to understand how it maintains
databases of packages, and what it means to “sync”.&lt;/p&gt;
&lt;p&gt;There are several “databases” that pacman deals with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“the database”, (&lt;code&gt;/var/lib/pacman/local/&lt;/code&gt;)&lt;br&gt; The
database of currently installed packages&lt;/li&gt;
&lt;li&gt;“package databases”,
(&lt;code&gt;/var/lib/pacman/sync/${repo}.db&lt;/code&gt;)&lt;br&gt; There is one of these
for each repository. It is a file that is fetched over plain http(s)
from the server; it is not modified locally, only updated.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The “operation” of pacman is set with a capital flag, one of “DQRSTU”
(plus &lt;code&gt;-V&lt;/code&gt; and &lt;code&gt;-h&lt;/code&gt; for version and help). Of
these, “DTU” are “low-level” (analogous to dpkg) and “QRS” are
“high-level” (analogous to apt).&lt;/p&gt;
&lt;p&gt;To give a brief explanation of cover the “high-level” operations, and
which databases they deal with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;“Q” Queries “the database” of locally installed packages.&lt;/li&gt;
&lt;li&gt;“S” deals with “package databases”, and Syncing “the database” with
them; meaning it installs/updates packages that are in package
databases, but not installed on the local system.&lt;/li&gt;
&lt;li&gt;“R” Removes packages “the database”; removing them from the local
system.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The biggest “gotcha” is that “S” deals with all operations with
“package databases”, not just syncing “the database” with them.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2013 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./poor-system-documentation.html"/>
		<link rel="alternate" type="text/markdown" href="./poor-system-documentation.md"/>
		<id>https://lukeshu.com/blog/poor-system-documentation.html</id>
		<updated>2012-09-12T00:00:00+00:00</updated>
		<published>2012-09-12T00:00:00+00:00</published>
		<title>Why documentation on GNU/Linux sucks</title>
		<content type="html">&lt;h1 id="why-documentation-on-gnulinux-sucks"&gt;Why documentation on
GNU/Linux sucks&lt;/h1&gt;
&lt;p&gt;This is based on a post on &lt;a
href="http://www.reddit.com/r/archlinux/comments/zoffo/systemd_we_will_keep_making_it_the_distro_we_like/c66uu57"&gt;reddit&lt;/a&gt;,
published on 2012-09-12.&lt;/p&gt;
&lt;p&gt;The documentation situation on GNU/Linux based operating systems is
right now a mess. In the world of documentation, there are basically 3
camps, the “UNIX” camp, the “GNU” camp, and the “GNU/Linux” camp.&lt;/p&gt;
&lt;p&gt;The UNIX camp is the &lt;code&gt;man&lt;/code&gt; page camp, they have quality,
terse but informative man pages, on &lt;em&gt;everything&lt;/em&gt;, including the
system’s design and all system files. If it was up to the UNIX camp,
&lt;code&gt;man grub.cfg&lt;/code&gt;, &lt;code&gt;man grub.d&lt;/code&gt;, and
&lt;code&gt;man grub-mkconfig_lib&lt;/code&gt; would exist and actually be helpful.
The man page would either include inline examples, or point you to a
directory. If I were to print off all of the man pages, it would
actually be a useful manual for the system.&lt;/p&gt;
&lt;p&gt;Then GNU camp is the &lt;code&gt;info&lt;/code&gt; camp. They basically thought
that each piece of software was more complex than a man page could
handle. They essentially think that some individual pieces software
warrant a book. So, they developed the &lt;code&gt;info&lt;/code&gt; system. The
info pages are usually quite high quality, but are very long, and a pain
if you just want a quick look. The &lt;code&gt;info&lt;/code&gt; system can generate
good HTML (and PDF, etc.) documentation. But the standard
&lt;code&gt;info&lt;/code&gt; is awkward as hell to use for non-Emacs users.&lt;/p&gt;
&lt;p&gt;Then we have the “GNU/Linux” camp, they use GNU software, but want to
use &lt;code&gt;man&lt;/code&gt; pages. This means that we get low-quality man pages
for GNU software, and then we don’t have a good baseline for
documentation, developers each try to create their own. The
documentation that gets written is frequently either low-quality, or
non-standard. A lot of man pages are auto-generated from
&lt;code&gt;--help&lt;/code&gt; output or info pages, meaning they are either not
helpful, or overly verbose with low information density. This camp gets
the worst of both worlds, and a few problems of its own.&lt;/p&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2012 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
	<entry xmlns="http://www.w3.org/2005/Atom">
		<link rel="alternate" type="text/html"     href="./arch-systemd.html"/>
		<link rel="alternate" type="text/markdown" href="./arch-systemd.md"/>
		<id>https://lukeshu.com/blog/arch-systemd.html</id>
		<updated>2012-09-11T00:00:00+00:00</updated>
		<published>2012-09-11T00:00:00+00:00</published>
		<title>What Arch Linux's switch to systemd means for users</title>
		<content type="html">&lt;h1 id="what-arch-linuxs-switch-to-systemd-means-for-users"&gt;What Arch
Linux’s switch to systemd means for users&lt;/h1&gt;
&lt;p&gt;This is based on a post on &lt;a
href="http://www.reddit.com/r/archlinux/comments/zoffo/systemd_we_will_keep_making_it_the_distro_we_like/c66nrcb"&gt;reddit&lt;/a&gt;,
published on 2012-09-11.&lt;/p&gt;
&lt;p&gt;systemd is a replacement for UNIX System V-style init; instead of
having &lt;code&gt;/etc/init.d/*&lt;/code&gt; or &lt;code&gt;/etc/rc.d/*&lt;/code&gt; scripts,
systemd runs in the background to manage them.&lt;/p&gt;
&lt;p&gt;This has the &lt;strong&gt;advantages&lt;/strong&gt; that there is proper
dependency tracking, easing the life of the administrator and allowing
for things to be run in parallel safely. It also uses “targets” instead
of “init levels”, which just makes more sense. It also means that a
target can be started or stopped on the fly, such as mounting or
unmounting a drive, which has in the past only been done at boot up and
shut down.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;downside&lt;/strong&gt; is that it is (allegedly) big,
bloated&lt;a href="#fn1" class="footnote-ref" id="fnref1"
role="doc-noteref"&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt;, and does (arguably) more than it
should. Why is there a dedicated systemd-fsck? Why does systemd
encapsulate the functionality of syslog? That, and it means somebody is
standing on my lawn.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;changes&lt;/strong&gt; an Arch user needs to worry about is
that everything is being moved out of &lt;code&gt;/etc/rc.conf&lt;/code&gt;. Arch
users will still have the choice between systemd and SysV-init, but
rc.conf is becoming the SysV-init configuration file, rather than the
general system configuration file. If you will still be using SysV-init,
basically the only thing in rc.conf will be &lt;code&gt;DAEMONS&lt;/code&gt;.&lt;a
href="#fn2" class="footnote-ref" id="fnref2"
role="doc-noteref"&gt;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt; For now there is compatibility for
the variables that used to be there, but that is going away.&lt;/p&gt;
&lt;aside id="footnotes" class="footnotes footnotes-end-of-document"
role="doc-endnotes"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn1"&gt;&lt;p&gt;&lt;em&gt;I&lt;/em&gt; don’t think it’s bloated, but that is the
criticism. Basically, I discount any argument that uses “bloated”
without backing it up. I was trying to say that it takes a lot of heat
for being bloated, and that there is be some truth to that (the
systemd-fsck and syslog comments), but that these claims are largely
unsubstantiated, and more along the lines of “I would have done it
differently”. Maybe your ideas are better, but you haven’t written the
code.&lt;/p&gt;
&lt;p&gt;I personally don’t have an opinion either way about SysV-init vs
systemd. I recently migrated my boxes to systemd, but that was because
the SysV init scripts for NFSv4 in Arch are problematic. I suppose this
is another &lt;strong&gt;advantage&lt;/strong&gt; I missed: &lt;em&gt;people generally
consider systemd “units” to be more robust and easier to write than SysV
“scripts”.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I’m actually not a fan of either. If I had more time on my hands, I’d
be running a &lt;code&gt;make&lt;/code&gt;-based init system based on a research
project IBM did a while ago. So I consider myself fairly objective; my
horse isn’t in this race.&lt;a href="#fnref1" class="footnote-back"
role="doc-backlink"&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn2"&gt;&lt;p&gt;You can still have &lt;code&gt;USEDMRAID&lt;/code&gt;,
&lt;code&gt;USELVM&lt;/code&gt;, &lt;code&gt;interface&lt;/code&gt;, &lt;code&gt;address&lt;/code&gt;,
&lt;code&gt;netmask&lt;/code&gt;, and &lt;code&gt;gateway&lt;/code&gt;. But those are minor.&lt;a
href="#fnref2" class="footnote-back" role="doc-backlink"&gt;↩︎&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/aside&gt;
</content>
		<author><name>Luke Shumaker</name><uri>https://lukeshu.com/</uri><email>lukeshu@sbcglobal.net</email></author>
		<rights type="html">&lt;p&gt;The content of this page is Copyright © 2012 &lt;a href="mailto:lukeshu@sbcglobal.net"&gt;Luke Shumaker&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This page is licensed under the &lt;a href="https://creativecommons.org/licenses/by-sa/3.0/"&gt;CC BY-SA-3.0&lt;/a&gt; license.&lt;/p&gt;</rights>
	</entry>
	
</feed>