core/s-nail/mimeheader.patch


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147

s-nail-14_5_2-mimeheader.patch, 2014-02-05:

Apply:
  $ cd s-nail-14.5.2
  $ patch -bu < s-nail-14_5_2-mimeheader.patch

Description:
    mime_fromhdr(): fix my rewrite again..
    
    My hasty rewrite [0f9ad93] (mime_fromhdr(): partial rewrite using
    n_iconv_str(), 2013-03-12), just about ninety (90) minutes before
    the release of S-nail v14.1 already caused the bugfix [b608c6b]
    (mime_fromhdr(): never return NULL output.., 2013-03-14), which
    was the sole reason for the release of S-nail v14.2.
    
    Well, about a year later, after tens of thousands of mails,
    including multibyte ones, i wrote myself a message that has shown
    that the rewrite was still buggy -- the header
    
      Subject: ehm, .getElementById("blink") needs <span
       =?US-ASCII?Q?id=3D"blink">,?= not =?US-ASCII?Q?class=3D"id"?=
    
    cannot be viewed correctly, the ", not" will be lost.
    The reason is now understood and this changeset should fix
    mime_fromhdr() so that it'll do what it is assumed to do in the
    current codebase, unless i'm terribly mistaken.
    
    Because i bickered some time in private, i WANT to add that the
    real problem is that the codebase is weird INSOFAR as that i still
    don't really understand the WAY it works, because THAT IS SICK.
    I.e., in my brain i assume this function effectively is
    rfc_2047_decode(), meant to decode encoded words as specified in
    RFC 2047, but that's simply not true, and FOR QUITE SOME TIME,
    because of the embedded newlines that may be in the data and need
    to passed through for at least the case that we send data to the
    display.  I slowly get around that schizophrenic codebase while
    also converting it to a straight one, but that will take years.
    Until then we need to strip whitespace in between multiple
    adjacent encoded words, while passing through newlines and
    whitespace that follows newlines, regardless of whatever.
    I hope this will do it until we are sane.
---
 mime.c | 63 +++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 39 insertions(+), 24 deletions(-)

diff --git a/mime.c b/mime.c
index ccb0061..6ee55cc 100644
--- a/mime.c
+++ b/mime.c
@@ -863,20 +863,26 @@ jclear:
 	goto jleave;
 }
 
-/*
- * Convert header fields from RFC 1522 format
- * TODO mime_fromhdr(): NO error handling, fat; REWRITE **ASAP**
- */
 FL void
 mime_fromhdr(struct str const *in, struct str *out, enum tdflags flags)
 {
-	/* TODO mime_fromhdr(): is called with strings that contain newlines;
-	 * TODO this is the usual newline problem all around the codebase;
-	 * TODO i.e., if we strip it, then the display misses it ;} */
+   /* TODO mime_fromhdr(): is called with strings that contain newlines;
+    * TODO this is the usual newline problem all around the codebase;
+    * TODO i.e., if we strip it, then the display misses it ;>
+    * TODO this is why it is so messy and why S-nail v14.2 plus additional
+    * TODO patch for v14.5.2 (and maybe even v14.5.3 subminor) occurred, and
+    * TODO why our display reflects what is contained in the message: the 1:1
+    * TODO relationship of message content and display!
+    * TODO instead a header line should be decoded to what it is (a single
+    * TODO line that is) and it should be objective to the backend wether
+    * TODO it'll be folded to fit onto the display or not, e.g., for search
+    * TODO purposes etc.  then the only condition we have to honour in here
+    * TODO is that whitespace in between multiple adjacent MIME encoded words
+    * TODO á la RFC 2047 is discarded; i.e.: this function should deal with
+    * TODO RFC 2047 and be renamed: mime_fromhdr() -> mime_rfc2047_decode() */
 	struct str cin, cout;
 	char *p, *op, *upper, *cs, *cbeg;
-	int convert;
-	size_t lastoutl = (size_t)-1;
+   ui32_t convert, lastenc, lastoutl;
 #ifdef HAVE_ICONV
 	char const *tcs;
 	iconv_t fhicd = (iconv_t)-1;
@@ -894,6 +900,7 @@ mime_fromhdr(struct str const *in, struct str *out, enum tdflags flags)
 #endif
 	p = in->s;
 	upper = p + in->l;
+   lastenc = lastoutl = 0;
 
 	while (p < upper) {
 		op = p;
@@ -949,8 +956,7 @@ mime_fromhdr(struct str const *in, struct str *out, enum tdflags flags)
 					--cout.l;
 			} else
 				(void)qp_decode(&cout, &cin, NULL);
-			if (lastoutl != (size_t)-1)
-				out->l = lastoutl;
+			out->l = lastenc;
 #ifdef HAVE_ICONV
 			if ((flags & TD_ICONV) && fhicd != (iconv_t)-1) {
 				cin.s = NULL, cin.l = 0; /* XXX string pool ! */
@@ -966,21 +972,30 @@ mime_fromhdr(struct str const *in, struct str *out, enum tdflags flags)
 #ifdef HAVE_ICONV
 			}
 #endif
-			lastoutl = out->l;
+			lastenc = lastoutl = out->l;
 			free(cout.s);
-		} else {
-jnotmime:
-			p = op;
-			convert = 1;
-			while ((op = p + convert) < upper &&
-					(op[0] != '=' || op[1] != '?'))
-				++convert;
-			out = n_str_add_buf(out, p, convert);
-			p += convert;
-			if (! blankchar(p[-1]))
-				lastoutl = (size_t)-1;
-		}
+		} else
+jnotmime: {
+         bool_t onlyws;
+
+         p = op;
+         onlyws = (lastenc > 0);
+         for (;;) {
+            if (++op == upper)
+               break;
+            if (op[0] == '=' && (PTRCMP(op + 1, ==, upper) || op[1] == '?'))
+               break;
+            if (onlyws && !blankchar(*op))
+               onlyws = FAL0;
+         }
+
+         out = n_str_add_buf(out, p, PTR2SIZE(op - p));
+         p = op;
+         if (!onlyws || lastoutl != lastenc)
+            lastenc = out->l;
+         lastoutl = out->l;
 	}
+  }
 	out->s[out->l] = '\0';
 
 	if (flags & TD_ISPR) {