summaryrefslogtreecommitdiff
path: root/public/message-threading.md
diff options
context:
space:
mode:
Diffstat (limited to 'public/message-threading.md')
-rw-r--r--public/message-threading.md76
1 files changed, 76 insertions, 0 deletions
diff --git a/public/message-threading.md b/public/message-threading.md
new file mode 100644
index 0000000..eb83705
--- /dev/null
+++ b/public/message-threading.md
@@ -0,0 +1,76 @@
+Notes on email message threading
+================================
+---
+date: "2024-06-08"
+markdown_options: "-smart"
+---
+
+> I sent an email to Jamie Zawinski with feedback on his venerable
+> email threading algorithm. Perhaps my commentary will be a useful
+> reference to others implementing email threading.
+>
+> You can see my implementation of his algorithm at
+> <https://git.lukeshu.com/www/tree/cmd/generate/mailstuff/thread_alg.go>
+> (and a use of it at
+> <https://git.lukeshu.com/www/tree/cmd/generate/mailstuff/thread.go>).
+
+<div style="font-family: monospace">
+To: [Jamie Zawinski] [&lt;jwz@jwz.org&gt;]<br/>
+Subject: message threading<br/>
+Date: Sat, 08 Jun 2024 22:34:41 -0600
+Message-ID: &lt;87tti2ybry.wl-lukeshu@lukeshu.com&gt;
+</div>
+
+Hi,
+
+I'm implementing message threading, and have been referencing both
+your document [&lt;https://www.jwz.org/doc/threading.html&gt;]; and [RFC 5256].
+I'm not sure whether you're interested in updating a document that's
+more than 25 years old, but if you are: I hope you find the following
+feedback valuable.
+
+You write that the algorithm in RFC 5256 is merely a <q>restating</q> of
+your algorithm, but I noticed 3 (minor) differences:
+
+1. In your step 1.C, the RFC says to check whether this would create a
+ loop, and if it would to skip creating the link; your version only
+ says to perform this check in step 1.B.
+
+2. The RFC says to sort the messages by date between your steps 4 and
+ 5; that is: when grouping by subject, containers in the root set
+ should be processed in date-order (you do not specify an order),
+ and that if container in the root set is empty then the subject
+ should be taken from the earliest-date child (you say to use an
+ arbitrary child).
+
+3. The RFC precisely states how to trim a subject down to a "base
+ subject," rather than simply saying <q>Strip \`\`Re:'', \`\`RE:'',
+ \`\`RE[5]:'', \`\`Re: Re[4]: Re:'' and so on.</q>
+
+Additionally, there are two minor points on which I found their
+version to be clearer:
+
+1. The RFC specifies how to handle messages without a Message-Id or
+ with a duplicate Message-Id (on [page 9]), as well as how to
+ normalize a Message-Id (by referring to [RFC 2822]). This is perhaps
+ out-of-scope of your algorithm document, but I feel that it would
+ be worth mentioning in your background or definitions section.
+
+2. In your step 1.B, I did not understand what <q>If they are already
+ linked, don't change the existing links</q> meant until I read the
+ RFC, which words it as <q>If a message already has a parent, don't
+ change the existing link.</q> It was unclear to me what <q>they</q> was
+ referring to in your version.
+
+<div style="font-family: monospace">
+-- <br/>
+Happy hacking,<br/>
+~ Luke T. Shumaker<br/>
+</div>
+
+[Jamie Zawinski]: https://www.jwz.org/
+[&lt;jwz@jwz.org&gt;]: https://www.jwz.org/about.html
+[&lt;https://www.jwz.org/doc/threading.html&gt;]: https://www.jwz.org/doc/threading.html
+[RFC 5256]: https://datatracker.ietf.org/doc/html/rfc5256
+[RFC 2822]: https://datatracker.ietf.org/doc/html/rfc2822
+[page 9]: https://datatracker.ietf.org/doc/html/rfc5256#page-9