diff options
Diffstat (limited to 'public/message-threading.md')
-rw-r--r-- | public/message-threading.md | 76 |
1 files changed, 76 insertions, 0 deletions
diff --git a/public/message-threading.md b/public/message-threading.md new file mode 100644 index 0000000..eb83705 --- /dev/null +++ b/public/message-threading.md @@ -0,0 +1,76 @@ +Notes on email message threading +================================ +--- +date: "2024-06-08" +markdown_options: "-smart" +--- + +> I sent an email to Jamie Zawinski with feedback on his venerable +> email threading algorithm. Perhaps my commentary will be a useful +> reference to others implementing email threading. +> +> You can see my implementation of his algorithm at +> <https://git.lukeshu.com/www/tree/cmd/generate/mailstuff/thread_alg.go> +> (and a use of it at +> <https://git.lukeshu.com/www/tree/cmd/generate/mailstuff/thread.go>). + +<div style="font-family: monospace"> +To: [Jamie Zawinski] [<jwz@jwz.org>]<br/> +Subject: message threading<br/> +Date: Sat, 08 Jun 2024 22:34:41 -0600 +Message-ID: <87tti2ybry.wl-lukeshu@lukeshu.com> +</div> + +Hi, + +I'm implementing message threading, and have been referencing both +your document [<https://www.jwz.org/doc/threading.html>]; and [RFC 5256]. +I'm not sure whether you're interested in updating a document that's +more than 25 years old, but if you are: I hope you find the following +feedback valuable. + +You write that the algorithm in RFC 5256 is merely a <q>restating</q> of +your algorithm, but I noticed 3 (minor) differences: + +1. In your step 1.C, the RFC says to check whether this would create a + loop, and if it would to skip creating the link; your version only + says to perform this check in step 1.B. + +2. The RFC says to sort the messages by date between your steps 4 and + 5; that is: when grouping by subject, containers in the root set + should be processed in date-order (you do not specify an order), + and that if container in the root set is empty then the subject + should be taken from the earliest-date child (you say to use an + arbitrary child). + +3. The RFC precisely states how to trim a subject down to a "base + subject," rather than simply saying <q>Strip \`\`Re:'', \`\`RE:'', + \`\`RE[5]:'', \`\`Re: Re[4]: Re:'' and so on.</q> + +Additionally, there are two minor points on which I found their +version to be clearer: + +1. The RFC specifies how to handle messages without a Message-Id or + with a duplicate Message-Id (on [page 9]), as well as how to + normalize a Message-Id (by referring to [RFC 2822]). This is perhaps + out-of-scope of your algorithm document, but I feel that it would + be worth mentioning in your background or definitions section. + +2. In your step 1.B, I did not understand what <q>If they are already + linked, don't change the existing links</q> meant until I read the + RFC, which words it as <q>If a message already has a parent, don't + change the existing link.</q> It was unclear to me what <q>they</q> was + referring to in your version. + +<div style="font-family: monospace"> +-- <br/> +Happy hacking,<br/> +~ Luke T. Shumaker<br/> +</div> + +[Jamie Zawinski]: https://www.jwz.org/ +[<jwz@jwz.org>]: https://www.jwz.org/about.html +[<https://www.jwz.org/doc/threading.html>]: https://www.jwz.org/doc/threading.html +[RFC 5256]: https://datatracker.ietf.org/doc/html/rfc5256 +[RFC 2822]: https://datatracker.ietf.org/doc/html/rfc2822 +[page 9]: https://datatracker.ietf.org/doc/html/rfc5256#page-9 |