Re: mailer.py can produce subject header violates RFC 5321/5322 if truncate_subject is not set

From: Yasuhito FUTATSUKI <futatuki_at_poem.co.jp>
Date: Tue, 7 Jan 2020 09:41:21 +0900

On 2020/01/07 6:52, Yasuhito FUTATSUKI wrote:
> By the way, it seems another issue about truncate_subject that current
> implementation of truncate_subject may break utf-8 multi-bytes character
> sequence, but I didn't reproduce it(because I always use ascii
> characters only for file names...).

Probably it needs something like this (but it doesn't support conbining
characters, and I didn't any test...):
[[[
Index: tools/hook-scripts/mailer/mailer.py
===================================================================
--- tools/hook-scripts/mailer/mailer.py (revision 1872398)
+++ tools/hook-scripts/mailer/mailer.py (working copy)
@@ -159,7 +159,13 @@
       truncate_subject = 0

     if truncate_subject and len(subject) > truncate_subject:
- subject = subject[:(truncate_subject - 3)] + "..."
+ # To avoid breaking utf-8 multi-bytes character sequence, we should
+ # search the top of the sequence if the byte of the truncate point is
+ # secound or later part of multi-bytes character sequence.
+ idx = truncate_subject - 3
+ while 0x80 <= ord(subject[idx]) <= 0xbf:
+ idx -= 1
+ subject = subject[:idx] + "..."
     return subject

   def start(self, group, params):
]]]

Cheers,

-- 
Yasuhito FUTATSUKI <futatuki_at_yf.bsdclub.org> / <futatuki_at_poem.co.jp>

Received on 2020-01-07 01:43:36 CET

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]