On 2020/01/08 2:03, Daniel Shahaf wrote:
> Yasuhito FUTATSUKI wrote on Tue, Jan 07, 2020 at 06:52:20 +0900:
>> I found tools/hook-scripts/mailer/mailer.py can produce very long
>> subject header line without folding. It can be easily over 1000
>> characters [1] if some large source tree is imported in a repository
>> and truncate_subject config value is not specified appropriately.
>> The mailer.py script send it without regard if server can accept over
>> 1000 octets line [2], and don't have way of recovery when received
>> response like "500 line too long" (of course, as this response code
>> doesn't show the reason, it is no wonder).
>>
>> I also found similar suggestion for commit-email.pl in users@ list
>> archive [3], but on mailer.py we can avoid it by setting apropriate
>> truncate_value, such as 200 (, which is shown as comment in
>> mailer.conf.example). Is it succifient?
>>
>
> Admins shouldn't need to edit the config file in order to have the tool comply
> with relevant RFC's. Therefore, we should not generate lines longer than
> 998 octets unless we're specifically aware that the remote SMTP server can
> accept such lines.
>
> Thus, I wouldn't consider editing the comments in mailer.conf.example to suffice:
> that wouldn't prevent mailer.py from generating overlong lines by default.
>
>> (1) It is suffient because this is a code example and setting example,
>> and not to use as is.
>> (2) We should change the default value not to violate them.
>> (3) We should change the default value and ignore if larger values is
>> set.
>> (4) We should implement line folding
>
> I think there are two separate questions here: the physical lines [the raw
> rfc822 form] and the logical lines [after unfolding and MIME decoding].
>
> The physical lines shouldn't be longer than the 1000 octet limit stipulated by
> the relevant RFC's. Longer lines should be folded (or else truncated with a
> warning). I'm surprised that that isn't already the case; we should let the
> 'email' module handle this for us. (We we already do it this way in
> tools/dist/security/mailer.py (sic).)
I agree that we should entrust with 'email' module to handle this issue,
at least with Python 3. It's better to replace the code using modern
API, not with legacy API. (I think the author of current code didn't
trust 'email' package or didn't like its policy to encode...)
> The logical lines shouldn't be longer than the limit given by TRUNCATE_VALUE in
> the config file. It's fair game to ask whether the default value of
> TRUNCATE_VALUE should be changed, but this is just a usability issue, whereas
> the length of physical lines is an interoperability issue.
Agreed.
> Is TRUNCATE_VALUE specified in characters or in bytes? The docstring doesn't
> make this clear. If it applies to the logical line it should be specified in
> characters, but it sounds like the code interprets it in bytes?
I think current code don't care multi-bytes characters, as I worte
other reply, so it is not clear now.
Length limits in characters is used in context of display areas for
fixed font width in most case, however, it is often ignored existence
of CJK ideographs which width are twice of latin alphabets. So, it
exists the third indicator of the length limit,
'per character width unit' :)
(The latter half of this paragrah is a just idle complaint, so please
ignore)
On the other hand, limits in bytes represents just size of data,
or quantity of information (although it ignores entropy).
I think it is mainly for display purpose to introduce the limits
for subject.
(I'm sorry I cannot spare much time in this month later, so
it is next month if I work for this issue...)
Cheers,
--
Yasuhito FUTATSUKI <futatuki_at_poem.co.jp>
Received on 2020-01-09 02:08:32 CET