This might be public knowledge already, but it occurred to me while
documenting the fixed-length keyword syntax in svnbook last night. To make
it easy on myself, I'll just paste what I added to the book:
Be aware that because the width of a keyword field is measure in bytes,
the potential for corruption of multi-byte values exists. For example, a
username which contains some multi-byte UTF-8 characters might suffer
truncation in the middle of the string of bytes which make up one of
those characters. The result will be a mere truncation when viewed at the
byte level, but will likely appear as a string with an incorrect or
garbled final character when viewed as UTF-8 text. It is conceivable that
certain applications, when asked to load the file, would notice the
broken UTF-8 text and deem the entire file corrupt, refusing to operate
on the file altogether.
If this was discussed early, sorry for the noise. I *don't* think it is
a bug. But I did feel it was something to call out in the text.
--
C. Michael Pilato <cmpilato@collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on Wed Mar 15 16:55:40 2006