Karl Fogel wrote:
> Neels Janosch Hofmeyr <neels_at_elego.de> writes:
>
>> Checking for LF is easy. For UTF-8, there is this function in
>> subversion/libsvn_subr/utf.c called check_utf8(..), which I gather I
>> cannot access from libsvn_repos unless it is made public in
>> subversion/include/svn_utf.h
>>
>> So, I want to rename check_utf8 to svn_utf_check_utf8, put a "@since New
>> in 1.6" tag on the public doc string, publish it in include/svn_utf.h,
>> adjust all callers and use it in libsvn_repos/fs-wrap.c, in function
>> validate_prop(..).
>>
>> Am I on the right track here? :)
>>
>
> You're on the right track, but you don't have to make the function
> public. Subversion has an intermediate level of inter-library privacy,
> to allow a symbol to be shared among Subversion's modules without
> publishing that symbol to the world. See here:
>
> http://svn.collab.net/repos/svn/trunk/subversion/include/private
>
> Does that help?
>
Well, it would have, but there isn't any utf function published in
include/private/ either.
I could go on to publish check_utf8 in include/private/. But all the
other UTF-8 functions are declared in include/svn_utf.h. Wouldn't it be
silly to publish check_utf8 in a completely different place from the
rest of the UTF stuff? (I see check_utf8 in the category of "general
purpose tools that are nice to have around".)
If not, where in include/private/ would check_utf8 go? check_utf8 is
defined in libsvn_subr/utf.c, but there is no include/private/svn_subr.h
to add it to...
By the way, there is an alternative to check_utf8. Let's look at the
implementation of check_utf8() in libsvn_subr/utf.c:
/* Verify that the sequence DATA of length LEN is valid UTF-8 */
static svn_error_t *
check_utf8(const char *data, apr_size_t len, apr_pool_t *pool)
{
if (! svn_utf__is_valid(data, len))
return invalid_utf8(data, len, pool);
return SVN_NO_ERROR;
}
check_utf8 calls svn_utf__is_valid(), defined in libsvn_subr/utf_validate.c:
svn_utf__is_valid(const char *data, apr_size_t len)
{
const char *end = data + len;
int state = FSM_START;
while (data < end)
{
unsigned char octet = *data++;
int category = octet_category[octet];
state = machine[state][category];
}
return state == FSM_START ? TRUE : FALSE;
}
The difference is that svn_utf__is_valid() returns a boolean, where
check_utf8 returns a SVN_ERROR. Both of these functions are only
accessible within libsvn_subr, and are neither in include/ nor in
include/private.
Any opinions on which one of these functions should be published (and
where)?
Currently, a lot of functions that convert from/to UTF-8 are declared in
include/svn_utf.h, but none that just verify. Should I rather attempt a
conversion and discard the resulting copied data? No, right?
Thanks!
--
Neels Hofmeyr -- elego Software Solutions GmbH
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23458696 mobile: +49 177 2345869 fax: +49 30 23458695
http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelsreg: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
Received on 2008-05-26 23:23:38 CEST