[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Membuf "optimization" in r1532186

From: Branko Čibej <brane_at_wandisco.com>
Date: Mon, 09 Dec 2013 11:01:37 +0100

Hi Stefan,

could you please shed some light on your "minor opimization" commit:

> ------------------------------------------------------------------------
> r1532186 | stefan2 | 2013-10-15 06:59:00 +0200 (Tue, 15 Oct 2013) | 5
> lines Minor optimization in svn_membuf code. *
> subversion/libsvn_subr/string.c (membuf_create): always allocate, even
> 0 bytes (svn_membuf__resize): new allocated membuf->data can never be
> NULL Index: subversion/libsvn_subr/string.c
> ===================================================================
> --- subversion/libsvn_subr/string.c (revision 1532185) +++
> subversion/libsvn_subr/string.c (revision 1532186) @@ -53,9 +53,9 @@
> /* apr_palloc will allocate multiples of 8. * Thus, we would waste
> some of that memory if we stuck to the * smaller size. Note that this
> is safe even if apr_palloc would - * use some other aligment or none
> at all. */ + * use some other alignment or none at all. */
> minimum_size = APR_ALIGN_DEFAULT(minimum_size); - *data =
> (!minimum_size ? NULL : apr_palloc(pool, minimum_size)); + *data =
> apr_palloc(pool, minimum_size); *size = minimum_size; } @@ -121,7
> +121,10 @@ const apr_size_t old_size = membuf->size;
> membuf_ensure(&membuf->data, &membuf->size, size, membuf->pool); - if
> (membuf->data && old_data && old_data != membuf->data) + + /* If we
> re-allocated MEMBUF->DATA, it cannot be NULL. + * Statically
> initialized membuffers (OLD_DATA) may be NULL, though. */ + if
> (old_data && old_data != membuf->data) memcpy(membuf->data, old_data,
> old_size); }

The result, as I see it, pessimises the code in many scenarios where
membufs are used as a buffer whose size is unknown at creation time.
True, we save some time at every resize. The time saved is never more
than one clock cycle, and probably far less on average (with parallel
issue); checking (membuf->data != 0) will take that much time once its
value is loaded into a register, and your changed condition does not
avoid the load.

On the other hand, you pay for this "optimization" by always calling
apr_palloc on membuf creation, even if the buffer size is 0 and will
therefore probably be resized, and apr_palloc called again, almost
immediately. I fail to see how you can justify this extra call to
apr_palloc.

To clarify, the most often used pattern where the initial membuf size os
0 is when normalizing UTF-8 strings, where we let the utf8proc code
determine how large the allocation has to be, based on its analysis of
the string; the only alternative is to allocate a far larger buffer than
you can ever need, and incidentally making assumptions about how the
normalization is implemented. The extra allocation you introduced here
does not speed anything up; rather the opposite.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com
Received on 2013-12-09 11:02:15 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.