On 11.08.2014 01:38, Alan Modra wrote:
> On Sat, Aug 09, 2014 at 08:08:21PM +0200, Branko Čibej wrote:
>> The way to fix this is to make sure that the macro
>> SVN_UNALIGNED_ACCESS_IS_OK gives the correct answer; and it's OK if
>> that answer compiler-specific, not just platform-specific. In other
>> words, it's fine if the macro gives a different answer depending on
>> GCC vectorizer options.
> If it were up to me I'd revert r956593 entirely. Thinking you can
> optimise memcpy is just plain wrong-headed,
I fully agree this is the case today. It probably wasn't when the code
was written. Still, it makes sense to look at all our
unaligned-access-specific code again.
> and this is the second bug
> found in patterning_copy(). Hmm, actually if you want to optimise
> patterning_copy() then something that might make sense is to detect
> the overlap case and copy the original repeating sequence multiple
> times, rather than copying what has just been written. That is likely
> to give better cache performance especially on processors that suffer
> from load-hit-store stalls.
You're probably right. Back when I wrote the original code (which, IIRC,
didn't use any magic unaligned accesses ...), I was more concerned about
being able to read what I'd written. :)
-- Brane
--
Branko Čibej | Director of Subversion
WANdisco | Realising the impossibilities of Big Data
e. brane_at_wandisco.com
Received on 2014-08-11 02:01:08 CEST