On 28.04.2015 18:03, Marc Strapetz wrote:
> Hi Brane,
>
> On 28.04.2015 07:36, Branko Čibej wrote:
>> On 24.04.2015 14:11, Branko Čibej wrote:
>>>
>>> Hi Marc,
>>>
>>> Just a quick note: your last msg jogged my memory and I think I know
>>> the root cause of the leak: improper JNI frame management within a
>>> loop. If I'm right, I can both fix the leak and remove the
>>> close-stream requirement I just added.
>>>
>>> On 24 Apr 2015 11:00 am, "Marc Strapetz" <marc.strapetz_at_syntevo.com
>>> <mailto:marc.strapetz_at_syntevo.com>> wrote:
>>>
>>> On 24.04.2015 06 <tel:24.04.2015%2006>:34, Branko Čibej wrote:
>>>
>>> On 22.03.2015 05 <tel:22.03.2015%2005>:06, Branko Čibej wrote:
>>>
>>> On 21.03.2015 16 <tel:21.03.2015%2016>:23, Branko Čibej
>>> wrote:
>>>
>>> On 19.03.2015 11:43, Marc Strapetz wrote:
>>>
>>> Attached example performs an endless series of
>>> remote status against
>>> the Subversion repository. When invoked with
>>> -Xmx24M, the VM will run
>>> out of memory soon. Monitoring with jvisualvm
>>> shows that the used heap
>>> size constantly grows. Monitoring with the Task
>>> Manager shows that the
>>> allocated memory grows even more (significantly).
>>> Looks like a memory
>>> leak, for which a large amount of native memory is
>>> involved, too.
>>>
>>> Tested on Windows 8.1 with almost latest
>>> Subversion 1.9 JavaHL builds.
>>>
>>> I can confirm that this happens on the Mac, too, and
>>> it's not a garbage
>>> collector artefact. I'm trying to trace where the leak
>>> is happening ...
>>> valgrind with APR pool debugging doesn't tell me much
>>> (no surprise there).
>>>
>>> Just to make sure we weren't doing something bad in our
>>> libraries, I
>>> wrote a small C program that does the same as your Java
>>> example (Ev2
>>> shims included), and memory usage is completely steady. So
>>> it is
>>> something in JavaHL, but I have no clue yet what the
>>> problem is.
>>>
>>>
>>> I have to say this was one of the more "interesting" bug-hunts
>>> in my not
>>> entirely boring career, and that's not really obvious from
>>> the fix
>>> itself. :)
>>>
>>> http://svn.apache.org/r1675771
>>>
>>> Marc: this will not be in RC1, but please give the patch a
>>> spin and let
>>> me know if it fixes your problem. I tested this with the Java
>>> program
>>> you attached to your original report, and heap size no longer
>>> grows
>>> without bounds.
>>>
>>>
>>> Great hunt, Brane! The native leak seems to be fixed. I've run my
>>> remote status loop with -Xmx24M and still get an OOME after ~170
>>> loop iterations. The memory leak is significantly smaller and this
>>> time it seems to be in the Java part. According to the profiler,
>>> most memory is allocated by HashMap and friends, referenced from
>>> JNI code. Only two org.apache.subversion classes show up, but I
>>> guess they indicate the source of the leak:
>>>
>>> org.apache.subversion.javahl.types.Checksum (~10K instances)
>>> org.apache.subversion.javahl.types.NativeInputStream (~10K
>>> instances)
>>>
>>> Let me know, if you more profiler statistics will be helpful.
>>>
>>
>> So I've been looking at this in depth. At first I thought that one of
>> the problems was that we didn't release JNI local references; I added
>> code to make sure this happens in the status callbacks (not committed
>> yet) and I verified that all the native wrapped objects do get
>> finalized. However, the Java objects still hang around.
>>
>> One of the problems is that all the callbacks happen within the scope of
>> the ISVNReporter.finishReport call, which means that the whole edit
>> drive is considered a single JNI call (despite the callbacks to Java)
>> and the garbage collector can't reclaim space for the objects created
>> within JNI during that time. But even a forced GC after the report is
>> done and the remote session disposed won't release all the native
>> references. I'm a bit stumped here ... JVM's built-in memory profiler
>> shows the live references and where they're allocated, but doesn't show
>> why they're not released even when I explicitly create and destroy JNI
>> frames.
>
> Can you please commit your current state somewhere or send me a patch?
> I can give this one a try in JProfiler and see whether I can gather
> some more useful information.
Here's the complete patch against 1.9.x. I do see some NativeInputStream
objects (but not all) being garbage-collected, but there are a number of
other objects, even those allocated in Java code in the callback, that
are just hanging around, even if I force a GC. Any help in tracking this
down will be greatly appreciated.
-- Brane
Received on 2015-04-28 18:13:17 CEST