On 29.04.2015 17:44, Branko Čibej wrote:
> On 29.04.2015 17:03, Branko Čibej wrote:
>> On 29.04.2015 16:02, Branko Čibej wrote:
>>> On 29.04.2015 11:57, Marc Strapetz wrote:
>>>> On 29.04.2015 05:31, Branko Čibej wrote:
>>>>> On 28.04.2015 21:22, Bert Huijben wrote:
>>>>>>> -----Original Message-----
>>>>>>> From: Marc Strapetz [mailto:marc.strapetz_at_syntevo.com]
>>>>>>> Sent: dinsdag 28 april 2015 20:26
>>>>>>> To: Branko Čibej
>>>>>>> Cc: Subversion Development
>>>>>>> Subject: Re: 1.9 JavaHL memory leak in ISVNRemote#status
>>>>>>> Also, I should add that according to the Profiler, the byte[]s are
>>>>>>> referenced from the Checksums. The char[]s are referenced from the
>>>>>>> Strings. And the Strings are referenced directly as JNI local
>>>>>>> references. Browsing through these Strings, they seem to be
>>>>>>> server-side
>>>>>>> paths ("subversion/branches/1.8.x/...")
>>>>>> Just guessing: Notifications?
>>>>> No, this is an RA status edit drive; there are no notifications, only
>>>>> editor callbacks, and the checksum objects are created in in the
>>>>> callbacks related to file content changes (file contents streams and
>>>>> checksums always come in pairs).
>>>>>
>>>>> I counted creations, finalizations and garbage collections again. I
>>>>> added forced finalization and GC calls to the test case. For every loop
>>>>> in the test, we create 57 Checksum instances, but only one of them is
>>>>> finalized, no matter how often the finalizer and GC are run. All the
>>>>> Checksum objects are created in the same way, and here are /no/
>>>>> references anywhere to the remaining 56 objects, yet they're neither
>>>>> finalized nor garbage-collected. The fields (byte array and kind) /are/
>>>>> collected; all the "live" (according to the heap profiler) Checksum
>>>>> objects have their fields set to null.
>>>> I've been testing on Windows. According to JProfiler and JVisualVM,
>>>> byte[]s are still referenced from the Checksums. Hence, I would expect
>>>> that they are not garbage collected.
>>>>
>>>>> clearly, the code is cleaning up the references correctly
>>>> I don't have detailed understanding of the "jniwrapper" package, but I
>>>> tend to agree with you. In the native code, CreateJ::Checksum and
>>>> CreateJ::PropertyMap are basically doing the same thing, so there is
>>>> no reason why Checksums would remain referenced while HashMaps
>>>> properly do not.
>>>>
>>>> I've also tried to comment out all env.CallVoidMethod()-callbacks in
>>>> EditorProxy.cpp, so created object references would not even be passed
>>>> into the Java code. Still the same, Checksums remain as "JNI local
>>>> reference".
>>>>
>>>> Finally, I've tried to explicitly call DeleteLocalRef(). This /solves/
>>>> the memory leak (at least for Checksums), but I don't understand why
>>>> this is necessary and whether this is correct.
>>>>
>>>> svn_error_t*
>>>> EditorProxy::cb_alter_file(void *baton,
>>>> const char *relpath,
>>>> ...
>>>> jstring jrelpath = JNIUtil::makeJString(relpath);
>>>> SVN_JAVAHL_OLDSTYLE_EXCEPTION_CHECK(env);
>>>> jobject jchecksum = CreateJ::Checksum(checksum);
>>>> SVN_JAVAHL_OLDSTYLE_EXCEPTION_CHECK(env);
>>>> jobject jprops = CreateJ::PropertyMap(props, scratch_pool);
>>>> SVN_JAVAHL_OLDSTYLE_EXCEPTION_CHECK(env);
>>>>
>>>> jobject jcontents = NULL;
>>>> if (contents != NULL)
>>>> jcontents = wrap_input_stream(contents);
>>>>
>>>> env.CallVoidMethod(ep->m_jeditor, mid,
>>>> jrelpath, jlong(revision),
>>>> jchecksum, jcontents, jprops);
>>>>
>>>> env.DeleteLocalRef(jrelpath);
>>>> env.DeleteLocalRef(jchecksum);
>>>> env.DeleteLocalRef(jprops);
>>>> if (contents != NULL)
>>>> env.DeleteLocalRef(jcontents);
>>>> ...
>>>>
>>>>> but for some unfathomable reason, the collector keeps them
>>>>> alive for a while.
>>>> I'm not entirely sure about the exact difference of the live data in
>>>> the VM and a heap dump, but IMO the Checksums are still considered as
>>>> referenced ("JNI local reference") and hence will never be garbage
>>>> collected. The profilers confirm this.
>>>>
>>>> Given that DeleteLocalRef solves the problem, I think this is either a
>>>> bug in the jniwrapper or a bug in JNI itself.
>>> The latest code wraps the callback implementations with
>>> PushLocalFrame/PopLocalFrame; any references created within a local
>>> frame should be automatically deleted by PopLocalFrame, according to all
>>> JNI docs I can find.
>>>
>>> I can add the explicit deletions, but it's a shame that frame management
>>> wouldn't work as expected. :( So, I'm going to double-check if we're
>>> actually getting the frame management right. I can't imagine why the
>>> HashMaps and NativeInputStreams would be released, but not the Checksums.
>>>
>>> All in all, I agree with you that this looks like a JNI bug ... the
>>> trick now will be to prove that with a minimal test case and report it
>>> upstream. :)
>>>
>>> (FWIW, I'm using Java 8u45 64-bit on OSX.)
>>
>> So, interesting data point ... I moved the creation of the Checksum
>> objects after the creation of the property maps ... and now they're
>> getting garbage-collected. This is becoming extremely weird.
>
> Hah. Fixed it. http://svn.apache.org/r1676771
>
> We were not properly popping off a JNI frame in CreateJ::PropertyMap, so
> the checksum objects were referenced in a JNI frame that was left
> hanging in limbo after the call to finishReport returned.
>
> I'm running your test case now (with -Xmx24M), currently at over 350
> iterations with total memory usage stable at around 50MB with no forced
> garbage collections.
I can confirm: two subsequent heap dumps with sufficient time between
them do not differ even in a single instance :) Great!
Thanks, Brane, for staying on that issue -- I expect that this will
solve a large amount of OOMEs we are getting reported.
-Marc
Received on 2015-04-29 19:15:54 CEST