Hi Andrew and others,
I spent a lot of time a few months ago working on $subj. After many dead ends, I decided it was more prudent to spend my time working on other issues. I have now returned to this issue and would like to pass it off to those who will have more luck than I did figuring out what is going wrong.
Background information: ~~~~~~~~~~~~~~~~~~~~~~
RH Bug #161483 [1]
CVS checkouts in Eclipse take an extremely long time to finish when run under gij. The issue occurs both when the relevant jars are natively-compiled and when they are not (although the problem is exacerbated when run with only bytecode). I have found a few modules that replicate the issue but the one I have used the most is GNU Classpath. The problem is independent of network issues as checkouts from a local CVS server exhibit the same behaviour.
Things to note: ~~~~~~~~~~~~~~
The problem appears to be related to some sort of thread contention. I wrote a headless Eclipse CVS client (using the same code as the GUI CVS client) that does not reproduce the problem. Following the actual checkout of the files from CVS, Eclipse creates synchronization information for the files it has just checked out. Since this is an initial checkout, all folders are considered dirty so it descends into each folder. One of my original theories was that we were incorrectly computing dirty resources (or getting into some sort of recursive loop) but this is not the case. Much println'ing has told me that both gcj and the Sun JVM come up with the same list of dirty folders.
Timing information that I have accumulated has shown that we are not spending an inordinate amount of time generating or writing the synchronization information but are instead spending too much time holding locks on the resources. On average we are holding the locks about 100 - 1000 times longer than the Sun JVM. This can be verified by a sysprof [2] sample [3] taken which shows a lot of time in Semaphore.acquire(). OProfile results corroborate this evidence.
gnuplot was used at one point to see if the time spent was proportional to the number of files in the directory or to the disk space of said files. No correlation was observed.
Note: at one point, I thought the problem was due to missing .sos for some jars because I could not duplicate the problem on my laptop. It turns out that if I changed the CVS compression level [4], I could somehow affect the timing such that it would indeed occur. Hyperthreaded machines seems impervious to this.
There appears to be no way to reduce this to a test case so we are unfortunately left with duplicating the symptoms using Eclipse itself.
How to duplicate: ~~~~~~~~~~~~~~~~
I have created a version of Anthony's excellent redhat-eclipse-demo RPM that contains a snapshot of GNU Classpath (from a while ago). This RPM will create a local pserver CVS server running on port 2402. It will also create the user anoncvs and put an exploded GNU Classpath checkout (owned by this user) in /var/lib/eclipse-demo-cvsroot. You can retrieve a copy of this RPM here:
http://overholt.ca/redhat-eclipse-demo-2.0-1.noarch.rpm http://overholt.ca/redhat-eclipse-demo-2.0-1.src.rpm
1. Install Eclipse from either FC4 or rawhide. gcc versions do not seem to affect this problem and I've done most of my testing with 4.0.2-3 which was current in rawhide until recently.
2. Fire up Eclipse with a new workspace ... something like:
eclipse -data ~/testcvsissueworkspace
3. Checkout classpath from your local CVS server:
File->Import->Checkout Projects from CVS Next localhost, /var/lib/eclipse-demo-cvsroot anoncvs, anoncvs pserver, 2402, check 'Save password' Next classpath Finish
4. Observe that checkout hangs at final synchronization
I am at a bit of a loss as to how to proceed. Any and all help is greatly appreciated. I will provide whatever information is necessary. (I wrote this email while on a plane and lost a bit of revision I had done just before my battery died. If anything's amiss, I blame that ;)
Thank you,
Andrew
[1] https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161483
[2] http://www.daimi.au.dk/~sandmann/sysprof/
[3] http://overholt.ca/eclipse/checkout.sysprof
[4] Window->Preferences->Team->CVS->Connection->Compression
Andrew Overholt writes:
Hi Andrew and others,
I spent a lot of time a few months ago working on $subj. After many dead ends, I decided it was more prudent to spend my time working on other issues. I have now returned to this issue and would like to pass it off to those who will have more luck than I did figuring out what is going wrong.
Background information:
RH Bug #161483 [1] CVS checkouts in Eclipse take an extremely long time to finish when run under gij.
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Andrew.
* Andrew Haley aph@redhat.com [2005-11-14 12:50]:
Andrew Overholt writes:
RH Bug #161483 [1]
CVS checkouts in Eclipse take an extremely long time to finish when run under gij.
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Gah. I guess I can't get your help, then :(
Andrew
Andrew Overholt writes:
- Andrew Haley aph@redhat.com [2005-11-14 12:50]:
Andrew Overholt writes:
RH Bug #161483 [1]
CVS checkouts in Eclipse take an extremely long time to finish when run under gij.
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Gah. I guess I can't get your help, then :(
I wouldn't say that. It will make it harder, though.
Andrew.
"Andrew" == Andrew Overholt overholt@redhat.com writes:
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Andrew> Gah. I guess I can't get your help, then :(
I saw this problem last night when checking out cacao. I was using 3.1.1-1jpp_2fc.
FWIW I looked and noticed that the cvs .so was not being used, so it was running interpreted. This may be due to running rawhide versions; I've noticed that the compiler bump to 4.0.1 to 4.0.2 invalidated the .so cache -- we probably should have stripped that final digit from the cache directory name.
Tom
Tom Tromey writes:
"Andrew" == Andrew Overholt overholt@redhat.com writes:
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Andrew> Gah. I guess I can't get your help, then :(
I saw this problem last night when checking out cacao. I was using 3.1.1-1jpp_2fc.
FWIW I looked and noticed that the cvs .so was not being used, so it was running interpreted. This may be due to running rawhide versions; I've noticed that the compiler bump to 4.0.1 to 4.0.2 invalidated the .so cache -- we probably should have stripped that final digit from the cache directory name.
I think Gary already fixed this.
We now install all the target libs under /usr/lib/gcj. We used to install them under gcj-4.0.*, and this was wrong.
Andrew.
* Tom Tromey tromey@redhat.com [2005-11-15 14:02]:
"Andrew" == Andrew Overholt overholt@redhat.com writes:
FYI, I have been unable to dup this problem. It works fine for me. This suggests hat it might well be some sort of race.
Andrew> Gah. I guess I can't get your help, then :(
I saw this problem last night when checking out cacao. I was using 3.1.1-1jpp_2fc.
FWIW I looked and noticed that the cvs .so was not being used, so it was running interpreted. This may be due to running rawhide versions; I've noticed that the compiler bump to 4.0.1 to 4.0.2 invalidated the .so cache -- we probably should have stripped that final digit from the cache directory name.
I can't see how that happened. I've verified more than once that now that we've got -fjni in aot-compile-rpm in both FC4 and rawhide, all the jars that we have compiled (which is all of them in rawhide) should be loaded from .so and not bytecode.
Andrew
"Andrew" == Andrew Overholt overholt@redhat.com writes:
Andrew> I can't see how that happened. I've verified more than once Andrew> that now that we've got -fjni in aot-compile-rpm in both FC4 Andrew> and rawhide, all the jars that we have compiled (which is all Andrew> of them in rawhide) should be loaded from .so and not Andrew> bytecode.
I just updated eclipse on that machine and I got a bunch of errors about '/usr/bin/rebuild-gcj-db: No such file or directory'. I suppose this is more to blame than the minor version increment.
Tom
* Tom Tromey tromey@redhat.com [2005-11-15 16:35]:
"Andrew" == Andrew Overholt overholt@redhat.com writes:
Andrew> I can't see how that happened. I've verified more than once Andrew> that now that we've got -fjni in aot-compile-rpm in both FC4 Andrew> and rawhide, all the jars that we have compiled (which is all Andrew> of them in rawhide) should be loaded from .so and not Andrew> bytecode.
I just updated eclipse on that machine and I got a bunch of errors about '/usr/bin/rebuild-gcj-db: No such file or directory'. I suppose this is more to blame than the minor version increment.
This is due to the changes in java-1.4.2-gcj-compat{,-devel}. fitzsim can probably shed some light. Perhaps in a new thread, fitzsim?
Andrew
On Tue, 2005-11-15 at 16:36 -0500, Andrew Overholt wrote:
- Tom Tromey tromey@redhat.com [2005-11-15 16:35]:
> "Andrew" == Andrew Overholt overholt@redhat.com writes:
Andrew> I can't see how that happened. I've verified more than once Andrew> that now that we've got -fjni in aot-compile-rpm in both FC4 Andrew> and rawhide, all the jars that we have compiled (which is all Andrew> of them in rawhide) should be loaded from .so and not Andrew> bytecode.
I just updated eclipse on that machine and I got a bunch of errors about '/usr/bin/rebuild-gcj-db: No such file or directory'. I suppose this is more to blame than the minor version increment.
This is due to the changes in java-1.4.2-gcj-compat{,-devel}. fitzsim can probably shed some light. Perhaps in a new thread, fitzsim?
Update to Rawhide java-1.4.2-gcj-compat-devel.
Tom
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Tom Tromey wrote:
I saw this problem last night when checking out cacao. I was using 3.1.1-1jpp_2fc.
Hey, let's get some jpackage rpms of cacao ;) I built both kaffe and jamvm and neither can seem to run eclipse on x86_64.
- -- Sincerely,
David Walluck david@zarb.org
java-devel@lists.fedoraproject.org