[fedora-java] aot-compile-rpm

Gary Benson gbenson at redhat.com
Wed Jul 6 15:40:56 UTC 2005


Over the past couple of days I've been writing a replacement for
aot-compile and find-and-aot-compile.  They were both good when they
were first written but the demands of Eclipse and JOnAS have exposed
numerous shortcomings.

Attached is a copy of aot-compile-rpm which I'd like to commit into
java-1.4.2-gcj-compat if everyone's agreeable.  It's an order of
magnitude more complex than aot-compile and find-and-aot-compile, but
it's advantages over them are manifold:

 IT'S MUCH MORE USABLE
 =====================
  Nativifying an rpm using aot-compile-rpm is a matter of
  copy-and-paste:

    1. Remove "BuildArch: noarch"
    2. Add "BuildRequires: java-1.4.2-gcj-compat >= 1.4.2.0-Xjpp"
       and "Requires(post,postun)" on same.
    3. Add "aot-compile-rpm" to the very end of %install.
    4. Add "/usr/bin/rebuild-gcj-db %{_libdir}" to %post and %postun.
    5. Add "%attr(-,root,root) %{_libdir}/gcj/%{name}" to %files.

  With aot-compile or find-and-aot-compile step 4 was much more
  complex.  I was always uneasy about nativifying things I didn't
  myself maintain, postgresql-jdbc for example, because I was wary of
  dropping a bunch of fragile code on someone else.  No longer.

 IT FINDS JARS BY SIGNATURE RATHER THAN BY EXTENSION
 ===================================================
  find-and-aot-compile identifies jarfiles by their extension, ".jar",
  so it misses ".war", ".ear", ".rar", and anything else the Java
  world happens to invent.  aot-compile-rpm identifies jarfiles by
  opening them, so it catches them no matter what they're called.

  It's already found some unexpected stuff.  Tomcat, for example, has
  a couple of servlets that are disabled by default because they are
  in jarfiles called ".renametojar".  aot-compile-rpm finds and
  compiles these, so if the user renames them to enable the servlets
  then they'll be running BC-compiled code.

 IT IGNORES SUBSETTED JARFILES
 =============================
  Several packages contain jarfiles which are a subset of others.
  MX4J, for example, has the APIs in mx4j-jmx.jar, the implementation
  in mx4j-impl.jar, and both together in mx4j.jar.  aot-compile-rpm
  recognises that compiling mx4j.jar will get every class in the other
  two jars too, so it'll only compile mx4j.jar.  So, aot-compile-rpm
  compiles MX4J in half the time (and generates half the bytes) that
  find-and-aot-compile does.

 IT WORKS AROUND THE PPC GO2 LIMIT
 =================================
  PPC machines are limited on the size of jarfiles that can be
  compiled in one go, affecting Eclipse and JacORB, as described
  at https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=158308.
  aot-compile-rpm splits large jarfiles on ppc to avoid this.

One limitation of aot-compile and find-and-aot-compile that I have not
addressed (yet) is SMP support.  Andrew Overholt suggested generating
a Makefile and letting make handle the complex stuff, which I think is
an excellent idea.  Implementing something along those lines is
something I'd like to do, but I need to get ppc64 and s390*
bootstrapped first.

Cheers,
Gary
-------------- next part --------------
#!/usr/bin/env python

import os
import sys
import zipfile

PATHS = {"libdir": "/usr/lib/gcj",
         "gcj":    "/usr/bin/gcj",
         "dbtool": "/usr/bin/gcj-dbtool"}

GCJFLAGS = os.environ.get("RPM_OPT_FLAGS", "").split() + [
    "-fPIC", "-findirect-dispatch"]
LDFLAGS = ["-Wl,-Bsymbolic"]

class Error(Exception):
    pass

def aot_compile_rpm(basedir, libdir):
    """Search basedir for jarfiles, then generate solibs and class
    mappings for them all in libdir."""
    dstdir = os.path.join(basedir, libdir.strip(os.sep))
    if not os.path.isdir(dstdir):
        os.makedirs(dstdir)
    for jar in weed_jars(find_jars(basedir)):
        aot_compile_jar(jar, dstdir, libdir)

def find_jars(dir):
    """Return a list of every jarfile under a directory.  Goes on
    magic rather than file extension so we hit wars, ears, rars and
    anything else they cooked up lately."""
    def visit(jars, dir, items):
        for item in items:
            path = os.path.join(dir, item)
            if os.path.islink(path) or not os.path.isfile(path):
                continue
            # could use zipfile.is_zipfile() but this is quicker
            if open(path, "r").read(2) != "PK":
                continue
            zf = zipfile.ZipFile(path, "r")
            try:
                zf.getinfo("META-INF/MANIFEST.MF")
            except KeyError:
                continue
            assert not jars.has_key(item)
            jars[item] = zf
    jars = {}
    os.path.walk(dir, visit, jars)
    jars = [(jar.filename, jar) for jar in jars.values()]
    jars.sort()
    return [jar for path, jar in jars]

def weed_jars(jars):
    """Remove any jarfiles that are completely contained within
    another.  This is more common than you'd think, and we only
    need one nativified copy of each class after all."""
    while True:
        for jar1 in jars:
            for jar2 in jars:
                if jar1 is jar2:
                    continue
                for item2 in jar2.infolist():
                    try:
                        item1 = jar1.getinfo(item2.filename)
                    except KeyError:
                        break
                    if item1.CRC != item2.CRC:
                        break
                else:
                    warn("subsetted %s" % jar2.filename)
                    jars.remove(jar2)
                    break
            else:
                continue
            break
        else:
            break
        continue
    return jars

def aot_compile_jar(jar, dir, libdir):
    """Generate the shared library and class mapping for one jarfile.
    If the shared library already exists then it will not be
    overwritten.  This is to allow optimizer failures and the like to
    be worked around."""
    basename = os.path.basename(jar.filename)
    soname = os.path.join(dir, basename + ".so")
    dbname = os.path.join(dir, basename + ".db")
    if os.path.exists(soname):
        warn("not recreating %s" % soname)
    else:
        sources = split_jarfile(jar, dir)
        if sources == [jar.filename]:
            # compile and link
            system([PATHS["gcj"], "-shared"] +
                   GCJFLAGS + LDFLAGS +
                   [jar.filename, "-o", soname])
        else:
            # compile
            objects = []
            for source in sources:
                object = os.path.join(dir, os.path.basename(source) + ".o")
                system([PATHS["gcj"], "-c"] +
                       GCJFLAGS +
                       [source, "-o", object])
                objects.append(object)
            # link
            system([PATHS["gcj"], "-shared"] +
                   GCJFLAGS + LDFLAGS +
                   objects + ["-o", soname])
            # clean up
            for item in sources + objects:
                os.unlink(item)
    # dbtool
    dbname = os.path.join(dir, basename + ".db")
    system([PATHS["dbtool"], "-n", dbname, "64"])
    system([PATHS["dbtool"], "-f", dbname, jar.filename,
            os.path.join(libdir, basename + ".so")])

def split_jarfile(src, dir, split = 1500):
    """In order to avoid #158308 we must split large jarfiles on PPC."""
    if os.environ.get("RPM_ARCH") != "ppc":
        return [src.filename]
    items = src.infolist()
    if len([i for i in items if i.filename.endswith(".class")]) < split:
        return [src.filename]
    warn("splitting %s" % src.filename)
    jarfiles, dst = [], None
    for item in items:
        if (dst is None or item.filename.endswith(".class") and size >= split):
            if dst is not None:
                dst.close()
            path = os.path.join(dir, "%s.%d.jar" % (
                os.path.basename(src.filename), len(jarfiles) + 1))
            jarfiles.append(path)
            dst = zipfile.ZipFile(path, "w", zipfile.ZIP_STORED)
            size = 0
        dst.writestr(item, src.read(item.filename))
        size += 1
    dst.close()
    return jarfiles

def system(command):
    """Execute a command."""
    prefix = os.environ.get("PS4", "+ ")
    prefix = prefix[0] + prefix
    print >>sys.stderr, prefix + " ".join(command)

    status = os.spawnv(os.P_WAIT, command[0], command)
    if status > 0:
        raise Error, "%s exited with code %d" % status
    elif status < 0:
        raise Error, "%s killed by signal %d" % -status

def warn(msg):
    """Print a warning message."""
    print >>sys.stderr, "%s: warning: %s" % (
        os.path.basename(sys.argv[0]), msg)

if __name__ == "__main__":
    try:
        name = os.environ.get("RPM_PACKAGE_NAME")
        if name is None:
            raise Error, "this script is designed for use in rpm specfiles"
        arch = os.environ.get("RPM_ARCH")
        if arch == "noarch":
            raise Error, "cannot be used on noarch packages"
        buildroot = os.environ.get("RPM_BUILD_ROOT")
        if buildroot in (None, "/"):
            raise Error, "bad $RPM_BUILD_ROOT"
        aot_compile_rpm(buildroot, os.path.join(PATHS["libdir"], name))
    except Error, e:
        print >>sys.stderr, "%s: error: %s" % (
            os.path.basename(sys.argv[0]), e)


More information about the java-devel mailing list