Monday, December 12, 2011

Estimating Java Object Sizes with Instrumentation

Most Java developers who come from a C/C++ background have probably at one time wished for a Java equivalent of sizeof(). Although Java lacks a true sizeof() equivalent, the Instrumentation interface introduced with J2SE5 can be used to get an estimate of the size of a particular object via its getObjectSize(Object) method. Although this approach only supports the object being considered itself and does not take into account the sizes of the objects it references, code can be built to traverse those references and calculate an estimated total size.

The Instrumentation interface provides several methods, but the focus of this post is the getObjectSize(Object) method. This method's Javadoc documentation describes the method:

Returns an implementation-specific approximation of the amount of storage consumed by the specified object. The result may include some or all of the object's overhead, and thus is useful for comparison within an implementation but not between implementations. The estimate may change during a single invocation of the JVM.

This description tells us what the method does (provides an "implementation-specific approximation" of the specified object's size), its potential inclusion of overhead in the approximated size, and its potentially different values during a single JVM invocation.

It's fairly obvious that one can call Instrumentation.getObjectSize(Object) on an object to get its approximate size, but how does one access an instance of Instrumentation in the first place? The package documentation for the java.lang.instrument package provides the answer (and is an example of an effective Javadoc package description).

The package-level documentation for the java.lang.instrument package describes two ways an implementation might allow use JVM instrumentation. The first approach (and the one highlighted in this post) is to specify an instrumentation agent via the command-line. The second approach is to use an instrumentation agent with an already running JVM. The package documentation goes on to explain a high-level overview of using each approach. In each approach, a specific entry is required in the agent JAR's manifest file to specify the agent class: Premain-Class for the command-line approach and Agent-Class for the post-JVM startup approach. The agent class requires a specific method be implemented for either case: premain for command-line startup or agentmain forpost JVM startup.

The next code listing features the Java code for the instrumentation agent. The class includes both a premain (command-line agent) method and a agentmain (post JVM startup agent) method, though only the premain will be demonstrated in this post.

package dustin.examples;

import static java.lang.System.out;

import java.lang.instrument.Instrumentation;

/**
 * Simple example of an Instrumentation Agent adapted from blog post
 * "Instrumentation: querying the memory usage of a Java object"
 * (http://www.javamex.com/tutorials/memory/instrumentation.shtml).
 */
public class InstrumentationAgent
{
   /** Handle to instance of Instrumentation interface. */
   private static volatile Instrumentation globalInstrumentation;

   /**
    * Implementation of the overloaded premain method that is first invoked by
    * the JVM during use of instrumentation.
    * 
    * @param agentArgs Agent options provided as a single String.
    * @param inst Handle to instance of Instrumentation provided on command-line.
    */
   public static void premain(final String agentArgs, final Instrumentation inst)
   {
      out.println("premain...");
      globalInstrumentation = inst;
   }

   /**
    * Implementation of the overloaded agentmain method that is invoked for
    * accessing instrumentation of an already running JVM.
    * 
    * @param agentArgs Agent options provided as a single String.
    * @param inst Handle to instance of Instrumentation provided on command-line.
    */
   public static void agentmain(String agentArgs, Instrumentation inst)
   {
      out.println("agentmain...");
      globalInstrumentation = inst;
   }

   /**
    * Provide the memory size of the provided object (but not it's components).
    * 
    * @param object Object whose memory size is desired.
    * @return The size of the provided object, not counting its components
    *    (described in Instrumentation.getObjectSize(Object)'s Javadoc as "an
    *    implementation-specific approximation of the amount of storage consumed
    *    by the specified object").
    * @throws IllegalStateException Thrown if my Instrumentation is null.
    */
   public static long getObjectSize(final Object object)
   {
      if (globalInstrumentation == null)
      {
         throw new IllegalStateException("Agent not initialized.");
      }
      return globalInstrumentation.getObjectSize(object);
   }
}

The agent class above exposes a statically available method for accessing Instrumentation.getObjectSize(Object). The next code listing demonstrates a simple 'application' that makes use of it.

package dustin.examples;

import static java.lang.System.out;
import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.List;

/**
 * Build up some sample objects and throw them at the Instrumentation example.
 * 
 * Might run this class as shown next:
 * java -javaagent:dist\agent.jar -cp dist\agent.jar dustin.examples.InstrumentSampleObjects 
 * 
 * @author Dustin
 */
public class InstrumentSampleObjects
{
   public enum Color
   {
      RED,
      WHITE,
      YELLOW
   }

   /**
    * Print basic details including size of provided object to standard output.
    * 
    * @param object Object whose value and size are to be printed to standard
    *   output.
    */
   public static void printInstrumentationSize(final Object object)
   {
      out.println(
           "Object of type '" + object.getClass() + "' has size of "
         + InstrumentationAgent.getObjectSize(object) + " bytes.");
   }

   /**
    * Main executable function.
    * 
    * @param arguments Command-line arguments; none expected.
    */
   public static void main(final String[] arguments)
   {
      final StringBuilder sb = new StringBuilder(1000);
      final boolean falseBoolean = false;
      final int zeroInt = 0;
      final double zeroDouble = 0.0;
      final Long zeroLong = 0L;
      final long zeroLongP = 0L;
      final Long maxLong = Long.MAX_VALUE;
      final Long minLong = Long.MIN_VALUE;
      final long maxLongP = Long.MAX_VALUE;
      final long minLongP = Long.MIN_VALUE;
      final String emptyString = "";
      final String string = "ToBeOrNotToBeThatIsTheQuestion";
      final String[] strings = {emptyString, string, "Dustin"};
      final String[] moreStrings = new String[1000];
      final List<String> someStrings = new ArrayList<String>();
      final EmptyClass empty = new EmptyClass();
      final BigDecimal bd = new BigDecimal("999999999999999999.99999999");
      final Calendar calendar = Calendar.getInstance();

      printInstrumentationSize(sb);
      printInstrumentationSize(falseBoolean);
      printInstrumentationSize(zeroInt);
      printInstrumentationSize(zeroDouble);
      printInstrumentationSize(zeroLong);
      printInstrumentationSize(zeroLongP);
      printInstrumentationSize(maxLong);
      printInstrumentationSize(maxLongP);
      printInstrumentationSize(minLong);
      printInstrumentationSize(minLongP);
      printInstrumentationSize(maxLong);
      printInstrumentationSize(maxLongP);
      printInstrumentationSize(emptyString);
      printInstrumentationSize(string);
      printInstrumentationSize(strings);
      printInstrumentationSize(moreStrings);
      printInstrumentationSize(someStrings);
      printInstrumentationSize(empty);
      printInstrumentationSize(bd);
      printInstrumentationSize(calendar);
      printInstrumentationSize(Color.WHITE);
   }
}

To use the instrumentation agent via the command-line start-up, I need to ensure that a simple metafile is included in the agent JAR. It might look like what follows in the next code listing for the agent class in this case (dustin.examples.InstrumentationAgent). Although I only need the Premain-class entry for the command-line startup of the agent, I have included Agent-class as an example of how to use the post JVM startup agent. It doesn't hurt anything to have both present just as it did not hurt anything to have both premain and agentmain methods defined in the object class. There are prescribed rules for which of these is first attempted based on the type of agent being used.

Premain-class: dustin.examples.InstrumentationAgent
Agent-class: dustin.examples.InstrumentationAgent

To place this manifest file into the JAR, I could use the jar cmf with the name of the manifest file and the Java classes to be archived into the JAR. However, it's arguably easier to do with Ant and certainly is preferred for repeatedly doing this. A simple use of the Ant jar task with the manifest sub-element is shown next.

   <target name="jar"
           description="Package compiled classes into JAR file"
           depends="compile">
      <jar destfile="${dist.dir}/${jar.name}"
           basedir="${classes.dir}"
           filesonly="${jar.filesonly}">
         <manifest>
            <attribute name="Premain-class"
                       value="dustin.examples.InstrumentationAgent"/>
            <attribute name="Agent-class"
                       value="dustin.examples.InstrumentationAgent"/>
         </manifest>
      </jar>
   </target>

With the JAR built, I can easily run it with the Java launcher and specifying the Java agent (-javaagent):

java -javaagent:dist\Instrumentation.jar -cp Instrumentation.jar dustin.examples.InstrumentSampleObjects 

The next screen snapshot shows the output.

The above output shows some of the estimated sizes of various objects such as BigDecimal, Calendar, and others.

There are several useful resources related to the topic of this post. The java.sizeOf Project is "a little java agent what use the package java.lang.Instrument introduced in Java 5 and is released under GPL license." Dr. Heinz M. Kabutz's Instrumentation Memory Counter provides a significantly more sophisticated example than my post of using the Instrumentation interface to estimate object sizes. Instrumentation: querying the memory usage of a Java object provides a nice overview of this interface and provides a link to the Classmexer agent, "a simple Java instrumentation agent that provides some convenience calls for measuring the memory usage of Java objects from within an application." The posts How much memory the java objects consume? and Estimating the memory usage of a java object are also related.

4 comments:

Stephen Connolly said...

https://github.com/jbellis/jamm is a good implementation that is somewhat battle hardened as it is used by Apache Cassandra

@DustinMarx said...

Stephen,

Thanks for pointing out Java Agent for Memory Measurements (jamm). I was not aware of its existence or that it is used by Cassandra.

Dustin

Venki said...

I don't understand the manifest creating part. How do i create a manifest without using Ant?

PS: Its a brilliant tutorial by the way.

@DustinMarx said...

Mclovin,

Thanks for the kinds words.

A manifest file is simply a text file with the name MANIFEST.MF and is present in all JAR files. Often, there is a simple default version applied, but you can create the MANIFEST.MF file in your favorite IDE or text editor and add it to your JAR manually.

You can use the jar command with the m option to add a manifest file to the JAR when it is assembled. For example, you could take the two-line manifest contents shown in my post above, place them in a text file called MANIFEST.MF or even something else, and then place that file in your JAR using the jar command as described here. Be sure that the last line of the text file used as the manifest has a carriage return at its end.

Dustin