A Basic Introduction to the Classfile API

Using the class file API to generate JVM bytecode that creates a new object, and branches based on a random number

Introduction

JEP 484 defines the class file API as a standard way for parsing, generating, and transforming Java class files. You’ll never need to use it if you don’t write/never plan to write a JVM-based library, framework or compiler. ASM would still be the go-to library for most developers because there’s already a lot of information about it online. As of the time of writing, the API is still in preview and will be finalized with the release of JDK 24.

This article would give a very shallow introduction to creating class files. To be clear, we will be converting the following code to JVM byte code and run the generated bytecode with the java command:

// inside PersonRunner.java
class Person {
	private final String name;
	final int age;

	Person(String name, int age) {
		this.name = name;
		this.age = age;
	}

	public String getName() {
		return name;
	}
}

// inside PersonRunner.java
public class PersonRunner {
	public static void main(String[] args) {
		var person = new Person("Cassi", 23);
		int random = (int) (Math.random() * 2); // random number between 0 and 1

		// prints the name if the random # is 0, or the age if it's 1
		if (random == 0) System.out.println(person.getName());
		else System.out.println(person.age);
	}
}

Big Disclaimer: If you’re looking for an in-depth understanding of the API, and JVM bytecode in general, this article is most definitely not for you. ALL the code you see here was gotten by looking at the actual byte code gotten from javap, and the trying to map it to the API which involved lots of digging through different methods/classes/documentation.

Requirements

  1. Java (duh) 23 or higher.

  2. An okay-ish understanding of Java

  3. An IDE that makes working with preview features easier

  4. I wouldn’t say you need to know JVM bytecode, but it will help to understand what’s going on under the hood.

The JVM

The JVM is a stack-based virtual machine - a glorified way of saying that any operation it wants to perform needs to be done with values on the stack. For example, if you want to add two numbers, you would have to do the following:

push 2 // 2 is now on the stack
push 3 // 3 is now on the stack
add 
// 2 and 3 have been popped from the stack, and the result (5) is pushed back

Another type of machine is a register-based machine. Rather than operations being done on the stack, they’re done on registers. Most assembly languages are register-based.

There’s much more to stack machines and the JVM, but that’s enough for now. Another thing to note is that the JVM does not have a concept of different languages like Java, Scala, Groovy, etc. It only understands bytecode, so you usually have to compile with javac first. javac converts your Java code to bytecode and then calls the JVM to run it with the java command.

The code you will see here is a really dumbed down version of what the Kotlin compiler, scala compiler, and any other language that targets the JVM does.

Setting up the project

Since the class file API is still in preview, we’ll need to enable preview features when running the program. This is what our starting code will look like:

import static java.lang.classfile.ClassFile.*;
import static java.lang.constant.ConstantDescs.*;

/// Creates the Person class shown above
@SuppressWarnings("preview")
void createPerson() throws IOException {

}

/// Creates the class containing the main method shown above
@SuppressWarnings("preview")
void createPersonRunner() throws IOException {

}

@SuppressWarnings("preview")
void main() throws IOException {
	println("Hello, ClassFileAPI!");
	createPerson();
	createPersonRunner();
}
  1. Since we have to use preview features, I figured I might as well use the instance main method feature which gets rids of the lovely public static void main String args[] and gives us nice methods like println and readln.

  2. The two methods will do precisely what their comments say.

  3. @SuppressWarnings("preview") is there so IntelliJ stops bugging us about preview features.

  4. Both methods that create the class files throw an IOException because we’re writing to a file. So, our main method also throws an IOException. Remember “What colour is your function?” by Bob Nystrom? This is a good example of it.

To run the above code:

# assuming the file is named Runner.java
java --enable-preview --source 23 Runner.java
# With Java 22+, we don't need to compile first 

Creating the Person class

As a reminder, the Person class is inside the PersonRunner.java file and looks like:

class Person {
	private final String name;
	final int age;

	Person(String name, int age) {
		this.name = name;
		this.age = age;
	}

	public String getName() {
		return name;
	}
}

public class PersonRunner {/*PersonRunnerStuffHere*/
}

To start, let’s save the name of the classes in a public variable so it can easily be accessed:

// At the beginning of the file:
ClassDesc personDesc = ClassDesc.of("Person");
ClassDesc personRunnerDesc = ClassDesc.of("PersonRunner");

The first class file API stuff we come across is ClassDesc, which (in simplified terms) is a way for the API to ensure that it always works with a valid class name. Inside the createPerson() method, we’ll add:

ClassFile.of().buildTo(Path.of(personDesc.displayName() +".class"),personDesc,
   classBuilder -> classBuilder
   // there will be an error here because the consumer is empty
);
  1. We specify that the class file should be saved to a file with the name Person.class. The displayName() method returns the class name as a string.

  2. The third parameter is a consumer, and it will handle the logic behind creating the class.

  3. Note that at this point, the code will not compile because the classBuilder consumer is empty

Summary on javap

Before jumping into building the class, a nice tool called javap can help guide us. It comes with every Java installation, so installing anything new is unnecessary. Create a PersonRunner.java file, and add the Person class. Something like:

class Person {/*Copy/paste the content of Person class here*/
}

public class PersonRunner {}

Then run the following command to compile to class file: javac PersonRunner.java. You should see two new .class files created in the current directory, even though we compiled just one file. The reason is that every class, interface, record, etc., in the JVM gets its own (class)file, even if it’s declared inside another class.

The PersonRunner.class file is empty for now. So we should test javap on the Person.class with: javap -v -p Person.class. javap dumps the bytecode that will be executed onto the terminal.

The output can be scary if it’s your first time seeing it, but the nice part is that we don’t have to do everything manually. For example, we don’t have to manually add new strings or integers to the constant pool; it’s all done automatically by the API.

You should save/redirect the output to a text file. For every instruction you write in Java code, try to see how it maps to the bytecode. Call javap on the class file that was generated by the API and compare it to the javap output for the classes that were generated by javac.


Looking at the raw Person class, we can see no extra modifiers such as private, static, etc. We could survive without explicitly setting flags, but it will default to public. To make it the default to package private, we update our class builder to:

ClassFile.of().buildTo(Path.of(personStr + ".class"), personDesc,
		classBuilder -> classBuilder
			.withFlags(0)
);

Nothing else has changed except the addition of withFlags(0), which explicitly says no access modifier should be used in the class. With that change, we can run our program with: java --enable-preview --source 23 Runner.java and see a Person.class file generated in the current directory.

Note that the .class file is a binary. If you open it in IntelliJ, you’ll get a nice-to-view decompiled version of the class, which doesn’t tell us much, so we use javap to see the bytecode.


The Person class has two instance variables: private final String name and final int age. Let’s create them! Just below .withFlags(0) add:

.withField("name", CD_String, ACC_FINAL | ACC_PRIVATE)
.withField("age", CD_int, ACC_FINAL)
  1. The first parameter is the name of the field we’re storing.

  2. The second parameter is the ClassDesc that we saw earlier. It specifies the type of field. CD_String and CD_int are just helper constants for primitives and common classes. It saves us the hassle of manually writing how Strings or ints are represented in bytecode: Ljava/lang/String;. If you click on it to view the source code, you’ll be taken to ConstantDecs.java from the java.lang.constant package. Many already verified ClassDesc are ready to use, such as int, void, boolean, etc. You can also see how those types are represented in bytecode. For example, int, is represented as I. We’ll use CD_* constants a lot!

  3. The 3rd parameter is the access flags on the field, which looks self-explanatory. The bitwise OR operator is used to combine the flags.


Now that we have the instance fields, the next thing to do according to the Person class we’re creating is to set up a constructor. Add the following directly below .withField("age",....):

.withMethodBody(INIT_NAME, MethodTypeDesc.of(CD_void, CD_String, CD_int), 0, 
   builder -> builder
      // there will be an error here because the consumer is empty
)

Make sure to use CD_void and not CD_Void. One is for the primitive type void, and the other is for the wrapper class Void. Same for CD_int.

  1. We’re using the .withMethodBody because constructors in Java are simply a special type of method.

  2. INIT_NAME is another helper constant in ConstantDescs.java that translates to <init>. If the name of a method is <init>, the JVM will treat it as a constructor.

  3. The second parameter: MethodTypeDesc.of(CD_void, CD_String, CD_int) describes the method signature of the constructor. The first parameter is the return type - void because all constructors in Java return void. The remaining parameters are the types of the argument the constructor takes.

  4. The third parameter is the access flag. There are no access flags, so it’s set to 0 (package-private)

  5. The fourth parameter is the builder consumer, that well … builds the constructor body

To build the constructor body, we need to call the super constructor, save the first parameter received in the name field, save the second parameter received in the age field, and return nothing. The compiler automatically calls the super constructor (object) in Java code and returns nothing for us. Replace the error comment with the following:

.aload(0)
.invokespecial(CD_Object, INIT_NAME, MethodTypeDesc.of(CD_void))
.aload(0)
.aload(1)
.putfield(personDesc, "name", CD_String)
.aload(0)
.iload(2)
.putfield(personDesc, "age", CD_int)
.return_()

Remember when I said the JVM is a stack-based machine? It shows here!

  1. .aload(0):

    1. The constructor is a method, and every method stores its parameters in “slots” starting from slot 0.

    2. The value in slot 0 of the constructor is this. In fact, the value in the 0th slot of all non-static methods is this.

    3. aload(n) loads a reference onto the stack. In this case, we’re loading the this reference in slot 0

  2. .invokespecial(CD_Object,...):

    1. This calls the constructor for the Object superclass. Although we aren’t explicitly extending the Object class, all classes in Java implicitly extend the Object class, so we need to call the Object constructor.

    2. invokespecial is used mostly to call constructors. The first parameter is the owner of the constructor to call, the second parameter is the constructor’s name, and the third is the method signature.

    3. The Object constructor doesn’t take any parameters, so we use MethodTypeDesc.of(CD_void).

    4. Note that invokespecial will pop this that was loaded with aload(0) and use it for its operations.

  3. .aload(0): Load this onto the stack again because we want to modify this.name.

  4. aload(1): Load the constructor’s first parameter onto the stack. In our case, the String name parameter will be loaded onto the stack

  5. .putfield(personRunnerDesc, "name", CD_String):

    1. Set this.name to the value on the stack.

    2. putfield is used to set instance fields. In our case, it uses the last 2 values on the stack for its operations.

    3. The first parameter to putfield is the owner of the field, i.e. Person class the second is the name of the field, and the third is the type of the field.

    4. Note that this and the first parameter have been popped from the stack.

  6. Repeat the same for the age field. The only difference is that we use iload(2) instead of aload(1) because the age is an int, so we use a specialized command to load it onto the stack.

  7. .return_(): Return nothing.

That wraps up the constructor! You can run the program with java --enable-preview --source 23 Runner.java and examine thePerson.class file in the directory.


Next on the Person class is creating the getName method. This method looks similar to the constructor because they’re both methods. Add the following below the constructor:

.withMethodBody("getName", MethodTypeDesc.of(CD_String), ACC_PUBLIC,
   builder -> builder
	.aload(0)
	.getfield(personDesc, "name", CD_String)
	.areturn()
)

There is nothing new here except the content of the builder consumer:

  1. .aload(0): Recall that the first parameter of all non-static methods is this. We load this onto the stack to access this.name.

  2. .getfield(personDesc, "name", CD_String): Get the value of this.name and push it onto the stack. Note that this has now been popped from the stack, and the stack now contains the value of this.name.

  3. .areturn(): Return the value on the stack. areturn() is used to return a reference type. If the method were returning an int, we would use ireturn().


There’s one more thing that’s easy to forget. If you look at the bytecode generated by javap, there’s a SourceFile attribute at the end of the file. We should also add that to our Person class to signify that it belongs inside the PersonRunner class. Just after the getter method for name, add:

.with(SourceFileAttribute.of(personRunnerDesc.displayName() + ".java"))

That’s it for the Person class! If you run the Person class with java Person, you should see an error message saying no main method was found, which is expected.

Hopefully, at this point, any compiler-minded people can see how code generation can be done with the API.

Creating the PersonRunner class

With the brief knowledge we have from creating the Person class, the PersonRunner class will be fairly easy. Recall that it looks like this:

class Person {/*Person class stuff here*/
}

public class PersonRunner {
	public static void main(String[] args) {
		var person = new Person("Cassi", 23);
		int random = (int) (Math.random() * 2); // random number between 0 and 1

		if (random == 0) System.out.println(person.getName());
		else System.out.println(person.age);
	}
}

We start the same way we started the Person class, with just a little extra info. In createPersonRunner(), add:

ClassFile.of().buildTo(Path.of(personRunnerDesc.displayName() + ".class"),
    personRunnerDesc,
    classBuilder -> classBuilder
        .with(SourceFileAttribute.of(personRunnerDesc.displayName() + ".java"))
        .withFlags(ACC_PUBLIC)
);

Again, there’s nothing new here. We’re simply setting the source file attribute, and making the PersonRunner class public.


We can’t create the main method just yet. If there is no constructor for a class, javac automatically creates a no-arg constructor, So, we should do the same. Directly below .withFlags(ACC_PUBLIC), add:

.withMethodBody(INIT_NAME, MTD_void, ACC_PUBLIC, builder -> builder
   .aload(0)
   .invokespecial(CD_Object, INIT_NAME, MTD_void)
   .return_()
)

The above is similar to what we’ve seen before. The only new thing is the use of MTD_void, a helper constant representing a method signature that returns void and takes no parameters. It is equivalent to MethodTypeDesc.of(CD_void). It also comes from the ConstantDescs.java file.


Now, we can create the main method. Again, it’s very similar to what we’ve seen earlier. After the constructor, add:

.withMethodBody("main", MethodTypeDesc.of(CD_void, CD_String.arrayType()), 
   ACC_PUBLIC | ACC_STATIC, builder -> builder
     .return_()
)
  1. Recall that Java requires the lovely public static void main (String[] args) to let the JVM know where to start executing code from.

  2. CD_String.arrayType() is a helper method that returns the ClassDesc for String[]. Every ClassDesc can be converted to an array type by calling the arrayType() method. An array type is represented in the JVM by appending [ to the type. For example, String is represented as Ljava/lang/String;, so String[] will be represented as [Ljava/lang/String;.

  3. The main method returns void, so we add a .return_() at the end. Every code we add should be above the return_().


Now, we can start adding the logic to the main method. We first need to create the Person object with the name “Cassi” and age 23. Above the .return_(), add:

.new_(personDesc)
.dup()
.ldc("Cassi")
.ldc(23)
.invokespecial(personDesc, INIT_NAME,
   MethodTypeDesc.of(CD_void, CD_String, CD_int)
)
.astore(1)
  1. .new_(personDesc): Create a new object of the Person class. The first parameter is the ClassDesc of the class to create. Note that the object is uninitialized, and we need to call the constructor to set the appropriate values.

  2. .dup(): Duplicate the value on top of the stack. In our case, we’re duplicating the new object that was created. If we don’t duplicate the object, the constructor call will pop the object from the stack, and we’ll lose the reference forever

  3. .ldc("Cassi"), ldc(23): Load the string “Cassi” and the integer 23 onto the stack. ldc is used an item from the constant pool onto the stack. If the item is already in the constant pool, a reference to it will be used. If not, it will be added to the constant pool

  4. .invokespecial(personDesc, INIT_NAME,...:

    1. We’ve seen this earlier. We’re calling the constructor of the Person class and specifying the type of parameters it needs.

    2. Note that the operation pops the amount of parameters that it needs from the stack, AND the object where the parameter will be stored in, which is why we loaded the parameters onto the stack and duplicated the reference before calling the constructor.

  5. .astore(1):

    1. At this point, we have a Person object with the name “Cassi” and age 23 on top of the stack

    2. astore(1) is used to a reference in slot 1.

    3. We’re not storing in slot 0 because slot 0 contains the command line argument String[] args. Recall that the main method is static, so this is not available.


We have our Person object; let’s generate the random number! This is the corresponding java code: int random = (int) (Math.random() * 2); // random number between 0 and 1

Just after astore(1), add:

.invokestatic(ClassDesc.ofDescriptor("Ljava/lang/Math;"), "random", MethodTypeDesc.of(CD_double))
.ldc(2.0)
.dmul()
.d2i()
.istore(2)
.iload(2)
  1. .invokestatic(ClassDesc...:

    1. This is how we call a static method, which is very similar to invokespecial.

    2. The first parameter is the ClassDesc containing the method we want to call. Since the ClassDesc of the Math class was not defined for us, we had to create it.

    3. The second parameter is the method’s name to call, and the third is the method signature. Math.random() returns a double.

    4. Note that the top of the stack now contains the random number

  2. .ldc(2.0): Load the double 2.0 onto the stack.

  3. .dmul(): Multiply the two doubles on the stack. The result is pushed back onto the stack, and the two doubles are popped.

  4. .d2i(): Convert the double on the stack to an int. The double is popped, and the int is pushed.

  5. .istore(2): Store the int in slot 2. We use istore because the random number is now int. Slot 1 is taken by the Person object (and slot 0 by the command line arguments).

  6. .iload(2): Load the int from slot 2 onto the stack so we can later use it in the if condition.

At this point, we have either 0 or 1 on the stack. Time for the if condition! Below iload(2), add:

.ifThenElse(
   Opcode.IFEQ,
   ifBlock -> {},
   elseBlock -> {}
)

Everything is pretty self-explanatory when you read it. Opcode.IFEQ would perform an equality checking using the value on top of the stack and 0. If the value on top of the stack is equal to 0. If it is, the ifBlock is executed, if not, the elseBlock is executed. The ifBlock and elseBlock are consumers.

There is a tiny optimization that can be done to remove 2 instructions. Can you find it?

Run javap on the PersonRunner class and compare the way the if statement is done in bytecode, and the way it’s done in the classfile API.


We’re almost done! We just need to print the name if the random number is 0, or the age if it’s 1.

Before printing though, we need the ClassDesc of PrintStream and the System class. At the top of the createPersonRunner() method, add:

ClassDesc CD_System = ClassDesc.ofDescriptor("Ljava/lang/System;");
ClassDesc CD_PrintStream = ClassDesc.ofDescriptor("Ljava/io/PrintStream;");

We also need to understand how System.out.println() works before attempting to emit bytecode for it.

  1. System is a class.

  2. out is a public static field in System class, and has type PrintStream

  3. println() is an overloaded non-static method on the PrintStream class that can take various types as a parameter.

  4. So, to call System.out.println(), we

    1. get out (no pun intended) - a static field
    2. call println() - a virtual/instance method i.e. instance method on out

Confusing? Hopefully, the code will clear it up. Inside the ifBlock, consumer add the following:

// bytecode for System.out.println(person.getName())
ifBlock
   .getstatic(CD_System, "out", CD_PrintStream)
   .aload(1)
   .getfield(personDesc, "age", CD_int)
   .invokevirtual(CD_PrintStream, "println", MethodTypeDesc.of(CD_void, CD_int));
  1. .getstatic(CD_System, "out", CD_PrintStream): Get the static field out from the System class. The last parameter is the type of out. The top of the stack now contains the PrintStream object

  2. aload(1): Recall that we saved the Person object in the 1st slot. Here, we’re simply loading it back onto the stack so we can call the age field

  3. .getfield(personDesc, "age", CD_int): Get the value of age from the Person object. The top of the stack now contains the value of age, and the PrintStream object gotten in #1 is below it.

  4. .invokevirtual(CD_PrintStream, "println",...:

    1. Very similar to invokestatic and invokespecial, but for instance methods.

    2. Call the println method from the PrintStream object.

    3. The first parameter is the ClassDesc of the class the method is in,

    4. The second parameter is the method’s name, and the third is the method signature. Keep in mind that println is an overloaded method.

    5. The PrintStream object and the value of age are popped from the stack. And the age is printed to the console.

And finally, the else block:

elseBlock
  .getstatic(CD_System, "out", CD_PrintStream)
  .aload(1)
  .invokevirtual(personDesc, "getName", MethodTypeDesc.of(CD_String))
  .invokevirtual(CD_PrintStream, "println", MethodTypeDesc.of(CD_void, 
      CD_String));
  1. Very similar to the ifblock with the only difference being the getfield() method being replaced with an invokevirtual to call the getName method.
  2. Once again, println is overloaded, so we can’t use the same signature as the previous call in the ifBlock. We have to specify the exact method signature we want to call.

Run your program with java --enable-preview --source 23 Runner.java and if everything goes well, you should have a PersonRunner.class file. Run that file with java PersonRunner and you should see either “Cassi” or 23 printed to the console.

Closing words

This was a fun and interesting API to learn without any tutorials, especially since I worked on it while trying to escape working on my capstone

Escaping a capstone project focused on programming by learning something else that is also focused on programming. Smart move?

Going off documentation and trial/error felt like how developers in the ’80s/’90s/early 2000s learned stuff. No StackOverflow, ChatGPT, or YouTube tutorials that hold your hand every step of the way. Just you, life, a bad chair, and documentation. Close enough?


Currently, the only use case I have for it is in a compiler I’m working on, which emits C code. Maybe I’ll create a bytecode target? (I won’t)

References/Learn More

Source Code

Run with java 23+ : java --enable-preview --source 23 Runner.java