A Basic Introduction to the Classfile API
Using the class file API to generate JVM bytecode that creates a new object, and branches based on a random number
Table of Contents
Introduction
JEP 484 defines the class file API as a standard way for parsing, generating, and transforming Java class files. You’ll never need to use it if you don’t write/never plan to write a JVM-based library, framework or compiler. ASM would still be the go-to library for most developers because there’s already a lot of information about it online. As of the time of writing, the API is still in preview and will be finalized with the release of JDK 24.
This article would give a very shallow introduction to creating class files.
To be clear, we will be converting the following code to JVM byte code and run
the generated bytecode with the java
command:
// inside PersonRunner.java
class Person {
private final String name;
final int age;
Person(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() {
return name;
}
}
// inside PersonRunner.java
public class PersonRunner {
public static void main(String[] args) {
var person = new Person("Cassi", 23);
int random = (int) (Math.random() * 2); // random number between 0 and 1
// prints the name if the random # is 0, or the age if it's 1
if (random == 0) System.out.println(person.getName());
else System.out.println(person.age);
}
}
Big Disclaimer: If you’re looking for an in-depth understanding of the API, and JVM bytecode in general, this article is most definitely not for you. ALL the code you see here was gotten by looking at the actual byte code gotten from
javap
, and the trying to map it to the API which involved lots of digging through different methods/classes/documentation.
Requirements
-
Java (duh) 23 or higher.
-
An okay-ish understanding of Java
-
An IDE that makes working with preview features easier
-
I wouldn’t say you need to know JVM bytecode, but it will help to understand what’s going on under the hood.
The JVM
The JVM is a stack-based virtual machine - a glorified way of saying that any operation it wants to perform needs to be done with values on the stack. For example, if you want to add two numbers, you would have to do the following:
push 2 // 2 is now on the stack
push 3 // 3 is now on the stack
add
// 2 and 3 have been popped from the stack, and the result (5) is pushed back
Another type of machine is a register-based machine. Rather than operations being done on the stack, they’re done on registers. Most assembly languages are register-based.
There’s much more to stack machines and the JVM, but that’s enough for now.
Another thing to note is that the JVM does not have a concept of different
languages like Java, Scala, Groovy, etc. It only understands bytecode, so you
usually have to compile with javac
first. javac
converts your Java code to
bytecode and then calls the JVM to run it with the java
command.
The code you will see here is a really dumbed down version of what the Kotlin compiler, scala compiler, and any other language that targets the JVM does.
Setting up the project
Since the class file API is still in preview, we’ll need to enable preview features when running the program. This is what our starting code will look like:
import static java.lang.classfile.ClassFile.*;
import static java.lang.constant.ConstantDescs.*;
/// Creates the Person class shown above
@SuppressWarnings("preview")
void createPerson() throws IOException {
}
/// Creates the class containing the main method shown above
@SuppressWarnings("preview")
void createPersonRunner() throws IOException {
}
@SuppressWarnings("preview")
void main() throws IOException {
println("Hello, ClassFileAPI!");
createPerson();
createPersonRunner();
}
-
Since we have to use preview features, I figured I might as well use the instance main method feature which gets rids of the lovely
public static void main String args[]
and gives us nice methods like println and readln. -
The two methods will do precisely what their comments say.
-
@SuppressWarnings("preview")
is there so IntelliJ stops bugging us about preview features. -
Both methods that create the class files throw an
IOException
because we’re writing to a file. So, our main method also throws anIOException.
Remember “What colour is your function?” by Bob Nystrom? This is a good example of it.
To run the above code:
# assuming the file is named Runner.java
java --enable-preview --source 23 Runner.java
# With Java 22+, we don't need to compile first
Creating the Person class
As a reminder, the Person
class is inside the PersonRunner.java
file and
looks like:
class Person {
private final String name;
final int age;
Person(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() {
return name;
}
}
public class PersonRunner {/*PersonRunnerStuffHere*/
}
To start, let’s save the name of the classes in a public variable so it can easily be accessed:
// At the beginning of the file:
ClassDesc personDesc = ClassDesc.of("Person");
ClassDesc personRunnerDesc = ClassDesc.of("PersonRunner");
The first class file API stuff we come across is ClassDesc,
which (in
simplified terms) is a way for the API to ensure that it always works with a
valid class name. Inside the createPerson()
method, we’ll add:
ClassFile.of().buildTo(Path.of(personDesc.displayName() +".class"),personDesc,
classBuilder -> classBuilder
// there will be an error here because the consumer is empty
);
-
We specify that the class file should be saved to a file with the name
Person.class
. ThedisplayName()
method returns the class name as a string. -
The third parameter is a consumer, and it will handle the logic behind creating the class.
-
Note that at this point, the code will not compile because the classBuilder consumer is empty
Summary on javap
Before jumping into building the class, a nice tool called javap
can help
guide us. It comes with every Java installation, so installing anything new is
unnecessary. Create a PersonRunner.java
file, and add the Person
class.
Something like:
class Person {/*Copy/paste the content of Person class here*/
}
public class PersonRunner {}
Then run the following command to compile to class file:
javac PersonRunner.java
. You should see two new .class
files created in
the current directory, even though we compiled just one file. The reason is that
every class, interface, record, etc., in the JVM gets its own
(class)file, even if it’s declared inside another class.
The PersonRunner.class
file is empty for now. So we should test javap on the
Person.class
with: javap -v -p Person.class
. javap
dumps the bytecode that
will be executed onto the terminal.
The output can be scary if it’s your first time seeing it, but the nice part is that we don’t have to do everything manually. For example, we don’t have to manually add new strings or integers to the constant pool; it’s all done automatically by the API.
You should save/redirect the output to a text file. For every instruction you
write in Java code, try to see how it maps to the bytecode. Call javap
on the
class file that was generated by the API and compare it to the javap
output
for the classes that were generated by javac.
Looking at the raw Person
class, we can see no extra modifiers such as
private,
static,
etc. We could survive without explicitly setting flags, but
it will default to public. To make it the default to package private, we update
our class builder to:
ClassFile.of().buildTo(Path.of(personStr + ".class"), personDesc,
classBuilder -> classBuilder
.withFlags(0)
);
Nothing else has changed except the addition of withFlags(0)
, which
explicitly says no access modifier should be used in the class. With that
change, we can run our program with:
java --enable-preview --source 23 Runner.java
and see a Person.class
file
generated in the current directory.
Note that the
.class
file is a binary. If you open it in IntelliJ, you’ll get a nice-to-view decompiled version of the class, which doesn’t tell us much, so we usejavap
to see the bytecode.
The Person
class has two instance variables: private final String name
and
final int age
. Let’s create them! Just below .withFlags(0)
add:
.withField("name", CD_String, ACC_FINAL | ACC_PRIVATE)
.withField("age", CD_int, ACC_FINAL)
-
The first parameter is the name of the field we’re storing.
-
The second parameter is the
ClassDesc
that we saw earlier. It specifies the type of field.CD_String
andCD_int
are just helper constants for primitives and common classes. It saves us the hassle of manually writing how Strings or ints are represented in bytecode:Ljava/lang/String;
. If you click on it to view the source code, you’ll be taken toConstantDecs.java
from thejava.lang.constant
package. Many already verifiedClassDesc
are ready to use, such as int, void, boolean, etc. You can also see how those types are represented in bytecode. For example,int
, is represented asI
. We’ll useCD_*
constants a lot! -
The 3rd parameter is the access flags on the field, which looks self-explanatory. The bitwise OR operator is used to combine the flags.
Now that we have the instance fields, the next thing to do according to the
Person
class we’re creating is to set up a constructor. Add the following
directly below .withField("age",....)
:
.withMethodBody(INIT_NAME, MethodTypeDesc.of(CD_void, CD_String, CD_int), 0,
builder -> builder
// there will be an error here because the consumer is empty
)
Make sure to use CD_void and not CD_Void. One is for the primitive type void, and the other is for the wrapper class Void. Same for CD_int.
-
We’re using the
.withMethodBody
because constructors in Java are simply a special type of method. -
INIT_NAME
is another helper constant inConstantDescs.java
that translates to<init>
. If the name of a method is<init>
, the JVM will treat it as a constructor. -
The second parameter:
MethodTypeDesc.of(CD_void, CD_String, CD_int)
describes the method signature of the constructor. The first parameter is the return type - void because all constructors in Java return void. The remaining parameters are the types of the argument the constructor takes. -
The third parameter is the access flag. There are no access flags, so it’s set to 0 (package-private)
-
The fourth parameter is the builder consumer, that well … builds the constructor body
To build the constructor body, we need to call the super constructor, save the first parameter received in the name field, save the second parameter received in the age field, and return nothing. The compiler automatically calls the super constructor (object) in Java code and returns nothing for us. Replace the error comment with the following:
.aload(0)
.invokespecial(CD_Object, INIT_NAME, MethodTypeDesc.of(CD_void))
.aload(0)
.aload(1)
.putfield(personDesc, "name", CD_String)
.aload(0)
.iload(2)
.putfield(personDesc, "age", CD_int)
.return_()
Remember when I said the JVM is a stack-based machine? It shows here!
-
.aload(0)
:-
The constructor is a method, and every method stores its parameters in “slots” starting from slot 0.
-
The value in slot 0 of the constructor is
this
. In fact, the value in the 0th slot of all non-static methods isthis
. -
aload(n)
loads a reference onto the stack. In this case, we’re loading thethis
reference in slot 0
-
-
.invokespecial(CD_Object,...)
:-
This calls the constructor for the
Object
superclass. Although we aren’t explicitly extending theObject
class, all classes in Java implicitly extend theObject
class, so we need to call theObject
constructor. -
invokespecial
is used mostly to call constructors. The first parameter is the owner of the constructor to call, the second parameter is the constructor’s name, and the third is the method signature. -
The
Object
constructor doesn’t take any parameters, so we useMethodTypeDesc.of(CD_void)
. -
Note that
invokespecial
will popthis
that was loaded withaload(0)
and use it for its operations.
-
-
.aload(0)
: Loadthis
onto the stack again because we want to modifythis.name
. -
aload(1)
: Load the constructor’s first parameter onto the stack. In our case, theString name
parameter will be loaded onto the stack -
.putfield(personRunnerDesc, "name", CD_String)
:-
Set
this.name
to the value on the stack. -
putfield
is used to set instance fields. In our case, it uses the last 2 values on the stack for its operations. -
The first parameter to
putfield
is the owner of the field, i.e.Person
class the second is the name of the field, and the third is the type of the field. -
Note that
this
and the first parameter have been popped from the stack.
-
-
Repeat the same for the age field. The only difference is that we use
iload(2)
instead ofaload(1)
because the age is an int, so we use a specialized command to load it onto the stack. -
.return_()
: Return nothing.
That wraps up the constructor! You can run the program with
java --enable-preview --source 23 Runner.java
and examine thePerson.class
file in the directory.
Next on the Person
class is creating the getName
method. This method looks
similar to the constructor because they’re both methods. Add the following below
the constructor:
.withMethodBody("getName", MethodTypeDesc.of(CD_String), ACC_PUBLIC,
builder -> builder
.aload(0)
.getfield(personDesc, "name", CD_String)
.areturn()
)
There is nothing new here except the content of the builder consumer:
-
.aload(0)
: Recall that the first parameter of all non-static methods isthis
. We loadthis
onto the stack to accessthis.name
. -
.getfield(personDesc, "name", CD_String)
: Get the value ofthis.name
and push it onto the stack. Note thatthis
has now been popped from the stack, and the stack now contains the value ofthis.name
. -
.areturn()
: Return the value on the stack.areturn()
is used to return a reference type. If the method were returning an int, we would useireturn()
.
There’s one more thing that’s easy to forget. If you look at the bytecode
generated by javap, there’s a SourceFile
attribute at the end of the file.
We should also add that to our Person class to signify that it belongs
inside the PersonRunner class. Just after the getter method for name
, add:
.with(SourceFileAttribute.of(personRunnerDesc.displayName() + ".java"))
That’s it for the Person class! If you run the Person
class with
java Person
, you should see an error message saying no main method was found,
which is expected.
Hopefully, at this point, any compiler-minded people can see how code generation can be done with the API.
Creating the PersonRunner class
With the brief knowledge we have from creating the Person
class, the
PersonRunner
class will be fairly easy. Recall that it looks like this:
class Person {/*Person class stuff here*/
}
public class PersonRunner {
public static void main(String[] args) {
var person = new Person("Cassi", 23);
int random = (int) (Math.random() * 2); // random number between 0 and 1
if (random == 0) System.out.println(person.getName());
else System.out.println(person.age);
}
}
We start the same way we started the Person
class, with just a little extra
info. In createPersonRunner()
, add:
ClassFile.of().buildTo(Path.of(personRunnerDesc.displayName() + ".class"),
personRunnerDesc,
classBuilder -> classBuilder
.with(SourceFileAttribute.of(personRunnerDesc.displayName() + ".java"))
.withFlags(ACC_PUBLIC)
);
Again, there’s nothing new here. We’re simply setting the source file
attribute, and making the PersonRunner
class public.
We can’t create the main method just yet. If there is no constructor for a
class, javac
automatically creates a no-arg constructor, So, we should do
the same. Directly below .withFlags(ACC_PUBLIC)
, add:
.withMethodBody(INIT_NAME, MTD_void, ACC_PUBLIC, builder -> builder
.aload(0)
.invokespecial(CD_Object, INIT_NAME, MTD_void)
.return_()
)
The above is similar to what we’ve seen before. The only new thing is the use
of MTD_void
, a helper constant representing a method signature
that returns void and takes no parameters. It is equivalent to
MethodTypeDesc.of(CD_void)
. It also comes from the ConstantDescs.java
file.
Now, we can create the main method. Again, it’s very similar to what we’ve seen earlier. After the constructor, add:
.withMethodBody("main", MethodTypeDesc.of(CD_void, CD_String.arrayType()),
ACC_PUBLIC | ACC_STATIC, builder -> builder
.return_()
)
-
Recall that Java requires the lovely
public static void main (String[] args)
to let the JVM know where to start executing code from. -
CD_String.arrayType()
is a helper method that returns theClassDesc
forString[]
. EveryClassDesc
can be converted to an array type by calling thearrayType()
method. An array type is represented in the JVM by appending[
to the type. For example,String
is represented asLjava/lang/String;
, soString[]
will be represented as[Ljava/lang/String;
. -
The main method returns void, so we add a
.return_()
at the end. Every code we add should be above thereturn_()
.
Now, we can start adding the logic to the main method. We first need to create
the Person object with the name “Cassi” and age 23. Above the .return_()
, add:
.new_(personDesc)
.dup()
.ldc("Cassi")
.ldc(23)
.invokespecial(personDesc, INIT_NAME,
MethodTypeDesc.of(CD_void, CD_String, CD_int)
)
.astore(1)
-
.new_(personDesc)
: Create a new object of thePerson
class. The first parameter is theClassDesc
of the class to create. Note that the object is uninitialized, and we need to call the constructor to set the appropriate values. -
.dup()
: Duplicate the value on top of the stack. In our case, we’re duplicating the new object that was created. If we don’t duplicate the object, the constructor call will pop the object from the stack, and we’ll lose the reference forever -
.ldc("Cassi")
,ldc(23)
: Load the string “Cassi” and the integer 23 onto the stack.ldc
is used an item from the constant pool onto the stack. If the item is already in the constant pool, a reference to it will be used. If not, it will be added to the constant pool -
.invokespecial(personDesc, INIT_NAME,...
:-
We’ve seen this earlier. We’re calling the constructor of the
Person
class and specifying the type of parameters it needs. -
Note that the operation pops the amount of parameters that it needs from the stack, AND the object where the parameter will be stored in, which is why we loaded the parameters onto the stack and duplicated the reference before calling the constructor.
-
-
.astore(1)
:-
At this point, we have a
Person
object with the name “Cassi” and age 23 on top of the stack -
astore(1)
is used to a reference in slot 1. -
We’re not storing in slot 0 because slot 0 contains the command line argument
String[] args
. Recall that the main method is static, sothis
is not available.
-
We have our Person object; let’s generate the random number! This is the
corresponding java code:
int random = (int) (Math.random() * 2); // random number between 0 and 1
Just after astore(1)
, add:
.invokestatic(ClassDesc.ofDescriptor("Ljava/lang/Math;"), "random", MethodTypeDesc.of(CD_double))
.ldc(2.0)
.dmul()
.d2i()
.istore(2)
.iload(2)
-
.invokestatic(ClassDesc...
:-
This is how we call a static method, which is very similar to
invokespecial
. -
The first parameter is the
ClassDesc
containing the method we want to call. Since theClassDesc
of the Math class was not defined for us, we had to create it. -
The second parameter is the method’s name to call, and the third is the method signature.
Math.random()
returns a double. -
Note that the top of the stack now contains the random number
-
-
.ldc(2.0)
: Load the double 2.0 onto the stack. -
.dmul()
: Multiply the two doubles on the stack. The result is pushed back onto the stack, and the two doubles are popped. -
.d2i()
: Convert the double on the stack to an int. The double is popped, and the int is pushed. -
.istore(2)
: Store the int in slot 2. We useistore
because the random number is now int. Slot 1 is taken by the Person object (and slot 0 by the command line arguments). -
.iload(2)
: Load the int from slot 2 onto the stack so we can later use it in the if condition.
At this point, we have either 0 or 1 on the stack. Time for the if condition!
Below iload(2)
, add:
.ifThenElse(
Opcode.IFEQ,
ifBlock -> {},
elseBlock -> {}
)
Everything is pretty self-explanatory when you read it. Opcode.IFEQ
would
perform an equality checking using the value on top of the stack and 0. If the
value on top of the stack is equal to 0. If it is, the ifBlock
is executed,
if not, the elseBlock
is executed. The ifBlock
and elseBlock
are
consumers.
There is a tiny optimization that can be done to remove 2 instructions. Can you find it?
Run javap on the PersonRunner class and compare the way the if statement is done in bytecode, and the way it’s done in the classfile API.
We’re almost done! We just need to print the name if the random number is 0, or the age if it’s 1.
Before printing though, we need the ClassDesc
of PrintStream
and the
System
class. At the top of the createPersonRunner()
method, add:
ClassDesc CD_System = ClassDesc.ofDescriptor("Ljava/lang/System;");
ClassDesc CD_PrintStream = ClassDesc.ofDescriptor("Ljava/io/PrintStream;");
We also need to understand how System.out.println()
works before
attempting to emit bytecode for it.
-
System
is a class. -
out
is a public static field inSystem
class, and has typePrintStream
-
println()
is an overloaded non-static method on thePrintStream
class that can take various types as a parameter. -
So, to call
System.out.println()
, we- get
out
(no pun intended) - a static field - call
println()
- a virtual/instance method i.e. instance method onout
- get
Confusing? Hopefully, the code will clear it up. Inside the ifBlock
, consumer
add the following:
// bytecode for System.out.println(person.getName())
ifBlock
.getstatic(CD_System, "out", CD_PrintStream)
.aload(1)
.getfield(personDesc, "age", CD_int)
.invokevirtual(CD_PrintStream, "println", MethodTypeDesc.of(CD_void, CD_int));
-
.getstatic(CD_System, "out", CD_PrintStream)
: Get the static fieldout
from theSystem
class. The last parameter is the type ofout
. The top of the stack now contains thePrintStream
object -
aload(1)
: Recall that we saved the Person object in the 1st slot. Here, we’re simply loading it back onto the stack so we can call the age field -
.getfield(personDesc, "age", CD_int)
: Get the value ofage
from the Person object. The top of the stack now contains the value ofage
, and thePrintStream
object gotten in #1 is below it. -
.invokevirtual(CD_PrintStream, "println",...
:-
Very similar to
invokestatic
andinvokespecial
, but for instance methods. -
Call the
println
method from thePrintStream
object. -
The first parameter is the
ClassDesc
of the class the method is in, -
The second parameter is the method’s name, and the third is the method signature. Keep in mind that
println
is an overloaded method. -
The
PrintStream
object and the value ofage
are popped from the stack. And the age is printed to the console.
-
And finally, the else block:
elseBlock
.getstatic(CD_System, "out", CD_PrintStream)
.aload(1)
.invokevirtual(personDesc, "getName", MethodTypeDesc.of(CD_String))
.invokevirtual(CD_PrintStream, "println", MethodTypeDesc.of(CD_void,
CD_String));
- Very similar to the
ifblock
with the only difference being thegetfield()
method being replaced with aninvokevirtual
to call thegetName
method. - Once again,
println
is overloaded, so we can’t use the same signature as the previous call in the ifBlock. We have to specify the exact method signature we want to call.
Run your program with java --enable-preview --source 23 Runner.java
and if
everything goes well, you should have a PersonRunner.class
file.
Run that file with java PersonRunner
and you should see either “Cassi” or
23 printed to the console.
Closing words
This was a fun and interesting API to learn without any tutorials, especially since I worked on it while trying to escape working on my capstone
Escaping a capstone project focused on programming by learning something else that is also focused on programming. Smart move?
Going off documentation and trial/error felt like how developers in the ’80s/’90s/early 2000s learned stuff. No StackOverflow, ChatGPT, or YouTube tutorials that hold your hand every step of the way. Just you, life, a bad chair, and documentation. Close enough?
Currently, the only use case I have for it is in a compiler I’m working on, which emits C code. Maybe I’ll create a bytecode target? (I won’t)
References/Learn More
- The Java Virtual Machine Specification - Java SE 23 Edition, Tim Lindholm, Frank Yellin, Gilad Bracha, Alex Buckley, Daniel Smith
- (YouTube) Java Bytecode Crash Course - David Buck
- (YouTube) VM Bytecode for Dummies (and the Rest of Us Too) - Charles Oliver Nutter
- (YouTube) A Classfile API for the JDK #JVMLS - Brian Goetz
- What Color is Your Function? - Bob Nystrom
- JEP 484: Class-File API
Source Code
Run with java 23+ : java --enable-preview --source 23 Runner.java