In-depth understanding of java virtual machine notes Chapter 8

In-depth understanding of java virtual machine notes Chapter 8

Runtime stack frame structure

A stack frame is a data structure used to support method calls and method executions of a virtual machine. It is a stack element of a virtual machine stack (Virtual Machine Stack) in the data area of a virtual machine runtime. The stack frame stores the method's local variable table, operand stack, dynamic connection and method return address and other information. The process of each method from the start of the call to the completion of the execution corresponds to the process of a stack frame from the stack to the stack in the virtual machine stack. Each stack frame includes the local variable table, operand stack, dynamic link, method return address and some additional additional information. When compiling the program code, how much local variable table is needed in the stack frame, and how deep the operand stack is has been completely determined, and written into the Code attribute of the method table, so how much memory needs to be allocated for a stack frame, It will not be affected by the variable data during program runtime, but only depends on the specific virtual machine implementation.

Conceptual structure of stack frame

Local variable table

Local Variable Table (Local Variable Table) is a set of variable value storage space used to store method parameters and local variables defined inside the method. When the Java program is compiled into a Class file, the maximum capacity of the local variable table that the method needs to be allocated is determined in the max_locals data item of the Code attribute of the method.

  • Store method parameters and local variables defined inside the method;
    • When the Java program is compiled into a class file, the maximum capacity of the local variable table that each method needs to be allocated is determined.
  • Minimum unit: variable slot Slot;
    • A Slot can be stored: boolean, byte, char, short, int, float, reference, returnAddress (rare);
      • The reference type represents a reference to an object instance
      • The virtual machine can do this through the reference in the local variable table:
        • Find the starting address of the instance object in the Java heap;
        • Find the Class object in the method area.
    • For 64-bit data types, the virtual machine allocates two consecutive Slot spaces in a high-aligned manner
      • The method of splitting the storage of long and double data types is somewhat similar to the method of splitting one read and write of long and double data types into two 32-bit reads and writes in the "non-atomic agreement of long and double". However, because the local variable table is built on the thread's stack and is thread-private data, no matter whether the reading and writing of two consecutive Slots is an atomic operation, it will not cause data security problems.
      • If you are accessing a variable of a 32-bit data type, the index n represents the use of the nth Slot
      • If it is a variable of 64-bit data type, it means that two Slots n and n+1 will be used at the same time
        • For two adjacent slots that store a 64-bit data together, it is not allowed to access one of them separately in any way, and an exception should be thrown during the verification phase of class loading.

Space allocation of local variable table

Slot in the local variable table can be reused

Definition: If the current position has exceeded the scope of a certain variable, for example, the code block that defines this variable is out, the Slot corresponding to this variable can be used for other variables. But it also explains that as long as other variables do not use this part of the Slot area, this variable will still be stored there, which will have an impact on GC operations.

1.

public static void main(String[] args) { byte[] placeholder = new byte[64 * 1024 * 1024]; System.gc(); } Copy code

-verbose:gc
Output (add in the virtual machine operating parameters
-verbose:gc
Let's take a look at the process of junk phone):

[GC (System.gc()) 68813K->66304K(123904K), 0.0034797 secs] [Full GC (System.gc()) 66304K->66204K(123904K), 0.0086225 secs]//not recovered Copy code

Reason: When System.gc() is executed, the variable placeholder is still in scope, so the memory will not be reclaimed by the virtual machine

2.

public static void main(String[] args) { { byte[] placeholder = new byte[64 * 1024 * 1024]; } System.gc(); } Copy code

-verbose:gc
Output:

[GC (System.gc()) 68813K->66304K(123904K), 0.0034797 secs] [Full GC (System.gc()) 66304K->66204K(123904K), 0.0086225 secs]//not recovered Copy code

3.

public static void main(String[] args) { { byte[] placeholder = new byte[64 * 1024 * 1024]; } int a = 1;//add a new assignment operation System.gc(); } Copy code

-verbose:gc
Output:

[GC (System.gc()) 68813K->66320K(123904K), 0.0017394 secs] [Full GC (System.gc()) 66320K->668K(123904K), 0.0084337 secs]//was recycled Copy code

After the second modification, why can the placeholder be recycled?

  • The key to whether the placeholder can be recycled: whether the Slot in the local variable table still has a reference to the placeholder;
  • After the code block where the placeholder is located, no other operations have been performed, so the Slot where the placeholder is located has not been reused by other variables, that is, the placeholder reference still exists in the Slot of the local variable table;
  • After the third modification, int a occupies the Slot where the original placeholder is located, so it can be dropped by the GC.

Operand stack

  • Elements can be any Java type, 32-bit data occupies 1 stack capacity, 64-bit data occupies 2 stack capacity;
  • The interpretation and execution of the Java virtual machine is called: stack-based execution engine, where "stack" refers to the operand stack;

Dynamic link

  • A reference to the method to which the stack frame belongs in the runtime constant pool;
  • To support dynamic connection during method invocation

Method return address

  • Two ways to exit:
    • Encounter return
    • An exception was encountered.
  • Possible actions when exiting the method:
    • Restore the local variable table and operand stack of the upper method;
    • Push the return value into the operand stack of the caller's stack frame;
    • Adjust the PC counter to point to the instruction following the method call.

Method call

The Java virtual machine provides five method call bytecode instructions with different responsibilities:

  • invokestatic
    : Call a static method;
  • invokespecial
    : Invoke constructor methods, private methods, and parent methods;
  • invokevirtual
    : Call all virtual methods, except for static methods, constructor methods, private methods, parent methods, and final methods, other methods are called virtual methods;
  • invokeinterface
    : Calling the interface method will determine an implementation object of the interface at runtime;
  • invokedynamic
    : Dynamically resolve the method referenced by the call point qualifier at runtime, and then execute the method.

apart from

invokedynamic
, The first parameter of the other four methods is the symbolic reference of the called method, which is determined at compile time, so they lack dynamic type language support, because dynamic type language can only determine the receiver type at runtime. That is, the main process of variable type checking is at runtime, not at compile time.

Although the final method is called through invokevirtual, it cannot be overridden. There is no other version, and there is no need for polymorphic selection of the receiver, or the result of polymorphic selection is unique, so it is a non-virtual method.

Parsing

The resolution call, as its name suggests, determines the call version of the method during the parsing phase of class loading. We know that the parsing phase of class loading converts part of the symbol references into direct references. This process is called parsing calls. Because it is determined which method to call before the program is actually run, the premise that the resolution call can be established is: the method has a clear call version before the program is actually run, and this call version will not occur during runtime change.

Only the following two types of methods meet these two requirements:

  • The method called by invokestatic: static method;
  • Methods called through invokespecial: private methods, constructor methods, and parent methods;

These two types of methods are simply impossible to rewrite other versions through inheritance or other methods, that is to say, the calling version can be determined before running, which is very suitable for analysis in the class loading stage. They will be parsed as direct references at the resolution stage of class loading, that is, the calling version is determined.

Dispatch

Before introducing the dispatch call, let's first introduce the three basic object-oriented features of Java: encapsulation, inheritance, and polymorphism.

The most basic manifestation of polymorphism is overloading and rewriting. An important feature of overloading and rewriting is that the method name is the same, and there are various other differences:

  • Overloading: Occurs in the same class, the input parameters must be different, and the return type, access modifier, and exception thrown can all be different;
  • Rewrite: occurs in the child parent class, the input parameter and return type must be the same, the access modifier is greater than or equal to the rewritten method, and no new exception can be thrown.

The same method name actually brings confusion to the call of the virtual machine, because the virtual machine needs to determine which method it should call, and this process will be reflected in the dispatch call. among them:

  • Method overloading-static dispatch
  • Method rewriting-dynamic dispatch

Static dispatch (method overloading)

Before introducing static dispatch, let's first introduce what is the static type and actual type of a variable.

Static type and actual type of variable

public class StaticDispatch { static abstract class Human { }
static class Man extends Human { } static class Woman extends Human { } public void sayHello(Human guy) { System.out.println("Hello guy!"); } public void sayHello(Man man) { System.out.println("Hello man!"); } public void sayHello(Woman woman) { System.out.println("Hello woman!"); } public static void main(String[] args) { Human man = new Man(); Human woman = new Woman(); StaticDispatch sr = new StaticDispatch(); sr.sayHello(man); sr.sayHello(woman); /* Output: Hello guy! Hello guy! Because it is based on the static type of the variable, that is, the type on the left: Human to determine which method to call, So all calls are public void sayHello(Human guy) */ } Copy code

}

/* Brief explanation*/ //use Human man = new Man();

//The actual type changes Human man = new Man(); man = new Woman();

Copy code

//The static type changes sr.sayHello((Man) man);//Output: Hello man! sr.sayHello((Woman) man);//Output: Hello woman!

Among them, Human is called the static type of the variable, and Man is called the actual type of the variable.

When overloading, the compiler judges which method should be called by the static type of the method parameter, not the actual type.

In layman's terms, static dispatch is the process of judging which method to call through static things such as method parameters (type & number & order).

  • Overload method matching priority, such as a character'a' as an input parameter
  • basic type
    • char
    • int
    • long
    • float
    • double
  • Character
  • Serializable (interface implemented by Character)
    • If two interfaces with the same priority, such as Serializable and Comparable, appear at the same time, they will prompt the fuzzy type and refuse to compile.
  • Object
  • char... (variable length parameters have the lowest priority)

Dynamic dispatch (method overriding)

Dynamic dispatch is the dispatch process of determining the method execution version according to the actual type at runtime.

The process of dynamic dispatch

public class DynamicDispatch { static abstract class Human { protected abstract void sayHello(); }
static class Man extends Human { protected void sayHello() { System.out.println("Hello man"); } } static class Woman extends Human { protected void sayHello() { System.out.println("Hello woman"); } } public static void main(String[] args) { Human man = new Man(); Human woman = new Woman(); man.sayHello(); woman.sayHello(); man = woman; man.sayHello(); /* output Hello man Hello woman Hello woman */ } Copy code
Copy code

}

Bytecode analysis (javap command output bytecode):

public static void main(java.lang.String[]); descriptor: ([Ljava/lang/String;)V flags: ACC_PUBLIC, ACC_STATIC Code: stack=2, locals=3, args_size=1 0: new #2//class com/jvm/ch8/DynamicDispatch$Man 3: dup 4: invokespecial #3//Method com/jvm/ch8/DynamicDispatch$Man."<init>":()V 7: astore_1 8: new #4//class com/jvm/ch8/DynamicDispatch$Woman 11: dup 12: invokespecial #5//Method com/jvm/ch8/DynamicDispatch$Woman."<init>":()V 15: astore_2 16: aload_1//Push the reference of the newly created object to the top of the operand stack, //For the subsequent execution of sayHello, determine which object's sayHello is to be executed 17: invokevirtual #6//method call 20: aload_2//Push the reference of the newly created object to the top of the operand stack, //For the subsequent execution of sayHello, determine which object's sayHello is to be executed 21: invokevirtual #6//method call 24: aload_2 25: astore_1 26: aload_1 27: invokevirtual #6//Method com/jvm/ch8/DynamicDispatch$Human.sayHello:()V 30: return Copy code

Through bytecode analysis, it can be seen that the operation process of the invokevirtual instruction is roughly as follows:

  • Go to the top of the operand stack to fetch the owner of the method to be executed, denoted as C;
  • Find this method:
    • Find this method in C;
    • Find in each parent class of C;
    • Search process:
      • Find the same method as the descriptor and simple name of the constant;
      • Perform access authorization verification, but throw: IllegalAccessError exception;
      • Pass the access authority verification and return the direct reference;
  • If it is not found, it will throw: AbstractMethodError, that is, the method has not been implemented.

Implementation of dynamic dispatch

Dynamic dispatch is executed very frequently in virtual machines, and the method search process needs to search for suitable targets in the method metadata of the class. In terms of performance, it is unlikely to perform such frequent searches, and performance optimization is required.

Common optimization methods: Create a virtual method table in the method area of the class.

  • The virtual method table stores the actual entry address of each method. If a method is not overwritten by a subclass method, the entry address of the method in the subclass method table = the entry address of the method in the parent method table;
  • Use this method table index instead of searching in metadata;
  • The method table will be initialized during the connection phase of class loading.

In layman's terms, dynamic dispatch is the process of judging which method to call through dynamic things like the receiver of the method.

To sum it up: look at the left for static allocation, and look at the right for dynamic allocation.

Single dispatch and multiple dispatch

In addition to static distribution and dynamic distribution, there is also a method of classification based on quantity, which can divide method distribution into single distribution and multiple distribution.

Argument: The receiver of the method & the parameters of the method.

Static assignment Java language belongs to multiple dispatch, according to the static method type recipient and method parameter types to choose two cases amount.

Dynamic assignment Java language are single assignment, based only on the actual type of method recipient selecting a parcel amount.

Dynamically typed language support

What is a dynamically typed language?

It is the programming language where the main process of type checking is at runtime, not at compile time.

What are the advantages of dynamically/statically typed languages?

  • Dynamically typed language: high flexibility and high development efficiency.
  • Statically typed language: The compiler provides rigorous type checking, and type-related problems can be found during coding.

Dynamic type support provided at the Java virtual machine level:

  • invokedynamic instruction
  • java.lang.invoke package

java.lang.invoke package

Purpose: In addition to the previous method that relied on symbolic references to determine the target method to be called, it provides MethodHandle, a calling mechanism for dynamically determining the target method.

Use of MethodHandle

Get the parameter description of the method. The first parameter is the type of the return value of the method, and the following parameters are the input parameters of the method:

MethodType mt = MethodType.methodType(void.class, String.class); Copy code

Get a call to a common method:

/** * Required parameters: * 1. The class object of the class to which the called method belongs * 2. Method name * 3. MethodType object mt * 4. The object that calls the method */ MethodHandle.lookup().findVirtual(receiver.getClass(), "Method Name", mt).bindTo(receiver); Copy code

Get a call to a parent method:

/** * Required parameters: * 1. The class object of the class to which the called method belongs * 2. Method name * 3. MethodType object mt * 4. The class object of the class that calls this method */ MethodHandle.lookup().findSpecial(GrandFather.class, "Method Name", mt, getClass()); Copy code

Execute method through MethodHandle mh:

/* The difference between invoke() and invokeExact(): -invokeExact() requires stricter requirements, strict type matching, and the return value type of the method is also considered -invoke() allows a looser call method */ mh.invoke("Hello world"); mh.invokeExact("Hello world"); Copy code

Example of use:

public class MethodHandleTest { static class ClassA { public void println(String s) { System.out.println(s); } }
public static void main(String[] args) throws Throwable { /* The static type of obj is Object, and there is no println method, so although the actual type of obj contains the println method, It still cannot call the println method */ Object obj = System.currentTimeMillis()% 2 == 0? System.out: new ClassA(); /* The difference between invoke() and invokeExact(): -invokeExact() requires stricter, strict type matching, and the return value type of the method is also considered -invoke() allows a looser call method */ getPrintlnMH(obj).invoke("Hello world"); getPrintlnMH(obj).invokeExact("Hello world"); } private static MethodHandle getPrintlnMH(Object receiver) throws NoSuchMethodException, IllegalAccessException { /* MethodType represents the method type, the first parameter is the type of the return value of the method, and the following parameters are the input parameters of the method*/ MethodType mt = MethodType.methodType(void.class, String.class); /* The lookup() method comes from MethodHandles.lookup, The function of this sentence is to find a method handle that matches the given method name, method type, and call permission in the specified class */ /* Because here is a virtual method called, according to the rules of the Java language, the first parameter of the method is implicit, representing the receiver of the method, That is, the object pointed to by this. This parameter was previously placed in the parameter list for transmission. Now the bindTo() method is provided to accomplish this. */ return MethodHandles.lookup().findVirtual(receiver.getClass(), "println", mt).bindTo(receiver); } Copy code
Copy code

}

The bytecode instructions corresponding to the three methods in MethodHandles.lookup:

  • findStatic(): corresponding to invokestatic
  • findVirtual(): corresponding to invokevirtual & invokeinterface
  • findSpecial(): corresponding to invokespecial

The difference between MethodHandle and Reflection

  • The essential difference: They are all simulating method calls, but
    • Reflection simulates calls at the Java code level;
    • MethodHandle simulates a bytecode level call.
  • Contains the difference in information:
    • Reflection s Method object contains a lot of information, including: method signatures, method descriptors, Java-side expressions of various properties of the method, method execution permissions, etc.;
    • The MethodHandle object contains less information, including information related to the execution of the method.

invokedynamic instruction

Lambda expressions are implemented through the invokedynamic instruction.