Update documentation for class file parsing, class loading, frame interpreter, JNI, object management, and FFI. Increment crate version to 0.2.0;
This commit is contained in:
parent
7fcf00b77f
commit
24939df1b7
175
README.md
175
README.md
@ -1,146 +1,99 @@
|
|||||||
# RoastVM
|
# RoastVM
|
||||||
|
|
||||||
A Java Virtual Machine (JVM) implementation written in Rust, capable of parsing and executing Java class files and bytecode.
|
A Java Virtual Machine (JVM) implementation written in Rust.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
RoastVM is an educational/experimental JVM implementation that demonstrates the core components and execution model of the Java Virtual Machine. The project uses Rust's type safety and modern tooling to build a simplified but functional JVM interpreter.
|
RoastVM is an educational/experimental JVM implementation that executes Java bytecode. It supports class file parsing,
|
||||||
|
bytecode interpretation, JNI native methods, and includes a boot image system for loading the Java standard library.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
### Currently Implemented
|
- **Class File Parsing** - Full `.class` file support using deku for binary parsing
|
||||||
|
- **Bytecode Interpreter** - 200+ JVM instructions implemented
|
||||||
- **Class File Parsing**: Full support for reading and deserializing binary Java class files (`.class`) with magic number `0xCAFEBABE`
|
- **Object Model** - Objects, arrays, monitors, and string interning
|
||||||
- **Constant Pool Management**: Handles 20+ constant pool entry types (UTF8, Integer, Float, Long, Double, Class, String, MethodRef, FieldRef, InterfaceMethodRef, NameAndType, MethodHandle, MethodType, InvokeDynamic, etc.)
|
- **JNI Support** - 250+ JNI functions for native method integration
|
||||||
- **Dynamic Class Loading**: On-demand class loading with superclass and interface resolution, caching via DashMap
|
- **Boot Image** - Load JDK classes from 7z module archives
|
||||||
- **Class Initialization**: Automatic `<clinit>` method execution following JVM Spec 5.5, with recursive initialization tracking
|
- **Native FFI** - Dynamic library loading via libffi
|
||||||
- **Bytecode Execution**: Interpreter for 50+ JVM bytecode instructions including:
|
|
||||||
- Constants: `aconst_null`, `iconst_*`, `lconst_*`, `fconst_*`, `dconst_*`, `bipush`, `sipush`, `ldc`, `ldc_w`, `ldc2_w`
|
|
||||||
- Load/Store: `iload`, `lload`, `fload`, `dload`, `aload`, `istore`, `lstore`, `fstore`, `dstore`, `astore` (including `_0-3` variants)
|
|
||||||
- Array operations: `iaload`, `laload`, `faload`, `daload`, `aaload`, `baload`, `caload`, `saload`, `iastore`, `lastore`, `fastore`, `dastore`, `aastore`, `bastore`, `castore`, `sastore`, `arraylength`
|
|
||||||
- Stack manipulation: `pop`, `pop2`, `dup`, `dup_x1`, `dup_x2`, `dup2`, `dup2_x1`, `dup2_x2`, `swap`
|
|
||||||
- Arithmetic: `iadd`, `ladd`, `fadd`, `dadd`, `isub`, `lsub`, `fsub`, `dsub`, `imul`, `lmul`, `fmul`, `dmul`, `idiv`, `ldiv`, `fdiv`, `ddiv`, `irem`, `lrem`, `frem`, `drem`, `ineg`, `lneg`, `fneg`, `dneg`
|
|
||||||
- Bitwise: `ishl`, `lshl`, `ishr`, `lshr`, `iushr`, `lushr`, `iand`, `land`, `ior`, `lor`, `ixor`, `lxor`
|
|
||||||
- Type conversions: `i2l`, `i2f`, `i2d`, `l2i`, `l2f`, `l2d`, `f2i`, `f2l`, `f2d`, `d2i`, `d2l`, `d2f`, `i2b`, `i2c`, `i2s`
|
|
||||||
- Comparisons: `lcmp`, `fcmpl`, `fcmpg`, `dcmpl`, `dcmpg`
|
|
||||||
- Control flow: `ifeq`, `ifne`, `iflt`, `ifge`, `ifgt`, `ifle`, `if_icmp*`, `if_acmp*`, `goto`, `ifnull`, `ifnonnull`
|
|
||||||
- Object operations: `new`, `newarray`, `anewarray`, `multianewarray`, `checkcast`, `instanceof`
|
|
||||||
- Field access: `getstatic`, `putstatic`, `getfield`, `putfield`
|
|
||||||
- Method invocation: `invokevirtual`, `invokespecial`, `invokestatic`, `invokeinterface`
|
|
||||||
- Returns: `ireturn`, `lreturn`, `freturn`, `dreturn`, `areturn`, `return`
|
|
||||||
- **Object Model**: Full object creation, field storage, and array support (primitive and reference arrays)
|
|
||||||
- **JNI Support**: Implementation of 80+ JNI functions for native method integration
|
|
||||||
- **Native Library Loading**: Dynamic loading of native libraries (DLLs on Windows)
|
|
||||||
- **Stack Traces**: Detailed stack trace generation with line number mapping from class file attributes
|
|
||||||
- **Module System**: Support for loading classes from 7z binary image archives (JDK modules)
|
|
||||||
- **Frame-based Execution**: Proper execution context with program counter, operand stack, and local variables
|
|
||||||
|
|
||||||
### In Development
|
|
||||||
|
|
||||||
- Additional bytecode instructions (`tableswitch`, `lookupswitch`, `monitorenter`, `monitorexit`, etc.)
|
|
||||||
- Exception handling (`athrow`, try/catch blocks)
|
|
||||||
- Garbage collection (basic object manager exists)
|
|
||||||
- Reflection API
|
|
||||||
- Multi-threading support
|
|
||||||
- Method handles and `invokedynamic`
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
### Core Components
|
|
||||||
|
|
||||||
- **`Vm`** (`vm.rs`): Main virtual machine controller managing threads, class loader, and native library loading
|
|
||||||
- **`VmThread`** (`thread.rs`): Thread of execution managing the frame stack and method invocation
|
|
||||||
- **`Frame`** (`lib.rs`): Execution context for a method with PC, operand stack, and local variables
|
|
||||||
- **`ClassLoader`** (`class_loader.rs`): Handles dynamic class loading, linking, and initialization
|
|
||||||
- **`RuntimeClass`** (`class.rs`): Runtime representation of a loaded class with initialization state tracking
|
|
||||||
- **`ClassFile`** (`class_file/`): Binary parser for Java class files using the `deku` library
|
|
||||||
- **`ConstantPool`** (`class_file/constant_pool.rs`): Constant pool resolution and management
|
|
||||||
- **`ObjectManager`** (`objects/object_manager.rs`): Object allocation and garbage collection management
|
|
||||||
- **`JNI`** (`jni.rs`): Java Native Interface implementation
|
|
||||||
|
|
||||||
### Execution Flow
|
|
||||||
|
|
||||||
1. **Loading**: `ClassFile::from_bytes()` parses binary class file data
|
|
||||||
2. **Resolution**: `ClassLoader` converts `ClassFile` to `RuntimeClass`, resolving dependencies
|
|
||||||
3. **Initialization**: Class initializers (`<clinit>`) execute per JVM Spec 5.5
|
|
||||||
4. **Execution**: `VmThread` invokes the main method, creating a `Frame`
|
|
||||||
5. **Interpretation**: `Frame` iterates through bytecode operations, executing each instruction
|
|
||||||
6. **Stack Operations**: Instructions manipulate the operand stack and local variables
|
|
||||||
|
|
||||||
## Project Structure
|
## Project Structure
|
||||||
|
|
||||||
```
|
```
|
||||||
roast-vm/
|
roast-vm/
|
||||||
├── Cargo.toml # Workspace configuration
|
├── crates/
|
||||||
└── crates/
|
│ ├── core/ # Main VM implementation
|
||||||
├── core/ # Main JVM implementation (roast-vm-core)
|
│ │ └── src/
|
||||||
│ ├── Cargo.toml
|
│ │ ├── main.rs # Entry point
|
||||||
│ └── src/
|
│ │ ├── vm.rs # VM controller
|
||||||
│ ├── main.rs # Entry point (binary: roast)
|
│ │ ├── thread.rs # Thread execution
|
||||||
│ ├── lib.rs # Frame and bytecode execution
|
│ │ ├── class_loader.rs # Class loading
|
||||||
│ ├── vm.rs # Virtual Machine controller
|
│ │ ├── bimage.rs # Boot image reader
|
||||||
│ ├── thread.rs # Thread execution management
|
│ │ ├── frame/ # Stack frames & interpreter
|
||||||
│ ├── class.rs # RuntimeClass definition
|
│ │ ├── class_file/ # Class file parser
|
||||||
│ ├── class_loader.rs # ClassLoader implementation
|
│ │ ├── objects/ # Object/array model
|
||||||
│ ├── class_file/ # Binary class file parser
|
│ │ └── native/ # JNI infrastructure
|
||||||
│ │ ├── class_file.rs # ClassFile parser (magic 0xCAFEBABE)
|
│ │
|
||||||
│ │ └── constant_pool.rs
|
│ └── roast-vm-sys/ # Native methods (cdylib)
|
||||||
│ ├── objects/ # Object model
|
│ └── src/ # JNI implementations
|
||||||
│ │ ├── object.rs # Object representation
|
|
||||||
│ │ ├── array.rs # Array support
|
|
||||||
│ │ └── object_manager.rs
|
|
||||||
│ ├── jni.rs # JNI implementation
|
|
||||||
│ ├── instructions.rs # Bytecode opcode definitions
|
|
||||||
│ ├── attributes.rs # Class file attributes
|
|
||||||
│ ├── value.rs # Value and stack types
|
|
||||||
│ ├── error.rs # Error handling and stack traces
|
|
||||||
│ ├── native_libraries.rs # Native library management
|
|
||||||
│ └── bimage.rs # Binary image (7z) reader
|
|
||||||
│
|
│
|
||||||
└── roast-vm-sys/ # Native methods bridge (cdylib)
|
├── lib/ # Boot image location
|
||||||
├── Cargo.toml
|
├── data/ # Default classpath
|
||||||
└── src/
|
└── docs/ # Detailed documentation
|
||||||
├── lib.rs # Native method implementations
|
|
||||||
├── system.rs # System native calls
|
|
||||||
├── class.rs # Class native operations
|
|
||||||
└── object.rs # Object native operations
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Dependencies
|
## Documentation
|
||||||
|
|
||||||
- **`deku`**: Binary parsing and serialization for class files
|
Detailed implementation docs are in the `docs/` folder:
|
||||||
- **`dashmap`**: Concurrent HashMap for class and object storage
|
|
||||||
- **`jni`**: Java Native Interface bindings
|
- [Class File Parsing](docs/class-file-parsing.md) - Binary format, constant pool, attributes
|
||||||
- **`libloading`**: Dynamic library loading
|
- [Frame & Interpreter](docs/frame-interpreter.md) - Stack frames, opcode dispatch
|
||||||
- **`libffi`**: Foreign function interface for native calls
|
- [Object Management](docs/object-management.md) - Objects, arrays, monitors
|
||||||
- **`sevenz-rust2`**: 7z archive reading for module system support
|
- [JNI](docs/jni.md) - JNIEnv structure, native invocation
|
||||||
- **`log`** / **`env_logger`**: Logging infrastructure
|
- [Native/FFI](docs/native-ffi.md) - Library loading, libffi integration
|
||||||
- **`itertools`**: Iterator utilities
|
- [roast-vm-sys](docs/roast-vm-sys.md) - Native method implementations
|
||||||
- **`colored`**: Colored console output
|
- [Class Loading](docs/class-loading.md) - Boot image, classpath, RuntimeClass
|
||||||
|
|
||||||
## Building
|
## Building
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build the project
|
|
||||||
cargo build
|
cargo build
|
||||||
|
|
||||||
# Build with optimizations
|
|
||||||
cargo build --release
|
cargo build --release
|
||||||
|
|
||||||
# Run tests
|
|
||||||
cargo test
|
cargo test
|
||||||
|
```
|
||||||
|
|
||||||
# Run with logging
|
## Running
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Run with default classpath (./data)
|
||||||
|
cargo run
|
||||||
|
|
||||||
|
# Run with custom classpath
|
||||||
|
cargo run -- /path/to/classes
|
||||||
|
|
||||||
|
# With debug logging
|
||||||
RUST_LOG=debug cargo run
|
RUST_LOG=debug cargo run
|
||||||
```
|
```
|
||||||
|
|
||||||
## Current Status
|
## Dependencies
|
||||||
|
|
||||||
This project is in early development (v0.1.0). The core infrastructure for class loading, bytecode execution, object creation, JNI support, and stack traces is functional. Many JVM features remain in development.
|
| Crate | Purpose |
|
||||||
|
|-------------------------------------------------------|---------------------------|
|
||||||
|
| [deku](https://crates.io/crates/deku) | Binary class file parsing |
|
||||||
|
| [dashmap](https://crates.io/crates/dashmap) | Concurrent maps |
|
||||||
|
| [jni](https://crates.io/crates/jni) | JNI type definitions |
|
||||||
|
| [libloading](https://crates.io/crates/libloading) | Dynamic library loading |
|
||||||
|
| [libffi](https://crates.io/crates/libffi) | Native function calls |
|
||||||
|
| [sevenz-rust2](https://crates.io/crates/sevenz-rust2) | Boot image archives |
|
||||||
|
| [parking_lot](https://crates.io/crates/parking_lot) | Synchronization |
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Early development (v0.2.0). Core class loading, bytecode execution, and JNI are functional. Exception handling and GC
|
||||||
|
are in progress.
|
||||||
|
|
||||||
|
**Vendor**: infernap12
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
- [JVM Specification](https://docs.oracle.com/javase/specs/jvms/se25/html/index.html)
|
- [JVM Specification](https://docs.oracle.com/javase/specs/jvms/se25/html/index.html)
|
||||||
- [Java Class File Format](https://docs.oracle.com/javase/specs/jvms/se25/html/jvms-4.html)
|
- [JNI Specification](https://docs.oracle.com/en/java/javase/25/docs/specs/jni/index.html)
|
||||||
@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "roast-vm-core"
|
name = "roast-vm-core"
|
||||||
version = "0.1.5"
|
version = "0.2.0"
|
||||||
edition = "2024"
|
edition = "2024"
|
||||||
publish = ["nexus"]
|
publish = ["nexus"]
|
||||||
|
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "roast-vm-sys"
|
name = "roast-vm-sys"
|
||||||
version = "0.1.5"
|
version = "0.2.0"
|
||||||
edition = "2024"
|
edition = "2024"
|
||||||
publish = ["nexus"]
|
publish = ["nexus"]
|
||||||
|
|
||||||
|
|||||||
156
docs/class-file-parsing.md
Normal file
156
docs/class-file-parsing.md
Normal file
@ -0,0 +1,156 @@
|
|||||||
|
# Class File Parsing
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/class_file/`
|
||||||
|
|
||||||
|
The class file parser uses the **deku** library for declarative binary deserialization with automatic validation.
|
||||||
|
|
||||||
|
## Components
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `class_file.rs` | Main ClassFile struct with version, constant pool, fields, methods, attributes |
|
||||||
|
| `constant_pool.rs` | ConstantPoolGet/ConstantPoolExt traits for pool access and resolution |
|
||||||
|
| `attributes.rs` | Attribute parsing (Code, LineNumberTable, LocalVariableTable, BootstrapMethods) |
|
||||||
|
| `mod.rs` | Access flag definitions (ClassFlags, MethodFlags, FieldFlags) |
|
||||||
|
|
||||||
|
## Key Types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct ClassFile {
|
||||||
|
pub minor_version: u16,
|
||||||
|
pub major_version: u16,
|
||||||
|
pub constant_pool: Arc<ConstantPoolOwned>,
|
||||||
|
pub access_flags: u16,
|
||||||
|
pub this_class: u16,
|
||||||
|
pub super_class: u16,
|
||||||
|
pub interfaces: Vec<u16>,
|
||||||
|
pub fields: Vec<FieldInfo>,
|
||||||
|
pub methods: Vec<MethodInfo>,
|
||||||
|
pub attributes: Vec<AttributeInfo>,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct FieldInfo {
|
||||||
|
pub access_flags: u16,
|
||||||
|
pub name_index: u16,
|
||||||
|
pub descriptor_index: u16,
|
||||||
|
pub attributes: Vec<AttributeInfo>,
|
||||||
|
}
|
||||||
|
|
||||||
|
pub struct MethodInfo {
|
||||||
|
pub access_flags: u16,
|
||||||
|
pub name_index: u16,
|
||||||
|
pub descriptor_index: u16,
|
||||||
|
pub attributes: Vec<AttributeInfo>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Constant Pool
|
||||||
|
|
||||||
|
Trait-based architecture with two layers:
|
||||||
|
|
||||||
|
### ConstantPoolGet Trait
|
||||||
|
Low-level accessors:
|
||||||
|
- `get_constant()`: Resolve by index (accounts for 64-bit entries)
|
||||||
|
- Type-specific getters: `get_i32()`, `get_utf8_info()`, `get_class_info()`, `get_method_ref()`, etc.
|
||||||
|
- Implemented via `pool_get_impl!` macro
|
||||||
|
|
||||||
|
### ConstantPoolExt Trait
|
||||||
|
High-level operations:
|
||||||
|
- `get_string()`: Fetch UTF-8 strings with CESU-8 decoding
|
||||||
|
- `resolve_class_name()`: Trace class references through constant pool
|
||||||
|
- `resolve_method_ref()` / `resolve_interface_method_ref()`: Resolve method references
|
||||||
|
- `resolve_field()`: Resolve field references with type descriptors
|
||||||
|
- `parse_attribute()`: Convert raw attribute bytes to typed Attribute enum
|
||||||
|
|
||||||
|
### Constant Pool Entry Types (20 types)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ConstantPoolEntry {
|
||||||
|
Utf8(ConstantUtf8Info),
|
||||||
|
Integer(i32), Float(f32), Long(i64), Double(f64),
|
||||||
|
Class(ConstantClassInfo),
|
||||||
|
String(ConstantStringInfo),
|
||||||
|
FieldRef(ConstantFieldrefInfo),
|
||||||
|
MethodRef(ConstantMethodrefInfo),
|
||||||
|
InterfaceMethodRef(ConstantInterfaceMethodrefInfo),
|
||||||
|
NameAndType(ConstantNameAndTypeInfo),
|
||||||
|
MethodHandle(ConstantMethodHandleInfo),
|
||||||
|
MethodType(ConstantMethodTypeInfo),
|
||||||
|
Dynamic(ConstantDynamicInfo),
|
||||||
|
InvokeDynamic(ConstantInvokeDynamicInfo),
|
||||||
|
Module(ConstantModuleInfo),
|
||||||
|
Package(ConstantPackageInfo),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Attributes
|
||||||
|
|
||||||
|
Recursive attribute parsing with support for:
|
||||||
|
|
||||||
|
- **Code**: Method bytecode with max_stack, max_locals, exception tables, nested attributes
|
||||||
|
- **LineNumberTable**: Maps bytecode offsets to source line numbers
|
||||||
|
- **LocalVariableTable**: Local variable debugging info (name, descriptor, PC range)
|
||||||
|
- **BootstrapMethods**: Dynamic invocation bootstrap method references
|
||||||
|
- **StackMapTable**, **Exceptions**, **InnerClasses**: Parsed as raw byte vectors
|
||||||
|
- **SourceFile**, **Signature**: Index-based attribute data
|
||||||
|
- **Unknown**: Fallback for unrecognized attributes
|
||||||
|
|
||||||
|
### Code Attribute Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct CodeAttribute {
|
||||||
|
pub max_stack: u16,
|
||||||
|
pub max_locals: u16,
|
||||||
|
pub code_length: u32,
|
||||||
|
pub code: Vec<u8>,
|
||||||
|
pub exception_table: Vec<ExceptionTableEntry>,
|
||||||
|
pub attributes: Vec<AttributeInfo>, // Recursive
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Access Flags
|
||||||
|
|
||||||
|
Bitfield structures for parsing JVM access flags:
|
||||||
|
|
||||||
|
- **ClassFlags**: PUBLIC, FINAL, INTERFACE, ABSTRACT, SYNTHETIC, ANNOTATION, ENUM, MODULE
|
||||||
|
- **FieldFlags**: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, VOLATILE, TRANSIENT, SYNTHETIC, ENUM
|
||||||
|
- **MethodFlags**: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, SYNCHRONIZED, BRIDGE, VARARGS, NATIVE, ABSTRACT, STRICT, SYNTHETIC
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
Validation occurs at multiple levels:
|
||||||
|
|
||||||
|
1. **Binary Format** (Automatic via Deku):
|
||||||
|
- Magic number (0xCAFEBABE)
|
||||||
|
- Big-endian byte order
|
||||||
|
- Type-safe parsing with error propagation
|
||||||
|
|
||||||
|
2. **Constant Pool**:
|
||||||
|
- Index bounds checking in `get_constant()`
|
||||||
|
- Type validation: Each accessor checks the entry type matches expected type
|
||||||
|
- CESU-8 decoding errors caught from Java-style UTF-8 strings
|
||||||
|
|
||||||
|
3. **Class Structure** (ClassLoader):
|
||||||
|
- Debug assertions for Object class having super_class = 0
|
||||||
|
- Non-Object classes must have valid super class reference
|
||||||
|
- Interfaces must inherit from Object
|
||||||
|
|
||||||
|
## Descriptor Parsing
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Method descriptor: (II)I -> two ints, return int
|
||||||
|
MethodDescriptor::parse("(II)I")?
|
||||||
|
|
||||||
|
// Field descriptor: Ljava/lang/String; -> String class type
|
||||||
|
FieldType::parse("Ljava/lang/String;")?
|
||||||
|
|
||||||
|
// Array descriptor: [[I -> 2D int array
|
||||||
|
FieldType::parse("[[I")?
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
- `DekuError`: Binary parsing failures
|
||||||
|
- `ConstantPoolError`: Pool access with Generic, DescriptorParseError, Cesu8DecodingError variants
|
||||||
|
- `VmError`: Higher-level VM-specific errors
|
||||||
|
- `DescParseError`: Invalid method/field descriptor syntax
|
||||||
134
docs/class-loading.md
Normal file
134
docs/class-loading.md
Normal file
@ -0,0 +1,134 @@
|
|||||||
|
# Class Loading and Boot Image
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/class_loader.rs`, `crates/core/src/bimage.rs`
|
||||||
|
|
||||||
|
## Boot Image (Bimage)
|
||||||
|
|
||||||
|
A 7z archive containing precompiled Java standard library classes.
|
||||||
|
|
||||||
|
### Structure
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Bimage {
|
||||||
|
image: ArchiveReader<File>, // 7z archive reader
|
||||||
|
modules: Vec<String>, // Available modules
|
||||||
|
packages: HashMap<String, String>, // Package -> Module mapping
|
||||||
|
pub total_access_time: Duration, // Performance tracking
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Default Location**: `./lib/modules`
|
||||||
|
- **Format**: 7z compressed archive
|
||||||
|
- **Structure**: `<module>/classes/<class>.class`
|
||||||
|
- **Default Module**: `java.base` (used when no module is specified)
|
||||||
|
|
||||||
|
## Class Loading Flow
|
||||||
|
|
||||||
|
The `ClassLoader` manages class resolution with a two-tier fallback mechanism:
|
||||||
|
|
||||||
|
### Process
|
||||||
|
|
||||||
|
1. **Check Cache**: Look in `DashMap<(String, LoaderId), Arc<RuntimeClass>>` for already-loaded classes
|
||||||
|
2. **Try Bimage**: Attempt to load from boot image via `bimage.get_class(module, class_fqn)`
|
||||||
|
3. **Fallback to Disk**: If not in bimage, load from filesystem at `{CLASSPATH}/{class_name}.class`
|
||||||
|
4. **Parse & Cache**: Parse ClassFile using deku, create RuntimeClass, store in cache
|
||||||
|
|
||||||
|
### Key Method
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn load_class(&mut self, what: &str, loader: LoaderId) -> Result<Arc<RuntimeClass>, VmError> {
|
||||||
|
let bytes = self.bimage
|
||||||
|
.and_then(|b| b.get_class("", what).ok())
|
||||||
|
.or_else(|_| Self::load_class_from_disk(what))
|
||||||
|
.map_err(|_| VmError::LoaderError(...))?;
|
||||||
|
|
||||||
|
let (_, cf) = ClassFile::from_bytes(bytes.as_ref())?;
|
||||||
|
let runtime = self.runtime_class(cf);
|
||||||
|
|
||||||
|
// Store with loader ID for multi-loader support
|
||||||
|
self.classes.insert((class_fqn, loader), Arc::new(runtime));
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Classpath Handling
|
||||||
|
|
||||||
|
### Resolution Priority
|
||||||
|
|
||||||
|
1. **Bimage (boot image)** - Primary source for standard library
|
||||||
|
2. **Command-line argument (arg[1])** - User-provided classpath
|
||||||
|
3. **Default `./data` directory** - Fallback location
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn load_class_from_disk(what: &str) -> Result<Vec<u8>, String> {
|
||||||
|
let class_path = std::env::args()
|
||||||
|
.nth(1)
|
||||||
|
.unwrap_or("./data".to_string())
|
||||||
|
.replace("\\", "/");
|
||||||
|
|
||||||
|
let path = format!("{class_path}/{what}.class");
|
||||||
|
// Load file from disk
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Bootstrap Process
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/vm.rs` - `boot_strap()` method
|
||||||
|
|
||||||
|
### Steps
|
||||||
|
|
||||||
|
1. **Create VM**: `ClassLoader::with_bimage()` - initializes with boot image
|
||||||
|
2. **Load Core Classes**: Preloads essential VM classes
|
||||||
|
3. **Create Primitive Classes**: Synthetic class objects for primitive types
|
||||||
|
4. **Initialize Classes**: Run static initializers (`<clinit>`)
|
||||||
|
|
||||||
|
### Core Classes Loaded
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let classes = vec![
|
||||||
|
"java/lang/String",
|
||||||
|
"java/lang/System",
|
||||||
|
"java/lang/Class",
|
||||||
|
"java/lang/Object",
|
||||||
|
"java/lang/Thread",
|
||||||
|
"java/lang/ThreadGroup",
|
||||||
|
"java/lang/Module",
|
||||||
|
"java/lang/reflect/Method",
|
||||||
|
// ...
|
||||||
|
];
|
||||||
|
```
|
||||||
|
|
||||||
|
## RuntimeClass
|
||||||
|
|
||||||
|
Runtime representation of a loaded class:
|
||||||
|
|
||||||
|
### Cached Data
|
||||||
|
|
||||||
|
- **Superclass Chain**: `super_classes: Vec<Arc<RuntimeClass>>`
|
||||||
|
- **Interface Hierarchy**: `super_interfaces: Vec<Arc<RuntimeClass>>`
|
||||||
|
- **Component Type**: For array classes, reference to element type
|
||||||
|
- **Initialization State**: Thread-safe `InitState` enum
|
||||||
|
|
||||||
|
### Initialization States
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum InitState {
|
||||||
|
NotInitialized,
|
||||||
|
Initializing(ThreadId), // Track which thread is initializing
|
||||||
|
Initialized,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Method/Field Resolution
|
||||||
|
|
||||||
|
- `find_method()` - Searches class then walks up superclass chain
|
||||||
|
- `find_field()` - Same recursive behavior for fields
|
||||||
|
- `is_assignable_into()` - Checks type compatibility with array covariance
|
||||||
|
|
||||||
|
## Multi-Loader Support
|
||||||
|
|
||||||
|
Classes are keyed by `(class_name, LoaderId)` tuple to support:
|
||||||
|
- Different class loaders loading same-named classes
|
||||||
|
- Isolation between class loader namespaces
|
||||||
|
- Proper class identity checks
|
||||||
141
docs/frame-interpreter.md
Normal file
141
docs/frame-interpreter.md
Normal file
@ -0,0 +1,141 @@
|
|||||||
|
# Frame and Bytecode Interpreter
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/frame/`
|
||||||
|
|
||||||
|
## Frame Structure
|
||||||
|
|
||||||
|
Each method invocation creates a `Frame` containing:
|
||||||
|
|
||||||
|
| Component | Description |
|
||||||
|
|-----------|-------------|
|
||||||
|
| Program Counter (PC) | i64 tracking current bytecode instruction |
|
||||||
|
| Operand Stack | Generic Vec-backed stack for intermediate values |
|
||||||
|
| Local Variables | Indexed slots, handles wide values (long/double occupy 2 slots) |
|
||||||
|
| Constant Pool | Arc reference to the class constant pool |
|
||||||
|
| Bytecode | Instructions for the method |
|
||||||
|
|
||||||
|
### OperandStack (`operand_stack.rs`)
|
||||||
|
|
||||||
|
- Generic Vec-backed stack with push/pop/peek operations
|
||||||
|
- `pop_n(n)` returns values in push order (not pop order) for method arguments
|
||||||
|
- Supports underflow detection
|
||||||
|
|
||||||
|
### LocalVariables (`local_vars.rs`)
|
||||||
|
|
||||||
|
- Vec-backed, indexed by slot
|
||||||
|
- Handles wide values (long, double) that occupy 2 slots with padding
|
||||||
|
- `from_args()` automatically spaces wide values correctly
|
||||||
|
- Prevents access to padding slots with runtime panic
|
||||||
|
|
||||||
|
## Execution Loop
|
||||||
|
|
||||||
|
```rust
|
||||||
|
loop {
|
||||||
|
let (offset, op) = self.next().unwrap();
|
||||||
|
self.pc = offset as i64;
|
||||||
|
let result = self.execute_instruction(op.clone());
|
||||||
|
match result {
|
||||||
|
Ok(ExecutionResult::Advance(offset)) => self.pc += offset as i64,
|
||||||
|
Ok(_) => self.pc += 1,
|
||||||
|
Err(x) => return error with stack trace,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Opcode Dispatch
|
||||||
|
|
||||||
|
**Location**: `frame.rs` `execute_instruction()` (lines 199-1516)
|
||||||
|
|
||||||
|
- Single match statement over `Ops` enum variants (defined in `instructions.rs`)
|
||||||
|
- 200+ opcodes: constants, loads/stores, math, stack ops, branches, references, method invocation
|
||||||
|
- Uses helper macros (`load!`, `store!`, `binary_op!`, `shift_op!`, etc.) for common patterns
|
||||||
|
- Each opcode returns one of:
|
||||||
|
- `ExecutionResult::Continue` (auto-increment PC by 1)
|
||||||
|
- `ExecutionResult::Advance(offset)` (jump)
|
||||||
|
- `ExecutionResult::Return(())` or `ExecutionResult::ReturnValue(Value)` (exit frame)
|
||||||
|
- `VmError` on failure
|
||||||
|
|
||||||
|
## Opcode Encoding (`instructions.rs`)
|
||||||
|
|
||||||
|
- Uses `deku` derive for binary deserialization
|
||||||
|
- Each opcode has a u8 ID (0x00-0xFF)
|
||||||
|
- Some opcodes carry operands (e.g., `iload(u8)`, `goto(i16)`)
|
||||||
|
- Wide instruction prefix (0xC4) for accessing local slots > 255
|
||||||
|
|
||||||
|
## Method Invocation
|
||||||
|
|
||||||
|
**Location**: `thread.rs` (lines 308-338)
|
||||||
|
|
||||||
|
### Invocation Types
|
||||||
|
|
||||||
|
1. **`invoke()`**: Resolve method by class and descriptor, execute it
|
||||||
|
2. **`invoke_virtual()`**: Virtual dispatch - find method on actual runtime class
|
||||||
|
3. **`invoke_native()`**: Call native JNI method via FFI
|
||||||
|
|
||||||
|
### Bytecode Invocation Instructions
|
||||||
|
|
||||||
|
- **`invokevirtual`**: Virtual method dispatch - pop receiver + arguments, get actual class, call `thread.invoke_virtual()`
|
||||||
|
- **`invokespecial`**: Non-virtual (constructors, private, super) - pop receiver + arguments, call `thread.invoke()` with static resolution
|
||||||
|
- **`invokestatic`**: Static methods - pop arguments only (no receiver), call `thread.invoke()`
|
||||||
|
- **`invokeinterface`**: Interface method dispatch - similar to `invokevirtual` with interface resolution
|
||||||
|
|
||||||
|
### Frame Creation & Execution
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn execute_method(&self, class: &Arc<RuntimeClass>, method: &MethodData, args: Vec<Value>) {
|
||||||
|
let mut frame = Frame::new(
|
||||||
|
class.clone(),
|
||||||
|
method_ref,
|
||||||
|
code_attr, // Contains max_stack, max_locals, bytecode
|
||||||
|
args, // Initialize local vars with parameters
|
||||||
|
...
|
||||||
|
);
|
||||||
|
self.frame_stack.lock().push(frame.clone());
|
||||||
|
frame.execute() // Bytecode interpretation loop
|
||||||
|
self.frame_stack.lock().pop();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Supported Instructions
|
||||||
|
|
||||||
|
### Constants
|
||||||
|
`aconst_null`, `iconst_*`, `lconst_*`, `fconst_*`, `dconst_*`, `bipush`, `sipush`, `ldc`, `ldc_w`, `ldc2_w`
|
||||||
|
|
||||||
|
### Load/Store
|
||||||
|
`iload`, `lload`, `fload`, `dload`, `aload`, `istore`, `lstore`, `fstore`, `dstore`, `astore` (including `_0-3` variants)
|
||||||
|
|
||||||
|
### Array Operations
|
||||||
|
`iaload`, `laload`, `faload`, `daload`, `aaload`, `baload`, `caload`, `saload`, `iastore`, `lastore`, `fastore`, `dastore`, `aastore`, `bastore`, `castore`, `sastore`, `arraylength`
|
||||||
|
|
||||||
|
### Stack Manipulation
|
||||||
|
`pop`, `pop2`, `dup`, `dup_x1`, `dup_x2`, `dup2`, `dup2_x1`, `dup2_x2`, `swap`
|
||||||
|
|
||||||
|
### Arithmetic
|
||||||
|
All int/long/float/double add, sub, mul, div, rem, neg operations
|
||||||
|
|
||||||
|
### Bitwise
|
||||||
|
`ishl`, `lshl`, `ishr`, `lshr`, `iushr`, `lushr`, `iand`, `land`, `ior`, `lor`, `ixor`, `lxor`
|
||||||
|
|
||||||
|
### Type Conversions
|
||||||
|
All primitive type conversions (`i2l`, `l2i`, `f2d`, etc.)
|
||||||
|
|
||||||
|
### Comparisons
|
||||||
|
`lcmp`, `fcmpl`, `fcmpg`, `dcmpl`, `dcmpg`
|
||||||
|
|
||||||
|
### Control Flow
|
||||||
|
`ifeq`, `ifne`, `iflt`, `ifge`, `ifgt`, `ifle`, `if_icmp*`, `if_acmp*`, `goto`, `ifnull`, `ifnonnull`, `tableswitch`, `lookupswitch`
|
||||||
|
|
||||||
|
### Object Operations
|
||||||
|
`new`, `newarray`, `anewarray`, `multianewarray`, `checkcast`, `instanceof`
|
||||||
|
|
||||||
|
### Field Access
|
||||||
|
`getstatic`, `putstatic`, `getfield`, `putfield`
|
||||||
|
|
||||||
|
### Method Invocation
|
||||||
|
`invokevirtual`, `invokespecial`, `invokestatic`, `invokeinterface`
|
||||||
|
|
||||||
|
### Returns
|
||||||
|
`ireturn`, `lreturn`, `freturn`, `dreturn`, `areturn`, `return`
|
||||||
|
|
||||||
|
### Synchronization
|
||||||
|
`monitorenter`, `monitorexit`
|
||||||
155
docs/jni.md
Normal file
155
docs/jni.md
Normal file
@ -0,0 +1,155 @@
|
|||||||
|
# JNI Implementation
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/native/jni.rs`
|
||||||
|
|
||||||
|
## JNIEnv Structure
|
||||||
|
|
||||||
|
The JNIEnv is created as a function table (`JNINativeInterface_` from the `jni` crate) with 250+ function pointers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn create_jni_function_table(thread: *const VmThread) -> JNIEnv {
|
||||||
|
Box::into_raw(Box::new(JNINativeInterface_ {
|
||||||
|
reserved0: thread as *mut _, // Stores pointer to VmThread for context
|
||||||
|
reserved1: std::ptr::null_mut(),
|
||||||
|
reserved2: std::ptr::null_mut(),
|
||||||
|
reserved3: std::ptr::null_mut(),
|
||||||
|
GetVersion: Some(jni_get_version),
|
||||||
|
FindClass: Some(find_class),
|
||||||
|
RegisterNatives: Some(register_natives),
|
||||||
|
GetMethodID: Some(get_method_id),
|
||||||
|
// ... 240+ more function pointers
|
||||||
|
}))
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Feature:** The `reserved0` field stores a pointer to the `VmThread`, allowing each JNI function to access thread
|
||||||
|
context via `get_thread(env)`.
|
||||||
|
|
||||||
|
## VmThread's JNIEnv Storage
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/thread.rs`
|
||||||
|
|
||||||
|
Each thread owns its JNIEnv:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct VmThread {
|
||||||
|
pub id: ThreadId,
|
||||||
|
pub vm: Arc<Vm>,
|
||||||
|
pub loader: Arc<Mutex<ClassLoader>>,
|
||||||
|
pub jni_env: JNIEnv, // Stored per-thread
|
||||||
|
// ... other fields
|
||||||
|
}
|
||||||
|
|
||||||
|
// Created during VmThread initialization:
|
||||||
|
let jni_env = create_jni_function_table(weak_self.as_ptr() as * mut VmThread);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Native Method Invocation
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/thread.rs` (lines 340-428)
|
||||||
|
|
||||||
|
The flow is:
|
||||||
|
|
||||||
|
1. **Method Detection:** When a method has `ACC_NATIVE` flag, `invoke_native()` is called
|
||||||
|
2. **Symbol Resolution:** Generates JNI symbol name (e.g., `Java_java_lang_String_intern`)
|
||||||
|
3. **Lookup:** Searches registered native methods or loaded native libraries via `find_native_method()`
|
||||||
|
4. **FFI Call:** Uses `libffi` to call the native function with constructed arguments
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub fn invoke_native(&self, method: &MethodRef, args: Vec<Value>) -> MethodCallResult {
|
||||||
|
let symbol_name = generate_jni_method_name(method, false);
|
||||||
|
|
||||||
|
// Find the function pointer
|
||||||
|
let p = self.vm.find_native_method(&symbol_name)
|
||||||
|
.ok_or(VmError::NativeError(...))?;
|
||||||
|
|
||||||
|
// Build Call Interface (CIF) for FFI
|
||||||
|
let cp = CodePtr::from_ptr(p);
|
||||||
|
let built_args = build_args(args, &mut storage, &self.jni_env as *mut JNIEnv);
|
||||||
|
let cif = method.build_cif();
|
||||||
|
|
||||||
|
// Invoke with type-specific call
|
||||||
|
match &method.desc.return_type {
|
||||||
|
None => {
|
||||||
|
cif.call::<()>(cp, built_args.as_ref());
|
||||||
|
Ok(None)
|
||||||
|
}
|
||||||
|
Some(FieldType::Base(BaseType::Int)) => {
|
||||||
|
let v = cif.call::<jint>(cp, built_args.as_ref());
|
||||||
|
Ok(Some(v.into()))
|
||||||
|
}
|
||||||
|
// ... handle other return types
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Argument Marshalling
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/thread.rs` (lines 509-548)
|
||||||
|
|
||||||
|
Native functions receive:
|
||||||
|
|
||||||
|
1. `JNIEnv*` - pointer to the function table
|
||||||
|
2. `jclass` or `jobject` (receiver) - always an ID (u32)
|
||||||
|
3. Parameters - converted from VM `Value` types to JNI types
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn build_args(mut params: VecDeque<Value>, storage: &mut Vec<Box<dyn Any>>,
|
||||||
|
jnienv: *mut JNIEnv) -> Vec<Arg> {
|
||||||
|
storage.push(Box::new(jnienv)); // Slot 0: JNIEnv*
|
||||||
|
let receiver_id = params.pop_front().map(...);
|
||||||
|
storage.push(Box::new(receiver_id as jobject)); // Slot 1: this/class
|
||||||
|
|
||||||
|
for value in params {
|
||||||
|
match value {
|
||||||
|
Value::Primitive(Primitive::Int(x)) => storage.push(Box::new(x)),
|
||||||
|
Value::Reference(Some(ref_kind)) => {
|
||||||
|
storage.push(Box::new(ref_kind.id() as jobject)) // References as IDs
|
||||||
|
}
|
||||||
|
// ... other types
|
||||||
|
}
|
||||||
|
}
|
||||||
|
storage.iter().map(|boxed| arg(&**boxed)).collect()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Native Method Registration
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/native/jni.rs` (lines 381-442)
|
||||||
|
|
||||||
|
Java code calls `RegisterNatives()` JNI function, which stores pointers:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
unsafe extern "system" fn register_natives(
|
||||||
|
env: *mut JNIEnv,
|
||||||
|
clazz: jclass,
|
||||||
|
methods: *const JNINativeMethod,
|
||||||
|
n_methods: jint,
|
||||||
|
) -> jint {
|
||||||
|
let thread = &*get_thread(env);
|
||||||
|
|
||||||
|
for i in 0..n_methods as usize {
|
||||||
|
let native_method = &*methods.add(i);
|
||||||
|
let full_name = generate_jni_short_name(&class_name, name);
|
||||||
|
|
||||||
|
thread.vm.native_methods.insert(full_name, native_method.fnPtr);
|
||||||
|
}
|
||||||
|
JNI_OK
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implemented JNI Functions
|
||||||
|
|
||||||
|
| Category | Functions |
|
||||||
|
|-------------------|--------------------------------------------------------------|
|
||||||
|
| Version | `GetVersion` |
|
||||||
|
| Class Operations | `FindClass`, `GetSuperclass`, `IsAssignableFrom` |
|
||||||
|
| Exceptions | `Throw`, `ThrowNew`, `ExceptionOccurred`, `ExceptionClear` |
|
||||||
|
| References | `NewGlobalRef`, `DeleteGlobalRef`, `NewLocalRef` |
|
||||||
|
| Object Operations | `AllocObject`, `NewObject`, `GetObjectClass`, `IsInstanceOf` |
|
||||||
|
| Field Access | `GetFieldID`, `Get/Set<Type>Field`, `GetStaticFieldID` |
|
||||||
|
| Method Invocation | `GetMethodID`, `Call<Type>Method`, `CallStatic<Type>Method` |
|
||||||
|
| String Operations | `NewString`, `GetStringLength`, `GetStringChars` |
|
||||||
|
| Array Operations | `NewArray`, `GetArrayLength`, `Get/Set<Type>ArrayRegion` |
|
||||||
|
| Registration | `RegisterNatives`, `UnregisterNatives` |
|
||||||
|
| Monitors | `MonitorEnter`, `MonitorExit` |
|
||||||
125
docs/native-ffi.md
Normal file
125
docs/native-ffi.md
Normal file
@ -0,0 +1,125 @@
|
|||||||
|
# Native Methods and FFI System
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/thread.rs`, `crates/core/src/native/`
|
||||||
|
|
||||||
|
## Library Loading
|
||||||
|
|
||||||
|
Native libraries are loaded dynamically using `libloading`:
|
||||||
|
|
||||||
|
- **Location**: `crates/core/src/main.rs`
|
||||||
|
- **Supported platforms**: Windows (.dll), Linux (.so)
|
||||||
|
- **Libraries loaded**:
|
||||||
|
- `roast_vm` - VM-specific native methods
|
||||||
|
- `jvm` - Java virtual machine standard library
|
||||||
|
- `java` - Java standard library
|
||||||
|
|
||||||
|
Libraries are registered with the VM via `Vm::load_native_library()` and stored in `native_libraries: Arc<RwLock<Vec<(String, Library)>>>`.
|
||||||
|
|
||||||
|
## Native Method Registration
|
||||||
|
|
||||||
|
Native methods are registered through JNI's `RegisterNatives` function:
|
||||||
|
|
||||||
|
- **Location**: `crates/core/src/native/jni.rs`
|
||||||
|
- **Process**:
|
||||||
|
1. Java code calls `RegisterNatives()` with method names, signatures, and function pointers
|
||||||
|
2. VM generates JNI-formatted symbol names using `generate_jni_method_name()`
|
||||||
|
3. Function pointers are stored in `Vm::native_methods: DashMap<String, *const c_void>`
|
||||||
|
4. Filters prevent registration of certain methods (e.g., Thread operations)
|
||||||
|
|
||||||
|
## Native Method Dispatch
|
||||||
|
|
||||||
|
When a native method is invoked:
|
||||||
|
|
||||||
|
- **Detection**: `MethodData::ACC_NATIVE` flag identifies native methods
|
||||||
|
- **Location**: `crates/core/src/thread.rs`
|
||||||
|
- **Flow**:
|
||||||
|
1. `execute_method()` checks if method has `ACC_NATIVE` flag
|
||||||
|
2. If static, adds class reference to args; otherwise adds instance reference
|
||||||
|
3. Calls `invoke_native()` with method reference and arguments
|
||||||
|
|
||||||
|
## FFI System - libffi Integration
|
||||||
|
|
||||||
|
libffi is used to call native functions with correct calling conventions:
|
||||||
|
|
||||||
|
### Call Interface Building
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn build_cif(&self) -> Cif {
|
||||||
|
let mut args = vec![
|
||||||
|
Type::pointer(), // JNIEnv*
|
||||||
|
Type::pointer(), // jclass/jobject
|
||||||
|
];
|
||||||
|
for v in self.desc.parameters {
|
||||||
|
args.push(v.into())
|
||||||
|
}
|
||||||
|
let return_type = ...;
|
||||||
|
Builder::new().args(args).res(return_type).into_cif()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- Constructs a Call Interface (Cif) from method signature
|
||||||
|
- Maps Java types to FFI types
|
||||||
|
- Always adds JNIEnv* and jclass/jobject as first two parameters
|
||||||
|
|
||||||
|
### Argument Marshalling
|
||||||
|
|
||||||
|
```rust
|
||||||
|
fn build_args(params, storage, jnienv) -> Vec<Arg>
|
||||||
|
```
|
||||||
|
|
||||||
|
- Marshals Java values to native format
|
||||||
|
- Converts references to object IDs (u32)
|
||||||
|
- Boxes primitives for FFI passing
|
||||||
|
- Stores all values in a temporary vector
|
||||||
|
|
||||||
|
### Function Invocation
|
||||||
|
|
||||||
|
```rust
|
||||||
|
let cp = CodePtr::from_ptr(p);
|
||||||
|
cif.call::<ReturnType>(cp, built_args.as_ref());
|
||||||
|
```
|
||||||
|
|
||||||
|
- Converts function pointer to CodePtr
|
||||||
|
- Calls through libffi with correct return type handling
|
||||||
|
|
||||||
|
## Type Mapping
|
||||||
|
|
||||||
|
| Java Type | FFI Type |
|
||||||
|
|-----------|----------|
|
||||||
|
| byte | i8 |
|
||||||
|
| char | u16 |
|
||||||
|
| short | i16 |
|
||||||
|
| int | i32 |
|
||||||
|
| long | i64 |
|
||||||
|
| float | f32 |
|
||||||
|
| double | f64 |
|
||||||
|
| boolean | i8 |
|
||||||
|
| Object/Array | pointer |
|
||||||
|
|
||||||
|
## JNI Function Table
|
||||||
|
|
||||||
|
A complete JNI function table is created and passed to native code:
|
||||||
|
|
||||||
|
- **Location**: `crates/core/src/native/jni.rs`
|
||||||
|
- **Implementation**:
|
||||||
|
- 250+ JNI functions defined as unsafe extern "system" functions
|
||||||
|
- Covers class operations, method invocation, field access, array operations, string handling
|
||||||
|
- Many functions are stubs returning `todo!()` for unimplemented features
|
||||||
|
|
||||||
|
## Unsafe Support (sun.misc.Unsafe)
|
||||||
|
|
||||||
|
Low-level unsafe operations are tracked:
|
||||||
|
|
||||||
|
- **Location**: `crates/core/src/native/unsafe.rs`
|
||||||
|
- **Features**:
|
||||||
|
- Field offset registry mapping to class/field pairs
|
||||||
|
- Off-heap memory allocation tracking
|
||||||
|
- Base offset constant: `0x1_0000_0000`
|
||||||
|
|
||||||
|
## Key Implementation Characteristics
|
||||||
|
|
||||||
|
- **Thread-safe**: All structures use `DashMap`, `RwLock`, or `Mutex`
|
||||||
|
- **JNI environment**: Created per-thread, stored in `VmThread::jni_env`
|
||||||
|
- **Symbol lookup**: Two-pass search (without params, then with params)
|
||||||
|
- **Error handling**: Returns `VmError::NativeError` if symbols not found
|
||||||
|
- **Tracking**: Maintains statistics on library resolution counts
|
||||||
95
docs/object-management.md
Normal file
95
docs/object-management.md
Normal file
@ -0,0 +1,95 @@
|
|||||||
|
# Object Management
|
||||||
|
|
||||||
|
**Location**: `crates/core/src/objects/`
|
||||||
|
|
||||||
|
## Object Representation
|
||||||
|
|
||||||
|
Java objects are represented through a multi-layered abstraction:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Object {
|
||||||
|
pub id: u32, // Unique identifier
|
||||||
|
pub class: Arc<RuntimeClass>, // Runtime class reference
|
||||||
|
pub fields: DashMap<String, Value>, // Concurrent field storage
|
||||||
|
}
|
||||||
|
|
||||||
|
pub type ObjectReference = Arc<Mutex<Object>>;
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Objects**: Contain a unique ID (u32), runtime class reference, and field storage (DashMap)
|
||||||
|
- **ObjectReference**: `Arc<Mutex<Object>>` - reference-counted, thread-safe smart pointer
|
||||||
|
- **Value wrapper**: The `Value` enum encapsulates both primitives and references for operand stack/local variable storage
|
||||||
|
|
||||||
|
## Array Management
|
||||||
|
|
||||||
|
Arrays are type-safe with separate variants for primitives and objects:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub enum ArrayReference {
|
||||||
|
Int(Arc<Mutex<Array<jint>>>),
|
||||||
|
Byte(Arc<Mutex<Array<jbyte>>>),
|
||||||
|
Short(Arc<Mutex<Array<jshort>>>),
|
||||||
|
Long(Arc<Mutex<Array<jlong>>>),
|
||||||
|
Float(Arc<Mutex<Array<jfloat>>>),
|
||||||
|
Double(Arc<Mutex<Array<jdouble>>>),
|
||||||
|
Char(Arc<Mutex<Array<jchar>>>),
|
||||||
|
Boolean(Arc<Mutex<Array<jboolean>>>),
|
||||||
|
Object(Arc<Mutex<Array<Option<ReferenceKind>>>>),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Primitive arrays**: Int, Byte, Short, Long, Float, Double, Char, Boolean
|
||||||
|
- **Object arrays**: Can hold references to other objects
|
||||||
|
- **Array structure**: Each array wraps a boxed slice `Box<[T]>` with id, class, and backing storage
|
||||||
|
- **Thread-safe**: All arrays use `Arc<Mutex<Array<T>>>` for concurrent access
|
||||||
|
|
||||||
|
## Allocation Strategy
|
||||||
|
|
||||||
|
Allocation is centralized in **ObjectManager**:
|
||||||
|
|
||||||
|
- **Object allocation**: `new_object()` generates unique IDs via atomic counter and stores references in HashMap
|
||||||
|
- **Array allocation**: Separate methods for primitive arrays (`new_primitive_array()`) and object arrays (`new_object_array()`)
|
||||||
|
- **String interning**: `new_string()` creates UTF-16 encoded strings with automatic interning via string pool
|
||||||
|
- **Memory tracking**: `bytes_in_use()` calculates total heap usage across all objects/arrays
|
||||||
|
|
||||||
|
All allocated objects are registered in `objects: HashMap<u32, ReferenceKind>` for global access.
|
||||||
|
|
||||||
|
## Reference Handling
|
||||||
|
|
||||||
|
Two-level reference system:
|
||||||
|
|
||||||
|
1. **ReferenceKind enum**: Distinguishes between `ObjectReference` and `ArrayReference`
|
||||||
|
2. **Reference type alias**: `Option<ReferenceKind>` (None = null)
|
||||||
|
3. **Conversion methods**: Safe conversions with `try_into_object_reference()` and `try_into_array_reference()`
|
||||||
|
|
||||||
|
## Memory Management
|
||||||
|
|
||||||
|
**No explicit garbage collection** - relies on Rust's reference counting:
|
||||||
|
|
||||||
|
- Arc ensures objects live as long as references exist
|
||||||
|
- Mutex provides thread-safe field/element access
|
||||||
|
- Shallow cloning for `clone()` operations (copies references, not objects)
|
||||||
|
- Array copy operations handle both primitive and object types with bounds checking
|
||||||
|
|
||||||
|
## Object Synchronization
|
||||||
|
|
||||||
|
Monitor-based concurrency for synchronized operations:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
pub struct Monitor {
|
||||||
|
owner: Option<ThreadId>,
|
||||||
|
entry_count: u32,
|
||||||
|
condition: Condvar,
|
||||||
|
mutex: Mutex<()>,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Operations**: `monitor_enter()`, `monitor_exit()`, `wait()`, `notify_one()`, `notify_all()`
|
||||||
|
- **Wait semantics**: Full support for Java-style wait/notify with timeout
|
||||||
|
- **Reentrant**: Same thread can enter multiple times (tracked by entry_count)
|
||||||
|
|
||||||
|
## Special Features
|
||||||
|
|
||||||
|
- **String handling**: UTF-16 LE encoding with automatic String object creation
|
||||||
|
- **Reflection support**: Methods to create Constructor, Method, MethodHandle objects
|
||||||
|
- **Class mirrors**: Every class has an associated mirror object (java/lang/Class)
|
||||||
87
docs/roast-vm-sys.md
Normal file
87
docs/roast-vm-sys.md
Normal file
@ -0,0 +1,87 @@
|
|||||||
|
# roast-vm-sys Crate
|
||||||
|
|
||||||
|
**Location**: `crates/roast-vm-sys/`
|
||||||
|
|
||||||
|
A cdylib crate that exports native method implementations callable from Java via JNI.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
**roast-vm-sys** is a JNI wrapper crate that exposes the roast-vm-core runtime to Java. It's compiled as a C dynamic library (cdylib) named `roast_vm`.
|
||||||
|
|
||||||
|
## Exported Native Methods
|
||||||
|
|
||||||
|
The crate exports 40+ JNI native functions via `#[no_mangle] extern "system"` declarations.
|
||||||
|
|
||||||
|
### By Module
|
||||||
|
|
||||||
|
| Module | Functions |
|
||||||
|
|--------|-----------|
|
||||||
|
| `thread.rs` | `Thread.currentThread()`, `Thread.start0()`, `Thread.setPriority0()` |
|
||||||
|
| `object.rs` | `Object.hashCode()`, `Object.clone()`, `Object.notify()`, `Object.notifyAll()`, `Object.wait()` |
|
||||||
|
| `class.rs` | `Class.forName0()`, `Class.getPrimitiveClass()`, `Class.getDeclaredConstructors0()` |
|
||||||
|
| `reflection.rs` | `Reflection.getCallerClass()` |
|
||||||
|
| `reflect/array.rs` | `Array.newArray()` |
|
||||||
|
| `string.rs` | `String.intern()` |
|
||||||
|
| `system.rs` | `System.arraycopy()`, `System.nanoTime()` |
|
||||||
|
| `runtime.rs` | `Runtime.maxMemory()`, `Runtime.availableProcessors()` |
|
||||||
|
| `misc_unsafe.rs` | `Unsafe` field offsets, volatile read/write, memory allocation |
|
||||||
|
| `file_output_stream.rs` | `FileOutputStream.writeBytes()` |
|
||||||
|
| `system_props.rs` | `vmProperties()` - VM identity (version 0.1.0, vendor "infernap12") |
|
||||||
|
| `CDS.rs` | Class Data Sharing stubs |
|
||||||
|
| `signal.rs` | `Signal.handle0()` stub |
|
||||||
|
| `scoped_memory_access.rs` | `ScopedMemoryAccess` registration |
|
||||||
|
|
||||||
|
## Bridge Pattern
|
||||||
|
|
||||||
|
Each native function follows this pattern:
|
||||||
|
|
||||||
|
1. **Extract VmThread** from `JNIEnv.reserved0` using `get_thread()` helper
|
||||||
|
2. **Resolve References** - Convert JNI handles (jobject) to internal references:
|
||||||
|
- `resolve_object()` - gets ObjectReference
|
||||||
|
- `resolve_array()` - gets ArrayReference
|
||||||
|
- `resolve_reference()` - gets generic ReferenceKind
|
||||||
|
3. **Perform Operation** via core VM APIs
|
||||||
|
4. **Return Result** in JNI-compatible format
|
||||||
|
|
||||||
|
## Example Native Implementation
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[unsafe(no_mangle)]
|
||||||
|
pub extern "system" fn Java_org_example_MockIO_print(
|
||||||
|
env: JNIEnv,
|
||||||
|
_jclass: JClass,
|
||||||
|
input: JString,
|
||||||
|
) {
|
||||||
|
unsafe {
|
||||||
|
let input: String = env.get_string_unchecked(&input)
|
||||||
|
.expect("Couldn't get java string!")
|
||||||
|
.into();
|
||||||
|
std::io::stdout().write_all(input.as_bytes()).ok();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
17 modules organized by Java class:
|
||||||
|
|
||||||
|
- `lib.rs` - Core helpers, test functions (`MockIO.print()`, `Main.getTime()`)
|
||||||
|
- `runtime.rs`, `thread.rs`, `class.rs` - Core VM operations
|
||||||
|
- `object.rs`, `string.rs`, `reflection.rs`, `reflect/` - Object/class introspection
|
||||||
|
- `system.rs`, `file_output_stream.rs` - System I/O
|
||||||
|
- `misc_unsafe.rs` - Unsafe memory operations (largest implementation, ~626 lines)
|
||||||
|
- `CDS.rs`, `signal.rs`, `system_props.rs`, `scoped_memory_access.rs` - Stubs/properties
|
||||||
|
|
||||||
|
## GC Interaction
|
||||||
|
|
||||||
|
Native methods access the garbage collector via:
|
||||||
|
- `thread.gc.read()` / `thread.gc.write()` for object access
|
||||||
|
- Object creation, cloning, and array operations go through GC
|
||||||
|
- Field access via `thread.gc` or direct field references
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
Mixed approach:
|
||||||
|
- Some methods panic on errors
|
||||||
|
- Some return null/default values
|
||||||
|
- TODO comments indicate incomplete exception throwing
|
||||||
Loading…
x
Reference in New Issue
Block a user