4.9 KiB
4.9 KiB
Class File Parsing
Location: crates/core/src/class_file/
The class file parser uses the deku library for declarative binary deserialization with automatic validation.
Components
| File | Purpose |
|---|---|
class_file.rs |
Main ClassFile struct with version, constant pool, fields, methods, attributes |
constant_pool.rs |
ConstantPoolGet/ConstantPoolExt traits for pool access and resolution |
attributes.rs |
Attribute parsing (Code, LineNumberTable, LocalVariableTable, BootstrapMethods) |
mod.rs |
Access flag definitions (ClassFlags, MethodFlags, FieldFlags) |
Key Types
pub struct ClassFile {
pub minor_version: u16,
pub major_version: u16,
pub constant_pool: Arc<ConstantPoolOwned>,
pub access_flags: u16,
pub this_class: u16,
pub super_class: u16,
pub interfaces: Vec<u16>,
pub fields: Vec<FieldInfo>,
pub methods: Vec<MethodInfo>,
pub attributes: Vec<AttributeInfo>,
}
pub struct FieldInfo {
pub access_flags: u16,
pub name_index: u16,
pub descriptor_index: u16,
pub attributes: Vec<AttributeInfo>,
}
pub struct MethodInfo {
pub access_flags: u16,
pub name_index: u16,
pub descriptor_index: u16,
pub attributes: Vec<AttributeInfo>,
}
Constant Pool
Trait-based architecture with two layers:
ConstantPoolGet Trait
Low-level accessors:
get_constant(): Resolve by index (accounts for 64-bit entries)- Type-specific getters:
get_i32(),get_utf8_info(),get_class_info(),get_method_ref(), etc. - Implemented via
pool_get_impl!macro
ConstantPoolExt Trait
High-level operations:
get_string(): Fetch UTF-8 strings with CESU-8 decodingresolve_class_name(): Trace class references through constant poolresolve_method_ref()/resolve_interface_method_ref(): Resolve method referencesresolve_field(): Resolve field references with type descriptorsparse_attribute(): Convert raw attribute bytes to typed Attribute enum
Constant Pool Entry Types (20 types)
pub enum ConstantPoolEntry {
Utf8(ConstantUtf8Info),
Integer(i32), Float(f32), Long(i64), Double(f64),
Class(ConstantClassInfo),
String(ConstantStringInfo),
FieldRef(ConstantFieldrefInfo),
MethodRef(ConstantMethodrefInfo),
InterfaceMethodRef(ConstantInterfaceMethodrefInfo),
NameAndType(ConstantNameAndTypeInfo),
MethodHandle(ConstantMethodHandleInfo),
MethodType(ConstantMethodTypeInfo),
Dynamic(ConstantDynamicInfo),
InvokeDynamic(ConstantInvokeDynamicInfo),
Module(ConstantModuleInfo),
Package(ConstantPackageInfo),
}
Attributes
Recursive attribute parsing with support for:
- Code: Method bytecode with max_stack, max_locals, exception tables, nested attributes
- LineNumberTable: Maps bytecode offsets to source line numbers
- LocalVariableTable: Local variable debugging info (name, descriptor, PC range)
- BootstrapMethods: Dynamic invocation bootstrap method references
- StackMapTable, Exceptions, InnerClasses: Parsed as raw byte vectors
- SourceFile, Signature: Index-based attribute data
- Unknown: Fallback for unrecognized attributes
Code Attribute Structure
pub struct CodeAttribute {
pub max_stack: u16,
pub max_locals: u16,
pub code_length: u32,
pub code: Vec<u8>,
pub exception_table: Vec<ExceptionTableEntry>,
pub attributes: Vec<AttributeInfo>, // Recursive
}
Access Flags
Bitfield structures for parsing JVM access flags:
- ClassFlags: PUBLIC, FINAL, INTERFACE, ABSTRACT, SYNTHETIC, ANNOTATION, ENUM, MODULE
- FieldFlags: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, VOLATILE, TRANSIENT, SYNTHETIC, ENUM
- MethodFlags: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, SYNCHRONIZED, BRIDGE, VARARGS, NATIVE, ABSTRACT, STRICT, SYNTHETIC
Validation
Validation occurs at multiple levels:
-
Binary Format (Automatic via Deku):
- Magic number (0xCAFEBABE)
- Big-endian byte order
- Type-safe parsing with error propagation
-
Constant Pool:
- Index bounds checking in
get_constant() - Type validation: Each accessor checks the entry type matches expected type
- CESU-8 decoding errors caught from Java-style UTF-8 strings
- Index bounds checking in
-
Class Structure (ClassLoader):
- Debug assertions for Object class having super_class = 0
- Non-Object classes must have valid super class reference
- Interfaces must inherit from Object
Descriptor Parsing
// Method descriptor: (II)I -> two ints, return int
MethodDescriptor::parse("(II)I")?
// Field descriptor: Ljava/lang/String; -> String class type
FieldType::parse("Ljava/lang/String;")?
// Array descriptor: [[I -> 2D int array
FieldType::parse("[[I")?
Error Handling
DekuError: Binary parsing failuresConstantPoolError: Pool access with Generic, DescriptorParseError, Cesu8DecodingError variantsVmError: Higher-level VM-specific errorsDescParseError: Invalid method/field descriptor syntax