# Class File Parsing **Location**: `crates/core/src/class_file/` The class file parser uses the **deku** library for declarative binary deserialization with automatic validation. ## Components | File | Purpose | |------|---------| | `class_file.rs` | Main ClassFile struct with version, constant pool, fields, methods, attributes | | `constant_pool.rs` | ConstantPoolGet/ConstantPoolExt traits for pool access and resolution | | `attributes.rs` | Attribute parsing (Code, LineNumberTable, LocalVariableTable, BootstrapMethods) | | `mod.rs` | Access flag definitions (ClassFlags, MethodFlags, FieldFlags) | ## Key Types ```rust pub struct ClassFile { pub minor_version: u16, pub major_version: u16, pub constant_pool: Arc, pub access_flags: u16, pub this_class: u16, pub super_class: u16, pub interfaces: Vec, pub fields: Vec, pub methods: Vec, pub attributes: Vec, } pub struct FieldInfo { pub access_flags: u16, pub name_index: u16, pub descriptor_index: u16, pub attributes: Vec, } pub struct MethodInfo { pub access_flags: u16, pub name_index: u16, pub descriptor_index: u16, pub attributes: Vec, } ``` ## Constant Pool Trait-based architecture with two layers: ### ConstantPoolGet Trait Low-level accessors: - `get_constant()`: Resolve by index (accounts for 64-bit entries) - Type-specific getters: `get_i32()`, `get_utf8_info()`, `get_class_info()`, `get_method_ref()`, etc. - Implemented via `pool_get_impl!` macro ### ConstantPoolExt Trait High-level operations: - `get_string()`: Fetch UTF-8 strings with CESU-8 decoding - `resolve_class_name()`: Trace class references through constant pool - `resolve_method_ref()` / `resolve_interface_method_ref()`: Resolve method references - `resolve_field()`: Resolve field references with type descriptors - `parse_attribute()`: Convert raw attribute bytes to typed Attribute enum ### Constant Pool Entry Types (20 types) ```rust pub enum ConstantPoolEntry { Utf8(ConstantUtf8Info), Integer(i32), Float(f32), Long(i64), Double(f64), Class(ConstantClassInfo), String(ConstantStringInfo), FieldRef(ConstantFieldrefInfo), MethodRef(ConstantMethodrefInfo), InterfaceMethodRef(ConstantInterfaceMethodrefInfo), NameAndType(ConstantNameAndTypeInfo), MethodHandle(ConstantMethodHandleInfo), MethodType(ConstantMethodTypeInfo), Dynamic(ConstantDynamicInfo), InvokeDynamic(ConstantInvokeDynamicInfo), Module(ConstantModuleInfo), Package(ConstantPackageInfo), } ``` ## Attributes Recursive attribute parsing with support for: - **Code**: Method bytecode with max_stack, max_locals, exception tables, nested attributes - **LineNumberTable**: Maps bytecode offsets to source line numbers - **LocalVariableTable**: Local variable debugging info (name, descriptor, PC range) - **BootstrapMethods**: Dynamic invocation bootstrap method references - **StackMapTable**, **Exceptions**, **InnerClasses**: Parsed as raw byte vectors - **SourceFile**, **Signature**: Index-based attribute data - **Unknown**: Fallback for unrecognized attributes ### Code Attribute Structure ```rust pub struct CodeAttribute { pub max_stack: u16, pub max_locals: u16, pub code_length: u32, pub code: Vec, pub exception_table: Vec, pub attributes: Vec, // Recursive } ``` ## Access Flags Bitfield structures for parsing JVM access flags: - **ClassFlags**: PUBLIC, FINAL, INTERFACE, ABSTRACT, SYNTHETIC, ANNOTATION, ENUM, MODULE - **FieldFlags**: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, VOLATILE, TRANSIENT, SYNTHETIC, ENUM - **MethodFlags**: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, SYNCHRONIZED, BRIDGE, VARARGS, NATIVE, ABSTRACT, STRICT, SYNTHETIC ## Validation Validation occurs at multiple levels: 1. **Binary Format** (Automatic via Deku): - Magic number (0xCAFEBABE) - Big-endian byte order - Type-safe parsing with error propagation 2. **Constant Pool**: - Index bounds checking in `get_constant()` - Type validation: Each accessor checks the entry type matches expected type - CESU-8 decoding errors caught from Java-style UTF-8 strings 3. **Class Structure** (ClassLoader): - Debug assertions for Object class having super_class = 0 - Non-Object classes must have valid super class reference - Interfaces must inherit from Object ## Descriptor Parsing ```rust // Method descriptor: (II)I -> two ints, return int MethodDescriptor::parse("(II)I")? // Field descriptor: Ljava/lang/String; -> String class type FieldType::parse("Ljava/lang/String;")? // Array descriptor: [[I -> 2D int array FieldType::parse("[[I")? ``` ## Error Handling - `DekuError`: Binary parsing failures - `ConstantPoolError`: Pool access with Generic, DescriptorParseError, Cesu8DecodingError variants - `VmError`: Higher-level VM-specific errors - `DescParseError`: Invalid method/field descriptor syntax