254 Commits

Author SHA1 Message Date
Ehsan Behrangi
00cc9be854 8381560: AArch64: Optimize String.equals intrinsic
This change improves the AArch64 implementation of String.equals by
introducing SIMD-based fast paths using SVE and NEON.

SVE implementation:
- Uses predicated loads and comparisons for short lengths (len < VL)
- Uses a full predicated loop for longer inputs
- Handles the tail via an overlapped compare at (base + len - VL)

NEON implementation:
- Uses an 8-byte pre-read to simplify tail handling and eliminate
  4/2/1-byte scalar branches
- Processes 16-byte chunks using LDP pair loads
- Uses CMP/CCMP to collapse comparisons into a single branch on mismatch

These changes reduce branch pressure and improve throughput for both
short and long strings.

Correctness:
- The implementation preserves existing semantics and matches behavior
  for all lengths

Testing:
- Updated and extended intrinsic tests to cover boundary conditions
  and mismatch positions

Benchmark:
Across evaluated macrobenchmarks (DaCapo and Renaissance), most workloads
spend <0.5% of CPU time in String.equals. DaCapo biojava is a notable
exception (~8–9%). In biojava, most String.equals calls are on very short
strings (1–2 bytes), where SVE shows ~1% end-to-end improvement, while
NEON is largely neutral or shows a small regression (~1%).

Measured using JMH on AArch64 (Arm Neoverse V2 CPU).
Values are relative (%) vs baseline. Negative values indicate regressions.
Mismatch results are reported across first(DF), middle(DM),
and last(DL) difference positions.

SVE results:
Length | L1_EQ  L1_DF  L1_DM  L1_DL | U16_EQ U16_DF U16_DM U16_DL | Avg
-------+----------------------------+-----------------------------+------
0      | 19.63                      | 20.05                      | 19.84
1      | 16.59  17.81  16.57  18.34 | 16.02   0.71   0.42   1.39 | 10.98
2      | 16.44   1.32   0.30  -0.16 | 15.90  -5.17  -4.55  -1.09 |  2.87
3      | 26.58   1.60   1.43  27.07 | 30.34  -8.86  -7.06  14.08 | 10.65
7      | 41.47  -2.94  -3.37  39.82 | 24.02  -8.82  -6.27  20.48 | 13.05
8      | 19.08  -1.16  -3.50  -0.90 | 22.49  -9.75  17.50  13.13 |  7.11
9      | 20.17  -4.12  -5.17  19.03 |  9.25  -2.24  21.35   3.39 |  7.71
15     | 19.48  -3.83  -4.50  19.01 | 29.26 -10.06  11.76  17.07 |  9.77
16     | 19.04  -3.15  16.41  16.85 | 38.37 -11.12  13.18  27.70 | 14.66
17     |  8.95  -2.40   5.68   6.38 | 16.32  -1.61   7.49  11.44 |  6.53
31     | 28.87  -0.01  19.79  23.37 | 41.43  -7.57  23.85  35.89 | 20.70
32     | 32.58   3.38  12.39  26.90 | 46.01 -10.99  20.53  44.15 | 21.87
33     | 11.62 -15.20   6.04  13.27 | 32.27  -9.38  20.33  32.28 | 11.40
63     | 44.66 -11.59  37.20  42.56 | 55.41 -10.57  43.19  55.90 | 32.10
64     | 53.99  -2.19  27.04  51.79 | 59.36  -8.72  35.41  60.32 | 34.63
65     | 33.79 -14.01  23.95  29.15 | 48.91 -11.58  36.54  50.03 | 24.60
127    | 62.10  -3.79  47.51  62.79 | 58.13  -8.89  60.68  60.90 | 42.43
128    | 67.38  -2.47  38.62  67.09 | 62.83  -0.38  51.72  61.87 | 43.33
129    | 52.02  -1.42  39.17  49.20 | 55.04  -9.52  53.23  52.81 | 36.32
256    | 66.11  -1.38  56.12  64.93 | 70.67  -3.68  53.67  74.54 | 47.62

Average:
         33.03  -2.40  17.46  30.34 | 37.60  -7.27  23.84  33.49 | 20.91

NEON results:
Length | L1_EQ  L1_DF  L1_DM  L1_DL | U16_EQ U16_DF U16_DM U16_DL | Avg
-------+----------------------------+-----------------------------+------
0      |  9.22                      |  8.69                      |  8.95
1      |  3.07   3.59   1.34   5.42 |  6.36  -6.20  -6.71 -10.59 | -0.47
2      |  3.23  -4.79  -5.67  -4.09 |  8.06  -8.43  -9.89  -9.20 | -3.85
3      | 12.80  -4.16  -3.95  11.28 | 11.94 -14.50 -14.41  11.83 |  1.36
7      | 31.00  -7.21 -12.76  33.59 |  4.73 -17.67 -17.38   1.65 |  1.99
8      |  4.43  -7.20  -4.70  -6.73 |  2.71 -18.05  -3.17  -4.05 | -4.59
9      | -9.33 -19.90 -16.27  -1.80 | 16.65 -23.72   4.26   8.78 | -5.17
15     | -6.96 -16.17 -15.60  -4.01 |  7.46 -24.60  -3.19  77.82 |  1.84
16     |  2.48 -16.38  -2.56  -3.62 |  9.08 -19.29  -5.45  77.93 |  5.27
17     |  4.88 -18.85  -0.18  19.35 | 18.43 -19.80  -8.37  84.96 | 10.05
31     |  6.92 -21.13  -4.62  60.71 | 24.42 -21.81   9.48 188.59 | 30.32
32     |  7.75 -24.20  -5.29  68.23 | 25.33 -20.57   4.17 183.65 | 29.88
33     | 20.23 -20.42 -11.33  98.60 | 23.76 -24.76   5.97 188.57 | 35.08
63     | 30.25 -22.30  14.29 152.37 | 25.02 -28.37  21.43 419.68 | 76.55
64     | 28.99 -22.91   9.03 185.51 | 38.20 -22.82  19.76 446.60 | 85.29
65     | 16.13 -21.77   1.45 211.38 | 27.94 -24.79  17.50 446.80 | 84.33
127    | 33.69 -28.94  28.75 429.23 | 41.75 -24.86  37.35 832.68 |168.71
128    | 26.28 -29.03  24.13 432.87 | 43.48 -18.53  26.44 810.20 |164.48
129    | 27.73 -20.30  20.84 439.01 | 44.09 -22.35  30.09 827.38 |168.31
256    | 53.30 -20.27  26.09 841.37 | 56.66 -21.07  47.41 1604.98|323.56

Average:
         15.30 -16.97   2.26 156.24 | 22.24 -20.12   8.17 325.70 | 59.10

Observations:
- SVE shows consistent improvements across all tested lengths, with gains
  increasing as input size grows
- NEON improves equal-string performance across all lengths
- NEON shows regressions for short mismatched inputs due to the loss
  of the scalar tbz-based early-exit sequence, which efficiently
  detects mismatches at small sizes and at early positions
- The scalar implementation relies on a branchy 4/2/1 tbz ladder,
  which is efficient for early mismatches but suboptimal for equal
  strings
- The NEON implementation replaces this with a branchless SIMD
  approach and performs upfront comparisons of the first and last
  8 bytes, improving throughput and late-mismatch detection
2026-06-05 12:22:15 +01:00
Galder Zamarreño
af9ed6c022 8382881: Swap min/max values and avoid equals min/max values in MinMaxVector
Reviewed-by: roland
2026-05-07 09:20:45 +00:00
Paul Hübner
8de6298ed5 8379630: Add JMH benchmark to measure the overhead of using captured call state
Reviewed-by: pminborg, jvernee, liach
2026-05-07 07:57:44 +00:00
Quan Anh Mai
41a5c032f5 8382700: C2: Delay inlining instead of giving up when hit NodeCountInliningCutoff
Co-authored-by: Vladimir Ivanov <vlivanov@openjdk.org>
Co-authored-by: Maurizio Cimadamore <mcimadamore@openjdk.org>
Co-authored-by: Ioannis Tsakpinis <iotsakp@gmail.com>
Reviewed-by: kvn, vlivanov
2026-04-30 18:17:38 +00:00
Liam Miller-Cushon
0fbf58d8ff 8372353: API to compute the byte length of a String encoded in a given Charset
Reviewed-by: rriggs, naoto, vyazici
2026-03-04 17:33:32 +00:00
Mohamed Issa
161aa5d528 8371955: Support AVX10 floating point comparison instructions
Reviewed-by: epeter, sviswanathan, sparasa
2026-02-09 19:14:46 +00:00
Alan Bateman
ac6e8d481a 8376568: Change Thread::getStackTrace to use handshake op for all cases
Reviewed-by: pchilanomate, sspitsyn
2026-02-05 13:46:23 +00:00
Liam Miller-Cushon
d433ce5236 8369564: Provide a MemorySegment API to read strings with known lengths
Co-authored-by: Per Minborg <pminborg@openjdk.org>
Reviewed-by: jvernee, mcimadamore
2026-01-12 15:22:42 +00:00
Sergey Bylokhov
c6246d58f7 8374383: Update the copyright year to 2025 in the remaining files under test/ where it was missed
Reviewed-by: jpai
2025-12-31 10:04:45 +00:00
Sergey Bylokhov
5c694eab0f 8374363: Update copyright year to 2025 for test/micro in files where it was missed
Reviewed-by: phh
2025-12-27 04:45:56 +00:00
Hamlin Li
6700baa505 8357551: RISC-V: support CMoveF/D vectorization
Reviewed-by: fyang, luhenry
2025-12-08 13:38:22 +00:00
Xueming Shen
b97ed667db 8365675: Add String Unicode Case-Folding Support
Reviewed-by: rriggs, naoto, ihse
2025-12-02 19:47:18 +00:00
Per Minborg
1ce2a44e9f 8371571: Consolidate and enhance bulk memory segment ops benchmarks
Reviewed-by: jvernee
2025-11-26 15:11:10 +00:00
Galder Zamarreño
a7bb99ed00 8372119: Missing copyright header in MinMaxVector
Reviewed-by: chagedorn, thartmann
2025-11-24 09:24:19 +00:00
Per Minborg
f946449997 8366178: Implement JEP 526: Lazy Constants (Second Preview)
8371882: Improve documentation for JEP 526: Lazy Constants

Reviewed-by: jvernee, mcimadamore
2025-11-18 12:20:23 +00:00
Alan Bateman
26460b6f12 8353835: Implement JEP 500: Prepare to Make Final Mean Final
Reviewed-by: liach, vlivanov, dholmes, vyazici
2025-11-18 08:06:18 +00:00
Chen Liang
7aff8e15ba 8371319: java.lang.reflect.Method#equals doesn't short-circuit with same instances
Reviewed-by: jvernee
2025-11-14 22:55:28 +00:00
Jorn Vernee
a51a0bf57f 8370344: Arbitrary Java frames on stack during scoped access
Reviewed-by: pchilanomate, dholmes, liach
2025-11-04 15:40:40 +00:00
Raffaello Giulietti
deb7edb151 8366017: Extend the set of inputs handled by fast paths in FloatingDecimal
Reviewed-by: darcy
2025-11-03 09:48:55 +00:00
Sergey Kuksenko
2158719aab 8370150: Add StrictMath microbenchmarks to cover FDLIBM algorithms
Reviewed-by: rgiulietti
2025-10-31 14:00:55 +00:00
Shaojin Wen
5862358965 8370013: Refactor Double.toHexString to eliminate regex and StringBuilder
Reviewed-by: rgiulietti, darcy
2025-10-24 00:40:13 +00:00
Chen Liang
43e036ba89 8366424: Missing type profiling in generated Record Object methods
Reviewed-by: jvernee
2025-10-21 19:00:51 +00:00
Jatin Bhateja
449641813a 8365205: C2: Optimize popcount value computation using knownbits
Reviewed-by: epeter, hgreule, qamai
2025-10-14 03:35:11 +00:00
Xueming Shen
4ca4485e9a 8365588: defineClass that accepts a ByteBuffer does not work as expected
Reviewed-by: alanb
2025-10-13 20:29:06 +00:00
Mohamed Issa
05f8a6fca8 8360559: Optimize Math.sinh for x86 64 bit platforms
Reviewed-by: sviswanathan, sparasa
2025-08-04 18:47:57 +00:00
Shaojin Wen
e2feff8599 8355177: Speed up StringBuilder::append(char[]) via Unsafe::copyMemory
Reviewed-by: rriggs, rgiulietti
2025-07-30 13:16:27 +00:00
Roland Westrelin
f155661151 8342692: C2: long counted loop/long range checks: don't create loop-nest for short running loops
Co-authored-by: Maurizio Cimadamore <mcimadamore@openjdk.org>
Co-authored-by: Christian Hagedorn <chagedorn@openjdk.org>
Reviewed-by: chagedorn, thartmann
2025-07-22 08:35:36 +00:00
Andrew Haley
9dd93c6a2c 8361497: Scoped Values: orElse and orElseThrow do not access the cache
Reviewed-by: alanb
2025-07-21 17:05:50 +00:00
Andrew Haley
4df9c87345 8360884: Better scoped values
Reviewed-by: liach, alanb
2025-07-07 09:16:39 +00:00
Mohamed Issa
0df8c9684b 8353686: Optimize Math.cbrt for x86 64 bit platforms
Reviewed-by: sviswanathan, sparasa, jbhateja
2025-05-30 21:47:20 +00:00
Andrew Haley
a6ebcf61eb 8354674: AArch64: Intrinsify Unsafe::setMemory
Reviewed-by: adinn
2025-05-16 09:28:35 +00:00
Per Minborg
066477de80 8356080: Address post-integration comments for Stable Values
Reviewed-by: liach
2025-05-13 13:40:48 +00:00
Per Minborg
45cf32bd2c 8347408: Create an internal method handle adapter for system calls with errno
Reviewed-by: mcimadamore
2025-05-12 06:59:41 +00:00
Mohamed Issa
c8bbcaf5de 8348638: Performance regression in Math.tanh
Reviewed-by: jbhateja, epeter, sviswanathan
2025-05-02 17:21:50 +00:00
Per Minborg
9f9e73d5f9 8349146: [REDO] Implement a better allocator for downcalls
Reviewed-by: mcimadamore, jvernee, liach
2025-05-02 14:14:59 +00:00
Per Minborg
fbc4691bfa 8351565: Implement JEP 502: Stable Values (Preview)
Co-authored-by: Maurizio Cimadamore <mcimadamore@openjdk.org>
Reviewed-by: vklang, jvernee, alanb, liach
2025-04-30 16:03:25 +00:00
Per Minborg
072b8273a4 8354300: Mark String.hash field @Stable
Reviewed-by: liach, shade, vlivanov
2025-04-22 15:10:26 +00:00
Hamlin Li
bcc33d5ef3 8352504: RISC-V: implement and enable CMoveI/L
8346786: RISC-V: Reconsider ConditionalMoveLimit when adding conditional move

Reviewed-by: fyang, fjiang
2025-04-22 08:32:03 +00:00
Alan Bateman
6c93ad42f3 8351927: Change VirtualThread implementation to use use FJP delayed task handling
Reviewed-by: vklang
2025-04-09 12:36:35 +00:00
Quan Anh Mai
e1bcff3ada 8345687: Improve the implementation of SegmentFactories::allocateSegment
Reviewed-by: jvernee, mcimadamore
2025-03-18 08:59:48 +00:00
Galder Zamarreño
4e51a8c9ad 8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long)
Reviewed-by: roland, epeter, chagedorn, darcy
2025-03-13 13:53:54 +00:00
Vladimir Ivanov
4e67ac4136 8350909: [JMH] test ThreadOnSpinWaitShared failed for 2 threads config
Reviewed-by: jbhateja, drwhite
2025-03-07 20:38:25 +00:00
Vladimir Ivanov
7c22b814d6 8350811: [JMH] test foreign.StrLenTest failed with StringIndexOutOfBoundsException for size=451
Reviewed-by: jbhateja, vpaprotski, mcimadamore
2025-03-07 16:12:55 +00:00
Vladimir Ivanov
f1398ecbe4 8350701: [JMH] test foreign.AllocFromSliceTest failed with Exception for size>1024
Reviewed-by: pminborg
2025-02-27 20:35:58 +00:00
Nicole Xu
3ebed78328 8349943: [JMH] Use jvmArgs consistently
Reviewed-by: syan, redestad, haosun
2025-02-20 01:33:58 +00:00
Coleen Phillimore
c9cadbd23f 8346567: Make Class.getModifiers() non-native
Reviewed-by: alanb, vlivanov, yzheng, dlong
2025-02-10 12:44:30 +00:00
Per Minborg
beb43e2633 8349343: Add missing copyright messages in FFM benchmarks
Reviewed-by: jvernee
2025-02-04 14:10:42 +00:00
Per Minborg
81126c20cb 8349238: Some more FFM benchmarks are broken
Reviewed-by: mcimadamore
2025-02-04 11:00:54 +00:00
Jaikiran Pai
618c5eb27b 8349183: [BACKOUT] Optimization for StringBuilder append boolean & null
8349239: [BACKOUT] Reuse StringLatin1::putCharsAt and StringUTF16::putCharsAt

Reviewed-by: redestad, liach
2025-02-03 18:21:33 +00:00
Jorn Vernee
77647421c5 8348909: [BACKOUT] Implement a better allocator for downcalls
Reviewed-by: shade, liach
2025-01-31 16:49:03 +00:00