mirror of
https://github.com/openjdk/jdk.git
synced 2026-01-28 12:09:14 +00:00
`VectorMaskCastNode` is used to cast a vector mask from one type to
another type. The cast may be generated by calling the vector API `cast`
or generated by the compiler. For example, some vector mask operations
like `trueCount` require the input mask to be integer types, so for
floating point type masks, the compiler will cast the mask to the
corresponding integer type mask automatically before doing the mask
operation. This kind of cast is very common.
If the vector element size is not changed, the `VectorMaskCastNode`
don't generate code, otherwise code will be generated to extend or narrow
the mask. This IR node is not free no matter it generates code or not
because it may block some optimizations. For example:
1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))`
The middle `VectorMaskCast` prevented the following optimization:
`(VectorStoremask (VectorLoadMask x)) => (x)`
2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which
blocks the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`.
In these IR patterns, the value of the input `x` is not changed, so we
can safely do the optimization. But if the input value is changed, we
can't eliminate the cast.
The general idea of this PR is introducing an `uncast_mask` helper
function, which can be used to uncast a chain of `VectorMaskCastNode`,
like the existing `Node::uncast(bool)` function. The funtion returns
the first non `VectorMaskCastNode`.
The intended use case is when the IR pattern to be optimized may
contain one or more consecutive `VectorMaskCastNode` and this does not
affect the correctness of the optimization. Then this function can be
called to eliminate the `VectorMaskCastNode` chain.
Current optimizations related to `VectorMaskCastNode` include:
1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760.
2. `(XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1))
=> (VectorMaskCast (VectorMaskCmp src1 src2 ncond))`, see JDK-8354242.
This PR does the following optimizations:
1. Extends the optimization pattern `(VectorMaskCast (VectorMaskCast x)) => (x)`
as `(VectorMaskCast (VectorMaskCast ... (VectorMaskCast x))) => (x)`.
Because as long as types of the head and tail `VectorMaskCastNode` are
consistent, the optimization is correct.
2. Supports a new optimization pattern
`(VectorStoreMask (VectorMaskCast ... (VectorLoadMask x))) => (x)`.
Since the value before and after the pattern is a boolean vector, it
remains unchanged as long as the vector length remains the same, and
this is guranteed in the api level.
I conducted some simple research on different mask generation methods
and mask operations, and obtained the following table, which includes
some potential optimization opportunities that may use this `uncast_mask`
function.
```
mask_gen\op toLong anyTrue allTrue trueCount firstTrue lastTrue
compare N/A N/A N/A N/A N/A N/A
maskAll TBI TBI TBI TBI TBI TBI
fromLong TBI TBI N/A TBI TBI TBI
mask_gen\op and or xor andNot not laneIsSet
compare N/A N/A N/A N/A TBI N/A
maskAll TBI TBI TBI TBI TBI TBI
fromLong N/A N/A N/A N/A TBI TBI
```
`TBI` indicated that there may be potential optimizations here that
require further investigation.
Benchmarks:
On a Nvidia Grace machine with 128-bit SVE2:
```
Benchmark Unit Before Error After Error Uplift
microMaskLoadCastStoreByte64 ops/us 59.23 0.21 148.12 0.07 2.50
microMaskLoadCastStoreDouble128 ops/us 2.43 0.00 38.31 0.01 15.73
microMaskLoadCastStoreFloat128 ops/us 6.19 0.00 75.67 0.11 12.22
microMaskLoadCastStoreInt128 ops/us 6.19 0.00 75.67 0.03 12.22
microMaskLoadCastStoreLong128 ops/us 2.43 0.00 38.32 0.01 15.74
microMaskLoadCastStoreShort64 ops/us 28.89 0.02 75.60 0.09 2.62
```
On a Nvidia Grace machine with 128-bit NEON:
```
Benchmark Unit Before Error After Error Uplift
microMaskLoadCastStoreByte64 ops/us 75.75 0.19 149.74 0.08 1.98
microMaskLoadCastStoreDouble128 ops/us 8.71 0.03 38.71 0.05 4.44
microMaskLoadCastStoreFloat128 ops/us 24.05 0.03 76.49 0.05 3.18
microMaskLoadCastStoreInt128 ops/us 24.06 0.02 76.51 0.05 3.18
microMaskLoadCastStoreLong128 ops/us 8.72 0.01 38.71 0.02 4.44
microMaskLoadCastStoreShort64 ops/us 24.64 0.01 76.43 0.06 3.10
```
On an AMD EPYC 9124 16-Core Processor with AVX3:
```
Benchmark Unit Before Error After Error Uplift
microMaskLoadCastStoreByte64 ops/us 82.13 0.31 115.14 0.08 1.40
microMaskLoadCastStoreDouble128 ops/us 0.32 0.00 0.32 0.00 1.01
microMaskLoadCastStoreFloat128 ops/us 42.18 0.05 57.56 0.07 1.36
microMaskLoadCastStoreInt128 ops/us 42.19 0.01 57.53 0.08 1.36
microMaskLoadCastStoreLong128 ops/us 0.30 0.01 0.32 0.00 1.05
microMaskLoadCastStoreShort64 ops/us 42.18 0.05 57.59 0.01 1.37
```
On an AMD EPYC 9124 16-Core Processor with AVX2:
```
Benchmark Unit Before Error After Error Uplift
microMaskLoadCastStoreByte64 ops/us 73.53 0.20 114.98 0.03 1.56
microMaskLoadCastStoreDouble128 ops/us 0.29 0.01 0.30 0.00 1.00
microMaskLoadCastStoreFloat128 ops/us 30.78 0.14 57.50 0.01 1.87
microMaskLoadCastStoreInt128 ops/us 30.65 0.26 57.50 0.01 1.88
microMaskLoadCastStoreLong128 ops/us 0.30 0.00 0.30 0.00 0.99
microMaskLoadCastStoreShort64 ops/us 24.92 0.00 57.49 0.01 2.31
```
On an AMD EPYC 9124 16-Core Processor with AVX1:
```
Benchmark Unit Before Error After Error Uplift
microMaskLoadCastStoreByte64 ops/us 79.68 0.01 248.49 0.91 3.12
microMaskLoadCastStoreDouble128 ops/us 0.28 0.00 0.28 0.00 1.00
microMaskLoadCastStoreFloat128 ops/us 31.11 0.04 95.48 2.27 3.07
microMaskLoadCastStoreInt128 ops/us 31.10 0.03 99.94 1.87 3.21
microMaskLoadCastStoreLong128 ops/us 0.28 0.00 0.28 0.00 0.99
microMaskLoadCastStoreShort64 ops/us 31.11 0.02 94.97 2.30 3.05
```
This PR was tested on 128-bit, 256-bit, and 512-bit (QEMU) aarch64
environments, and two 512-bit x64 machines with various configurations,
including sve2, sve1, neon, avx3, avx2, avx1, sse4 and sse3, all tests
passed.
379 lines
16 KiB
Java
379 lines
16 KiB
Java
/*
|
|
* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
|
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
|
|
*
|
|
* This code is free software; you can redistribute it and/or modify it
|
|
* under the terms of the GNU General Public License version 2 only, as
|
|
* published by the Free Software Foundation.
|
|
*
|
|
* This code is distributed in the hope that it will be useful, but WITHOUT
|
|
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
|
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
* version 2 for more details (a copy is included in the LICENSE file that
|
|
* accompanied this code).
|
|
*
|
|
* You should have received a copy of the GNU General Public License version
|
|
* 2 along with this work; if not, write to the Free Software Foundation,
|
|
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
|
|
*
|
|
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
|
|
* or visit www.oracle.com if you need additional information or have any
|
|
* questions.
|
|
*/
|
|
|
|
/*
|
|
* @test
|
|
* @bug 8356760 8367292
|
|
* @library /test/lib /
|
|
* @summary IR test for VectorMask.toLong()
|
|
* @modules jdk.incubator.vector
|
|
*
|
|
* @run driver compiler.vectorapi.VectorMaskToLongTest
|
|
*/
|
|
|
|
package compiler.vectorapi;
|
|
|
|
import compiler.lib.ir_framework.*;
|
|
import java.util.Arrays;
|
|
import jdk.incubator.vector.*;
|
|
import jdk.test.lib.Asserts;
|
|
|
|
public class VectorMaskToLongTest {
|
|
static final VectorSpecies<Byte> B_SPECIES = ByteVector.SPECIES_MAX;
|
|
static final VectorSpecies<Short> S_SPECIES = ShortVector.SPECIES_MAX;
|
|
static final VectorSpecies<Integer> I_SPECIES = IntVector.SPECIES_MAX;
|
|
static final VectorSpecies<Float> F_SPECIES = FloatVector.SPECIES_MAX;
|
|
static final VectorSpecies<Long> L_SPECIES = LongVector.SPECIES_MAX;
|
|
static final VectorSpecies<Double> D_SPECIES = DoubleVector.SPECIES_MAX;
|
|
|
|
private static boolean[] m;
|
|
|
|
static {
|
|
m = new boolean[B_SPECIES.length()];
|
|
Arrays.fill(m, true);
|
|
}
|
|
|
|
@DontInline
|
|
public static void verifyMaskToLong(VectorSpecies<?> species, long inputLong, long got) {
|
|
long expected = inputLong & (-1L >>> (64 - species.length()));
|
|
Asserts.assertEquals(expected, got, "for input long " + inputLong);
|
|
}
|
|
|
|
// Tests for "VectorMaskToLong(MaskAll(0/-1)) => ((0/-1) & (-1ULL >> (64 - vlen)))"
|
|
|
|
@ForceInline
|
|
public static void testMaskAllToLong(VectorSpecies<?> species) {
|
|
int vlen = species.length();
|
|
long inputLong = 0L;
|
|
// fromLong is expected to be converted to maskAll.
|
|
long got = VectorMask.fromLong(species, inputLong).toLong();
|
|
verifyMaskToLong(species, inputLong, got);
|
|
|
|
inputLong = vlen >= 64 ? 0 : (0x1L << vlen);
|
|
got = VectorMask.fromLong(species, inputLong).toLong();
|
|
verifyMaskToLong(species, inputLong, got);
|
|
|
|
inputLong = -1L;
|
|
got = VectorMask.fromLong(species, inputLong).toLong();
|
|
verifyMaskToLong(species, inputLong, got);
|
|
|
|
inputLong = (-1L >>> (64 - vlen));
|
|
got = VectorMask.fromLong(species, inputLong).toLong();
|
|
verifyMaskToLong(species, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_B, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_B, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongByte() {
|
|
testMaskAllToLong(B_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_S, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_S, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongShort() {
|
|
testMaskAllToLong(S_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_I, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_I, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongInt() {
|
|
testMaskAllToLong(I_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_L, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_L, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongLong() {
|
|
testMaskAllToLong(L_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_I, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_I, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongFloat() {
|
|
testMaskAllToLong(F_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.MASK_ALL, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "sve", "true", "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.REPLICATE_L, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "sve", "false" })
|
|
@IR(counts = { IRNode.REPLICATE_L, "= 0",
|
|
IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testMaskAllToLongDouble() {
|
|
testMaskAllToLong(D_SPECIES);
|
|
}
|
|
|
|
// General cases for (VectorMaskToLong (VectorLongToMask (x))) => x.
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongByte() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - B_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(B_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(B_SPECIES, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongShort() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - S_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(S_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(S_SPECIES, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongInt() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - I_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(I_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(I_SPECIES, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureOr = { "svebitperm", "true", "avx2", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongLong() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - L_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(L_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(L_SPECIES, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeature = { "svebitperm", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongFloat() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - F_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(F_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(F_SPECIES, inputLong, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeature = { "svebitperm", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 0" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_LONG_TO_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "asimd", "true", "svebitperm", "false" })
|
|
public static void testFromLongToLongDouble() {
|
|
// Test the case where some but not all bits are set.
|
|
long inputLong = (-1L >>> (64 - D_SPECIES.length()))-1;
|
|
long got = VectorMask.fromLong(D_SPECIES, inputLong).toLong();
|
|
verifyMaskToLong(D_SPECIES, inputLong, got);
|
|
}
|
|
|
|
// General cases for VectorMask.toLong(). The main purpose is to test the IRs
|
|
// for API VectorMask.toLong(). To avoid the IRs being optimized out by compiler,
|
|
// we insert a VectorMask.not() before toLong().
|
|
|
|
@ForceInline
|
|
public static void testToLongGeneral(VectorSpecies<?> species) {
|
|
long got = VectorMask.fromArray(species, m, 0).not().toLong();
|
|
verifyMaskToLong(species, 0, got);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
public static void testToLongByte() {
|
|
testToLongGeneral(B_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
public static void testToLongShort() {
|
|
testToLongGeneral(S_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
public static void testToLongInt() {
|
|
testToLongGeneral(I_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
public static void testToLongLong() {
|
|
testToLongGeneral(L_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
public static void testToLongFloat() {
|
|
testToLongGeneral(F_SPECIES);
|
|
}
|
|
|
|
@Test
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 0",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureOr = { "avx512", "true", "rvv", "true" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeatureAnd = { "avx2", "true", "avx512", "false" })
|
|
@IR(counts = { IRNode.VECTOR_STORE_MASK, "= 1",
|
|
IRNode.VECTOR_MASK_TO_LONG, "= 1" },
|
|
applyIfCPUFeature = { "asimd", "true" })
|
|
public static void testToLongDouble() {
|
|
testToLongGeneral(D_SPECIES);
|
|
}
|
|
|
|
public static void main(String[] args) {
|
|
TestFramework testFramework = new TestFramework();
|
|
testFramework.setDefaultWarmup(10000)
|
|
.addFlags("--add-modules=jdk.incubator.vector")
|
|
.start();
|
|
}
|
|
} |