feat: Add support for high resolution fractional#1740
feat: Add support for high resolution fractional#1740chrfwow wants to merge 6 commits intoopen-feature:mainfrom
Conversation
Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| ] | ||
| ], | ||
| "result": "blue" | ||
| "result": "red" |
There was a problem hiding this comment.
In some cases, we will compute a different bucket now
|
Do we want more tests for this? I stuck to the current test suite |
There was a problem hiding this comment.
Code Review
The pull request refactors the fractional targeting logic to support various input types (numbers, booleans) for bucketing by converting them to byte arrays and introduces a new numberToByteArray helper. The core bucketing algorithm was also revised to use an integer-based bit shift operation. Review comments highlight two key areas for improvement: the numberToByteArray method has a potential precision loss issue for BigDecimal and could be made more concise using ByteBuffer, and the bucketing logic has a subtle bug related to handling signed vs. unsigned integers from the MurmurHash3 function, which requires a bitmask for correct range mapping.
| private byte[] numberToByteArray(Number number) { | ||
| if (number instanceof Integer) { | ||
| return new byte[] { | ||
| (byte) ((int) number >> 24), | ||
| (byte) ((int) number >> 16), | ||
| (byte) ((int) number >> 8), | ||
| (byte) ((int) number) | ||
| }; | ||
| } else if (number instanceof Double) { | ||
| return numberToByteArray(Double.doubleToLongBits((Double) number)); | ||
| } else if (number instanceof Long) { | ||
| return new byte[] { | ||
| (byte) ((long) number >> 56), | ||
| (byte) ((long) number >> 48), | ||
| (byte) ((long) number >> 40), | ||
| (byte) ((long) number >> 32), | ||
| (byte) ((long) number >> 24), | ||
| (byte) ((long) number >> 16), | ||
| (byte) ((long) number >> 8), | ||
| (byte) ((long) number) | ||
| }; | ||
| } else if (number instanceof BigInteger) { | ||
| return ((BigInteger) number).toByteArray(); | ||
| } else if (number instanceof Byte) { | ||
| return new byte[] {(byte) number}; | ||
| } else if (number instanceof Short) { | ||
| return new byte[] { | ||
| (byte) ((short) number >> 8), | ||
| (byte) ((short) number) | ||
| }; | ||
| } else if (number instanceof Float) { | ||
| return numberToByteArray(Float.floatToIntBits((Float) number)); | ||
| } else if (number instanceof BigDecimal) { | ||
| return numberToByteArray(Double.doubleToLongBits(number.doubleValue())); | ||
| } else { | ||
| throw new IllegalArgumentException("Unsupported number type: " + number.getClass()); | ||
| } | ||
| } |
There was a problem hiding this comment.
The numberToByteArray method can be improved in two ways:
- Precision Loss with
BigDecimal: ForBigDecimal, converting todoubleviadoubleValue()can cause a loss of precision, which undermines the goal of supporting high-resolution fractional targeting. A better approach is to convert theBigDecimalto its string representation to preserve its full precision. - Readability and Conciseness: The manual byte array creation for primitive wrapper types like
Integer,Long,Double,Float, andShortis verbose. Usingjava.nio.ByteBufferwould make the code more concise and idiomatic.
private byte[] numberToByteArray(Number number) {
if (number instanceof Integer) {
return ByteBuffer.allocate(4).putInt(number.intValue()).array();
} else if (number instanceof Double) {
return ByteBuffer.allocate(8).putDouble(number.doubleValue()).array();
} else if (number instanceof Long) {
return ByteBuffer.allocate(8).putLong(number.longValue()).array();
} else if (number instanceof BigInteger) {
return ((BigInteger) number).toByteArray();
} else if (number instanceof Byte) {
return new byte[] {number.byteValue()};
} else if (number instanceof Short) {
return ByteBuffer.allocate(2).putShort(number.shortValue()).array();
} else if (number instanceof Float) {
return ByteBuffer.allocate(4).putFloat(number.floatValue()).array();
} else if (number instanceof BigDecimal) {
// Using doubleValue() can lead to precision loss for high-resolution numbers.
// A string representation preserves the precision.
return number.toString().getBytes(StandardCharsets.UTF_8);
} else {
throw new IllegalArgumentException("Unsupported number type: " + number.getClass());
}
}There was a problem hiding this comment.
Disagree with point 1. Other implementation will also convert large numbers to a Double equivalent type, so we will be consistent.
As for point 2, do we want the java.nio.Bytebuffer?
| long mmrHash = MurmurHash3.hash32x86(hashKey, 0, hashKey.length, 0); | ||
| int bucket = (int) (((mmrHash * totalWeight) >> 32) & 0xFFFFFFFFL); |
There was a problem hiding this comment.
The bucketing logic has a subtle issue with signed vs. unsigned integers. MurmurHash3.hash32x86 returns a signed int. To correctly use it for range mapping as an unsigned 32-bit value, it should be converted to a long using a bitmask to avoid sign extension. This ensures the bucketing is fair and consistent across different hash values.
long mmrHash = MurmurHash3.hash32x86(hashKey, 0, hashKey.length, 0) & 0xFFFFFFFFL;
int bucket = (int) ((mmrHash * totalWeight) >> 32);Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
Signed-off-by: christian.lutnik <christian.lutnik@dynatrace.com>
|
Idk why there are PMD errors, but I think they are unrelated and false positives |
This PR
Adds high resolution fractional support
Related Issues
Fixes #1738