Skip to content

Conversation

@konstantinb
Copy link
Contributor

@konstantinb konstantinb commented Dec 16, 2025

What changes were proposed in this pull request?

HIVE-29367: fixing overflows in ConvertJoinMapJoin calculations

Why are the changes needed?

ConvertJoinMapJoin does not use StatsUtils.safeAdd()/saveMult() for all its calculations. There are some real life scenarios when it could perform a catastrophic decision to convert a join to a mapjoin after calculating negative size for the 'small" table, resulting in an OOM during query processing

Does this PR introduce any user-facing change?

No

How was this patch tested?

Via unit testing and with load testing on a custom Hive installation based of 4.0x version

You can see the test output generated by the pre-fix code here: HIVE-29637-mapjoin_stats_overflow q out original
it clearly confirms the decision of perform a mapjoin despite very large volume of data

@konstantinb konstantinb changed the title HIVE-29367: preventing Long overflows in ConvertJoinMapJoin Dec 18, 2025
@konstantinb konstantinb marked this pull request as ready for review December 18, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants