Skip to content

New cop Style/ComparableBetween #13936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 10, 2025

Conversation

lovro-bikic
Copy link
Contributor

@lovro-bikic lovro-bikic commented Mar 2, 2025

Implements cop for style guide rule Ranges or between (picking Comparable#between? since it's the next fastest option after logical comparison and Range#include? isn't equivalent to logical comparison in all cases).

The cop registers cases such as x >= min && x <= max as offenses and autocorrects to x.between?(min, max).

Benchmark

Benchmark script
require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('logical comparison') do
    value = rand(800..2200)
    value >= 1000 && value <= 2000
  end

  x.report('Comparable#between?') do
    value = rand(800..2200)
    value.between?(1000, 2000)
  end

  x.report('Range#include?') do
    value = rand(800..2200)
    (1000..2000).include?(value)
  end

  x.compare!
end
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin23]
Warming up --------------------------------------
  logical comparison   661.161k i/100ms
 Comparable#between?   527.743k i/100ms
      Range#include?   440.940k i/100ms
Calculating -------------------------------------
  logical comparison      6.762M (± 1.0%) i/s  (147.89 ns/i) -     34.380M in   5.084957s
 Comparable#between?      5.251M (± 1.0%) i/s  (190.44 ns/i) -     26.387M in   5.025596s
      Range#include?      4.273M (± 4.6%) i/s  (234.03 ns/i) -     21.606M in   5.068613s

Comparison:
  logical comparison:  6761845.8 i/s
 Comparable#between?:  5251112.6 i/s - 1.29x  slower
      Range#include?:  4272925.5 i/s - 1.58x  slower

Before submitting the PR make sure the following are checked:

  • The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
  • Wrote good commit messages.
  • Commit message starts with [Fix #issue-number] (if the related issue exists).
  • Feature branch is up-to-date with master (if not - rebase it).
  • Squashed related commits together.
  • Added tests.
  • Ran bundle exec rake default. It executes all tests and runs RuboCop on its own code.
  • Added an entry (file) to the changelog folder named {change_type}_{change_description}.md if the new code introduces user-observable changes. See changelog entry format for details.
Comment on lines +58 to +65
def register_offense(node, min_and_value, max_and_value)
value = (min_and_value & max_and_value).first
min = min_and_value.find { _1 != value }
max = max_and_value.find { _1 != value }
Copy link
Contributor Author

@lovro-bikic lovro-bikic Mar 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very happy with this solution, but I couldn't find something more elegant because node matchers apparently don't support named captures that are returned as a hash instead of an array in the order of captures.

For example, in the matcher {$_value :>= $_min | $_min :<= $_value}, the return value will either be [value, min] or [min, value], but I can't know which is which without the extra logic.

If I'm missing something here to make this logic simpler, let me know. 🙏

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that will do for now.

@lovro-bikic
Copy link
Contributor Author

lovro-bikic commented Mar 3, 2025

I was not aware of #12333 before opening the PR. I'd like to evaluate the comments which led to closing the issue.

The benchmark which showed between? to be >2x slower than logical comparison is skewed because it hardcodes the value (42) below the minimum (1000), so in every logical comparison run, value <= 2000 part never evaluates because value >= 1000 is always false. This doesn't represent real-world situations where we don't know what the actual value is.

The benchmark I attached in PR description uses a random value which can be either below the minimum, inside the boundary, or above the maximum. With it, between? is on average ~1.3x slower than logical comparison, which I believe is not a big enough difference to write it off on performance grounds (and I don't think that Style cops should be written off on such grounds anyway, unless performance diff is sizeable).

We should also consider the case when value is a method call:

Benchmark script
require 'benchmark/ips'

class Run
  def initialize
    @value = rand(800..2200)
  end

  attr_reader :value

  def comparison
    value >= 1000 && value <= 2000
  end

  def between
    value.between?(1000, 2000)
  end
end

Benchmark.ips do |x|
  x.report('logical comparison') do
    Run.new.comparison
  end

  x.report('Comparable#between?') do
    Run.new.between
  end

  x.compare!
end
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin23]
Warming up --------------------------------------
  logical comparison   340.541k i/100ms
 Comparable#between?   296.352k i/100ms
Calculating -------------------------------------
  logical comparison      3.418M (± 1.0%) i/s  (292.53 ns/i) -     17.368M in   5.081004s
 Comparable#between?      2.941M (± 1.1%) i/s  (340.04 ns/i) -     14.818M in   5.039218s

Comparison:
  logical comparison:  3418459.7 i/s
 Comparable#between?:  2940790.5 i/s - 1.16x  slower

which brings the two even closer in terms of performance. In this codebase, there's one such example:

source_length >= max_line_length &&
source_length - comment.source_range.length <= max_line_length

I think that this example also shows the stylistic benefit of having such a cop because it got autocorrected to:

max_line_length.between?(source_length - comment.source_range.length, source_length)

which makes it easier to understand the intention (checking that a value is within a boundary) and what are the boundaries (in original code, condition for max comes before condition for min, which does not read naturally). It's also shorter since max_line_length is not repeated, which causes the condition to be broken into two lines in original code.

While someone might find logical comparison preferable, I think between? is definitely a style which should be made available as a cop to those who wish to enforce it.

With this in mind, I think we should reevaluate the decision to not implement this cop (or update the style guide).

@bbatsov
Copy link
Collaborator

bbatsov commented Mar 3, 2025

I agree with your sentiment and I'm fine with including this cop. I guess it might not be a bad idea to mention the small performance hit, so that people are aware of it. Few people will ever be in a situation where such performance difference will matter, though.

@lovro-bikic lovro-bikic force-pushed the style-comparable-between branch from 455a0dc to 152db46 Compare March 3, 2025 23:56
@lovro-bikic
Copy link
Contributor Author

lovro-bikic commented Mar 3, 2025

Updated cop docs with a note on performance.

@lovro-bikic lovro-bikic force-pushed the style-comparable-between branch 2 times, most recently from aca62af to 94006cf Compare March 4, 2025 00:03
@lovro-bikic lovro-bikic force-pushed the style-comparable-between branch from 94006cf to ca0ae0b Compare March 4, 2025 08:29
@bbatsov bbatsov merged commit 0ee5eb3 into rubocop:master Mar 10, 2025
24 checks passed
@lovro-bikic lovro-bikic deleted the style-comparable-between branch March 10, 2025 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants