Skip to content

Conversation

@zzchun
Copy link

@zzchun zzchun commented Nov 27, 2025

Thank you for contributing to Ray! 🚀
Please review the Ray Contribution Guide before opening a pull request.

⚠️ Remove these instructions before submitting your PR.

💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete.

Description

Briefly describe what this PR accomplishes and why it's needed.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

@zzchun zzchun requested a review from a team as a code owner November 27, 2025 10:30
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request primarily focuses on translating comments and docstrings from Chinese to English across several files, improving code readability and maintainability. It also introduces a ResourceBasedActorAutoscaler and a NodeAwareActorRemovalStrategy, along with corresponding tests. The changes are positive, especially the replacement of magic numbers with constants.

However, I've identified a few issues, mainly in the test files. There are two critical bugs: one is an incorrect comment indentation that will cause a syntax error, and the other is a typo in a variable name that will lead to a NameError. I've also pointed out some minor issues related to redundant or misplaced comments. Please address these points to ensure the tests are correct and the code quality is maintained.

class SlowUDF:
def __init__(self):
# This UDF slows down actor creation, and thus
# This UDF slows down actor creation, and thus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This comment's indentation is incorrect, which will lead to a syntax error. It should be indented to be inside the __init__ method.

Suggested change
# This UDF slows down actor creation, and thus
# This UDF slows down actor creation, and thus
# 创建两个并发执行的dataset
# Verify data integrity instead of the number of batches
total_rows2 = sum(len(batch["id"]) for batch in results2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a typo in the variable name here. It should be result2 to match the variable defined on line 647, not results2. This will cause a NameError.

Suggested change
total_rows2 = sum(len(batch["id"]) for batch in results2)
total_rows2 = sum(len(batch["id"]) for batch in result2)
)

# 直接创建测试实例,不调用 setup_class()
"""Test that the node-aware strategy does not remove active actors."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This docstring-like comment is redundant as the function test_node_aware_strategy_with_active_actors already has a docstring. Please remove it to avoid confusion.

assert selected in [actor3, actor4, actor5]

# 验证统计逻辑
"""Test the logic for counting idle actors per node."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This docstring-like comment is used within the function body. It's better to use a standard hash (#) comment for inline explanations.

Suggested change
"""Test the logic for counting idle actors per node."""
# Test the logic for counting idle actors per node.
"""测试按权重分配资源给多个 actor pools"""
# 创建 mock operator 和 actor pools
"""Test distribute resources by weight among multiple actor pools"""
# mock operator 和 actor pools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment appears to be an incomplete translation. It should be in English for consistency with the rest of the file.

Suggested change
# mock operator actor pools
# mock operator and actor pools
# 创建两个并发执行的dataset
# Verify data integrity instead of the number of batches
total_rows2 = sum(len(batch["id"]) for batch in results2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Variable name typo causes NameError in test

The variable results2 is referenced but was never defined. The variable is assigned as result2 on line 647, but then incorrectly referenced as results2 (with an extra 's') on line 649. This will cause a NameError when the test executes.

Fix in Cursor Fix in Web

@zzchun zzchun force-pushed the add_resource_based_autoscaler branch from 069c1fa to 94f678a Compare November 27, 2025 10:45
)

# 直接创建测试实例,不调��� setup_class()
"""Test that the node-aware strategy does not remove active actors."""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Duplicate docstring used instead of comment

A string literal duplicating the function's docstring appears as a standalone statement on line 1023, where a comment should be. Looking at the diff, the original Chinese comment # 直接创建测试实例,不调用 setup_class() was mistakenly replaced with the function's docstring """Test that the node-aware strategy does not remove active actors.""" instead of the appropriate English comment. This string literal is a no-op and appears to be a translation error.

Fix in Cursor Fix in Web

@zzchun zzchun force-pushed the add_resource_based_autoscaler branch 2 times, most recently from 8051811 to 4340aef Compare November 27, 2025 11:56
@ray-gardener ray-gardener bot added core Issues that should be addressed in Ray Core community-contribution Contributed by the community labels Nov 27, 2025
@zzchun zzchun force-pushed the add_resource_based_autoscaler branch 3 times, most recently from f20b9d1 to 8e6f0e0 Compare November 28, 2025 03:07
@zzchun zzchun force-pushed the add_resource_based_autoscaler branch from 8e6f0e0 to 1381960 Compare November 28, 2025 03:12
test_case._wait_for_actor_ready(pool, ready_ref)

assert pool.current_size() == 6
assert pool.current_size() == 4
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test assertion changed breaks test logic consistency

The assertion at line 1098 was changed from pool.current_size() == 6 to pool.current_size() == 4, but the subsequent assertion at line 1105 still expects pool.current_size() == 3 after shrinking by 3 actors. This creates an arithmetic inconsistency: if the pool has 4 actors and shrinks by 3, the result should be 1 actor, not 3. The comment on line 1091 also still says "Expand to 6 actors" which contradicts the new assertion. The test will fail at runtime due to this inconsistency.

Additional Locations (1)

Fix in Cursor Fix in Web

min_resources=ExecutionResources(cpu=3, gpu=0, memory=3e9),
max_resources=ExecutionResources(cpu=6, gpu=0, memory=6e9)
min_resources=ExecutionResources(cpu=1, gpu=0, memory=3e9),
max_resources=ExecutionResources(cpu=6, gpu=0, memory=6e9),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Test for resource contention has incomplete commented-out code

Test case 3 in test_e2e_resource_based_autoscaling claims to test "resource contention" between two datasets, but several lines are commented out: ds4._execute_to_iterator(), shared_executor4 = ds4._current_executor, and the assertion for shared_executor4. The loop at line 678 only iterates over [shared_executor3] instead of both executors, so ds4 never has resource limits applied even though the comment says "Set the same resource limits for both executors". The test still runs both datasets concurrently, but only ds3 has resource limits, making the resource contention test incomplete.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core

1 participant