-
Notifications
You must be signed in to change notification settings - Fork 892
refactor: cleanup app-router #7628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the AppRouter class to improve code organization and testability by extracting functionality into separate, focused modules. The main changes involve creating dedicated classes for path validation and directory scanning, introducing an AppDefaults dataclass to simplify parameter passing, and adding comprehensive test coverage.
Key changes:
- Extracted path validation logic into
PathValidatorclass for better separation of security concerns - Extracted directory scanning logic into
DirectoryScannerclass for improved testability - Introduced
AppDefaultsdataclass to reduce parameter repetition across the codebase
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
marimo/_server/files/path_validator.py |
New module containing PathValidator class for security validation of file paths |
marimo/_server/files/directory_scanner.py |
New module containing DirectoryScanner class and is_marimo_app function |
marimo/_server/app_defaults.py |
New dataclass for consolidating app default configuration parameters |
marimo/_server/file_router.py |
Refactored to delegate to PathValidator and DirectoryScanner, significantly reducing code size |
marimo/_server/session_manager.py |
Updated to use AppDefaults instead of passing individual parameters |
marimo/_session/notebook/file_manager.py |
Updated to accept AppDefaults instead of individual config parameters |
marimo/_server/api/endpoints/home.py |
Updated imports to reference DirectoryScanner.MAX_FILES |
marimo/_server/api/endpoints/assets.py |
Updated to use PathValidator for security checks |
marimo/_pyodide/bootstrap.py |
Updated to use AppDefaults when creating AppFileManager |
tests/_server/test_path_validator.py |
New comprehensive tests for PathValidator functionality |
tests/_server/test_directory_scanner.py |
New comprehensive tests for DirectoryScanner functionality |
tests/_server/test_file_router.py |
Updated to use PathValidator instead of validate_inside_directory function |
tests/_server/rtc/test_rtc_doc.py |
Minor optimization moving AsyncGenerator import into TYPE_CHECKING block |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| validate_inside_directory(public_dir, file_path) | ||
| validator = PathValidator(public_dir) | ||
| validator.validate_inside_directory(public_dir, file_path) |
Copilot
AI
Dec 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PathValidator is instantiated with public_dir as the base directory on line 316, but then validate_inside_directory is called with public_dir and file_path again on line 317. This is redundant because you could use the instance's validate_file_access method instead, or simply call validate_inside_directory without passing public_dir to the constructor.
The current code creates a PathValidator with a base_directory but then doesn't leverage that instance state. Consider either:
- Using
validator.validate_file_access(file_path)instead, which would internally use the base_directory, OR - Creating the validator without a base directory:
validator = PathValidator()and keeping the currentvalidate_inside_directorycall
| validator.validate_inside_directory(public_dir, file_path) | |
| validator.validate_file_access(file_path) |
| from marimo._server.models.models import ( | ||
| CopyNotebookRequest, | ||
| SaveNotebookRequest, | ||
| ) |
Copilot
AI
Dec 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The imports CopyNotebookRequest and SaveNotebookRequest are imported twice: once at lines 20-23 as regular imports, and again at lines 35-39 inside the TYPE_CHECKING block. These are redundant - since they're already imported at the top, the TYPE_CHECKING imports should be removed.
| raise HTTPException( | ||
| status_code=HTTPStatus.REQUEST_TIMEOUT, | ||
| detail=f"Request timed out: Loading workspace files took too long. Showing first {file_count[0]} files.", # noqa: E501 | ||
| ) |
Copilot
AI
Dec 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The accumulated_results list is populated during directory scanning but is never returned when a timeout occurs. In the previous implementation (in file_router.py), partial results were stored in self._lazy_files before raising the timeout exception, allowing the caller to return partial results.
In the new DirectoryScanner, when a timeout occurs at line 161, the accumulated_results are lost. The caller in file_router.py (lines 268-275) expects to handle partial results on timeout, but self._lazy_files will still be None, so it falls back to an empty list instead of returning the files that were successfully scanned.
To fix this, DirectoryScanner should either:
- Include accumulated_results in the HTTPException's detail or as a custom attribute
- Return accumulated_results before raising the timeout exception
- Use a callback mechanism to provide partial results
| raise HTTPException( | |
| status_code=HTTPStatus.REQUEST_TIMEOUT, | |
| detail=f"Request timed out: Loading workspace files took too long. Showing first {file_count[0]} files.", # noqa: E501 | |
| ) | |
| exc = HTTPException( | |
| status_code=HTTPStatus.REQUEST_TIMEOUT, | |
| detail=f"Request timed out: Loading workspace files took too long. Showing first {file_count[0]} files.", # noqa: E501 | |
| ) | |
| # Attach partial results so callers can access scanned files | |
| setattr(exc, "partial_results", accumulated_results) | |
| raise exc |
| contain `marimo-version:`. | ||
| - Python (`.py`) files are marimo apps if the header (first 512 bytes) | ||
| contains both `marimo.App` and `import marimo`. | ||
| - If the header contains `# /// script`, read the full file and check for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the 512 trick is a DDOS prevention isn't this just an easy trick around that?
Could just bump this up a fair amount in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see for perf. May as well only check an MB or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only a refactor, not changing any logic
|
|
||
|
|
||
| @dataclass | ||
| class AppDefaults: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chance to couple with marimo/_ast/app_config.py? Agree we shouldn't move around a full AppConfig object though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is only for the server though, i don't think "defaults" belong in _ast
| contain `marimo-version:`. | ||
| - Python (`.py`) files are marimo apps if the header (first 512 bytes) | ||
| contains both `marimo.App` and `import marimo`. | ||
| - If the header contains `# /// script`, read the full file and check for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see for perf. May as well only check an MB or something
Reduce the size of the
AppRouterclass, make it more testable, add more tests.