⚡️ Speed up function _sanitize_list by 44% in PR #10820 (cz/add-logs-feature)
#11171
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #10820
If you approve this dependent PR, these changes will be merged into the original PR branch
cz/add-logs-feature.📄 44% (0.44x) speedup for
_sanitize_listinsrc/backend/base/langflow/services/database/models/transactions/model.py⏱️ Runtime :
3.65 milliseconds→2.53 milliseconds(best of77runs)📝 Explanation and details
The optimized code achieves a 44% speedup by introducing a cached version of the
_is_sensitive_key()function using@lru_cache(maxsize=512).Key optimization:
The critical change is wrapping
_is_sensitive_key()with an LRU cache through the new_is_sensitive_key_cached()function. The line profiler data reveals the impact:_is_sensitive_key(key)took 11.4ms (33.8% of_sanitize_dicttime)_is_sensitive_key_cached(key)took 2.9ms (11.7% of_sanitize_dicttime)This represents a ~74% reduction in time spent on sensitivity checking within
_sanitize_dict.Why this works:
The
_is_sensitive_key()function performs:key.lower())SENSITIVE_KEY_NAMESSENSITIVE_KEYS_PATTERN.match()These operations, especially regex matching, are computationally expensive. In typical usage, dictionaries often have repeated keys across multiple records (e.g., "api_key", "password", "username"). The cache with
maxsize=512stores previously computed results, converting O(n) regex operations into O(1) dictionary lookups for repeated keys.Performance characteristics:
The test results show consistent speedups across all test cases, particularly:
test_large_list_of_dictswith 500 identical structures)SENSITIVE_KEY_NAMES, meaning the cache hit happens immediately on second occurrenceThe
maxsize=512is well-sized for typical workloads—most applications have far fewer than 512 unique key names, ensuring high cache hit rates without excessive memory overhead.Workload impact:
Without
function_references, the specific call context is unclear. However, given this is in a database transaction model for sanitizing logs/data, this optimization is particularly valuable for:✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr10820-2025-12-30T19.01.21and push.