Skip to content

Conversation

@hsfzxjy
Copy link
Contributor

@hsfzxjy hsfzxjy commented Aug 21, 2023

Fix #190255

This PR presents two faster escaping functions as the replacement of the original slower one.

The first one escapes each byte 0xab into '\xab', which is most scalable and has promising runtime
efficiency, except that it relies on external commands od and tr.

The second one utilizes bash builtin string substitution, which is much faster and has zero dependency, except that it escapes only '\\' -> '\\\\' and ';' -> '\x3b', which scales up badly for more patterns.

Currently it defaults to use the second function, but if env VSCODE_SHELL_FULL_ESCAPE is set and od, tr are
available, it will switch to the first one.

We may discuss more on the choice of the implementation.

@hsfzxjy hsfzxjy force-pushed the fast-bash-escape-190255 branch from 84a3540 to d0b8386 Compare August 21, 2023 16:29
Comment on lines 106 to 108
# We default to use the second function, but if env VSCODE_SHELL_FULL_ESCAPE is set and od, tr are
# available, we will switch to the first one.
if [ -n "$VSCODE_SHELL_FULL_ESCAPE" ] && command -v od &>/dev/null && command -v tr &>/dev/null; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is VSCODE_SHELL_FULL_ESCAPE required because you're concerned about breakages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both solutions have their pros and cons, so I temporarily make one of them as default and the other opt-in controlled by a newly introduced flag VSCODE_SHELL_FULL_ESCAPE, but this is apparently not the most optimal solution. I am open to consider any further advice as you see fit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps select a magic number for $1 length and opt for the new approach when it's above that threshold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. What about make it 2000 bytes? This is a point where I can observe around 0.1s delay with the old method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On your machine how long does the new method take? 100ms is quite large still

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some benchmarks

Input size (bytes) string sub(new default) od+tr(new opt-in) old
2k 1ms 5ms 107ms
100k 33ms 48ms >1 min

I think with the new method the overhead is tolerable even with quite large input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They both seem very fast. If I understand right, the od+tr method is superior as it escapes more correctly. Since perf is comparable, I think we should default to the od+tr method if they're available without an opt-in variable, then fallback to the more naive method when they're not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I will change accordingly. Shall we still set up the magic number?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's make it 2k

Copy link
Contributor Author

@hsfzxjy hsfzxjy Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, is /dev/null always available for the execution context of shellIntegration-bash.sh? The script is using command -v od &>/dev/null to detect the availability of od.

Edit: never mind. I found solution to get around it.

@Tyriar Tyriar added this to the August 2023 milestone Aug 22, 2023
@hsfzxjy hsfzxjy force-pushed the fast-bash-escape-190255 branch 2 times, most recently from 961cfa0 to f8e932f Compare August 23, 2023 15:14
@hsfzxjy
Copy link
Contributor Author

hsfzxjy commented Aug 23, 2023

The latest push contains these further modification:

  1. od+tr becomes the default solution for faster escaping, and if od or tr is not available, we fallback to use the string substitution solution.
  2. The old __vsc_escape_value function is still used when the length of $1 is below 2000. Otherwise we switch to the new faster function.
@hsfzxjy hsfzxjy force-pushed the fast-bash-escape-190255 branch 2 times, most recently from bba6901 to d273696 Compare August 24, 2023 07:28
@hsfzxjy hsfzxjy force-pushed the fast-bash-escape-190255 branch from d273696 to 440896c Compare August 26, 2023 05:20
Tyriar
Tyriar previously approved these changes Aug 28, 2023
Copy link
Member

@Tyriar Tyriar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hsfzxjy, this looks good. I'll wait until next month to merge so we get some testing in Insiders before it goes out.

@Tyriar Tyriar modified the milestones: August 2023, September 2023 Aug 28, 2023
DonJayamanne
DonJayamanne previously approved these changes Aug 28, 2023
@hsfzxjy hsfzxjy dismissed stale reviews from DonJayamanne and Tyriar via 440896c August 29, 2023 04:21
@hsfzxjy hsfzxjy force-pushed the fast-bash-escape-190255 branch 3 times, most recently from 5c76f39 to 2067753 Compare August 29, 2023 04:34
@hsfzxjy
Copy link
Contributor Author

hsfzxjy commented Aug 29, 2023

Accidental force push dimissed the reviews. Sorry! 🙏

@Tyriar Tyriar merged commit fa0b183 into microsoft:main Sep 11, 2023
@Tyriar
Copy link
Member

Tyriar commented Sep 28, 2023

This ended up getting reverted due to a hang #190255 (comment)

@github-actions github-actions bot locked and limited conversation to collaborators Oct 26, 2023
@hsfzxjy hsfzxjy deleted the fast-bash-escape-190255 branch December 5, 2023 15:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

5 participants