feat: gopls allow fuzzy matching with spelling errors #612

SimoneDutto · 2025-12-20T16:24:28Z

Description

Before the fuzzy matching was skipped when not all characters could be matched. Now it's changed in favor of getting the score for the number of characters that match.

By debugging and trying to understand the current scoring mechanism I think I've understood m.scores is a matrix [length-candidate][length-pattern][another-dim].

Before it was getting the score always from len(candidate) and len(pattern), but this is wrong in case we know the pattern has x amount of char actually matching.

So:

now match requires full matching chars for short candidate
now the score is get from len(candidate), num of matching chars in pattern.

Unit tests

I've added a few unit tests I thought showed the improvement.
Without my patch the failures in the new test cases are:

/tools/gopls/internal/fuzzy/matcher_test.go:270: Score(tstincrementatlnope, TestIncrementalNope) = 0, want: 0.61842
/tools/gopls/internal/fuzzy/matcher_test.go:270: Score(testssssss, TestIncrementalNope) = 0, want: 0.4

Manual QA

By trying to use the examples in the issue are satisfied, and in general it seems to match broader. I'm not entirely sure how to test false positives.

Fix: golang/go#74793

Before the fuzzy matching was skipped when not all characters could be matched. Now it's changed in favor of getting the score for the number of characters that match.

google-cla · 2025-12-20T16:24:34Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

muirdm · 2025-12-24T19:24:06Z

gopls/internal/fuzzy/matcher.go

 	m.roles = RuneRoles(candidate, m.rolesBuf[:])

-	return true
+	return true, j


I don't think this is quite right. j is checking exact consecutive matches from the start of the pattern. If the typo appears near the beginning of the pattern (e.g. "tetsincrementalnope") the scoring won't be right since we will only consider "te" (I think?).

What happens if we just let the entire pattern score instead of trimming to the prefix?

I don't think this is quite right. j is checking exact consecutive matches from the start of the pattern. If the typo appears near the beginning of the pattern (e.g. "tetsincrementalnope") the scoring won't be right since we will only consider "te" (I think?).

Thanks for the comment, it made me realized match was more nuanced than i've anticipated.
j expresses the number of matching chars going through the pattern a single time, not necessarly consecutive.
So tetsincrementalnope -> j is 3, because t and e are consecutive matches, but the third t is found later in the string, so it is counts as well.

However, this 3 seems important for the score func, because if we read the resulting matrix.

it's evident we have a significant drop in score after that 3.

So i would say that scores relies on the number of matching characters we can found reading through the pattern and the candidate a single time.
So, making a spelling error with a letter that can be later found in the string is better than making a spelling error with a wrong char.
Ex. tetsincrementalnope (score 3) is better than tewtincrementalNope (score 2)

By looking at the score matrix, i don't think we can change this behavior without changing the score func as well.
I don't know if you have suggestion on how this proceed, because this change that i've made is:

making spelling mistakes less punishing later in the string, because we are basically cutting the pattern to be matched at the first mistake (bar some exceptions discussed above)

But it is not:

fixing spelling mistakes in general, like if we would use Levenstein

It might be worth to get rid of this scoring mechanism since it's kind of weird with spelling mistakes, but it would be a bigger job, plus changing some people's UX because the new scoring system will be different from the current one.

muirdm · 2025-12-24T19:38:51Z

gopls/internal/fuzzy/matcher.go

+	// if the candidate is short the characters have to match completely.
+	if len(candidate) <= shortPatternSize && j != len(m.patternLower) {
+		return false, 0
 	}


Why do we limit sloppy matches to longer patterns? I feel like we should instead filter based on a threshold of sloppy characters (i.e allow 1 or 2 non-matching characters, maybe depending on pattern length).

I'm assuming this early return here is for performance (i.e. want to skip the expensive scoring for candidates that clearly don't match). We don't want an O(n^2) check here, but maybe we can handle a O(3n) where we backtrack a couple times to allow up to 2 non-matching characters.

Yeah, this early return was becasue my change is allowing the score func to run more frequently, and this is just an holistic cutoff like "if there candidate to be matched is really short it must have all chars matching`.

However this is entirely up to debate, it was just to throw the idea of having a shortcut for short candidates.

I'm experimented with the idea of "at least 30% of matching characters" or something like that, but i would honestly just prefer to have a simple, clear cut-off and run with it than hitting edge cases with percentages.
As I said, i'm entirely open to change this approach once we solve the other comment's problem, which is more important!

feat: gopls allow fuzzy matching with spelling errors

2b9b5d2

Before the fuzzy matching was skipped when not all characters could be matched. Now it's changed in favor of getting the score for the number of characters that match.

SimoneDutto mentioned this pull request Dec 20, 2025

x/tools/gopls: Symbols: use Levenshtein distance for ranking golang/go#74793

Open

muirdm reviewed Dec 24, 2025

View reviewed changes

SimoneDutto requested a review from muirdm December 29, 2025 21:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: gopls allow fuzzy matching with spelling errors #612

feat: gopls allow fuzzy matching with spelling errors #612

SimoneDutto commented Dec 20, 2025 •

edited

Loading

google-cla bot commented Dec 20, 2025

muirdm Dec 24, 2025

SimoneDutto Dec 26, 2025

SimoneDutto Dec 26, 2025

muirdm Dec 24, 2025

SimoneDutto Dec 26, 2025

Labels

2 participants

feat: gopls allow fuzzy matching with spelling errors #612

Are you sure you want to change the base?

feat: gopls allow fuzzy matching with spelling errors #612

Conversation

SimoneDutto commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Unit tests

Manual QA

google-cla bot commented Dec 20, 2025

muirdm Dec 24, 2025

Choose a reason for hiding this comment

SimoneDutto Dec 26, 2025

Choose a reason for hiding this comment

SimoneDutto Dec 26, 2025

Choose a reason for hiding this comment

muirdm Dec 24, 2025

Choose a reason for hiding this comment

SimoneDutto Dec 26, 2025

Choose a reason for hiding this comment

Labels

2 participants

SimoneDutto commented Dec 20, 2025 •

edited

Loading