8358066: Non-ascii package names gives compilation error "import requires canonical name" #25567

archiecobbs · 2025-05-31T21:05:35Z

A simple counting bug in Convert.utfNumChars() causes bogus compiler errors for import statements of non-ASCII class names when the compiler is configured to use one of the older UTF-8 based Name table implementations (e.g., by specifying the -XDuseUnsharedTable=true flag).

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8358066: Non-ascii package names gives compilation error "import requires canonical name" (Bug - P3)

Reviewers

Jan Lahoda (@lahodaj - Reviewer)
Naoto Sato (@naotoj - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25567/head:pull/25567
$ git checkout pull/25567

Update a local copy of the PR:
$ git checkout pull/25567
$ git pull https://git.openjdk.org/jdk.git pull/25567/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25567

View PR using the GUI difftool:
$ git pr show -t 25567

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25567.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-05-31T21:06:16Z

👋 Welcome back acobbs! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-05-31T21:06:19Z

@archiecobbs This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8358066: Non-ascii package names gives compilation error "import requires canonical name"

Reviewed-by: jlahoda, naoto

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 73 new commits pushed to the master branch:

da49fa5: 8354460: Streaming output for attach API should be turned on by default
704b599: 8358534: Bailout in Conv2B::Ideal when type of cmp input is not supported
e235b61: 8357987: [JVMCI] Add support for retrieving all methods of a ResolvedJavaType
... and 70 more: https://git.openjdk.org/jdk/compare/84002d12ed83c8254422fdda349aa647422d0768...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2025-05-31T21:06:46Z

@archiecobbs The following label will be automatically applied to this pull request:

compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-06-02T14:27:07Z

Webrevs

01: Full - Incremental (4967cac9)
00: Full (18f9c0c9)

lahodaj

Overall, looks sensible. Comments for consideration inline.

lahodaj · 2025-06-02T17:59:19Z

test/langtools/tools/javac/nametable/TestUtfNumChars.java

+    public static void main(String[] args) {
+
+        // This is the string "ab«cd≤ef🟢gh"
+        String s = "ab\u00ABcd\u2264ef\ud83d\udd34gh";


Nit: not sure if there's a strong reason to use escapes in the string literal, esp. given the Unicode characters are used in the comment above. Given #24574 is integrated, I would say, use UTF-8 in the string literal, and drop the comment?

I was aware of that recent change, but I don't understand the testing mechanics well enough to verify that -encoding utf-8 is being added to the regression test compilation step on every possible platform (and wouldn't that be a jtreg thing, not an openjdk thing?)

So I was playing it safe, but if you say it's OK to assume compilation is always being done with -encoding utf-8 then I'll take your word for it :)

src/jdk.compiler/share/classes/com/sun/tools/javac/util/Convert.java

lahodaj

Looks good to me. (I didn't run tests, please ask if you would want me to run them.)

archiecobbs · 2025-06-03T14:46:02Z

Looks good to me. (I didn't run tests, please ask if you would want me to run them.)

Thanks for the review!

Re: tests, to be honest I'm not sure what criteria to use to determine that. This change seems pretty innocuous but "seems" is a dangerous word. I'm happy to follow your advice on this... ? Thanks.

jolarsen · 2025-06-03T16:08:20Z

For what it's worth: I reported the issue, and the test I wrote - splitting a UTF-8 import statement into a list of package-parts and classname using lastIndexOfAscii('.') and utfNumChars - now works fine.
Building javac and verifying the whole build is too big of a task for me.
This was a trip down memory lane as I did a 5 year stint at Sun Micro some 20 years ago :-) The times of AppServer 8 and start of GlassFish.

lahodaj · 2025-06-03T18:36:46Z

Looks good to me. (I didn't run tests, please ask if you would want me to run them.)

Thanks for the review!

Re: tests, to be honest I'm not sure what criteria to use to determine that. This change seems pretty innocuous but "seems" is a dangerous word. I'm happy to follow your advice on this... ? Thanks.

I've started a test run, the results will hopefully be tomorrow (my time, CEST). I think we should wait with the integration before they run.

Alternatively you could issue /integrate delegate, and I'd finish the integration if the tests are OK. But I think there's still time before RDP1, so there's not (yet) a need to do this.

archiecobbs · 2025-06-03T18:41:43Z

I've started a test run, the results will hopefully be tomorrow (my time, CEST). I think we should wait with the integration before they run.

Sounds great - thanks.

naotoj · 2025-06-03T21:21:46Z

Just a drive-by comment, but should we check the validity of off? What if off points to the 2nd or 3rd byte in a character in the buffer?

archiecobbs · 2025-06-03T21:34:43Z

Just a drive-by comment, but should we check the validity of off? What if off points to the 2nd or 3rd byte in a character in the buffer?

Good question. This method is explicitly documented as assuming that the data is valid UTF-8. It's not trying to handle invalid data.

naotoj · 2025-06-03T21:51:55Z

Just a drive-by comment, but should we check the validity of off? What if off points to the 2nd or 3rd byte in a character in the buffer?

Good question. This method is explicitly documented as assuming that the data is valid UTF-8. It's not trying to handle invalid data.

I meant the validity of off, not the UTF-8 data. For example in your test case, if the off is 16, it will return 3 chars although the first one is only the trailing byte. So I guess some comment may help here

archiecobbs · 2025-06-03T23:13:04Z

Good question. This method is explicitly documented as assuming that the data is valid UTF-8. It's not trying to handle invalid data.

I meant the validity of off, not the UTF-8 data. For example in your test case, if the off is 16, it will return 3 chars although the first one is only the trailing byte. So I guess some comment may help here

Well I guess to be more specific the method assumes that the given range of bytes is valid UTF-8. But yes you are right, this could all be better (more precisely) documented. That's for another PR though, I am loath to further delay this one at this point since it's already approved and the JDK 25 lop off happens tomorrow.

naotoj

That's for another PR though, I am loath to further delay this one at this point since it's already approved and the JDK 25 lop off happens tomorrow

Totally fine by me

lahodaj · 2025-06-04T04:58:14Z

Tests (tier1-3) passed, so OK to integrate, I think. Thanks!

openjdk bot added the compiler compiler-dev@openjdk.org label May 31, 2025

Fix bug in Convert.utfNumChars().

18f9c0c

archiecobbs force-pushed the JDK-8358066 branch from b18a853 to 18f9c0c Compare June 2, 2025 14:16

archiecobbs marked this pull request as ready for review June 2, 2025 14:23

openjdk bot added the rfr Pull request is ready for review label Jun 2, 2025

lahodaj reviewed Jun 2, 2025

View reviewed changes

archiecobbs added 2 commits June 2, 2025 14:59

Simplify code using review suggestion.

91a9037

Fix glitch in exception message.

4967cac

lahodaj approved these changes Jun 3, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jun 3, 2025

naotoj approved these changes Jun 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

8358066: Non-ascii package names gives compilation error "import requires canonical name" #25567

8358066: Non-ascii package names gives compilation error "import requires canonical name" #25567

archiecobbs commented May 31, 2025 •

edited by openjdk bot

Loading

bridgekeeper bot commented May 31, 2025

openjdk bot commented May 31, 2025 •

edited

Loading

openjdk bot commented May 31, 2025

mlbridge bot commented Jun 2, 2025 •

edited

Loading

lahodaj left a comment

lahodaj Jun 2, 2025

archiecobbs Jun 2, 2025

Uh oh!

lahodaj left a comment

archiecobbs commented Jun 3, 2025

jolarsen commented Jun 3, 2025 •

edited by bridgekeeper bot

Loading

lahodaj commented Jun 3, 2025

archiecobbs commented Jun 3, 2025

naotoj commented Jun 3, 2025 •

edited

Loading

archiecobbs commented Jun 3, 2025

naotoj commented Jun 3, 2025

archiecobbs commented Jun 3, 2025

naotoj left a comment

lahodaj commented Jun 4, 2025

8358066: Non-ascii package names gives compilation error "import requires canonical name" #25567

Are you sure you want to change the base?

8358066: Non-ascii package names gives compilation error "import requires canonical name" #25567

Conversation

archiecobbs commented May 31, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented May 31, 2025

openjdk bot commented May 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

openjdk bot commented May 31, 2025

mlbridge bot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

lahodaj left a comment

Choose a reason for hiding this comment

lahodaj Jun 2, 2025

Choose a reason for hiding this comment

archiecobbs Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

lahodaj left a comment

Choose a reason for hiding this comment

archiecobbs commented Jun 3, 2025

jolarsen commented Jun 3, 2025 • edited by bridgekeeper bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lahodaj commented Jun 3, 2025

archiecobbs commented Jun 3, 2025

naotoj commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

archiecobbs commented Jun 3, 2025

naotoj commented Jun 3, 2025

archiecobbs commented Jun 3, 2025

naotoj left a comment

Choose a reason for hiding this comment

lahodaj commented Jun 4, 2025

archiecobbs commented May 31, 2025 •

edited by openjdk bot

Loading

openjdk bot commented May 31, 2025 •

edited

Loading

mlbridge bot commented Jun 2, 2025 •

edited

Loading

jolarsen commented Jun 3, 2025 •

edited by bridgekeeper bot

Loading

naotoj commented Jun 3, 2025 •

edited

Loading