-
Notifications
You must be signed in to change notification settings - Fork 165
Add ObjectCode
ptx constructor
#470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the title of the review should be changed to something like : "Add ObjectCode constructors" or something like that. Linking multiple PTX object codes is already supported.
I personally don't think we should allow people to be constructing ptx object codes, but if we are going to do so, I agree with Leo that we need to be very explicit in the documentation of the constructor, that it is not the main path and should only be used when necessary.
Thanks for contributing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by comments; very minor; I was mostly curious.
On CUDA 12, in a configuration where the driver is behind the runtime, I'm encountering an MVC error at
I'm guessing this has something to do with not invoking nvjitlink where we otherwise should be, I'm investigating. |
As far as we know, this is expected behaviour. This is an error raised by the driver. We raise a warning from Program.Compile() if you compile ptx in an enironment where the driver is behind the runtime. Did you get this warning? |
/ok to test |
|
Keenan is right and this error is reproduced in the CI (with CTK 12.8 & driver 12.4). @brandon-b-miller we should catch the warning
as noted by Keenan and skip the |
What do you think of the changes in 9276d11? It seems we dodge the warning when constructing an ObjectCode directly since we avoid the |
/ok to test |
Should we explore integrating and providing an interface to |
Right, the warning is raised in the fixture which is passed as a test function argument. On top of my head I don't know if there's a way to capture such warnings and do a conditional skip. This change looks fine to me. |
@kkraus14 we discussed that before and decided to avoid that because |
IIRC the motivating use case for
I would argue this is becoming less and less of a niche use case as we package CTK better and better for Python users. Regarding bloat, in the fullness of time we should probably create a Pythonic interface / bindings to it and package it as a separate library that could be optionally used by |
I think that example is really a corner case that we want to discourage. If one has multiple PTX and/or LTO-IR to link together, and we want the final output to be loadable by the driver (not just for humans to inspect/debug the output), then the output should be a CUBIN not PTX. That example is not following the recommended practice as per the CUDA Compatibility doc. nvJitLink is designed for exactly this use case. An old driver cannot load a PTX generated by a newer toolchain, but it has no problem loading CUBIN generated by a newer toolchain, and nvJitLink can generate a CUBIN as the final output. In fact, cuda.core uses nvJitLink for multiple purposes due to this exceptional capability:
This allows us to have a full solution for CUDA minor version compatibility.
At one point I would like to cover bindings for all CTK components (except for math libs, which now lives in |
Based on the documentation it looks like nvJitLink uses nvptxcompiler based on needing to include it if static linking, so this makes sense. I wasn't aware of this capability. Thanks! |
Yeah I feel we should document some of these discussions. Perhaps as a Q&A or tips & tricks emphasizing CUDA minor version compat support. |
Thanks @brandon-b-miller @ksimpson-work for driving this PR across finish line! |
See #480. |
Adds the ability to link multiple PTX files with one linker instance.