-
Notifications
You must be signed in to change notification settings - Fork 581
RFC: StreamExecutor C API #257
Conversation
Defined a portable bool type TF_BOOL. Added structure size definitions and updated usage examples. Fixed typos where member functions are supposed to be member function pointers. Fixed missing extern "C".
| TF_Status* status); | ||
|
|
||
| // Causes the host code to synchronously wait for the event to complete. | ||
| void (*block_host_for_event)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @annarev It seems that block_host_until_done is removed, can we keep this API? block_host_until_done is a more large-grained API, it provides more flexibility for the backend to implement, and lots of device runtime already provide the capability to match this API(CUDA: cuStreamSynchronize(stream),SYCL queue.wait() ). I saw current implementation is composed of create_event+ record_event + block_host_for_event + delete event, that means each back-end needs to implements all of these API, which limits the flexibility, and I am not sure whether it will bring some performance penalty (for DPC++, we need to enqueue a barrier to retrieve the event)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
I am planning to add the following typedef struct SP_DeviceDescription {
size_t struct_size;
// Device hardware name. Used for printing.
// Must be null-terminated.
char* name;
// Device vendor name. Used for printing.
// Must be null-terminated.
char* device_vendor;
// Returns the PCI bus identifier for this device, of the form
// [domain]:[bus]:[device].[function]
// Used for printing.
// Must be null-terminated.
char* pci_bus_id;
TF_Bool has_numa_node; // True if `numa_node` is set.
// Returns the NUMA node associated with this device, for use in
// determining socket locality.
int numa_node;
TF_Bool has_memory_bandwidth; // True if `memory_bandwidth` is set.
// Device's memory bandwidth in bytes/sec. (This is for reads/writes to/from
// the device's own memory, not for transfers between the host and device.)
int64_t memory_bandwidth;
TF_Bool has_gflops; // True if `gflops` field is set.
// Estimate of floating point operations per second for this device * 10e-9.
double gflops;
} SP_DeviceDescription; |
* Added the Implementation Conventions and Usage Overview section. * Replaced the Stability / User Impact with the Versioning Strategy and Stability section. Also moved the section higher. * Updated the API design according to the real implementation. * Updated code examples.
…gableDevice RFC (PR tensorflow#262). * Referenced semver and TensorFlow's Version Compatibility page. * Added the Updating Guidelines, Convention, and Detecting Incompatibility sections. * Merged the current code examples with @yisitu's newer ones on PR tensorflow#262 and simplified them a bit.
… stream_executor_rfc
…to stream_executor_rfc
Also some more minor edits.
Co-authored-by: Situ Yi <60493024+yisitu@users.noreply.github.com>
Co-authored-by: Situ Yi <60493024+yisitu@users.noreply.github.com>
Changed "shall" to "must" to make sure people would not misinterpret "shall".
Updates based on recent changes and discussion on RFC PR tensorflow#262
Thanks!. This looks good to me. Just to confirm, all the versioning information is subsumed in struct_size, and we don't need any extra fields in |
Yes, we rely on |
|
This RFC has been up for a while. The implementation has even been checked in over a month ago. I think we should wrap this up (i.e., merge this PR). Changes to StreamExecutor C API since the last combined design review meeting with PluggableDevice on 8/13/20:
If anyone has anything else that needs to be discussed/resolved before this RFC can be merged (i.e., fundamental issues), please explicitly say so here by this Thursday, 9/24 at 11:59pm PT. |
Thanks Penporn, for reaching out. Nothing from our side. |
|
@kulinseth Thank you for the quick confirmation! :) @ematejska @alextp I believe there is no outstanding issue now. Could we move forward? |
|
Yes, let's merge the PR.
…On Fri, Sep 25, 2020 at 10:13 AM Penporn Koanantakool < ***@***.***> wrote:
@kulinseth <https://github.com/kulinseth> Thank you for the quick
confirmation! :)
@ematejska <https://github.com/ematejska> @alextp
<https://github.com/alextp> I believe there is no outstanding issue now.
Could we move forward?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#257 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABHRL4JHCQHDI5DPSTMLLSHTFTRANCNFSM4N4SSC6Q>
.
--
- Alex
|
theadactyl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@penpornk Hi, is there any PluggableDevice template for beginner? TKS |
The feedback phase will be open for two weeks until Tuesday June 30, 2020.
Objective
Provide basic device management C API to allow new devices to modularly connect
to the current TensorFlow runtime.