Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make CUDA graph benchmarking overridable on a per-op basis
Summary: some operators need to do gpu-cpu syncs, which is not supported under graph capture Reviewed By: davidberard98 Differential Revision: D58680076 fbshipit-source-id: 7c86c484990445512723ebdda25ef4af8cfffde5
- Loading branch information