[question] Seeking information on low-level TPU interaction and libtpu.so API #7803

notlober · 2024-08-02T10:16:01Z

I'm looking to build an automatic differentiation library for TPUs without using high-level front-ends like TensorFlow/JAX/PyTorch-XLA, but I'm finding information about lower-level TPU usage is practically non-existent.

Specifically, I'm interested in:

How to interact with TPUs at a lower level than what's typically exposed in TensorFlow
Information about the libtpu.so library and its API
Any resources or documentation on implementing custom TPU operations

Are there any insights or suggestions on how to approach this, particularly regarding TPU support? Any ideas or help would be greatly appreciated.

I understand that some of this information might be proprietary, but any guidance on what is possible or available would be very helpful.

JackCaoG · 2024-08-02T17:02:37Z

@will-cromar should be able to share some information.

will-cromar · 2024-08-02T18:01:11Z

All three frameworks interact with libtpu through the PJRT plugin API. Most of the core API for PJRT is documented in comments here: https://github.com/openxla/xla/blob/main/xla/pjrt/pjrt_client.h

Almost all of our interactions with PJRT are in this folder, and it's largely independent from PyTorch itself: https://github.com/pytorch/xla/tree/master/torch_xla/csrc/runtime

Specifically, to create a PJRT TPU client, you would need to go through the PjRtCApiClient similar to this (device_type = "tpu", library_path = "/path/to/libtpu.so"):

xla/torch_xla/csrc/runtime/pjrt_registry.cc

Lines 118 to 126 in dd3b00c

 const PJRT_Api* c_api = *pjrt::LoadPjrtPlugin( 

 absl::AsciiStrToLower(device_type), plugin->library_path()); 

 XLA_CHECK_OK(pjrt::InitializePjrtPlugin(device_type)); 

 auto create_options = plugin->client_create_options(); 

 client = xla::GetCApiClient( 

 absl::AsciiStrToUpper(device_type), 

 {create_options.begin(), create_options.end()}, kv_store) 

 .value(); 

 profiler::RegisterProfilerForPlugin(c_api);

Once you have a client instantiated, then your interactions are going to look a lot like this example from JAX: https://github.com/google/jax/blob/main/examples/jax_cpp/main.cc

We use the PJRT C++ API direcly, but it's worth noting that (other than the example above) JAX actually mainly interacts with PJRT through Python bindings. I not nearly as familiar with those, so you'll have better luck asking in their repository if you want to use the same bindings.

The framework code outside of libtpu.so is all open source. I'm happy to help if you have any questions about the PJRT C++ API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[question] Seeking information on low-level TPU interaction and libtpu.so API #7803

[question] Seeking information on low-level TPU interaction and libtpu.so API #7803

notlober commented Aug 2, 2024

JackCaoG commented Aug 2, 2024

will-cromar commented Aug 2, 2024

[question] Seeking information on low-level TPU interaction and libtpu.so API #7803

[question] Seeking information on low-level TPU interaction and libtpu.so API #7803

Comments

notlober commented Aug 2, 2024

JackCaoG commented Aug 2, 2024

will-cromar commented Aug 2, 2024