Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LightMetal - Add LoadTrace() API and move TraceDescriptor out of detail namespace (#17039) #17313

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

kmabeeTT
Copy link
Contributor

@kmabeeTT kmabeeTT commented Jan 29, 2025

Ticket

I am breaking up original PR I put up recently into smaller bite-sized chunks as @omilyutin-tt suggested. Here is round 3.

[Feature Request] Add Light Metal capture/replay initial changes to tt-metal for some workloads #17039

Problem description

This is initial/bootstrapping changes for "Light Metal" capture/replay feature that uses Flatbufers as serialization/deserialization library. See my previous bigger PR (#16573) if you want to see how this will be used by followup merge, or context.

What's changed

  • Add new API LoadTrace(IDevice* device, const uint8_t cq_id, const uint32_t trace_id, TraceDescriptor& trace_desc) that is unused now, but will be used by Light Metal replay after upcoming PR, when executing a metal-trace traced program from binary, TraceDescriptor is extracted from flatbuffer binary and loaded to device through this API.
  • Remove TraceDescriptor out of detail namespace after discussion in parent PR. Don't want to expose detail objects on public APIs. The trace_buffer.hpp that contains TraceDescriptor was already in public API folder (nice).
  • Unrelated - Change trace_buffer.hpp to use fwd decl Buffer instead of buffer.hpp incl to reduce dependencies on users of trace_buffer.hpp

I incorporated some feedback in parent PR to use "trace_id" as variable name instead of "tid" (which is departure from other trace-related APIs, but in order to keep this PR contained, not updating existing APIs).

Checklist

  • Local build pass, and docs build pass.
  • Post commit CI passes (running)
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • (For models and ops writers) Full new models tests passes
  • New/Existing tests provide coverage for changes

tt_metal/api/tt-metalium/device.hpp Outdated Show resolved Hide resolved
@@ -312,7 +312,7 @@ EnqueueTraceCommand::EnqueueTraceCommand(
uint32_t command_queue_id,
IDevice* device,
SystemMemoryManager& manager,
std::shared_ptr<detail::TraceDescriptor>& descriptor,
std::shared_ptr<TraceDescriptor>& descriptor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, I think all of these can and should be const &. (outside of the PR, just a drive by comment:) )

tt_metal/api/tt-metalium/host_api.hpp Outdated Show resolved Hide resolved
tt_metal/api/tt-metalium/device_impl.hpp Outdated Show resolved Hide resolved
this->id_,
active_sub_device_manager->id());

auto& trace_buffer = active_sub_device_manager->create_trace(trace_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(outside of this PR)

These are weird APIs that return references to smart pointers. @dmakoviichuk-tt wdyt? I think these:

    const std::unique_ptr<Allocator>& get_initialized_allocator(SubDeviceId sub_device_id) const;
    std::unique_ptr<Allocator>& sub_device_allocator(SubDeviceId sub_device_id);
    std::shared_ptr<TraceBuffer>& create_trace(uint32_t tid);

Should be re-written as:

    const Allocator* get_initialized_allocator(SubDeviceId sub_device_id) const;
    Allocator* sub_device_allocator(SubDeviceId sub_device_id);
    std::shared_ptr<TraceBuffer> create_trace(uint32_t tid);

tt_metal/api/tt-metalium/host_api.hpp Outdated Show resolved Hide resolved
…il namespace (#17039)

 - Will be used by Light Metal replay after upcoming PR, when executing
   a metal-trace traced program from binary, TraceDescriptor is
   extracted from flatbuffer binary and loaded to device through this API.
 - Unrelated - Change trace_buffer.hpp to use fwd decl Buffer instead of
   buffer.hpp incl to reduce dependencies on users of trace_buffer.hpp
@kmabeeTT kmabeeTT force-pushed the kmabee/light_metal_add_apis2 branch from 6294297 to d1eb7be Compare January 29, 2025 21:31
Copy link
Contributor

@omilyutin-tt omilyutin-tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for making the changes.

@kmabeeTT kmabeeTT merged commit 0971366 into main Jan 29, 2025
10 checks passed
@kmabeeTT kmabeeTT deleted the kmabee/light_metal_add_apis2 branch January 29, 2025 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants