F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Integrating Python Profiler with Flyte Task

Summary

The user is integrating a sampling Python call stack profiler with a Flyte task by creating a new PodSpec in flytekit for a shared process namespace. They have configured the profiler to access the task's memory maps but are having difficulty identifying the correct process ID (PID) of the user task, as their method using pgrep -f pyflyte-execute is ineffective due to multiple processes. The user is looking for advice on the appropriate process name for obtaining the PID or considering a custom PythonFunctionTask that records the task's PID to a shared volume. Their goal is to identify hot paths in individual tasks using a sampling profiler and eventually render them as flamegraphs. They are currently using Austin to sample the task process frame stack and are exploring the possibility of attaching the profiler programmatically at the entry point, although they see limited advantages to this approach.

Status
resolved
Tags
    Source
    #ask-the-community
      m

      mhagel

      11/7/2024

      Awesome! We added memray to our task wrappers internally already, in an almost equivalent fashion — only difference is we have profiling as a task arg/flag and we materialize the flame graph by reaching into the Memray internals

      The sampling profiler here within a sidecar “worked,” but we currently have decided not to worry about looking for PIDs or the like and sticking with memray for now.

      P

      Pylon

      11/7/2024

      Michael, FYI we're about to land https://github.com/flyteorg/flytekit/pull/2875|https://github.com/flyteorg/flytekit/pull/2875, which brings in memray as an option for profiling. This works at the task level (so we don't need to worry about figuring out PIDs). Enabling this will be as easy as adding the ​@memray­_profiling​ decorator.

      Let me know if this fits your use case.