F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Accessing Full Error Logs in Flyte

Summary

The user is working on a project using a FlyteRemote connection to analyze failed executions and relaunch them in bulk. They need to programmatically access detailed error logs for these failures, as the FlyteWorkflowExecution objects from remote.recent_executions() only provide truncated error messages. The user is looking for a way to access the full error logs or alternative methods to obtain or set failure reasons, while noting that the complete logs are available in the Flyte UI. They mention the possibility of using the Python kube API to pull data directly from the worker pod.

Status
open
Tags
    Source
    #ask-the-community
      c

      charlie

      10/18/2024

      Thanks both! <@U07655DJTDM> this might be a bit tricky as we'd need to do it for each workflow (so it has the potential for triggering a lot of additional API calls) - but it's definitely a good idea if we can't get the info from the flyte details directly. <@U04H6UUE78B> this could be worth a shot - I'll have a look into what the dict looks like! Thanks!

      d

      david.espejo

      10/17/2024

      <@U06RTQ8FEP4> what about iterating through the node_executions dict?

      j

      josh210

      10/8/2024

      You can pull from the worker pod directly with the python kube API

      c

      charlie

      10/8/2024

      Hi :wave: I'm currently working with using a FlyteRemote connection to retrieve information about failed executions, so that we can assess failure reasons and relaunch executions in bulk. For this project, we would like to be able to programatically access some information (eg. logs) on why the execution failed.

      The FlyteWorkflowExecution objects returned by remote.recent_executions() have a closure.error.message property that allows us to get some information on the error logs - but this appears to be truncated to 100 characters: > Traceback (most recent call last): > > File "/opt/conda/envs/orchestrator/lib/python3.11/site-pac Is there any way to access the full error log from a failed execution using FlyteRemote? Or any alternative ways to get or set a failure reason? We can see the full error log in the Flyte UI, so I'm assuming this means it's accessible somewhere....