F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Deadlock issue in Flyte on GCP

Summary

The user is facing a deadlock issue in Flyte on GCP caused by the GCSFuse CSI driver and Flyte Copilot waiting for each other to terminate. A previous workaround involving sending a SIGTERM to the GCSFuse sidecar is no longer effective after upgrading to GKE cluster 1.29, as the sidecar is now an init-container with a restartPolicy: Always. The user is looking for an easy fix since their pipelines are stuck, and downgrading the GKE cluster is not an option. They mention that Flyte propeller controls pod termination and suggest that supporting all types of sidecar containers should be part of Flyte. The user has attempted to create a custom copilot image that ignores the GCSFuse sidecar, which allows termination but leads to an error regarding container task outputs, potentially unrelated to the main issue.

Status
resolved
Tags
    Source
    #ask-the-community