Summary
The user is facing an issue with the map_task
in Flyte, where containers take two hours to start, resulting in a RuntimeExecutionError
after retries are exhausted. They have updated flyte-core
and flytekit
to version 1.13.0, but the issue remains unresolved. The user has restarted all Flyte pods but is reluctant to redeploy the cluster due to concerns about it being too drastic. They mention the flyte propeller pod is based on the image ghcr.io/flyteorg/flytepropeller:v1.1.67
and initially thought updating the flyte-core helm chart version would update everything, but it did not. The user plans to try updating the helm chart and other services to see if that resolves the issue.