Summary
The user inquires about the durability of queue implementation in Flyte, particularly regarding its behavior during unexpected restarts of the flyte-binary and the configurable settings for queue durability. They mention a specific setting that has benefited their workflows during restarts and ask if it should be enabled for their use, as well as if it adds any overhead time to workflows. The user notes that Flyte queues seem to be based on Kubernetes' workqueue
, which they believe is a non-durable in-memory queue. They also ask if queued workflows would survive FlytePropeller restarts when the max_parallelism of a workflow is set to 10 but 100 execution requests arrive, and whether CRs for workflows are created immediately even if they are waiting due to max_parallelism constraints.
kumare
Yes they will survive- you will not lose any workflows
dubovikov.kirill
<@U029U35LRDJ> thanks for the explanation. My usecase was a bit different: suppose that max_parallelism of a workflow is set to 10, but 100 execution requests arrive. 10 executions will start immediately, while the remaining 90 workflows will be queued up. So, would these queued workflows survive FlytePropeller restarts? Do we create CRs for workflows immediately even if they are waiting due to constraints like max_parallelism?
dubovikov.kirill
Could anyone comment on the queue implementation in Flyte? Are those queues durable? What happens to the queue state in case of flyte-binary unexpected restart? Are there any settings we can change with respect to queue durability?