Summary
The user is experiencing scale issues with the Flyte Binary deployment on Oracle Cloud Infrastructure. They are considering switching to Flyte Core but are currently facing several problems: 1. The webhook service fails under load, and the current retry limit is insufficient.2. Flyte containers occasionally restart due to health check failures, with one instance consuming 25GB of memory despite having a 32GB limit.3. There is a concern about steadily increasing memory consumption, possibly indicating a memory leak.4. The user inquires about the possibility of increasing the number of replicas for the Flyte Binary, noting that they read it may not be feasible and seeks clarification on this.5. They question whether switching to Flyte Core would allow for more replicas to enhance availability during redeployment.The user also expresses confusion regarding the relevance of leader-election in relation to database usage.
kumare
It is possible, with leader election enabled, but we will have to test it and ensure it does indeed work
david.espejo
Hey <@U062Y21KSQG> what flyte version are you running? there's a potential memory leak on flyte-binary that was fixed in 1.13.2
guyarad
noticed <@U04H6UUE78B> answered 4 and 5 https://flyte-org.slack.com/archives/C01P3B761A6/p1729162749816229?thread_ts=1728465510.483769&cid=C01P3B761A6|here
guyarad
Scale issues with Flyte Binary: Hi all, we are deploying Flyte Binary in Oracle Cloud Infra. We should probably switch to Flyte Core deployment but that's what it is for now... We noticed few things: