F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Issues with pyflyte launchplan behavior

Summary

The user is facing two main issues with pyflyte: first, they can run a launchplan even when it is disabled and want to know if this is expected behavior and how to pause the pipeline; second, when reverting to an older launchplan version, the workflow still runs the latest version instead of the intended one, leading to confusion about version selection in the UI. They suggest that the CLI should be clearer and request a feature to pause the pipeline similar to Airflow, highlighting the need for better version control and filtering options. The user plans to file an issue on GitHub regarding these concerns and appreciates the Flyte UI/UX. They also provide links to a pull request and issue tickets related to their concerns.

Status
open
Tags
    Source
    #ask-the-community
      k

      kumare

      10/10/2024

      i dont think we will add triggers to flyte, as you have an api and anyone can build triggers - well in union that is part of the platform

      v

      vasani.ashwin

      10/10/2024

      <@U07E8MHFW83>: Instead of adding N external triggers in flyte, I think it is better to add a feature to pause the LP scheduler if is deactivated. Example:

      1. Let's say I have LP which is currently activated.
      2. Execution A, B, C are submitted to LP and executed.
      3. For some reason, we need to deactivate the LP
      4. User submitted P, Q, R request, but requests are not executed since LP is deactivated.
      5. A fix is landed and LP version updated and activated.
      6. Now P, Q, R will run with the activated version, unless specified otherwise.
      P

      Pylon

      10/10/2024

      Hi Ashwin! Yes, external triggers (Kafka, API, Webhooks, Lambda, Kubectl), etc are things we are actively exploring.

      v

      vasani.ashwin

      10/10/2024

      Maybe as a workaround we can add a check IS ACTIVATED in our service, but it would be nice if flyte can just add to job queue w/o execution.

      v

      vasani.ashwin

      10/10/2024

      Current workflow activate andd deactivate works incontext of cronjobs or schedules. I have seen uses cases (including our) where there is external trigger like Kafka, Service, Lambda, or manual triggering. I know these external triggers are not supported in Flyte today, except SQS. We use flyte go gRPC client to execute the remote launchplans.

      k

      kumare

      10/10/2024

      You can infact pause - it’s like deactivate and then activate. Do you mean you want to pause and restart where it should backfill?

      g

      grantham

      10/10/2024

      Thank you for sharing <@U07R0ED4PK8>! The idea of "pausing" an execution makes a ton of sense.

      I haven't seen any other requests for this functionality so far, and it would only be applicable for problems where you have some "external state" (IE: a database),

      v

      vasani.ashwin

      10/10/2024

      Here is the feature request tickets for pausing feature: https://github.com/flyteorg/flyte/issues/5835

      v

      vasani.ashwin

      10/10/2024

      Thank you for fast fix: Here is the issue ticket: https://github.com/flyteorg/flyte/issues/5834

      k

      kumare

      10/10/2024

      <@U07R0ED4PK8> PR please - https://github.com/flyteorg/flytekit/pull/2796 .Please create an issue to so that i can attach. cc <@U07E8MHFW83>

      k

      kumare

      10/10/2024

      ohh highly appreciate the feedback. I would highly recommend to try out Union. Its even faster :smile:

      v

      vasani.ashwin

      10/10/2024

      Yes. Let me do it tomorrow morning. Regarding - is this a breaking behavior? I don't think so. When we activate a launchplan, we have to specify a version. That is my contract with the system to activate a specific version. If there is a new version, I can test with workflow and I can decide if I want to activate or not. Maybe flag like --activate-latest or something similar might work. pyflyte launchplan --activate/--deactivate &lt;lp_name&gt; &lt;version&gt; Btw, I really like like your Flyte UI/UX (apart from many other things). It is lighting fast. :zap:

      k

      kumare

      10/10/2024

      <@U07R0ED4PK8> can you file an issue on github ?

      k

      kumare

      10/10/2024

      maybe i will add a filter too - to say --sort=by-time --sort=active etc wdyt?

      k

      kumare

      10/10/2024

      now my challenge is - is this a breaking behavior?

      k

      kumare

      10/10/2024

      yes i realize everywhere folks just use latest by time

      k

      kumare

      10/10/2024

      <@U07R0ED4PK8> i thought some more you are actually right, we should use active. I think it makes more sense. Let me create that PR and then we will add a version PR too? what say?

      v

      vasani.ashwin

      10/10/2024

      workflow -> always pick the latest launchplans -> always pick the active if enabled/activated.

      v

      vasani.ashwin

      10/10/2024

      Thanks <@UNZB4NW3S>! I see similar issue in flytectl as well. execution_spec pick the latest version and not the active version. flytectl get launchplan --project flytesnacks --domain development &lt;lp_name&gt; --execFile execution_spec.yaml The only way to fix this is by looking at all LPs, find the ACTIVE version, then get execution_spec and then execute the LP. I am not sure if there is better way. ./bin/flytectl get launchplan --project flytesnacks --domain development &lt;lp_name&gt; -o yaml | grep ACTIVE It would be great if you could consider feature request to pause the pipeline. Here's my use case: someone pushed a bad change to production. For some reason, I can't roll back to the old version due to some internal data issue. In Airflow, I can just pause the DAG, put out the fix and enable again. Flyte registers the default launch plan. I'm comparing the default launch plan with Airflow DAG. Flyte has added a version control feature, which is excellent and missing in airflow, we still need a way to pause a pipeline. This could be done by either rejecting incoming requests or adding the executions to a queued state.

      k

      kumare

      10/10/2024

      i think the cli should be more explicit

      k

      kumare

      10/10/2024

      fair

      k

      kumare

      10/10/2024

      for now you wll have to use flytectl or ui

      k

      kumare

      10/10/2024

      let us add that soon

      k

      kumare

      10/10/2024

      ohh yes we should add a version

      v

      vasani.ashwin

      10/9/2024

      $ pyflyte run remote-launchplan &lt;lp_name&gt; I don't see the option to pick a version in pyflyte. Also, if I set the (X - 1) version active from UI, my mental model expect to trigger (X-1) version and not X.

      k

      kumare

      10/9/2024

      disabling is only for schedules

      k

      kumare

      10/9/2024

      you can always trigger any version

      k

      kumare

      10/9/2024

      what do you mean execute?

      v

      vasani.ashwin

      10/9/2024

      Hello Everyone!

      I am trying to execute a launchplan with pyflyte and I have encountered two issues:

      1. I am able to execute a launchplan even if it is disabled. I am not sure if this is the expected behavior. If this is expected, then how do I pause the pipeline?
      2. When I set a launchplan to an older version (like a rollback), the workflow executes with the latest version and not the active version. Both of these issues seem like bugs, either in the Flyte server or in pyflyte. Can someone clarify the expected behaviors?