F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

User Seeks Advice on Flyte Workloads

Summary

The user is new to Flyte and has workloads that include both slow-running and fast-running tasks. They are concerned that using Flyte for fast-running tasks, which require setup time like loading a neural network to GPU, may introduce unnecessary overhead compared to their previous method of using an HTTP server with FastAPI. The user is considering running an HTTP server alongside Flyte tasks to make HTTP requests but feels this approach may be redundant. They are seeking advice on the best way to handle this situation.

Status
resolved
Tags
  • flyte
  • GPU
  • User
  • Question
  • FastAPI
  • Developer Help
Source
#ask-the-community
    p

    pim

    11/9/2024

    Thanks for the pointer! That indeed looks like what we need. I think we'll evaluate Flyte first, and consider Union later

    k

    kumare

    11/9/2024

    Actors are completely designed for this usecase.

    k

    kumare

    11/9/2024

    <@U0805QZCYS0> you should definitely keep it within flyte, we thinking keeping it simple makes it much much better

    k

    kumare

    11/9/2024

    Would you be open to talking more about this?

    k

    kumare

    11/9/2024

    <@U07VB6BDE1L> if you are open to it, we at Union have built a new feature called https://docs.union.ai/byoc/user-guide/core-concepts/actors#actors|Actors , which reuses containers, can allow you to pin models to memory and can run tasks in milliseconds

    p

    pingsutw

    11/8/2024

    what kind of long running task? If it’s long running computation, you could just use regular python @task to run it in a pod

    g

    g.m.verkes

    11/8/2024

    Would you do the same for long running comptutations? Would you implement a separate queue for long running computations with something like rabbitmq or is there a better way and keep it within Flyte?

    p

    pim

    11/8/2024

    Yeah, it's CPU (or GPU) bound

    p

    pingsutw

    11/8/2024

    is your task CPU bound? if so, I think it’s better to run multiple http servers, and use agent to dispatch requests to them

    p

    pim

    11/8/2024

    Or would you have the agent itself execute the requests?

    p

    pim

    11/8/2024

    Thanks! So you'd then make a task for each HTTP endpoint, and have a single agent to dispatch the HTTP requests?

    j

    josh.wills

    11/8/2024
    p

    pim

    11/8/2024

    Hi all! I'm new to Flyte. My workloads consist of slow-running tasks, for which Flyte is perfectly suited, as well as fast-running tasks. The latter tasks might need some set-up time, however, such as loading a neural network to GPU (for model inference). Before moving to Flyte, we've been using a HTTP server to execute those tasks, e.g. with FastAPI. Running these as Flyte tasks seems to add unnecessary overhead. What would be the best/idiomatic way of handling this? I'm currently thinking of running a HTTP server, and then running a flyte task to make a HTTP request. This seems a little duplicious though. Thanks!