F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Issue with map_tasks in dynamic workflow

Summary

The user is facing an issue with map_tasks in a dynamic workflow, where two tasks intended to run simultaneously are executing one after the other. They find this unexpected as there are no dependencies between the tasks and have shared a code snippet. The user is questioning whether the graph indicates serial execution instead of parallel, despite the graph showing parallel execution. They are also considering if the unavailability of pods could be a factor.

Status
open
Tags
  • Workflow Issue
  • flyte
  • User
  • Support Need
  • Question
  • Support Request
  • map_tasks
  • Bug Report
Source
#ask-the-community
    r

    rupsha

    11/13/2024

    seeing the same behavior in the next run as well…

    r

    rupsha

    11/12/2024

    So I'm surprised why 50 from 1 map task and 50 from the 2nd didn't run in parallel

    r

    rupsha

    11/12/2024

    Then I increased processing concurrency to 100.. and it ran 100 1 after the other

    r

    rupsha

    11/12/2024

    Instead those 50 tasks ran 1 after the other...

    r

    rupsha

    11/12/2024

    4... 1 -> 2 map tasks (wanted to run 50 in parallel for each) -> 1

    k

    kumare

    11/12/2024

    How many nodes in the graph

    r

    rupsha

    11/12/2024

    ?

    k

    kumare

    11/12/2024

    Max parallelism

    r

    rupsha

    11/12/2024

    let me check if unavailability of pods was an issue

    r

    rupsha

    11/12/2024

    graph shows parallel.. but for some bizarre reason they fired 1 after the other

    k

    kumare

    11/12/2024

    They should be parallel, do you see the graph as serial

    r

    rupsha

    11/12/2024

    Hi folks.. Question about map_tasks. I have a dynamic which triggers 2 map_tasks (same task really but to handle the fan out which is more than 5k). Earlier I would see both trigger simultaneously but in my most recent run I saw them getting serialized despite no dependency between them. Code looks like this:

    def my_dynamic_wf():
        results = []
        ... 
        Create some chunks of task inputs
        ...
        for c in chunks:```