Summary
The user is facing an issue with map_tasks
in a dynamic workflow, where two tasks intended to run simultaneously are executing one after the other. They find this unexpected as there are no dependencies between the tasks and have shared a code snippet. The user is questioning whether the graph indicates serial execution instead of parallel, despite the graph showing parallel execution. They are also considering if the unavailability of pods could be a factor.
rupsha
seeing the same behavior in the next run as well…
rupsha
So I'm surprised why 50 from 1 map task and 50 from the 2nd didn't run in parallel
rupsha
Then I increased processing concurrency to 100.. and it ran 100 1 after the other
rupsha
Instead those 50 tasks ran 1 after the other...
rupsha
4... 1 -> 2 map tasks (wanted to run 50 in parallel for each) -> 1
kumare
How many nodes in the graph
rupsha
?
kumare
Max parallelism
rupsha
let me check if unavailability of pods was an issue
rupsha
graph shows parallel.. but for some bizarre reason they fired 1 after the other
kumare
They should be parallel, do you see the graph as serial
rupsha
Hi folks.. Question about map_tasks. I have a dynamic which triggers 2 map_tasks (same task really but to handle the fan out which is more than 5k). Earlier I would see both trigger simultaneously but in my most recent run I saw them getting serialized despite no dependency between them. Code looks like this:
def my_dynamic_wf():
results = []
...
Create some chunks of task inputs
...
for c in chunks:```