While running multiple independent tasks in parallel which in fact is the primary function of map_task
map_task
in Flyte allows for the execution of the task across a series of inputs within a single workflow node. As a result, there is no need to create individual nodes for each instance which increases the performance and stability in the system.
Here is what Flyte says about the use cases of map_task
:
Executing the same code logic on multiple inputs
Concurrent processing of multiple data batches
Hyperparameter optimization
To know more, visit the following links:
The code that caused the error is as follows:
from flytekit import map_task, task, workflow
@task
def do_something(value: str) -> str:
print(f"launched: {value}", flush=True)
time.sleep(60) # fakes long process time
return f"{value}-processed"
@workflow
def do_multiple_things() -> list[str]:
values = ["foo", "bar", "baz"]
return map_task(do_something)(value=values)
So here is the solution:
Local runs will not do parallel just yet. (Making flyte-kit execute local runs in parallel is part of a broader project that flyte has plans for someday, but no definite timeline).
Top comments (2)
Hi
Could you add an intro to this post? I'm not sure I follow what a map task is and what this problem is all about?
Hi
Thank you so much for showing your interest in my blog. I have added more information in my blog and even links to the official documentation. I hope you already know what flyte is and how to use flyte.
If you still have any doubts, please let me know, I will research more on this topic and try to answer your queries