F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Machine Learning Workflow Error

Summary

The user is working on a machine learning project that involves dividing tasks across multiple workers using map_task. They have implemented a workflow that ingests data, preprocesses it, and creates batches for processing. However, they are encountering an error related to the generate_ai_response function, which is supposed to read a FlyteFile and index using the batches. The error message indicates a problem with loading a module, specifically a "ValueError: Empty module name." The user is seeking help to identify the cause of this issue.

Status
resolved
Tags
  • Error
  • Workflow Issue
  • Machine Learning
  • Workflow
  • flyte
  • generate_ai_response
  • User
  • Support Need
  • Support Request
  • Bug Report
Source
#ask-the-community
    y

    ytong

    11/11/2024

    what does -v show?

    m

    maartenvanmeeuwen

    11/11/2024

    I am running pyflyte run --remote -p project_name --image pointer_to_artefact_registry with flytekit==1.13.13

    y

    ytong

    11/11/2024

    and in what file is generate_ai_response defined?

    y

    ytong

    11/11/2024

    what’s your register/run command? can you add -v (so like pyflyte -v) to your command to show the files being copied? what version of flytekit is this?

    m

    maartenvanmeeuwen

    11/11/2024

    Hello, for an ML project I am trying to divide work across multiple workers using map_task .

    def ntrk_mutation_analysis_workflow() -> None:
        load_dotenv()
        mlflow_run_id = create_run()
        raw_data: pd.DataFrame = ingest_data(
            project_name=PROJECT_NAME, bucket_name=BUCKET_NAME, file_name=FILE_NAME
        )
        preprocessed_data: FlyteFile = preprocess(raw_data=raw_data)
    
        log_parameters(mlflow_run_id=mlflow_run_id)
    
        batches = create_batches(flyte_file=preprocessed_data)
        predicted_responses: list[int] = map_task(generate_ai_response)(
            batch=batches
        )```
    where `create_batches` returns a `list[int]` of indexing for the batches. In `generate_ai_response` I am trying to read a `FlyteFile` and then index using the `batches`.
    
    for now I am getting this error:
    
    ```[32]: code:"Error" message:"\r\n[fe32955ef6b4c42b984b-n5-0-32] terminated with exit code (1). Reason [Error]. Message: \n                                                 │\n│ /opt/venv/lib/python3.11/site-packages/flytekit/exceptions/scopes.py:178 in  │\n│ system_entry_point                                                           │\n│                                                                              │\n│ ❱ 178 │   │   │   │   return wrapped(*args, **kwargs)                        │\n│                                                                              │\n│ /opt/venv/lib/python3.11/site-packages/flytekit/bin/entrypoint.py:426 in     │\n│ _execute_map_task                                                            │\n│                                                                              │\n│ ❱ 426 │   │   mtr = load_object_from_module(resolver)()                      │\n│                                                                              │\n│ /opt/venv/lib/python3.11/site-packages/flytekit/tools/module_loader.py:43 in │\n│ load_object_from_module                                                      │\n│                                                                              │\n│ ❱ 43 │   class_obj_mod = importlib.import_module(\".\".join(class_obj_mod))    │\n│                                                                              │\n│ /usr/local/lib/python3.11/importlib/__init__.py:126 in import_module         │\n│                                                                              │\n│ ❱ 126 │   return _bootstrap._gcd_import(name[level:], package, level)        │\n│ in _gcd_import:1201                                                          │\n│ in _sanity_check:1114                                                        │\n╰──────────────────────────────────────────────────────────────────────────────╯\nValueError: Empty module name\n."
    ... and many more.```
    where `generate_ai_response` just does this for now:
    
    ```@task
    def generate_ai_response(batch: int) -> int:
        return batch```
    What could the issue be?