F

Flyte enables you to build & deploy data & ML pipelines, hassle-free. The infinitely scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks. Explore and Join the Flyte Community!

Issue with FlyteDirectory File Uploads

Summary

The user is facing an issue with files in a FlyteDirectory being partially uploaded to the blob store, capping at 500MB, which leads to unreadable parquet files due to missing magic end bytes. They are seeking guidance on configuration to resolve this. The user has checked local file sizes and shared log outputs indicating various file sizes around 472MB to 540MB. They suggest increasing the logging level for Flytekit to gather more detailed information about the upload process. After some experimentation, the user suspects the issue may not be with Flyte itself, as it also occurs when uploading files directly with s3fs, which could affect the use of FlyteDirectory/FlyteFile objects.

Status
resolved
Tags
  • parquet
  • flyte
  • s3fs
  • File Upload
  • blob store
  • Support Need
  • Configuration
  • File Issue
  • Support Request
  • Blob Store
  • File Upload Issue
  • Bug Report
Source
#ask-the-community
    k

    kumare

    10/28/2024

    Should not, this would break flytes promise. Default it uses s3fs too, much to our dislike

    h

    habuelfutuh

    10/28/2024

    It should output more verbose logging for the upload operation

    h

    habuelfutuh

    10/28/2024

    Can you turn on higher log level for flytekit? Launch with this env bar: FLYTE_SDK_LOGGING_LEVEL=10

    h

    habuelfutuh

    10/28/2024

    The task report success I presume? This is so odd.. Can you double check that the local files are correct? You can either log that or use @vscode to open a vscode in the browser from the running pod to observe