Summary
The user is facing an issue with files in a FlyteDirectory
being partially uploaded to the blob store, capping at 500MB, which leads to unreadable parquet files due to missing magic end bytes. They are seeking guidance on configuration to resolve this. The user has checked local file sizes and shared log outputs indicating various file sizes around 472MB to 540MB. They suggest increasing the logging level for Flytekit to gather more detailed information about the upload process. After some experimentation, the user suspects the issue may not be with Flyte itself, as it also occurs when uploading files directly with s3fs
, which could affect the use of FlyteDirectory/FlyteFile objects.
ytong
kumare
Should not, this would break flytes promise. Default it uses s3fs too, much to our dislike
habuelfutuh
It should output more verbose logging for the upload operation
habuelfutuh
Can you turn on higher log level for flytekit? Launch with this env bar: FLYTE_SDK_LOGGING_LEVEL=10
habuelfutuh
The task report success I presume? This is so odd.. Can you double check that the local files are correct? You can either log that or use @vscode to open a vscode in the browser from the running pod to observe