Summary
The user is looking for a developer flow for data scientists that simplifies the process of specifying packages in image specifications and files like requirements.txt
or poetry
. They want to configure a default image in client or server settings to avoid passing container_image=...
for every task. The user is testing the configuration with --config
but encountered issues, as the default image being used is not the one they specified. They suspect there may be a bug and plan to investigate further later. For now, they may resort to using ImageSpec
to expedite production deployment.
thomas571
Thanks, and just to clarify, this is on Flytekit Version: 1.13.9, also no rush I need to head home soon anyway. I just wanted to ask since I couldn't make any sense of this. Most likely we'll just use ImageSpec
for now just so we can get something in prod sooner, and later see of this concern was even worth following up on
kumare
I think the coding may have a bug - AFK, will try later in the day once I get a chance
thomas571
poetry run pyflyte --config=path/to/config-sandbox.yaml run --remote workflows/example.py wf --input=...
with
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint: localhost:30080
insecure: true
# # This is not a needed configuration, only useful if you want to explore the data in sandbox. For non sandbox, please
# # do not use this configuration, instead prefer to use aws, gcs, azure sessions. Flytekit, should use fsspec to
# # auto select the right backend to pull data as long as the sessions are configured. For Sandbox, this is special, as
# # minio is s3 compatible and we ship with minio in sandbox.
storage:
connection:
endpoint: <http://localhost:30002>
access-key: minio
secret-key: miniostorage
images:
default: localhost:30000/myimage:latest```
as the config results in the default image `<http://cr.flyte.org/flyteorg/flytekit:py3.11-1.13.9|cr.flyte.org/flyteorg/flytekit:py3.11-1.13.9>` being used
thomas571
I tried passing --config
but it didn't seem to work, I'll try again in case I screwed something up while testing
kumare
It will get complicated for your users
kumare
Hmm you can pass that config to the client and it will work
thomas571
I might of course just be holding this wrong and fighting how flyte expects you to do stuff, but I'm kinda hoping there is some way to avoid duplicating the tracking of dependencies
thomas571
Hi there, I'm trying to figure out a good developer flow for our data scientists where we don't need to specify packages in image spec and other pre-existing places such as requirements.txt
/poetry etc. What is the recommended way of solving this? Ideally I don't want to have to pass container_image=...
to every task. Callingpyflyte
run/register with --image
and an image we created using an appropriate docker file works just fine. But what I was hoping to be able to do was to add something like:
default: localhost:30000/myimage:latest ```
to the config (on the client or server) and then just have things work?