Summary
The user has developed a custom agent using the Flyte agent framework and deployed it within their organization. They are encountering a permission error causing workflow failures and are trying to determine if the issue is related to propeller or admin. The user suspects the problem lies with propeller since their agent-service development does not involve admin code, and they have not tried local execution. They are seeking assistance from specific individuals regarding this error and suggest checking the readability of TaskExecution.Phase
for propeller, which will be an integer. The user expresses gratitude for the follow-up and indicates they may have identified the issue, promising to provide updates later.
eric901201
nice, thank you so much
shuliang
Thanks for the follow up Han-Ru, Ethan and Kevin. I think I might know what went wrong, will update when I had more data later today.
eric901201
it will be an integer number
eric901201
I mean print the TaskExecution.Phase
to check if it is readable for propeller or not
ethan.brown
I haven't seen this one pop up at all in our agent (and we've been running with agents for ~ 6 months now)
shuliang
sure. what do you need? just you know that the task execution is readable when not running stanalone?
eric901201
Can you help me check this? https://github.com/flyteorg/flytekit/blob/master/flytekit/extend/backend/utils.py#L18-L31 Maybe your phase is not readable for propeller/
eric901201
Hi all, sorry for the direct ping, but I could really use some help with an issue. Has anyone encountered this error while developing custom agents? <@U065GFDUU4X> <@U072SMW5S1G> <@U03T940SK40> <@U03CLARPEJ0> Thank you all!
eric901201
this is a hard one, let me ask some OSS contributors.
shuliang
yeah I saw the log from propeller, but didnt have more information why propeller failed like that…
eric901201
it's all in propeller
eric901201
there's 0 code related to admin when developing agent-service
eric901201
> Transitioning/Recording event I think it's from propeller
shuliang
I have not tried the local execution though. still figuring out where is this even coming from.. e.g. from propeller? admin, etc..
eric901201
does it work well in local execution?
eric901201
I've never seen this error before
shuliang
hi guys, I have managed to create my custom agent/sensor/task leveraging flyte agent framework. It simplifies a lot of complicated task and works fine.
now I am decoupling my custom agent/task from the flytekit SDK and deployed it standalone to be managed by my org/teams. And then I am seeing an error that I have never seen before and did not have much info to go deeper per se.
from in propeller i only see the permission error:
s":"kk-flyte-dev3","res_ver":"10264174790","routine":"worker-2","src":"executor.go:297","wf":"shuliang-projects:development:gnn_agent_sensor_mpijob_ei_faro.distributed_multi_node_graph_training"},"level":"debug","msg":"Transitioning/Recording event for workflow state transition [Failing] -\u003e [Failed]","ts":"2024-09-23T07:20:30Z"}
any idea what’s happening? is there some other place we can look at for what caused permission of creating the custom task?