DockerIM should use operation mechanism properly#498
DockerIM should use operation mechanism properly#4980405ysj wants to merge 3 commits intogoogle:mainfrom
Conversation
| return nil, fmt.Errorf("operation not found for %q", name) | ||
| } | ||
| entry := val.(*operationEntry) | ||
| ctx, cancel := context.WithTimeout(context.TODO(), 3*time.Minute) |
There was a problem hiding this comment.
Can context.TODO() be context.Background() ?
| return entry.(*operationEntry).op | ||
| } | ||
| } | ||
| panic("Reached newOperationRetryLimit") |
There was a problem hiding this comment.
Shoud this be panic? looks like it stops service if newOperation is failed, how about just logging?
There was a problem hiding this comment.
I think it's okay to be panic, as retry count increments when uuid conflict happens. I believe it's very close to 0% with large retry couny, even less than sha256 hash conflict.
There was a problem hiding this comment.
Leave a comment explaining that the chances of hitting this are practically zero and not worth it of changing the API to return an error and that in the unlikely scenario this triggers a panic is better than running forever or returning an invalid/nil operation.
| case "POST /operations/deletingbar/:wait": | ||
| writeOK(w, apiv1.HostInstance{Name: "bar"}) | ||
| case "POST /operations/deletingbaz/:wait": | ||
| writeOK(w, apiv1.HostInstance{Name: "baz"}) |
There was a problem hiding this comment.
waiting for a delete host operation should return an empty response: https://github.com/google/cloud-android-orchestration/blob/main/pkg/app/instances/gce.go#L344. Please fix the DockerIM to return an empty response first.
| return | ||
| } | ||
| ins := &apiv1.HostInstance{} | ||
| if err := c.waitForOperation(&op, ins); err != nil { |
There was a problem hiding this comment.
waiting for a delete host operation should return an empty response: https://github.com/google/cloud-android-orchestration/blob/main/pkg/app/instances/gce.go#L344.
create a new helper: waitForOperation() where you don't have to pass a response argument, after fixing the DockerIM implementation.
| mutexes sync.Map | ||
| Config Config | ||
| Client *client.Client | ||
| mutexes sync.Map |
There was a problem hiding this comment.
The CO IM is designed to be stateless.
Why do you need to add this new state to the Docker IM implementation? What are the actual operations you need this for? You should be relying on Docker Engine to query whether a container was created, deleted and so on, to track such operations.
There was a problem hiding this comment.
It was motivated from taking long(> 1min) times to reply REST API, such as POST /hosts and DELETE /hosts.
The CO IM is designed to be stateless.
Then this PR should be definitely my fault.. How should I get such information? I think these circumstances are repeated, and I wish to retrieve how we manage those before working on it.
Context: b/453876231