fix: prevent race condition when marshalling cached operations#96
fix: prevent race condition when marshalling cached operations#96
Conversation
| // A copy is needed because the top level declaration name (primitive type) | ||
| // of the operation is re-assigned. | ||
| opReplica := *cachedOperation.(*field) | ||
| operation = &opReplica |
There was a problem hiding this comment.
I am afraid we need deeper refactoring.
Current "buggy" implementation modifies operation data (Decl) stored in cache after being retrieved from cache. So supposedly next time it is retrieved from cache it has new pieces.
This implementation modifies data on a copy of cached entry so that change is not visible to the next caller that hits the cache.
And maybe it is fine - the Decl part should be custom to each cycle. In that case we should probably removed it from cache entry and have it as local var in this function.
But if it is integral part of cache entry w'll probably need larger sync scope.
There was a problem hiding this comment.
You are right. The updated field in Declaration is definitely not integral part of the cache entry. Therefore, next retrievals don't need to be aware of it.
There was a problem hiding this comment.
Y'all are right on the money with your analysis as far as I can tell. I'll approve this in the interim though
What this PR does / why we need it
Re-bootstrapping with a recent version flags a race condition when running tests:
The issue occurs in the marshal method when multiple requests try to update the declaration name of the same cached operation
goql/query.go
Line 636 in 6f72c1e
The proposed fix copies the cached operation by value before it is updated. Since no other fields are updated during the marshalling process, there's no need make a deep copy (by reference).
Jira ID
N/A
Notes for your reviewers