The idea is to run 2 versions of code on production:
1 (control group) — is your old implementation which will be returned to the end user
2 (the experiment) — is a new implementation which results will be checked against the control group and logged
It will be run asynchronously and only for a small percentage of requests. And when you are sure, they are identical for all user requests for a long period (meaning that you checked all edge cases), you switch to a new implementation and remove the old one.
Here is the go library for this — https://github.com/jelmersnoeck/experiment.
And here is the talk on how my colleagues used it — https://www.youtube.com/watch?v=9TteA20_3qk