Thank you for this Daniel! Leo does not currently learn from corrections made to transcriptions. However, this will be changing soon. The plan is:
- Leo will learn from corrections that users make to transcriptions. It won’t learn in real time, but in cycles of training for each release of the main model.
- To encourage users to correct machine-generated transcriptions, we’ll allow users to fine-tune the base model using them. I discuss this more here and here. The hope is that this will put into motion a data flywheel, where transcription accuracy increases in a positive feedback loop.
- It will also be possiblefor users to benefit from collaborating on and correcting each others’ transcriptions.
- Finally, we’re planning on introducing a “Retry transcription” modal, that will harness stochasticity (like what you suggest here) to attempt to try to generate a better transcription. In addition, as part of this modal, we’ll ask the user to provide the opening text for that particular image as in-context learning, which may improve output.