Avoiding "guessing" the transcription

Thanks for this! I see where you’re coming from with dangerously misleading transcriptions. In practice, Leo is guessing every single word of the transcript. For the model, the difference between legibility and illegibility is not binary but scalar. The reason why we don’t want to include an [illegible] sign is because we don’t want to limit the potential scope for the model to learn in the future. To address this issue we’re planning to add confidence metrics. See here: