Confused model on long document

Jon · March 14, 2025, 9:04pm

If I’m understanding correctly, the transcript begins from the second, main section, starting “Whereas”, and continues quite faithfully up to “Now therefore these presents Witnesse And the said”. At that point the text (i.e., the ground truth) repeats the names that were listed in the first section. But the AI transcription actually produces these in the form that they had appeared in the first, skipped section. Then once that section concludes the whole transcript finishes…

This could help to direct us to some errors that we need to “clean” from our training data, so I’ve made a note of the problem. I think also that the general improvements that we’re planning to make for the next model will help to reduce the likelihood of an error like this occurring.

In the meantime, you might be able to get a better result if you try the tips I suggest here, e.g. cropping the image into smaller chunks.