Only transcribing the first paragraph (Nahuatl)

After spending a while uploading Spanish documents and having moderate success there, I’ve been experimenting uploading some Nahuatl documents. I’m experiencing a consistent bug with one set of documents where Leo is only transcribing the first paragraph, and not transcribing the rest on a given image; attaching an example so you can see. I didn’t experience this with my Spanish documents, so I wonder if it could be because it’s with a less familiar language? I’ve just been screenshotting individual paragraphs and uploading them to compensate. Any advice on how I might try to adjust would be appreciated!

1 Like

Does it consistently stop transcribing beneath (what looks like) a signature at the end of the paragraph? If so, that’d be very helpful to know, as this may relate to a systematic issue in the training data, where the model “learns” not to keep transcribing after a passage seems to end.

Yes, that’s exactly what’s happening. When the paragraphs are written closer together and the space for the signature isn’t as well-defined and distinct, it tends to keep transcribing.

Hope this is helpful!

1 Like

It’s very helpful—thanks Josh!