Bad transcriptions

Hi,

OK. I’ve got quite good transcriptions out of relatively clean documents, and some out of quite dirty ones. But I’ve also got bad ones - essentially unusable - via smooshed together and highly abbreviated documents. Here’s one I really hoped it could read:

This is a highly abbreviated but by no means particularly messy early seventeenth-century Italian hand, writing in Latin. I’m not sure why it’s so hard for the model, but the transcript is not particularly helpful. (If it’s any consolation, this is a better effort than I could coax out of Transkribus for the same page, even with their specialized early modern Italian hand and neo-Latin training sets).

1 Like

Thanks for this Noah. Would it be helpful to have an option for Leo to transcribe without attempting to expand the contractions? I think the model could probably make more sense of what’s actually written that way.

Yes. From my use I think Leo is not great at guessing how contractions should be expanded and the default should be ‘off’ rather than on. With less abbreviated material the incorrect expansions are usually just funny but with highly abbreviated ones they produce nonsense.

1 Like