Transcription Loop

So this is a fascinating error. While reading a missive on Austro-Prussian negotiations (c. 1879), Leo got caught in a terrible loop of transcribing “Königin” over 800 consecutive times. Of course, this is concerning from a technical point of view, as getting caught in a loop like this is quite bad on its own, but moreover, the word “Königin” is not the correct word found in the text. The sentence Leo got stuck on says “S[eine] kais[erliche] hoheit des Kronprinzen von uns veraulasst wurde”, but Leo read “v[. ]Gefäit des Königin zu Königin Königin…”, which makes no sense.

There are other minor transcription errors, such as at the beginning of that paragraph, Leo writes “Lange” where it should read “Lage”, and the page being given the wrong number. But to whatever degree Leo incorporates grammar into its output, it may be worth noting a few abbreviations that is continually misreads:

“Ew.” = “Eure Hoheit” (your highness)
“S. Maj.” = “Seine Majestät” (his majesty)
“gez.” = “Gezeichnet” (signed)
“usw” = “und so wieder” (and so on/et cetera)
“z. B.” = “zum Beispiel” (for example)

This is not an exhaustive list, but perhaps I can compile more of these examples as they come to me, as they would make a better product, should Leo be able to recognize them. “S.” is particularly hard for the program, as with contemporary writing styles, capital Ss look like modern capital Vs, meaning that if Leo does take grammar into consideration, it may think it’s looking at an abbreviated “von”, which would be quite the red herring.

1 Like

Thanks Christian! Please see here for some information on these kinds of transcription errors:

We’ll be releasing a new model some time next week and we’re hoping that the updated version will get on better with this kind of material. Unfortunately some German handwriting styles are still not well represented in the training data but this should be fixed in the coming months.

I’m having the same issue.

When working with an early 19th century birth record in French, Leo seems to have a number of problems. The first is inaccurate transcription, to the point that it’s easier to do this without AI aid.The second is a sort of doom loop where Leo keeps repeating the same sentence. I think that this shows Leo a bit beyond its capacities, at least with its present training.

Here is the transcription loop (cut off at a few repetitions: it goes on for about thirty). After that I’m pasting the part of the original.

j’acquiers et avons l’attuation a la grenouille aux
deux Bourhays Commune d’archibouy et avons nous à
vous et au soixant du Soi Suzanne
de la Chavannes et de la Chavannes
opposition ne nous soit garder ne un mois de la dite Commune
de la dite Commune de la Chavannes et que deux
parties sont Mme de la Chavannes et Suzanne et
avons donné de la soi l’ent droit qui
et Suzanne Clerk sont une le mariage de tout qui
nous avons donné le
d’atture qu’au nous de la soi l’ent droit qui
et Suzanne Clerk sont une le mariage de tout qui
nous avons donné le
d’atture qu’au nous de la soi l’ent droit qui
et Suzanne Clerk sont une le mariage de tout qui
nous avons donné le

Thanks Paul. Was it just this one or did you notice any patterns about the kinds of sources that this tends to happen with?