Generally reliable transcription

I uploaded several diaries to Leo and they were very accurately transcribed, at least from what I can tell. This may be facilitated by the neatness and consistency of the author’s handwriting, as well as the clarity of the scans, but I am so far impressed by the accuracy of Leo’s transcriptions.

3 Likes

That’s awesome to hear. Thank you! :grin:

often mine are 60-70% accurate - and there are some I will go through and correct. Would it be helpful for you to see some of the corrected ones?

If you notice any systematic kinds of errors that Leo tends to make, or documents that it generally struggles with, that would be very helpful to know!

Around 95% accurate on the mid-19th century letters I’ve been transcribing (60 or so pages so far). Decent, but not perfect, handwriting, so overall I’ve been very satisfied. The speed is also about right – it takes as much time to transcribe a new page as it does for me to proofread the previous transcription and remove the line breaks.

1 Like

Agreed. Particularly with rather neat handwriting, the transcription is excellent: In a 500 word letter written in French in the following script:

The first error rendered what should have been “plupart” as “plagiat.” I think the error her originated with seeing that second “p” as a “g” event though the form wasn’t really different.

1 Like

Elsewhere the AI interpreted a “p” as a “g” and this was the source of the error.
A third error transcribed “j’ai tracé sa” as “j’aurais dû”

This one is a little odd because elsewhere the software picks up the “j’ai” of this author easily, but once it lost the thread, it converted the whole expression to “j’aurais”

1 Like

Another letter, in a hand that is much looser, less regular and also somewhat lighter, gave the AI much more trouble. Here is a sample:

Understandably, the AI wants to translates “des Malleval” (i.e. some members of the Malleval family) as “malheur,” and this leads to trouble all the way through the document. One thing that the AI got immediately that I would have struggled with is “dissoute” as in “la société fut dissoute” on the 8th and 9th lines. Thereafter the real troubles begin: further on line 9 “Les Malleval” is again (incorrectly though consistently) transcribed as “les malheurs,” and then the AI skips 3.5 lines to (correctly but misplaced" “plus tard, à de grosses difficultés.”

I tested this letter because it is one of the most difficult in the entire archive I’m working with. I’m not sure at the present level of accuracy it’s worth correcting the AI transcription, or just transcribing by hand.

1 Like

Seems like we definitely need to get Leo reading more French…

Thank you Paul!