As discussed in my previous post, The Apollo 17 PDFs contained an early attempt at recognizing the typewritten text using Adobe Acrobat’s built in OCR functionality. Working from Adobe’s OCR output would result in a huge amount of manual labour, which kind of defeats the purpose of using OCR to being with.
If you’re interested in this project, have a quick flip through the PDF of the technical air-to-ground mission audio transcript to get an idea of what the source material is like. The raw PDF document was published courtesy Stephen Garber (NASA HQ) and Glen Swanson (JSC) (55MB PDF). These transcripts were originally typed in 1972 by NASA typists.