Denis Coquenet invites you to his thesis defense which will take place on Thursday, September 29th at 10am at the amphitheater A (UFR Sciences et Techniques), at the Madrillet campus (Saint-Etienne-du-Rouvray).
This thesis was carried out at the University of Rouen within the LITIS Apprentissage team and is entitled :
"Towards End-to-end Handwritten Document Recognition".
The defense will take place in front of a jury composed of :- Christian Wolf, Lecturer (HDR) at the INSA of Lyon, Referee
- Mathieu Aubry, Lecturer (HDR) at the École des Ponts ParisTech, Referee
- Elisa Fromont, Professor at the University of Rennes 1, Examiner
- Harold Mouchère, Professor at the University of Nantes, Examiner
- Thierry Paquet, Professor at the University of Rouen, thesis director
- Clément Chatelain, Lecturer (HDR) at the INSA of Rouen, thesis supervisor
Handwritten text recognition has been widely studied over the last decades for its numerous applications. Today, the state-of-the-art approach is based on a three-step process. The document is segmented into text lines, which are then ordered and recognized. However, this three-step approach has many drawbacks. The three steps are treated independently whereas they are closely related. Errors accumulate from one step to the next. The ordering step is based on heuristic rules that prevent its use for documents with complex layouts or for heterogeneous documents. The segmentation step requires its own additional annotations.
In this thesis, we propose a new paradigm to overcome these various limitations; it is the first paradigm able to jointly recognize and analyze entire documents in a single step. We will see how the use of attention-based deep neural networks enabled us to implement this new paradigm based on a learned reading order, going from character to character. The results obtained are comparable to the state of the art in terms of recognition error rate. In this thesis, we detail the steps that enabled us to reach these results: we progressively increased the difficulty of the recognition task, going from isolated lines to paragraphs, then to whole documents.