Development of Linguistic Resources and Automatic Text Adaptation Tools - PhD thesis by Margot Madina Gonzalez
Margot Madina Gonzalez
Projektleitung: Dr. Melanie Siegel, Dr. Itziar Gonzalez Dios
Beteiligte Institutionen: Hochschule Darmstadt, University of the Basque Country (UPV/EHU)
Dauer: 09/2022 bis 09/2025
ikum
Access to information, knowledge and culture is a right for all citizens. However, written text does not always match our ability to understand what we read. Easy-to-Read Language (E2R) aims to overcome these comprehension barriers, therefore promoting equal opportunities.
This model is accessible to anyone with a minimum reading level, but it is mainly aimed at specific user groups such as people with intellectual or developmental disabilities, people with cognitive difficulties or people with low literacy, among others. E2R is a way of making any written text more accessible through the use of clear, direct and simple language. However, it does not only consist on the summarization of a text or the substitution of complex words; E2R counts with a set of rules that may vary depending on the language, and it might also include examples, visual and/or auditory aids that are not present in the source text. The creation of such aids is also part of the adaptation process.
Furthering accessibility via existing resources
Unfortunately, there are not many linguistic resources nor adaptation tools available for E2R. This is a major problem, as automatic adaptation processes for this language model lags years behind other advances that have already been made for standard languages, such as machine translation processes. This is a great disadvantage for those who present difficulties in the reading process, since they cannot access every text they would like and can only rely on those texts that have been created specifically in E2R. This project aims to evaluate the existing resources for E2R automatic adaptation and to create new resources and adaptation tools. It will be focused on three European languages: German, Spanish, and Basque.
Language plays a pivotal role in today’s society. The automatization of the above-mentioned processes will lead to an improvement in the current state of E2R automatic adaptation. The spread of E2R and the creation of linguistic resources and adaptation tools based on E2R will contribute to the inclusion of those who, due to various reasons, present difficulties understanding a standard language.
Main scientific contributions of this dissertation
- Establishing the theoretical background for E2R automatic adaptation, taking into account both ATS techniques and linguistic features that affect the outcome.
- Compiling the existing parallel databases and create new ones when necessary.
- Proposing ATS techniques that enable the adaptation of standard language into E2R. These techniques will be designed to be adapted to different languages and thus be able to be used in multilingual environments and with multiple languages.