In the next entry we present how a reading comprehension application was created by using Freeling. Freeling is an open source library for linguistic analysis. It allows analyzing the structure of a sentence, assigning to each word a label that helps to recognize what kind of word it is (verb, noun, pronoun, adjective, etc.). This feature is really useful when analyzing texts.
The application has the ability to read a text and answer questions about it. The texts that are used are in English. The programming language that was used to create the application is Phyton.
In Figure 1 a scheme of the operation of the program is presented:
Fig. 1: Operation_reading comprehension application
1. Select the directory
You have to select the directory in which the files that you wanto to analize are. These files must be in .txt format. The directory can contain any number of files. It is important that each file has the set of questions that you want to answers will be given. These questions should be at the end of the text (Fig.2)
Fig. 2. Selecting the directory
2. Text analysis
The analysis is mainly based on determining which of all sentences in the text have the greatest semantic load in relation to the question that is asked. So, a percentage is given and the probable response is determined.
To carry out the code it is important to take into account the grammatical structure of both the questions and the answers:
- The questions begin with the so-called W words (what, who, where, when and why).
- The sentences have a subject, a verb and a complement.
- The question and all possible answers must be analyzed. An example is described below:
For the question: Who is Christopher Robin?
- Freeling Analysis
Fig. 3: Analysis Freeling
Freeling detects the words in the following way: W-word (Who) as WP, it means interrogative pronoun; "is" as VBZ, it means personal verb in third person; "Christopher Robin" as NP, it means proper name and finally the " ? " as Fit, it means question mark.
Una vez analizada palabra por palabra, es más fácil buscar las respuestas, la respuesta a esta pregunta debe venir nada de la siguiente manera: NP+VBZ+C (complemento). La aplicación dará mayor prioridad a las respuestas que cumplan con este criterio.
Once the application analyzed word by word, it is easier to find the answers. The answer to this question should be: NP + VBZ + C (complement). The application will give higher priority to the answers that meet this criterion.
3. Answers
Once all the text has been analyzed, all the possible answers to the given question will be shown in the window. Next to each answer is a percentage, the higher the percentage the answer is the more likely.
It is important to mention that, since there are a large number of grammatical structures in the English language, the answer with a higher percentage will not always be the correct one. The figure 4 shows the questions and answers of a example made in the app:

Fig. 4: Questions and answers
The application has the option of make a new question. You can put any question and the app will give you an answer (Figure 5).
Fig. 5: Extra Questions
The next video shows how the app works:
Conclusion
Freeling is a very useful tool in the analysis of texts. It is easy to implement. To create any application it is important to have the right tools, however it is even more important to know how to use it. For this the functionality that an application have, has to be clear, in this way it will be programmed in a logical and rational way.
The application created meets the main objective of reading texts and answering questions. There is a certain margin of error because the answers are based on the coincidence of words with each question. But the results of the tests carried out show that, in general, the answers obtained with the highest percentage are the correct ones.


No hay comentarios:
Publicar un comentario