Last Tuesday I’ve made my first meeting with Dr. Atwell, my co-supervisor, and he suggested the following research areas:
– Extending the Quranic Arabic Corpus (http://corpus.quran.com/) to include Tafsir texts, and use Artificial Intelligence or Natural Language Processing to enrich or enhance the text with a formal representation of some of the knowledge in the text. For example, adding PoS-tags and morphological analyses to Tafsir texts, and/or building a network of related Tafsir texts.
– Extending the Quranic corpus to include other Classical Arabic texts, then using this extended corpus to provide more examples of use and contexts for each Quranic word in order to extract a concordance of each Quranic word showing its meaning in Classical Arabic. This can be accompanied by automating the building of a distribtional lexical semantic model using Scooplex.
He also suggested creating a blog to record my research progress, and help making contact with other people who are interested in Quranic Arabic Computing.
My next research steps are:
– Creating my own blog.
-Investigating the format of available online Tafseer files (for example: http://altafsir.com) and check if they are compatible with the Quranic Arabic Corpus in order to be used in extending it.
– Exploring the “verse similarity network” in which each verse is linked to its related verses (http://textminingthequran.com/wiki/Verse_relatedness_in_Ibn_Kathir) and check if the same can be applied to Tafseer texts.
-Studying the work of Justin Washtell in researching lexical semantics models derived from a large corpus (http://www.comp.leeds.ac.uk/washtell).