My Classical Arabic Corpus is almost done!!!

Finally, I’m done with collecting a 50 million words Classical Arabic Corpus, and here is a brief description of its content:

Genre Subgenre Number of books Number of words Percentage
Religion The Holy Quran 1 78245 0.15
Hadith 44 5784326 11.43
Exegesis of The Quran 13 7061862 13.96
Quranic Studies 29 3665288 7.24
Hadith Studies 10 643144 1.27
Belief 23 486801 0.96
Jurisprudence 26 5567407 11.00
Principles of Jurisprudence 4 358014 0.71
Linguistics Grammar and Morphology 16 1400951 2.77
Language 6 401308 0.79
Lexicons 27 4855732 9.60
Provirbs 7 435975 0.86
Literature Poetry 42 1265696 2.50
Novels 2 172695 0.34
Literature and eloquence 60 5786113 11.43
Science History 19 3750498 7.41
Geography 14 609979 1.21
Medicine 3 1837452 3.63
Physics 1 61347 0.12
Astronomy 2 112695 0.22
Philosophy 1 24760 0.05
politics 1 4674 0.01
Miscellaneous 1 27728 0.05
Sociology Ethics and Morals 23 1081566 2.14
Genealogy 9 1628208 3.22
Biography Prophet Muhammad Peace be upon him biography 8 1163795 2.30
Other biographies 18 2336153 4.62
Total 410 50602412 100
Advertisements
This entry was posted in My Progress, Traditional Arabic Corpus. Bookmark the permalink.

2 Responses to My Classical Arabic Corpus is almost done!!!

  1. mahmoud says:

    when it gonna be available for public

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s