GUI Application for Measuring Morphological Richness of Korean Texts
Published by Haerim Hwang
Korean morphological complexity morphological diversity morphological richness natural language processing python
1 min READ
The Korean Morphological Richenss Analyzer (KOMORA) 1.0 measures syntactic complexity of Korean texts.
The 15 indices computed by this tool are:
(1) number of sentences,
(2) number of eojeols,
(3) token frequency of morphemes,
(4) type frequency of morphemes,
(5) type-token ratio (ttr; Chotlos 1944),
(6) root type-token ratio (rttr; Guiraud, 1960),
(7) corrected type-token ratio (cttr; Carrol, 1964),
(8) Herdan (Herdan, 1960),
(9) Summer (Somers, 1966),
(10) Dugast (Dugast, 1978),
(11) Maas (Maas, 1972),
(12) mean segmental type-token ratio (msttr; Johnson, 1944),
(13) moving-window type-token ratio (mattr; Covington & McFall, 2010),
(14) measure of textual lexical diversity (mtld; McCarthy & Jarvis, 2010),
(15) HD-D (McCarthy & Jarvis, 2007).
This tool has been developed using Lexical Richness, Kivy, and KoNLPy.
Find the user manual for KOMORA from here.
It can take 1 to 3 minutes to open the application depending on the performance of your device because it installs a few natural language processing packages.
Hwang, H. (2024). Development of morphological diversity in second language Korean: An NLP analysis using the Korean Morphological Richness Analyzer 1.0. System, 121, 103260. https://doi.org/10.1016/j.system.2024.103260