00000ctm a22000004a 4500 UP-99796217610382301 Buklod 20120628112629.0 m |o d | ta 120628s xx d r |||| u| (iLib)UPD-00187102700 DENGII eng DMLUC LG 995 2011 C65 O54 Ong, Darrel Alvin N. Automated content scoring of Filipino essays using concept indexing Darrel Alvin N. Ong. 2011. x, 55 p. ill. Thesis (M.S. Computer Science)--University of the Philippines, Diliman. Essay writing is writing a short composition about the personal views of its author on a given topic. It has been taught to every student and became a major part of formal education in the Philippines. It is also one of the effective methods to improve the language proficiency enriching the students? vocabulary as they write. However, evaluating and scoring essays need considerable amount of time, which limits language teachers in providing more writing exercises. Also, teachers often experience boredom and fatigue especially when checking large amounts of essar resulting in inconsistent scores. These problems motivate the development of Automated Essay Graders (AEGs). A number of AEGs have already been developed for various languages but none so far for the Filipino language. This study addresses these problems by developing the first automated content analysis for the Filipino Language. It is a computer application that automatically analyzes and scores content of Filipino essays. It captures the semantic meaning of each essay and scores them accordingly. The corpus used were essays written by high school students from a public high school. These essays were checked by different teachers based on content, grammar, and organization. Only the scores for content were considered in this study. Experiments were conducted to determine the effects of spell checking, stop words removal, stemming, sub-clustering and normalized weighting schemes. Different cases were considered to determine the optimum parameters that improve performance of the system. Results show that applying spell checking and stemming do not show significant improvement. However, stop words removal with raw term frequency normalized weighting scheme improves system accuracy. The system is implemented using Concept Indexing (CI), which is a relatively new dimensionality algorithm in the field of NLP. It is also implemented using Latent Semantic Indexing (LSI), which is the more common algorithm used by nist AEG systems for other languages. Experiments were conducted for each teacher for CI and LSI using the optimum parameters from previous experiments. Results show that the relatively new algorithm, CI, outperform and is faster than LSI. The system was compared to other human checkers to determine if the system can mimic the way teachers score an essay. Results show that one can not distinguish between the teacher and the computer mimicking the teacher?s style of essay grading. Students will be given more writing exercises that will consequently improve their language proficiency. Lastly, Filipinos will be able to learn and understand more the Filipino language. Essays Indexes Databases. Latent semantic indexing. Automatic indexing Software. Automated Essay Grader (AEG). Concept indexing. Filipino essays. FI UP UPD DARCHIVES LG 995 2011 C65 O54 UPD DENG-II LG 995 2011 C65 O54 Thesis