Comparing Text-Matching Software Systems Using the Document Set in the Latvian Language
Journal of Academic Ethics 2020
Laima Kamzola, Alla Anohina-Naumeca

There are many internationally developed text-matching software systems that help successfully identify potentially plagiarized content in English texts using both their internal databases and web resources. However, many other languages are not so widely spread but they are used daily to communicate, conduct research and acquire education. Each language has its peculiarities, so, in the context of finding content similarities, it is necessary to determine what systems are more suitable for a document set written in a specific language. The research focuses on testing the existent text-matching software systems on a set of documents prepared in the Latvian language. The corpus includes documents containing verbatim plagiarism, paraphrasing, translation plagiarism and original text to test both false positive and false negative cases. In total, 16 different text-matching software systems are compared on the plagiarism coverage using the prepared document corpus. The research presented is a part of an international initiative “Testing of Support Tools for Plagiarism Detection (TeSToP)” established under the European Network for Academic Integrity.

plagiarism detection, academic integrity, text-matching software, plagiarism coverage

Kamzola, L., Anohina-Naumeca, A. Comparing Text-Matching Software Systems Using the Document Set in the Latvian Language. Journal of Academic Ethics, 2020, Vol. 18, No. 2, pp.129-141. ISSN 1570-1727. e-ISSN 1572-8544. Available from: doi:10.1007/s10805-019-09355-z

Publication language
English (en)
