Zinātniskās darbības atbalsta sistēma
Latviešu English

Publikācija: Co-occurrence of the Benford-like and Zipf Laws Arising from the Texts Representing Human and Artificial Languages

Publikācijas veids Recenzēts zinātniskais raksts, kas publicēts Latvijā vai ārzemēs izdotos zinātniskos žurnālos ar redkolēģiju, tai skaitā augstskolu izdevumos
Pamatdarbībai piesaistītais finansējums Citi pētniecības projekti
Aizstāvēšana: ,
Publikācijas valoda English (en)
Nosaukums oriģinālvalodā Co-occurrence of the Benford-like and Zipf Laws Arising from the Texts Representing Human and Artificial Languages
Pētniecības nozare 2. Inženierzinātnes un tehnoloģijas
Pētniecības apakšnozare 2.2. Elektrotehnika, elektronika, informācijas un komunikāciju tehnoloģijas
Autori Evgeny Shulzinger
Irina Legchenkova
Edward Bormashenko
Atslēgas vārdi human language ; artificial language; Zipf’s law; Benford’s law; qualitative linguistics
Anotācija We demonstrate that large texts, representing human (English, Russian, Ukrainian) and artificial (C++, Java) languages, display quantitative patterns characterized by the Benford - like and Zipf laws. The frequency of a word following the Zipf law is inversely proportional to its rank, whereas the total numbers of a certain word appearing in the text generate the uneven Benford - like distribution of leading numbers. Excluding the most popular words essentially improves the correlation of actual textual data with the Zipfian distribution, whereas the Benford distribution of leading numbers (arising from the overall amount of a certain word) is insensitive to the same elimination procedure. The calculated values of the moduli of slopes of double logarithmical plots for artificial languages (C++, Java) are markedly larger than those for human ones.
Hipersaite: https://arxiv.org/abs/1803.03667 
Atsauce Shulzinger, E., Legchenkova, I., Bormashenko, E. Co-occurrence of the Benford-like and Zipf Laws Arising from the Texts Representing Human and Artificial Languages. arXiv, 2018, 1803.03667, 1.-1.lpp. ISSN 2331-8422.
ID 28584