New Technologies and Innovative Solutions in the Development of Multimedia Corpus of Mezen Robinsons Texts

Tatiana V. Shvetsova, Veronika E. Shakhova, Svetlana A. Dulova

Abstract


Objective: New Technologies and Innovative Solutions in creating a multimedia corpus of texts about the "Mezen Robinsons" aims to preserve the memory of an event that occurred in the 18th century and to study the history of Spitsbergen development. This article presents a multimedia corpus of Russian-language texts about the "Mezen Robinsons" written in 1766–2022. Observations show that the history of the survival of the Mezen hunters on Edge Island in 1743–1749 has repeatedly attracted the attention of specialists from various fields of knowledge: historians, archaeologists, publicists, professional writers, translators, etc. The corpus unites texts, audio, video, and multimedia resources. Methods: continuous sampling was used to collect the material; when analyzing and describing the data, we applied a descriptive method, a biographical method of studying literature, statistical data processing, philological analysis, observation, assessment, and corpus modeling methods. Findings: the methodology and technology of building an independent multimedia corpus, its architecture, and its design are described. Novelty: the multimedia corpus is a contribution to the development of a new approach to studying the subjectology of Russian literature. Practical significance:the findings can become the basis for studying the biographies and creativity of various authors who built their works on the plot of the Mezen industrialists and for further comparison of various interpretations of one event from the history of the development of the Arctic.

 

Doi: 10.28991/HIJ-2023-04-01-07

Full Text: PDF


Keywords


Corpus Linguistics; Multimedia Text Corpus; Mezen Robinsons; Corpus-Based Research.

References


Monogarova, A., Shiryaeva, T., & Arupova, N. (2021). The Language of Russian Fake Stories: A Corpus-Based Study of the Topical Change in the Viral Disinformation. Journal of Language and Education, 7(4), 83–106. doi:10.17323/JLE.2021.13371.

Dios, P. S. (2022). Veiga: a Multimedia Corpus of Film Subtitling for Multimodal Analysis. New Trends in Translation and Technology, 4-6 July, 2022, Rhodes Island, Greece.

Russian Shakespeare (2007). Information and Research Database. Shakespeare Commission of the Russian Academy of Sciences, Russia. Available online: https://rus-shake.ru/ (In Russian).

Orekhov, B. (2023). The Tale of Igor's Campaign: Corpus. Available online: http://nevmenandr.net/slovo/pro.php (accessed on April 2023). (In Russian).

Pelcz, K. (2022). A Multimedia Corpus for Language Teaching Purposes: the MagyarOK Video Corpus. Studi Finno-Ugrici, ns, 2, 1-20. doi:10.6093/1826-753X/9863. (In Italian).

Zhang, Y., Hu, W., & Liu, L. (2022). The Construction And Application Of The Multimedia Corpus Of Bisu Language: Taking The Study On Measure Words As An Example. Journal of Positive School Psychology, 6(10), 3902-3914.

Al-Maadeed, S., AlJa’am, J., Khalifa, B., & Elsaud, S. A. (2021). MOALLEMCorpus: A Large-Scale Multimedia Corpus for Children Education of Arabic Vocabularies. 2021 IEEE Global Engineering Education Conference (EDUCON), Vienna, Austria. doi:10.1109/educon46332.2021.9453983.

Wu, H. (2021). Multimedia Interaction-Based Computer-Aided Translation Technology in Applied English Teaching. Mobile Information Systems, 2021, 1–10. doi:10.1155/2021/5578476.

Khokhlova Maria, V. (2023). Learner corpora: relevant information and an overview of the existing frameworks. Terra Linguistica, 51(1), 57-69.

Ahmed, S., Sadeq, N., Shubha, S. S., Islam, M. N., Adnan, M. A., & Islam, M. Z. (2020). Preparation of bangla speech corpus from publicly available audio & text. Proceedings of The 12th language resources and evaluation conference, 13-15 May, 2020, Marseille, France.

Zhang, J., Wang, C., Muthu, A., & Varatharaju, V. M. (2022). Computer multimedia assisted language and literature teaching using Heuristic hidden Markov model and statistical language model. Computers & Electrical Engineering, 98, 107715. doi:10.1016/j.compeleceng.2022.107715.

Gomez Guinovart, X. (2019). Enriching parallel corpora with multimedia and lexical semantics. Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications, 90, 141–158. doi:10.1075/scl.90.09gom.

Shen, Y., Yang, H., & Lin, L. (2022). Automatic Depression Detection: an Emotional Audio-Textual Corpus and A Gru/Bilstm-Based Model. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). doi:10.1109/icassp43922.2022.9746569.

RSF. (2023). Multimedia Corpus of Texts about Mezen Robinsons. Russian Science Foundation, Russia. Available online: https://mezrob29.ru/ (accessed on March 2023).

State Archive of the Arkhangelsk Oblast (2019). Historical Description of the Journey to Spitsbergen in 1743–1749 of Four Mezen Sailors: Alexey and Ivan Khimkov, Stepan Sharapov and Fyodor Virugin. Fond 6, Inventory 17, Case 1, 1-8.

Leroy, P.-L. (1933). The adventures of four Russian sailors, to the island of Spitsbergen, a storm brought. All-Union Arctic Institute, Leningrad, Russia.

Griesinger, T. (1894). In the Far North: Travel and Adventures in the Polar Lands. Oehmigke, Leipzig, Germany. (In German).

Spokencorpora (2023). Stories about Dreams and Other Speech Corpora. Stories about Dreams. Available online: http://spokencorpora.ru/showcorpus.py?dir=00dreams (accessed on April 2023). (In Russian).

Project Phil (2023). Saint Petersburg Hagiographic Text Corpora. Available online: http://project.phil.spbu.ru/scat/ page.php?page=project (Accessed on April 2023). (In Russian).

Prozhito (2023). European University at St. Petersburg. Prozhito” (Lived thorugh). Available online: https://prozhito.org/ (accessed on April 2023). (In Russian).

Stenogramma. (2023). Politics and Literature. The Digital Archive of Literary Organizations in 1920-1930. Available online: http://stenogramma.imli.ru/ (accessed on April 2023). (In Russian).

Roberts, D. (2005). Four against the Arctic: Shipwrecked for six years at the top of the world. Simon and Schuster, New York, United Sates.

Wilbur, W. J., Rzhetsky, A., & Shatkay, H. (2006). New directions in biomedical text annotation: Definitions, guidelines and corpus construction. BMC Bioinformatics, 7(1). doi:10.1186/1471-2105-7-356.

Weismayer, C., & Pezenka, I. (2017). Identifying emerging research fields: a longitudinal latent semantic keyword analysis. Scientometrics, 113(3), 1757–1785. doi:10.1007/s11192-017-2555-z.

Webber, R., & Stroud, D. (2013). How changes in word frequencies reveal changes in the focus of the JDDDMP. Journal of Direct, Data and Digital Marketing Practice, 14(4), 310–320. doi:10.1057/dddmp.2013.19.

Riedhammer, K., Gropp, M., Bocklet, T., Hönig, F., Nöth, E., & Steidl, S. (2013). Lmelectures: A multimedia corpus of academic spoken english. First Workshop on Speech, Language and Audio in Multimedia, 22-23 August, 2013, Marseille, France,

Bloshchynskyi, I., Bahrii, H., Nanivska, L., Tsviak, L., Isaieva, I., Skyba, K., ... & Mishchynska, I. (2022). Gender Characteristics of Individual’s Linguistic Behavior in the Context of Future Translators’ Professional Training. Emerging Science Journal, 6, 199-208. doi:10.28991/ESJ-2022-SIED-014.

Kopotev, M., & Mustayoki, A. (2003). Principles of the Creation of the Helsinki Annotated Corpus HANCO in the Internet. Scientific and Technical Information. Series 2. Information Processes and Systems, 6, 33–36.

Starchikov, M.Yu. (2021). Polar Robinsons. Multimedia corpus of texts about the Mezen Robinsons. Available online: https://mezrob29.ru/mihail-yurevich-starchikov/ (accessed on April 2023). (In Russian).


Full Text: PDF

DOI: 10.28991/HIJ-2023-04-01-07

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Tatiana Vasilyeva Shvetsova, Veronika Evgenyevna Shakhova, Svetlana Alexeevna Dulova