Trainable Regularization in Dense Image Matching Problems

Vladimir Zh. Kuklin, Aslan A. Tatarkanov, Alexander A. Umyskov


This study examines the development of specialized models designed to solve image-matching problems. The purpose of this research is to develop a technique based on energy tensor aggregation for dense image matching. This task is relevant within the framework of computer systems since image comparison makes it possible to solve current problems such as reconstructing a three-dimensional model of an object, creating a panorama scene, ensuring object recognition, etc. This paper examines in detail the key features of the image matching process based on the use of binocular stereo reconstruction and the features of calculating energies during this process, and establishes the main parts of the proposed method in the form of diagrams and formulas. This research develops a machine learning model that provides solutions to image matching problems for real data using parallel programming tools. A detailed description of the architecture of the convolutional recurrent neural network that underlies this method is given. Appropriate computational experiments were conducted to compare the results obtained with the methods proposed in the scientific literature. The method discussed in this article is characterized by better efficiency, both in terms of the speed of work execution and the number of possible errors.


Doi: 10.28991/HIJ-2023-04-03-011

Full Text: PDF


Image Matching; Convolutional Recurrent Neural Network; Stereo Reconstruction; Method Error; Neural Network Architecture.


Lebedev, G. S., Linskaya, E. Y., Terekhov, V. Y., & Tatarkanov, A. A. bievich. (2023). Monitoring and Quality Control of Telemedical Services via the Identification of Artifacts in Video Footage. International Journal of Intelligent Systems and Applications in Engineering, 11(2), 82–92.

Kuklin, V., Alexandrov, I., Polezhaev, D., & Tatarkanov, A. (2023). Prospects for developing digital telecommunication complexes for storing and analyzing media data. Bulletin of Electrical Engineering and Informatics, 12(3), 1536–1549. doi:10.11591/eei.v12i3.4840.

Fang, L., Zhao, J., Pan, Z., & Li, Y. (2023). TPP: Deep learning based threshold post-processing multi-focus image fusion method. Computers and Electrical Engineering, 110, 108736. doi:10.1016/j.compeleceng.2023.108736.

Aldao, E., Fernández-Pardo, L., González-deSantos, L. M., & González-Jorge, H. (2023). Comparison of deep learning and analytic image processing methods for autonomous inspection of railway bolts and clips. Construction and Building Materials, 384, 131472. doi:10.1016/j.conbuildmat.2023.131472.

Pandey, B., Kumar Pandey, D., Pratap Mishra, B., & Rhmann, W. (2022). A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: Challenges and research directions. Journal of King Saud University-Computer and Information Sciences, 34(8), 5083–5099. doi:10.1016/j.jksuci.2021.01.007.

Nam, W., & Jang, B. (2024). A survey on multimodal bidirectional machine learning translation of image and natural language processing. Expert Systems with Applications, 235, 121168. doi:10.1016/j.eswa.2023.121168.

Ziafati Bagherzadeh, S. H., & Toosizadeh, S. (2022). Eye Tracking Algorithm Based on Multi Model Kalman Filter. HighTech and Innovation Journal, 3(1), 15–27. doi:10.28991/hij-2022-03-01-02.

Ma, J., Jiang, X., Fan, A., Jiang, J., & Yan, J. (2021). Image Matching from Handcrafted to Deep Features: A Survey. International Journal of Computer Vision, 129(1), 23–79. doi:10.1007/s11263-020-01359-2.

Ualiyeva, R. M., Kukusheva, A. N., Insebayeva, M. K., Akhmetov, K. K., Zhangazin, S. B., & Krykbayeva, M. S. (2022). Agrotechnological methods of plant feeders applying for spring wheat agrocenoses – North-Eastern Kazakhstan varieties. Journal of Water and Land Development, 55, 28–40. doi:10.24425/jwld.2022.142301.

Zhang, Y., & Hou, X. (2023). Application of video image processing in sports action recognition based on particle swarm optimization algorithm. Preventive Medicine, 173, 107592. doi:10.1016/j.ypmed.2023.107592.

Chen, Q., & Yao, J. (2023). Outliers rejection in similar image matching. Virtual Reality and Intelligent Hardware, 5(2), 171–187. doi:10.1016/j.vrih.2023.02.004.

Alsakka, F., Assaf, S., El-Chami, I., & Al-Hussein, M. (2023). Computer vision applications in offsite construction. Automation in Construction, 154, 104980. doi:10.1016/j.autcon.2023.104980.

Zimiao, Z., Hao, Z., Kai, X., Yanan, W., & Fumin, Z. (2022). A non-iterative calibration method for the extrinsic parameters of binocular stereo vision considering the line constraints. Measurement, 205, 112151. doi:10.1016/j.measurement.2022.112151.

Liu, Y., Li, Y., Dai, L., Yang, C., Wei, L., Lai, T., & Chen, R. (2021). Robust feature matching via advanced neighborhood topology consensus. Neurocomputing, 421, 273–284. doi:10.1016/j.neucom.2020.09.047.

Ma, J., Zhao, J., Jiang, J., Zhou, H., & Guo, X. (2019). Locality Preserving Matching. International Journal of Computer Vision, 127(5), 512–531. doi:10.1007/s11263-018-1117-z.

Wang, X. F., & Ye, D. (2010). On nonparametric comparison of images and regression surfaces. Journal of Statistical Planning and Inference, 140(10), 2875–2884. doi:10.1016/j.jspi.2010.03.011.

Wang, T., Zhang, J., Zhang, S., Zhang, X., & Wang, J. (2023). A combined computer vision and image processing method for surface coverage measurement of shot peen forming. Journal of Manufacturing Processes, 91, 137–148. doi:10.1016/j.jmapro.2023.02.035.

Krishnaveni, S., Subramani, K., Sharmila, L., Sathiya, V., Maheswari, M., & Priyaadarshan, B. (2023). Enhancing human sight perceptions to optimize machine vision: Untangling object recognition using deep learning techniques. Measurement: Sensors, 28, 100853. doi:10.1016/j.measen.2023.100853.

Ualiyeva, R. M., Kaverina, M. M., Ivanko, L. N., & Zhangazin, S. B. (2023). Assessment of Spring Wheat Varieties for Pest Resistance. OnLine Journal of Biological Sciences, 23(4), 489–503. doi:10.3844/ojbsci.2023.489.503.

Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nešić, N., Wang, X., & Westling, P. (2014). High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth. Pattern Recognition, 31–42, Springer, Cham, Switzerland. doi:10.1007/978-3-319-11752-2_3.

Bisson-Larrivée, A., & LeMoine, J. B. (2022). Photogrammetry and the impact of camera placement and angular intervals between images on model reconstruction. Digital Applications in Archaeology and Cultural Heritage, 26, 224. doi:10.1016/j.daach.2022.e00224.

Yang, B., Ali, F., Zhou, B., Li, S., Yu, Y., Yang, T., Liu, X., Liang, Z., & Zhang, K. (2022). A novel approach of efficient 3D reconstruction for real scene using unmanned aerial vehicle oblique photogrammetry with five cameras. Computers and Electrical Engineering, 99, 107804. doi:10.1016/j.compeleceng.2022.107804.

Pu, C., Yang, C., Pu, J., Tylecek, R., & Fisher, R. B. (2023). A multi-modal garden dataset and hybrid 3D dense reconstruction framework based on panoramic stereo images for a trimming robot. ISPRS Journal of Photogrammetry and Remote Sensing, 202, 262–286. doi:10.1016/j.isprsjprs.2023.06.006.

Guan, J., Yang, X., Lee, V. C. S., Liu, W., Li, Y., Ding, L., & Hui, B. (2022). Full field-of-view pavement stereo reconstruction under dynamic traffic conditions: Incorporating height-adaptive vehicle detection and multi-view occlusion optimization. Automation in Construction, 144, 104615. doi:10.1016/j.autcon.2022.104615.

Xu, Y., Liu, X., Qin, L., & Zhu, S.-C. (2017). Cross-View People Tracking by Scene-Centered Spatio-Temporal Parsing. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). doi:10.1609/aaai.v31i1.11190.

Wang, J., Zhang, S., Wang, Y., & Zhu, Z. (2021). Learning efficient multi-task stereo matching network with richer feature information. Neurocomputing, 421, 151–160. doi:10.1016/j.neucom.2020.08.010.

Zahiri-Azar, R., & Salcudean, S. E. (2006). Motion estimation in ultrasound images using time domain cross correlation with prior estimates. IEEE Transactions on Biomedical Engineering, 53(10), 1990–2000. doi:10.1109/TBME.2006.881780.

Hoskins, P. R., & Svensson, W. (2012). Current state of ultrasound elastography. Ultrasound, 20(1), 3–4. doi:10.1258/ult.2012.012e02.

Yang, Q. (2012). A non-local cost aggregation method for stereo matching. 2012 IEEE Conference on Computer Vision and Pattern Recognition. doi:10.1109/cvpr.2012.6247827.

Full Text: PDF

DOI: 10.28991/HIJ-2023-04-03-011


  • There are currently no refbacks.

Copyright (c) 2023 Vladimir Zh. Kuklin, Aslan A. Tatarkanov, Alexander A. Umyskov