Improved Skyline-BP Network for Multi-Track MIDI Music Melody Extraction and Style Classification
Downloads
With the rapid development of the digital music industry, core challenges have emerged concerning the insufficient accuracy of main melody extraction and the poor style classification effect of multi-track MIDI files. To address these issues, this study proposes a novel model based on an improved Skyline algorithm and an optimized BP neural network. The method first standardizes MIDI data into a Time-Pitch-Intensity feature matrix. An improved Skyline algorithm is then used to integrate pitch saliency calculation with temporal continuity screening, enhancing the anti-interference ability for multi-track melodies. For music style classification, an optimized BP network with Adaptive Moment Estimation (Adam) gradient optimization and Residual Connection (ResConnect) is designed to improve learning efficiency and accuracy. Experimental results demonstrated that the proposed model surpassed comparative models in overall performance, with a classical-style main melody extraction accuracy of 94.6% and a 2-track separation accuracy of 95.2%. The experiments were benchmarked on the Lakh MIDI Dataset and MuseScore MIDI Library. The model also exhibits superior robustness against noise interference and faster convergence speed. This study provides reliable technical support for applications like music creation assistance and copyright retrieval.
Downloads
[1] Pickstone, E., Maguire, P., & Robertson, F. (2024). Monotype MIDI. Book 2.0, 14(1), 169–185. doi:10.1386/btwo_00109_7.
[2] Pasquier, P., Ens, J., Fradet, N., Triana, P., Rizzotti, D., Rolland, J. B., & Safi, M. (2025). MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition. Proceedings of the AAAI Conference on Artificial Intelligence, 39(2), 1474–1482. doi:10.1609/aaai.v39i2.32138.
[3] Ding, F., & Cui, Y. (2023). MuseFlow: music accompaniment generation based on flow. Applied Intelligence, 53(20), 23029–23038. doi:10.1007/s10489-023-04664-8.
[4] Xiao, Z., Chen, X., & Zhou, L. (2024). Music performance style transfer for learning expressive musical performance. Signal, Image and Video Processing, 18(1), 889–898. doi:10.1007/s11760-023-02788-5.
[5] Hao, H., Xu, C., Zhang, W., Yang, S., & Muntean, G. M. (2024). Joint Task Offloading, Resource Allocation, and Trajectory Design for Multi-UAV Cooperative Edge Computing with Task Priority. IEEE Transactions on Mobile Computing, 23(9), 8649–8663. doi:10.1109/TMC.2024.3350078.
[6] Park, C. W., Palakonda, V., Yun, S., Kim, I. M., & Kang, J. M. (2024). OCR-Diff: A Two-Stage Deep Learning Framework for Optical Character Recognition Using Diffusion Model in Industrial Internet of Things. IEEE Internet of Things Journal, 11(15), 25997–26000. doi:10.1109/JIOT.2024.3390700.
[7] Zhao, J., Taniar, D., Adhinugraha, K., Baskaran, V. M., & Wong, K. S. (2023). Multi-MMLG: a novel framework of extracting multiple main melodies from MIDI files. Neural Computing and Applications, 35(30), 22687–22704. doi:10.1007/s00521-023-08924-z.
[8] Khames, W., Hadjali, A., & Lagha, M. (2024). Parallel continuous skyline query over high-dimensional data stream windows. Distributed and Parallel Databases, 42(4), 469–524. doi:10.1007/s10619-024-07443-7.
[9] He, J., Han, X., Wan, X., & Wang, J. (2024). Efficient Skyline Frequent-Utility Itemset Mining Algorithm on Massive Data. IEEE Transactions on Knowledge and Data Engineering, 36(7), 3009–3023. doi:10.1109/TKDE.2024.3349454.
[10] Xie, C., Song, H., Zhu, H., Mi, K., Li, Z., Zhang, Y., Cheng, J., Zhou, H., Li, R., & Cai, H. (2024). Music genre classification based on res-gated CNN and attention mechanism. Multimedia Tools and Applications, 83(5), 13527–13542. doi:10.1007/s11042-023-15277-1.
[11] Arzani, A., Yuan, L., Newell, P., & Wang, B. (2025). Interpreting and generalizing deep learning in physics-based problems with functional linear models. Engineering with Computers, 41(1), 135–157. doi:10.1007/s00366-024-01987-z.
[12] Ahmed, S. F., Alam, M. S. Bin, Hassan, M., Rozbu, M. R., Ishtiak, T., Rafa, N., Mofijur, M., Shawkat Ali, A. B. M., & Gandomi, A. H. (2023). Deep learning modelling techniques: current progress, applications, advantages, and challenges. Artificial Intelligence Review, 56(11), 13521–13617. doi:10.1007/s10462-023-10466-8.
[13] Zhang, Z. (2023). Extraction and recognition of music melody features using a deep neural network. Journal of Vibroengineering, 25(4), 769–777. doi:10.21595/jve.2023.23075.
[14] Hui, F. (2023). Transforming educational approaches by integrating ethnic music and ecosystems through RNN-based extraction. Soft Computing, 27(24), 19143–19158. doi:10.1007/s00500-023-09329-9.
[15] Wijaya, N. N., Setiadi, D. R. I. M., & Muslikh, A. R. (2024). Music-Genre Classification using Bidirectional Long Short-Term Memory and Mel-Frequency Cepstral Coefficients. Journal of Computing Theories and Applications, 1(3), 243–256. doi:10.62411/jcta.9655.
[16] Wan, X., Han, X., & Wang, J. (2025). Computing Prominent Skyline on Massive Data. Data Science and Engineering, 10(1), 117–146. doi:10.1007/s41019-024-00259-6.
[17] Dampfhoffer, M., Mesquida, T., Valentian, A., & Anghel, L. (2024). Backpropagation-Based Learning Techniques for Deep Spiking Neural Networks: A Survey. IEEE Transactions on Neural Networks and Learning Systems, 35(9), 11906–11921. doi:10.1109/TNNLS.2023.3263008.
[18] Pai, S., Sun, Z., Hughes, T. W., Park, T., Bartlett, B., Williamson, I. A. D., Minkov, M., Milanizadeh, M., Abebe, N., Morichetti, F., Melloni, A., Fan, S., Solgaard, O., & Miller, D. A. B. (2023). Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science, 380(6643), 398–404. doi:10.1126/science.ade8450.
[19] Silva Filho, T., Song, H., Perello-Nieto, M., Santos-Rodriguez, R., Kull, M., & Flach, P. (2023). Classifier calibration: a survey on how to assess and improve predicted class probabilities. Machine Learning, 112(9), 3211-3260. doi:10.1007/s10994-023-06336-7.
[20] Nsugbe, E. (2023). Toward a Self-Supervised Architecture for Semen Quality Prediction Using Environmental and Lifestyle Factors. Artificial Intelligence and Applications, 1(1), 35–42. doi:10.47852/bonviewAIA2202303.
[21] Akrami, A., & Mohsenian-Rad, H. (2024). Event-Triggered Distribution System State Estimation: Sparse Kalman Filtering With Reinforced Coupling. IEEE Transactions on Smart Grid, 15(1), 627–640. doi:10.1109/TSG.2023.3270421.
[22] Koo, Y. C., Mahyuddin, M. N., & Wahab, M. N. A. (2023). Novel Control Theoretic Consensus-Based Time Synchronization Algorithm for WSN in Industrial Applications: Convergence Analysis and Performance Characterization. IEEE Sensors Journal, 23(4), 4159–4175. doi:10.1109/JSEN.2022.3231726.
[23] Chu, T., Yang, Z., & Huang, X. (2024). Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization. IEEE Transactions on Circuits and Systems for Video Technology, 34(4), 3056–3060. doi:10.1109/TCSVT.2023.3311923.
[24] Liu, W. (2023). Literature survey of multi-track music generation model based on generative confrontation network in intelligent composition. Journal of Supercomputing, 79(6), 6560–6582. doi:10.1007/s11227-022-04914-5.
[25] Lee, D. H., & Liu, J. L. (2023). End-to-end deep learning of lane detection and path prediction for real-time autonomous driving. Signal, Image and Video Processing, 17(1), 199–205. doi:10.1007/s11760-022-02222-2.
[26] Xing, Z., & Zhao, W. (2024). Block-Diagonal Guided DBSCAN Clustering. IEEE Transactions on Knowledge and Data Engineering, 36(11), 5709–5722. doi:10.1109/TKDE.2024.3401075.
[27] Li, J., Li, Y., Song, J., Zhang, J., & Zhang, S. (2024). Quantum Support Vector Machine for Classifying Noisy Data. IEEE Transactions on Computers, 73(9), 2233–2247. doi:10.1109/TC.2024.3416619.
[28] Li, X., Wang, J., & Yang, C. (2023). Risk prediction in financial management of listed companies based on optimized BP neural network under digital economy. Neural Computing and Applications, 35(3), 2045–2058. doi:10.1007/s00521-022-07377-0.
[29] Reyad, M., Sarhan, A. M., & Arafa, M. (2023). A modified Adam algorithm for deep neural network optimization. Neural Computing and Applications, 35(23), 17095–17112. doi:10.1007/s00521-023-08568-z.
[30] Kar, M. K., Neog, D. R., & Nath, M. K. (2023). Retinal Vessel Segmentation Using Multi-Scale Residual Convolutional Neural Network (MSR-Net) Combined with Generative Adversarial Networks. Circuits, Systems, and Signal Processing, 42(2), 1206–1235. doi:10.1007/s00034-022-02190-5.
- This work (including HTML and PDF Files) is licensed under a Creative Commons Attribution 4.0 International License.






















