Bayesian Neural Networks; why and how?

Document Type : Promotional Paper

Authors

1 School of Engineering Science, College of Engineering, University of Tehran, Tehran, Iran

2 School of Engineering Science, College of Engineering, University of Tehran Tehran, Iran

3 Department of Algorithms and Computation, School of Engineering Sciences, University of Tehran

Abstract

One of the main challenges in utilizing neural networks is the problem of overfitting. This occurs when a neural network model fits the training data too precisely, but fails to generalize to data outside the training set. This lack of generalization is often observed when the number of training samples is smaller than the number of features being analyzed, and the complexity of the model — that is, the number of weights and biases in the neural network — is high. In such situations, ensemble learning, and more specifically bagging methods, are commonly employed. These methods use resampling techniques to incorporate uncertainty into the model, thereby improving the model’s ability to generalize. However, when the training sample size is extremely limited, resampling becomes less effective, and the uncertainty introduced in the model is very limited. Bayesian neural networks address this by quantifying parameter uncertainty and considering parameter states that may not have been observed in the existing data. This leads to a significant improvement in model generalization. In addition to mitigating overfitting, this approach also provides access to the posterior predictive distribution, allowing for the calculation of prediction intervals. In this article, we briefly review Bayesian neural networks, explain how they are trained, and then analyze data and compare these models with standard neural networks.

Keywords

Main Subjects


[1] C. Blundell, J. Cornebise, K. Kavukcuoglu and D. Wierstra, Weight uncertainty in neural network, Proc. of the International Conference on Machine Learning PMLR, Lille, France, (2015) 1613–1622.
[2] J. P. Bharadiya, A review of Bayesian machine learning principles, methods, and applications, Int. J. Innov. Sci. Res. Technol., 8 no. 5 (2023) 2033–2038.
[3] R. Chandra, R. Chen and J. Simmons, Bayesian neural networks via MCMC: a Python-based tutorial, (2023). https://doi.org/10.48550/arXiv.2304.02595.
[4] C. M. Carlo, Markov chain monte carlo and gibbs sampling, Lecture Notes for EEB 581, (2004) 24 pp.
[5] A. Graves, Practical variational inference for neural networks, Part of Part of Advances in Neural Information Processing Systems 24 (NIPS), (2011)
[6] A. Gelman, J. B. Carlin, H. S. Stern and D. B. Rubin, Bayesian Data Analysis, Chapman and Hall/CRC, 1995.
[7] Z. Q. Hong and J. Y. Yang, Lung cancer, UCI Machine Learning Repository, (1992). https://doi.org/10.24432/C57596.
[8] W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57 no. 1 (1970) 97–109.
[9] L. V. Jospin, H. Laga, F. Boussaid, W. Buntine and M. Bennamoun, Hands-on Bayesian neural networks—A tutorial for deep learning users, IEEE Computational Intelligence Magazine, 17 no. 2 (2022) 29–48.
[10] H. D. Kabir, A. Khosravi, M. A. Hosen and S. Nahavandi, Neural network-based uncertainty quantification: A survey of methodologies and applications, IEEE Access, 6 (2018) 36218–36234.
[11] J. Ker, L. Wang, J. Rao and T. Lim, Deep learning applications in medical image analysis, IEEE Access, 6 (2018) 9375–9389.
[12] I. Oleksiienko, D. T. Tran and A. Iosifidis, Variational neural networks, Procedia Computer Science, 222 (2023) 104–113.
[13] M. R. Meshkani and A. Kavousi Dolanghar, Bayesian Statistical Methods, Shahid Beheshti University of Medical Sciences Press, (2022). [In Persian]
[14] C. P. Robert, G. Casella and G. Casella, Monte carlo statistical methods, 2, Springer, 1999.
[15] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow and R. Fergus, Intriguing properties of neural networks, (2013). arXiv preprint arXiv:1312.6199. https://doi.org/10.48550/arXiv.1312.
6199
[16] S. Sun, G. Zhang, J. Shi and R. Grosse, Functional variational Bayesian neural networks, (2019). arXivpreprint arXiv:1903.05779. https://doi.org/10.48550/arXiv.1903.05779.
[17] S. M. Taheri, Statistics and artificial neural networks, In Proceedings of the 8th Iranian Statistics Conference, Shiraz University, (2006) 81–91. [In Persian]
[18] M. N. Tran, T. N. Nguyen, and V. H. Dao, A practical tutorial on variational Bayes, (2021) 43 p. arXivpreprint arXiv:2103.01327. https://doi.org/10.48550/arXiv.2103.01327
[19] M. J. Zaki and W. Meira Jr, Data mining and machine learning: fundamental concepts and algorithms, Cambridge University Press, 2020.