Application of a genetic algorithm to search for the optimal convolutional neural network architecture with weight distribution – Вісник Хмельницького національного університету

APPLICATION OF A GENETIC ALGORITHM TO SEARCH FOR THE OPTIMAL CONVOLUTIONAL NEURAL NETWORK ARCHITECTURE WITH WEIGHT DISTRIBUTION

ЗАСТОСУВАННЯ ГЕНЕТИЧНОГО АЛГОРИТМУ ДЛЯ ПОШУКУ ОПТИМАЛЬНОЇ АРХІТЕКТУРИ ЗГОРТКОВОЇ НЕЙРОННОЇ МЕРЕЖІ З РОЗПОДІЛЕННЯМ ВАГ

Сторінки: 7-11. Номер: №1, 2020 (281)
Автори:
P.M. RADIUK
Khmelnytskyi National University
П.М. РАДЮК
Хмельницький національний університет
DOI: https://www.doi.org/10.31891/2307-5732-2020-281-1-7-11
Рецензія/Peer review : 04. 01.2020 р.
Надрукована/Printed : 14.02.2020 р.

Анотація мовою оригіналу

In the past decade, a new way in neural networks research called Network architectures search has demonstrated noticeable results in the design of architectures for image segmentation and classification. Despite the considerable success of the architecture search in image segmentation and classification, it is still an unresolved and urgent problem. Moreover, the neural architecture search is also a highly computationally expensive task. This work proposes a new approach based on a genetic algorithm to search for the optimal convolutional neural network architecture. We integrated a genetic algorithm with standard stochastic gradient descent that implements weight distribution across all architecture solutions. This approach utilises a genetic algorithm to design a sub-graph of a convolution cell, which maximises the accuracy on the validation set. We show the performance of our approach on the CIFAR-10 and CIFAR-100 datasets with a final accuracy of 93.21% and 78.89%, respectively. The main scientific contribution of our work is the combination of genetic algorithm with weight distribution in the architecture search tasks that achieve similar to state-of-the-art results on a single GPU.
Keywords: convolutional neural networks, genetic algorithms, weight distribution, ablation study.

Розширена анотація англійською мовою

За останнє десятиліття новий спосіб дослідження нейронних мереж під назвою «Пошук мережевих архітектур» продемонстрував позитивні результати в розробці архітектур для сегментації та класифікації зображень. Незважаючи на значний успіх пошуку архітектур в задачах сегментації та класифікації зображень, він все ще є невирішеною і актуальною проблемою. Більше того, пошук архітектур нейронних мереж є також дуже витратим з точки зору обчислювальних ресурсів. У цій роботі пропонується новий підхід на основі генетичного алгоритму для пошуку оптимальної архітектури згорткової нейронної мережі. Ми інтегрували генетичний алгоритм зі стандартним стохастичним градієнтом, що реалізує розподіл ваг у всіх архітектурних рішеннях. Цей підхід використовує генетичний алгоритм для проектування частини графу в якості згорткового шару, що забезпечує максимальну точність на валідаційному наборі даних. У цій роботі ми демонструємо ефективність нашого підходу на наборах даних CIFAR-10 та CIFAR-100 з кінцевою точністю 93,21 % та 78,89 % відповідно. Основним науковим внеском нашої роботи є поєднання генетичного алгоритму з розподілом ваг в задачах пошуку архітектури, що досягає точності класифікацїі зображення з використанням одного графічного процесора близької до найсучасніших результатів.
Ключові слова: згорткові нейронні мережі, генетичні алгоритми, розподілення ваг, абляція дослідження.

References

Romanuke V.V. An efficient technique for size reduction of convolutional neural networks after transfer learning for scene recognition tasks / V.V. Romanuke // Applied Computer Systems. – 2018. – Volume 23. – Issue 2. – P. 141–149. – DOI: https://doi.org/10.2478/acss-2018-0018
Zhou H. BayesNAS: A bayesian approach for neural architecture search / H. Zhou, M. Yang, J. Wang // 2019 International Conference on Machine Learning, arXiv:1905.04919 [cs.LG]. – 2019. – P. 7603–7613.
Savarese P. Learning implicitly recurrent CNNs through parameter sharing / P. Savarese, M. Maire // 2019 International Conference on Learning Representations, arXiv:1902.09701 [cs.LG]. – 2019.
Pham H. Efficient neural architecture search via parameter sharing / H. Pham, M.Y. Guan, B. Zoph // 35th International Conference on Machine Learning, arXiv:1802.03268 [cs.LG]. – ICML, 2018. – Volume 80. – P. 6522–6531.
Liang H. DARTS+: Improved differentiable architecture search with early stopping / H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang, Z. Li // arXiv:1909.06035 [cs.CV]. – 2019.
Deb K. Introduction to genetic algorithms / K. Deb // Sadhana. – Academy Proceedings in Engineering Sciences, 1999. – Volume 24. – No. 4. – P. 293–315. – DOI: https://doi.org/10.1007/BF02823145
Chen Y. Reinforced evolutionary neural architecture search / Y. Chen, Q. Zhang, Ch. Huang, L. Mu, G. Meng, X. Wang // arXiv:1808.00193 [cs.NE]. – 2018.
Litzinger S. Compute-efficient neural network architecture optimisation by a genetic algorithm / S. Litzinger, A. Klos, W. Schiffmann // Artificial Neural Networks and Machine Learning. – ICANN, 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science, Springer, Cham, 2019. – Volume 11728. – DOI: https://doi.org/10.1007/978-3-030-30484-3_32
Costa M.G.F. Using convolutional neural networks with direct acyclic graph architecture in segmentation of breast lesions in US images / M.G.F. Costa, J.P.C. Mendes, W.C. A Pereira, C.F.F.C. Filho // VIII Latin American Conference on Biomedical Engineering and XLII National Conference on Biomedical Engineering. CLAIB 2019. IFMBE Proceedings, Springer, Cham, 2019. – Volume 75. – DOI: https://doi.org/10.1007/978-3-030-30648-9_99
Krizhevsky A. Learning multiple layers of features from tiny images / A. Krizhevsky – 2009. – URL: https://www.cs.toronto.edu/~kriz/CIFAR-.html
Stamoulis D. Single-Path NAS: Designing hardware-efficient ConvNets in less than 4 hours / D. Stamoulis, R. Ding, D. Wang, D. Lymberopoulos, B. Priyantha, J. Liu, D. Marculescu // arXiv:1904.02877 [cs.LG]. – 2019.
Lozano M. Replacement strategies to maintain useful diversity in steady-state genetic algorithms / M. Lozano, F. Herrera, J.R. Cano // Advances in Soft Computing. – Springer, Berlin, Heidelberg, 2005. – Volume 32. – DOI: https://doi.org/10.1007/3-540-32400-3_7
Abadi M. TensorFlow: Large-scale machine learning on heterogeneous distributed systems / M. Abadi, A. Agarwal, P. Barham. // OSDI’16 Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation. – 2016. – P. 265–283. – ISBN: 978-1-931971-33-1.
Romanuke V.V. Appropriateness of DropOut layers and allocation of their 0.5 rates across convolutional neural networks for CIFAR-10, EEACL26, and NORB datasets / V.V. Romanuke // Applied Computer Systems. – 2017. – Volume 22. – Issue 1. – P. 54–63. – DOI: https://doi.org/10.1515/acss-2017-0018
Radiuk P.M. Impact of training set batch size on the performance of convolutional neural networks for diverse datasets / P.M. Radiuk // Information Technology and Management Science. – 2017. – Volume 20. – Issue 1. – P. 20–24. – DOI: https://doi.org/10.1515/itms-2017-0003
Meyes R. Ablation Studies in artificial neural networks / R. Meyes, L. Melanie, C.W. de Puiseau, T. Meisen // arXiv:1901.08644 [cs.NE]. – 2019.

Post Author: npetliaks