ارائه یک روش مبتنی بر گرادیان و کرنل‌های شبکه عصبی عمیق به‌منظور تولید تصاویر بافتی

محمدحسین شکور ¹ ( دانشکده فنی و مهندسی، گروه مهندسی کامپیوتر، دانشگاه اراک، اراک، ایران، )

تاریخ ارسال : 1403/03/28 تاریخ تایید : 1403/08/14

کلید واژه: تولید تصویر, تولید داده, گرادیان صعودی, شبکه عصبی کانولوشن. ,

چکیده مقاله :

تولید پایگاه‌های داده تصاویر یکی از ضروریات بینایی ماشین به شمار می‌رود. روش‌های مختلفی مثل چرخش، تغییر روشنایی و زاویه دید، تغییر اندازه و ... برای افزایش داده‌های تصویری وجود دارد. عیب این روش‌ها آن است که تصاویر تولیدشده خیلی شبیه به تصاویر اولیه است و برای جلوگیری از بیش‌برازش کفایت نمی‌کند. شبکه‌های مواد تخاصمی هم از همین مشکل برخوردار است. از بین همه انواع تصاویر، تصاویر بافتی چالش‌های بیشتری دارند؛ چون تصاویر‌ بافتی را نمی‌توان مانند تصاویر معمولی با دوربین تصویربرداری تهیه کرد، زیرا بافت به‌راحتی در همه جا در دسترس نیست و هر نوع تصویری دارای بافت نیست. امروزه شبکه‌های عصبی عمیق به داده‌های حجیم برای آموزش نیاز دارند و در حال حاضر داده حجیم بافتی وجود ندارد. با استفاده از روش پیشنهادی می‌توان مجموعه داده‌های بافتی حجیم تهیه کرد. در این مقاله با ‌استفاده از ضرایب کانولوشن شبکه‌های عمیق پیش‌آموزش‌داده‌شده بافت جدید تولید می‌شود. در این‌ روش با اعمال گرادیان صعودی به تصاویر حاصل از فیلترهای کانولوشن، تصاویر بافتی جدید به طور مصنوعی تولید می‌شود. تفاوت این روش با روش‌های مولد آن است که این روش نیاز به تصاویر اولیه ندارد، بلکه در اینجا کلاس جدید بافتی تولید می‌گردد. پس از تولید بافت جدید، با روش‌های پردازش تصویر تعداد آن افزایش داده می‌شود. این روش بین 3 تا 5 برابر سریع‌تر از چند شبکه مولد مشهور است. کیفیت تصاویر هم خیلی بهتر است. یک نمونه پایگاه داده بافتی تولید شده که شامل 2400 تصویر در 80 کلاس می‌باشد و در سایت Kaggle بارگذاری شده است.

چکیده انگلیسی :

Production of image databases is one of the necessities of machine vision. There are various methods such as rotating, changing the viewing angle, resizing, etc., to increase the image data. The disadvantage of these methods is that the generated images are very similar to the original images and it is not enough to prevent overfitting. Among all types of images, texture images have more challenges. In this research, a new texture is generated using the convolution coefficients of pre-trained deep networks. In this method, new textured images are artificially produced by applying an ascending gradient to the images resulting from convolution filters. The difference between this method and the generative methods is that there is no initial texture image to increase, but here a new class of texture image is generated from the coefficients of the pre-trained deep network. After the new texture is produced, its number is increased by image processing methods. This method is between 3 and 5 times faster than some well-known generator networks. The quality of the images is much better. With this method, a texture database example has been produced, which includes 2400 images in 80 classes, and has been uploaded to the Kaggle site.

منابع و مأخذ:

[1] C. Shorten and M. T. Khoshgoftaar, "A survey on image data augmentation for deep learning," Mathematics and Computers in Simulation, vol. 6, no. 60, pp. 1-48, Jul. 2019.
[2] S. Di Cataldo and E. Ficarra, "Mining textural knowledge in biological images: applications, methods and trends," Computational and Structural Biotechnology J., vol. 15, pp. 56-67, 2017.
[3] S. Nitish, H. Geofrey, K. Alex, S. Ilya, and S. Ruslan, "Dropout: a simple way to prevent neural networks from overftting," J. Mach Learn Res., vol. 15, no. 1, pp. 1929-1958, 2014.
[4] P. Luis and W. Jason, The Effectiveness of Data Augmentation in Image Classification Using Deep Learning, Stanford University Research Report, 2017.
[5] D. Erhan, Y. Bengio, A. Courville, P. A. Manzagol, and P. Vincent, "Why does unsupervised pre-training help deep learning?" J. Mach. Learn. Res., vol. 11, pp. 625-660, 2010.
[6] J. Deng, et al., "ImageNet: a large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 248-255, Miami, FL, USA, 20-25 Jun. 2009.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classifcation with deep convolutional neural networks," Adv Neural Inf Process Syst., vol. 25, no. 1, pp. 106-114, 2012.
[8] S. Karen and Z. Andrew, Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv preprint arXiv:1409.1556, 2014.
[9] H. Kaiming, Z. Xiangyu, R Shaoqing, and S. Jian, "Deep residual learning for image recognition," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 770-778, Las Vegas, NV, USA, 27-30, 2016.
[10] S. Christian, V. Vincent, I. Sergey, S. Jon, and W. Zbigniew, Rethinking the Iinception Architecture for Computer Vision, arXiv e-prints arXiv:1512.00567, 2015.
[11] H. Gao, L. Zhuang, M. Laurens, and Q. W. Kilian, Densely Connected Convolutional Networks, arXiv preprint arXiv:1608.06993, 2016.
[12] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: unified, real-time object detection," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 779-788, Las Vegas, NV, USA, 27-30 Jun. 2016.
[13] M. Xu, S. Yoon, A. Fuentes, and D. S. Park, "A comprehensive survey of image augmentation techniques for deep learning," Pattern Recognition, vol. 137, Article ID: 109347, 2023.
[14] H. Inoue, Data Augmentation by Pairing Samples for Images Classification, arXiv preprint arXiv:1801.02929, 2018.
[15] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, Mixup: Beyond Empirical Risk Minimization, arXiv preprint arXiv:1710.09412, 2017.
[16] A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, Yolov4: Optimal Speed and Accuracy of Object Detection, arXiv preprint arXiv:2004.10934, 2020.
[17] D. Hendrycks, et al., "Augmix: a simple method to improve robustness and uncertainty under data shift," in Proc. the 18th Int. Conf. on Learning Representations, 6 pp. 6, 26 Apr.-1 May 2020.
[18] K. Baek, D. Bang, and H. Shim, "Gridmix: strong regularization through local context mapping," Pattern Recognit., vol. 109, Article ID: 107594, 2021.
[19] D. Dwibedi, I. Misra, and M. Hebert, "Cut, paste and learn: surprisingly easy synthesis for instance detection," in Proc. of the IEEE Int. Conf. on Computer Vision, pp. 1301-1310, Venice, Italy, 22-29 Oct. 2017.
[20] H. Naveed, S. Anwar, M. Hayat, K. Javed, and A. Mian, "Survey: image mixing and deleting for data augmentation," Engineering Applications of Artificial Intelligence, vol. 131, no. C, Article ID: 107791, 2024.
[21] I. J. Goodfellow, et al., "Generative adversarial nets," in Proc. of the 27th Int. Conf. on Neural Information Processing Systems, vol. 2, pp. 2672-2680, Lake Tahoe, NV, USA, 2-4 Dec. 2013.
[22] M. Mirza and S. Osindero, Conditional Generative Adversarial Nets, arXiv preprint arXiv:1411.1784, 2014.
[23] J.Y. Zhu, T. Park, P. Isola, A.A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, arXiv preprint arXiv:1703.10593, 2017.
[24] U. Bergmann, N. Jetchev, and R. Vollgraf, "Learning texture manifolds with the periodic spatial GAN," in Proc. of the 34th Int. Conf. on Machine Learning, vol. 70, pp. 469-477, Sydney, Australia, 6-11 Aug. 2017.
[25] X. Li, G. Mariethoz, D. Lu, and N. Linde, "Patch-based iterative conditional geostatistical simulation using graph cuts," Water Resource. Res., vol. 52, pp. 6297-6320, 2016.
[26] A. Radford, L. Metz, and S. Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, arXiv preprint arXiv:1511.06434, 2015.
[27] N. Jetchev, U. Bergmann, and R. Vollgraf, "Texture synthesis with spatial generative adversarial networks," arXiv preprint arXiv: 1611.08207, 2016.
[28] W. Fan, J. Fang, and G. Huang, "An improved image texture synthesis based on algorithm convolution neural network," Physical Communication, vol. 66, Article ID: 102395, 2024.
[29] R. Paget and I. Longstaff, "Texture synthesis via a noncausal nonparametric multiscale Markov random field," IEEE Trans. Image Process, vol. 7, no. 6, pp. 925-931, Jun. 1998.
[30] A. A. Efros and T. K. Leung, "Texture synthesis by non-parametric sampling," in Proc. 7th Intl. Conf. Computer Vision, vol. 2, pp. 1033-1038, Kerkyra, Greece, 20-27 Sept. 1999.
[31] K. K. Singh and Y. J. Lee, "Hide-and-seek: forcing a network to be meticulous for weakly-supervised object and action localization," in Proc. IEEE Int. Conf. on Computer Vision, pp. 3544-3553, Venice, Italy, 22-29 Oct. 2017.
[32] T. DeVries and G. W. Taylor, "Improved regularization of convolutional neural networks with cutout," arXiv preprint arXiv:1708.04552, 2017.
[33] Z. Zhong, L. Zheng, G. Kang, S. Li, and Y. Yang, "Random erasing data augmentation," in Proc. of the AAAI Conf. on Artificial Intelligence, vol. 34, pp. 13001-13008, New York, NY, USA, 7-12 Feb. 2020.
[34] P. Chen, S. Liu, H. Zhao, and J. Jia, "Gridmask data augmentation," arXiv preprint arXiv:2001.04086, 2020.
[35] H. Inoue, Data Augmentation by Pairing Samples for Images Classification. arXiv preprint arXiv:1801.02929, 2018.
[36] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, "Mixup: beyond empirical risk minimization," arXiv preprint arXiv:1710.09412, 2017.
[37] Y. Tokozume, Y. Ushiku, and T. Harada, "Between-class learning for image classification," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 5486-5494, Salt Lake City, UT, USA, 18-23 Jun. 2018.
[38] S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: regularization strategy to train strong classifiers with localizable features," in Proc. of the IEEE/CVF Int. Conf. on Computer Vision, pp. 6023-6032, Seoul, South Korea, 27 Oct.-2 Nov. 2019.
[39] T. DeVries and G.W. Taylor, Improved Regularization of Convolutional Neural Networks with Cutout, arXiv preprint arXiv:1708.04552, 2017.
[40] A. Radford, L. Metz, and S. Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, arXiv preprint arXiv:1511.06434, 2015.
[41] A. Madani, M. Moradi, A. Karargyris, and T. Syeda-Mahmood, "Chest x-ray generation and data augmentation for cardiovascular abnormality classification," Medical Imaging 2018: Image Processing, vol. 10574, pp. 105741M, 2018.
[42] S. W. Huang, et al., "Auggan: cross domain adaptation with gan-based data augmentation," in Proc. of the European Conf. on Computer Vision, pp. 718-731, Munich, Germany, 8-14 Sept. 2018.
[43] X. Zhu, Y. Liu, J. Li, T. Wan, and Z. Qin, "Emotion classification with data augmentation using generative adversarial networks," in Pacific-Asia Conf. on Knowledge Discovery and Data Mining, pp. 349-360, Melbourne, Australia, 3-6 Jun. 2018.
[44] Z. Zheng, et al., "Generative adversarial network with multi-branch discriminator for imbalanced cross-species image-to-image translation," Neural Netw., vol. 141, pp. 355-371, 2021.
[45] A. Fawzi, H. Samulowitz, D. Turaga, and P. Frossard, "Adaptive data augmentation for image classification," in Proc. IEEE Int. Conf. on Image Processing, pp. 3688-3692, Phoenix, AZ, USA, 25-26 Sept. 2016.
[46] D. Ho, E. Liang, X. Chen, I. Stoica, and P. Abbeel, "Population based augmentation: efficient learning of augmentation policy schedules," in Proc. 36th Int. Conf. on Machine Learning, pp. 2731-2741, Long Beach, CA, USA, 9-15 Jun. 2019.
[47] R. Hataya, J. Zdenek, K. Yoshizoe, and H. Nakayama, "Faster autoaugment: learning augmentation strategies using backpropagation," in Proc. European Conf. on Computer Vision, pp. 1-16, Munich, Germany, 23-28 Aug. 2020.
[48] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, "Randaugment: practical automated data augmentation with a reduced search space," in Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, pp. 702-703, Seattle, WA, USA, 14-19 Jun. 2020.
[49] R. Hataya, J. Zdenek, K. Yoshizoe, and H. Nakayama, "Meta approach to data augmentation optimization," in Proc. of the IEEE/CVF Winter Conf. on Applications of Computer Vision, pp. 2574-2583, Waikoloa, HI, USA, 3-8 Jan. 2022.
[50] B. Zoph, et al., "Learning data augmentation strategies for object detection," in Proc. European Conf. on Computer Vision, pp. 566-583, Munich, Germany, 23-28 Aug. 2020.
[51] P. Li, X. Liu, and X. Xie, "Learning sample-specific policies for sequential image augmentation," in Proc. of the 29th ACM Int. Conf. on Multimedia, pp. 4491-4500, 20-24 Oct. 2021.
[52] A. J. Ratner, H. Ehrenberg, Z. Hussain, J. Dunnmon, and C. Ré, "Learning to compose domain-specific transformations for data augmentation," in Proc. of the 31st Int. Conf. on Neural Information Processing Systems, pp. 3239-3249, Long Beach, CA, USA, 4-9 Dec. 2017.
[53] Z. Tang, X. Peng, T. Li, Y. Zhu, and D. N. Metaxas, "Adatransform: adaptive data transformation," in Proc. of the IEEE/CVF Int. Conf. on Computer Vision, pp. 2998-3006, Seoul, South Korea, 27 Oct.-2 Nov. 2019.
[54] F. Chollet, Deep Learning with Python, Manning Publication, 2017.
[55] R. R. Selvaraju, et al., Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, arXiv preprint arXiv:1610.02391, 2017.
[56] F. Bianconi and A. Fernandez, "On the occurrence probability of local binary patterns: a theoretical study," J. Math. Imag. Vis., vol. 40, no. 3, pp. 259-268, 2011.
[57] M. H. Shakoor and R. Boostani, "Extended mapping local binary pattern operator for texture classification," International J. of Pattern Recognition and Artificial Intelligence, vol. 3, no. 6, Article ID: 1750019, 2017.
[58] K. J. Dana, B. Van Ginneken, S. K. Nayar, and J. J. Koenderink, "Reflectance and texture of real world surfaces," ACM Trans. Graph, vol. 18, no. 1, pp. 1-34, 1999. [59] T. Ojala, et al., "Outex: new framework for empirical evaluation of texture analysis algorithm," in Proc. Int. Conf. Pattern Recogn., pp. 701-706, Quebec City, Canada, 11-15 Aug. 2002.
[60] A. R. Backes, D. Casanova, and O. M. Bruno, "Color texture analysis based on fractal descriptors," Pattern Recognition, vol. 45, no. 5, pp. 1984-1992, 2012.
[61] S. Lazebnik, C. Schmid, and J. Ponce, "A sparse texture representation using local affine regions," IEEE Trans Pattern Anal Mach Intell, vol. 27, no. 8, pp. 1265-1278, Aug. 2005.
[62] P. Brodatz, Textures: A Photographic Album for Artists and Designers, New York, NY, USA: Dover, 1966.
[63] G. Smith, MeasTex Image Texture Database and Test Suite Centre for Sensor Signal and Information Processing, Univ Qld, 1998. http://www.texturesynthesis.com/meastex/meastex.html
[64] G. J. Burghouts and J. M. Geusebroek, "Material-specific adaptation of color invariant features," Pattern Recognition Letters, vol. 30, no. 3, pp. 306-313, Feb. 2009.
[65] J. Eastwood, L. Newton, R. Leach, and S. Piano, "Generation and categorization of surface texture data using a modified progressively growing adversarial network," Precision Engineering, vol. 74, pp. 1-11, Mar. 2022.
[66] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Trans. Image Process, vol. 13, no. 4, pp. 600-612, Mar. 2004.
[67] U. Lin, Z. Xu, G. Sharma, and T. N. Pappas, "Texture representation via analysis and synthesis with generative adversarial networks," E-Prime-Advances in Electrical Engineering, Electronics and Energy, vol. 6, Article ID: 100092, 2023.
[68] S. E. Kim, H. Yoon, and J. Lee, "Fast and scalable earth texture synthesis using spatially assembled generative adversarial neural networks," J. of Contaminant Hydrology, vol. 243, no. C, Article ID: 103867, 2021.
[69] J. Portilla and E. P. Simoncelli, "A parametric texture model based on joint statistics of complex wavelet coefficients," Int'l J. of Computer Vision, vol. 40, no. 1, pp. 49-71, Oct. 2000.
[70] U. Bergmann, N. Jetchev, and R. Vollgraf, "Learning texture manifolds with the periodic spatial GAN," in Proc. of the 34th Int.l Conf. on Machine Learning Research, vol. 70, pp. 469-477, Sydney, Australia 6-11 Aug. 2017.
[71] Z. Guo, L. Zhang, and D. Zhang, "A completed modeling of local binary pattern operator for texture classification," IEEE Trans. Image Process, vol. 9, no. 16, pp. 1657-1663, Jun. 2010.