{"id":12167,"date":"2022-04-20T10:34:46","date_gmt":"2022-04-20T07:34:46","guid":{"rendered":"http:\/\/journals.khnu.km.ua\/vestnik\/?p=12167"},"modified":"2022-04-20T10:34:46","modified_gmt":"2022-04-20T07:34:46","slug":"porivnyalnyj-analiz-efektyvnosti-nadbudov-selenium-ta-beautifulsoup","status":"publish","type":"post","link":"https:\/\/journals.khnu.km.ua\/vestnik\/?p=12167","title":{"rendered":"\u041f\u043e\u0440\u0456\u0432\u043d\u044f\u043b\u044c\u043d\u0438\u0439 \u0430\u043d\u0430\u043b\u0456\u0437 \u0435\u0444\u0435\u043a\u0442\u0438\u0432\u043d\u043e\u0441\u0442\u0456 \u043d\u0430\u0434\u0431\u0443\u0434\u043e\u0432 Selenium \u0442\u0430 Beautifulsoup"},"content":{"rendered":"<p><!--more--><\/p>\n<p style=\"text-align: center;\">\u041f\u041e\u0420\u0406\u0412\u041d\u042f\u041b\u042c\u041d\u0418\u0419 \u0410\u041d\u0410\u041b\u0406\u0417 \u0415\u0424\u0415\u041a\u0422\u0418\u0412\u041d\u041e\u0421\u0422\u0406 \u041d\u0410\u0414\u0411\u0423\u0414\u041e\u0412 SELENIUM \u0422\u0410 BEAUTIFULSOUP<\/p>\n<p style=\"text-align: center;\">COMPARATIVE ANALYSIS OF SELENIUM AND BEAUTIFULSOUP EFFICIENCY<\/p>\n<p><strong>\u0421\u0442\u043e\u0440\u0456\u043d\u043a\u0438: 50-52. \u041d\u043e\u043c\u0435\u0440: \u2116<\/strong><strong>1<\/strong><strong>, 202<\/strong><strong>2<\/strong><strong> (<\/strong><strong>305<\/strong><strong>)\u00a0<a href=\"http:\/\/journals.khnu.km.ua\/vestnik\/wp-content\/uploads\/2022\/04\/vknu-ts-2022-n1-305-50-52.pdf\"> <img loading=\"lazy\" class=\"size-full wp-image-69 alignnone\" src=\"http:\/\/journals.khnu.km.ua\/vestnik\/wp-content\/uploads\/2021\/01\/pdf.png\" alt=\"\" width=\"76\" height=\"32\" \/><\/a> <\/strong><br \/>\n<strong>\u00a0<\/strong><strong>\u0410\u0432\u0442\u043e\u0440\u0438:<\/strong><br \/>\n\u041a\u0420\u0418\u0412\u0415\u041d\u0427\u0423\u041a \u042e. \u041f.<br \/>\n<a href=\"https:\/\/orcid.org\/0000-0002-2504-5833\">https:\/\/orcid.org\/0000-0002-2504-5833<\/a><br \/>\ne-mail: <a href=\"mailto:Yurii.P.Kryvenchuk@lpnu.ua\">Yurii.P.Kryvenchuk@lpnu.ua<\/a><br \/>\n\u0411\u0423\u0420\u0410\u041a \u041c. \u0422.<br \/>\n<a href=\"https:\/\/orcid.org\/0000-0002-8979-3347\">https:\/\/orcid.org\/0000-0002-8979-3347<\/a><br \/>\ne-mail: <a href=\"mailto:burakmarko@gmail.com\">burakmarko@gmail.com<\/a><br \/>\n\u041d\u0430\u0446\u0456\u043e\u043d\u0430\u043b\u044c\u043d\u0438\u0439 \u0443\u043d\u0456\u0432\u0435\u0440\u0441\u0438\u0442\u0435\u0442 \u201c\u041b\u044c\u0432\u0456\u0432\u0441\u044c\u043a\u0430 \u043f\u043e\u043b\u0456\u0442\u0435\u0445\u043d\u0456\u043a\u0430\u201d<br \/>\nYurii KRYVENCHUK, Marko BURAK<br \/>\nLviv Polytechnic National University<br \/>\n<strong>DOI:<\/strong> <a href=\"https:\/\/www.doi.org\/10.31891\/2307-5732-2022-305-1-50-52\">https:\/\/www.doi.org\/10.31891\/2307-5732-2022-305-1-50-52<\/a><\/p>\n<p style=\"text-align: center;\"><strong>\u0410\u043d\u043e\u0442\u0430\u0446\u0456\u044f \u043c\u043e\u0432\u043e\u044e \u043e\u0440\u0438\u0433\u0456\u043d\u0430\u043b\u0443<\/strong><\/p>\n<p><strong>\u00a0<\/strong>\u041d\u0430 \u0441\u044c\u043e\u0433\u043e\u0434\u043d\u0456 \u043a\u0456\u043b\u044c\u043a\u0456\u0441\u0442\u044c \u0446\u0438\u0444\u0440\u043e\u0432\u043e\u0457 \u0456\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0456\u0457 \u0443 \u0432\u0441\u0435\u0441\u0432\u0456\u0442\u043d\u0456\u0439 \u043c\u0435\u0440\u0435\u0436\u0456 \u0437\u0431\u0456\u043b\u044c\u0448\u0443\u0454\u0442\u044c\u0441\u044f \u0435\u043a\u0441\u043f\u043e\u043d\u0435\u043d\u0446\u0456\u0430\u043b\u044c\u043d\u043e \u0437 \u043a\u043e\u0436\u043d\u0438\u043c \u0440\u043e\u043a\u043e\u043c. \u0422\u043e\u043c\u0443 \u0437\u0440\u0456\u0441 \u043f\u043e\u043f\u0438\u0442 \u043d\u0430 \u0430\u043d\u0430\u043b\u0456\u0437 \u0434\u0430\u043d\u0438\u0445 \u0437 \u0432\u0435\u0431\u0440\u0435\u0441\u0443\u0440\u0441\u0456\u0432. \u041f\u0440\u043e\u0442\u0435 \u0434\u043b\u044f \u043f\u0440\u043e\u0432\u0435\u0434\u0435\u043d\u043d\u044f \u043e\u043f\u0435\u0440\u0430\u0446\u0456\u0439 \u0437 \u0434\u0430\u043d\u0438\u043c\u0438 \u0457\u0445 \u043f\u043e\u0442\u0440\u0456\u0431\u043d\u043e \u0441\u043f\u043e\u0447\u0430\u0442\u043a\u0443 \u043e\u0442\u0440\u0438\u043c\u0430\u0442\u0438 \u0437 \u0434\u0436\u0435\u0440\u0435\u043b\u0430. \u0406\u0441\u043d\u0443\u0454 \u0434\u0443\u0436\u0435 \u0431\u0430\u0433\u0430\u0442\u043e \u0456\u043d\u0441\u0442\u0440\u0443\u043c\u0435\u043d\u0442\u0456\u0432, \u043d\u0430\u043f\u0438\u0441\u0430\u043d\u0438\u0445 \u043f\u0456\u0434 \u043c\u043e\u0432\u0443 python \u0434\u043b\u044f \u0440\u043e\u0431\u043e\u0442\u0438 \u0437 \u0432\u0438\u0434\u043e\u0431\u0443\u0442\u043a\u043e\u043c \u0456\u043d\u0444\u043e\u0440\u043c\u0430\u0446\u0456\u0457, \u0437 \u044f\u043a\u0438\u0445 selenium \u0442\u0430 beautifulSoup \u0432\u0432\u0430\u0436\u0430\u044e\u0442\u044c\u0441\u044f \u043d\u0430\u0439\u043f\u043e\u043f\u0443\u043b\u044f\u0440\u043d\u0456\u0448\u0438\u043c\u0438. \u041f\u043e\u043f\u0440\u0438 \u0442\u0435, \u0449\u043e \u043e\u0431\u0438\u0434\u0432\u0456 \u043d\u0430\u0434\u0431\u0443\u0434\u043e\u0432\u0438 \u043f\u0440\u0430\u0446\u044e\u044e\u0442\u044c \u0434\u043e\u0441\u0438\u0442\u044c \u0434\u043e\u0431\u0440\u0435,\u00a0 \u0440\u043e\u0437\u0440\u043e\u0431\u043d\u0438\u043a\u0438 \u043f\u0440\u0438\u043a\u043b\u0430\u0434\u043d\u043e\u0433\u043e \u043f\u0440\u043e\u0433\u0440\u0430\u043c\u043d\u043e\u0433\u043e \u0437\u0430\u0431\u0435\u0437\u043f\u0435\u0447\u0435\u043d\u043d\u044f \u043d\u0430\u043c\u0430\u0433\u0430\u044e\u0442\u044c\u0441\u044f \u043e\u0431\u0440\u0430\u0442\u0438 \u043d\u0430\u0439\u0431\u0456\u043b\u044c\u0448 \u043e\u043f\u0442\u0438\u043c\u0430\u043b\u044c\u043d\u0443 \u0437 \u043d\u0438\u0445. \u0422\u043e\u043c\u0443 \u0432\u0438\u043d\u0438\u043a\u043b\u0430 \u043d\u0435\u043e\u0431\u0445\u0456\u0434\u043d\u0456\u0441\u0442\u044c \u043f\u0435\u0440\u0435\u0432\u0456\u0440\u043a\u0438 \u0446\u0438\u0445 \u0434\u0432\u043e\u0445 \u043f\u0430\u043a\u0435\u0442\u0456\u0432 \u043d\u0430 \u0435\u0444\u0435\u043a\u0442\u0438\u0432\u043d\u0456\u0441\u0442\u044c. \u0423 \u0440\u043e\u0431\u043e\u0442\u0456 \u0440\u043e\u0437\u0433\u043b\u044f\u043d\u0443\u0442\u043e \u0442\u0440\u0438\u0432\u0430\u043b\u0456\u0441\u0442\u044c \u0440\u043e\u0431\u043e\u0442\u0438 \u043f\u0430\u0440\u0441\u0435\u0440\u0456\u0432 \u0434\u043b\u044f \u043f\u043e\u0448\u0443\u043a\u0443 \u0442\u0435\u0433\u0456\u0432 \u043d\u0430 \u0432\u0435\u0431\u0441\u0442\u043e\u0440\u0456\u043d\u0446\u0456 \u0437\u0430 \u0434\u043e\u043f\u043e\u043c\u043e\u0433\u043e\u044e \u0440\u0456\u0437\u043d\u0438\u0445 \u043c\u0435\u0442\u043e\u0434\u0456\u0432 \u0442\u0430 \u043f\u043b\u0430\u0442\u0444\u043e\u0440\u043c. \u0414\u043e\u0441\u043b\u0456\u0434\u0436\u0435\u043d\u043d\u044f \u043f\u0440\u043e\u0432\u0435\u0434\u0435\u043d\u043e \u043d\u0430 \u043e\u0441\u043d\u043e\u0432\u0456 \u043e\u043d\u043b\u0430\u0439\u043d-\u043f\u043b\u0430\u0442\u0444\u043e\u0440\u043c \u0434\u043b\u044f \u043f\u0440\u043e\u0434\u0430\u0436\u0443 \u0442\u043e\u0432\u0430\u0440\u0456\u0432. \u0420\u0435\u0437\u0443\u043b\u044c\u0442\u0430\u0442\u0438 \u043f\u043e\u043a\u0430\u0437\u0430\u043b\u0438, \u044f\u043a\u0456 \u0441\u0430\u043c\u0435 \u0456\u043d\u0441\u0442\u0440\u0443\u043c\u0435\u043d\u0442\u0438 \u0442\u0430 \u0444\u0443\u043d\u043a\u0446\u0456\u0457 \u043d\u0430\u0439\u043a\u0440\u0430\u0449\u0435 \u0432\u0438\u043a\u043e\u0440\u0438\u0441\u0442\u043e\u0432\u0443\u0432\u0430\u0442\u0438 \u0434\u043b\u044f \u0437\u043d\u0430\u0445\u043e\u0434\u0436\u0435\u043d\u043d\u044f \u0442\u043e\u0432\u0430\u0440\u0443 \u043d\u0430 \u0456\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043c\u0430\u0433\u0430\u0437\u0438\u043d\u0430\u0445.<br \/>\n<strong>\u041a\u043b\u044e\u0447\u043e\u0432\u0456 \u0441\u043b\u043e\u0432\u0430:<\/strong> \u043f\u0430\u0440\u0441\u0435\u0440, selenium, beautifulSoup, python, \u0456\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043c\u0430\u0433\u0430\u0437\u0438\u043d, \u0442\u0435\u0433, \u0432\u0435\u0431\u0441\u0442\u043e\u0440\u0456\u043d\u043a\u0430, \u043f\u043e\u0448\u0443\u043a.<\/p>\n<p style=\"text-align: center;\"><strong>\u0420\u043e\u0437\u0448\u0438\u0440\u0435\u043d\u0430 \u0430\u043d\u043e\u0442\u0430\u0446\u0456\u044f \u0430\u043d\u0433\u043b\u0456\u0439\u0441\u044c\u043a\u043e\u044e \u00a0\u043c\u043e\u0432\u043e\u044e<\/strong><\/p>\n<p>Nowadays, the amount of digital information on the World Wide Web is growing exponentially every year. Therefore, the demand for data analysis from web resources has increased. However, to perform data operations, information must first be obtained from the source. Today almost every popular programming language has at least one library that can perform web scraping operations and extract data from websites, although some of them are hard to use or not compatible with the language of the projects, that this data is intended. Therefore, a lot of developers use python as the main tool for such projects. It can be used to build almost any platform and communicate with the parsers within a project. Also, this language is easy to use and has a huge community. There are many python-based tools for working with data mining, of which selenium and beautifulSoup are considered the most popular. Despite the fact that both add-ons work quite well, the developers strive to choose the most optimal one. Thus, there is a need to test these two packages for effectiveness.<br \/>\nThe paper considers the duration of parsers to search for tags on a web page using different methods and platforms. The study was conducted on the basis of online platforms for the sale of goods. The results showed which tools and functions are the best choices to find products on online stores. The object of analysis was the website \u201cRozetka\u201d, which is the biggest and the most popular online store in Ukraine. This article has described the advantages and disadvantages of using these libraries especially for scraping data from online stores. To analyze these add-ons, there was a special program created which can open the website on a browser, find the search bar, and enter the name of the desired product, afterward using various methods of these libraries perform a product search. For each search time was recorded. The results showed that beautifulSoup can find tags faster than selenium in general, however, for searching and web scraping online stores selenium can perform better and is more suitable.<br \/>\n<strong>Keywords:<\/strong> web scraper, selenium, beautifulSoup, python, online store, tag, web page, search.<\/p>\n<p style=\"text-align: center;\"><strong>\u041b\u0456\u0442\u0435\u0440\u0430\u0442\u0443\u0440\u0430<\/strong><\/p>\n<ol>\n<li>\u041f\u043e\u0440\u0456\u0432\u043d\u044f\u043d\u043d\u044f \u043c\u0456\u0436 Selenium \u0442\u0430 BeautifulSoup: \u044f\u043a\u0438\u0439 \u043d\u0430\u0439\u043a\u0440\u0430\u0449\u0438\u0439? Limeproxiess. \u041f\u043e\u0440\u0456\u0432\u043d\u044f\u043d\u043d\u044f \u043c\u0456\u0436 Selenium \u0442\u0430 BeautifulSoup: \u044f\u043a\u0438\u0439 \u043d\u0430\u0439\u043a\u0440\u0430\u0449\u0438\u0439? Limeproxies. URL: https:\/\/limeproxies.netlify.com\/blog\/selenium-vs-beautifulsoup<\/li>\n<li>\u0410\u043d\u0434\u0440\u0430\u0434\u0435 \u0424. \u0412\u0435\u0431-\u043f\u0430\u0440\u0441\u0438\u043d\u0433 \u0437\u0430 \u0434\u043e\u043f\u043e\u043c\u043e\u0433\u043e\u044e beautifulSoup, Selenium \u0447\u0438 Scrapy? Medium. 2021. URL: https:\/\/towardsdatascience.com\/web-scraping-with-beautiful-soup-selenium-or-scrapy-62c6f3545de7<\/li>\n<li>\u0411\u0445\u0430\u0442\u0430\u0447\u0430\u0440\u0456\u044f \u0421. \u041f\u0430\u0440\u0441\u0438\u043d\u0433 \u0441\u0430\u0439\u0442\u0456\u0432 \u0435\u043b\u0435\u043a\u0442\u0440\u043e\u043d\u043d\u043e\u0457 \u043a\u043e\u043c\u0435\u0440\u0446\u0456\u0457 \u0437\u0430 \u0434\u043e\u043f\u043e\u043c\u043e\u0433\u043e\u044e Selenium \u0442\u0430 Python. Analytics Vidhya. 2020. URL: https:\/\/medium.com\/analytics-vidhya\/web-scraping-e-commerce-sites-using-selenium-python-55fd980fe2fc<\/li>\n<li>\u0406\u043d\u0442\u0435\u0440\u043d\u0435\u0442-\u043c\u0430\u0433\u0430\u0437\u0438\u043d ROZETKATM: \u043e\u0444\u0456\u0446\u0456\u0439\u043d\u0438\u0439 \u0441\u0430\u0439\u0442 \u043d\u0430\u0439\u043f\u043e\u043f\u0443\u043b\u044f\u0440\u043d\u0456\u0448\u043e\u0433\u043e \u043e\u043d\u043b\u0430\u0439\u043d-\u0433\u0456\u043f\u0435\u0440\u043c\u0430\u0440\u043a\u0435\u0442\u0443 \u0432 \u0423\u043a\u0440\u0430\u0457\u043d\u0456. URL: https:\/\/rozetka.com.ua\/ua\/<\/li>\n<li>\u041a\u0445\u0434\u0435\u0440 \u041c\u0410. \u041f\u0430\u0440\u0441\u0438\u043d\u0433 \u0430\u0431\u043e \u0432\u0435\u0431-\u0441\u043a\u0430\u043d\u0443\u0432\u0430\u043d\u043d\u044f: \u0421\u0443\u0447\u0430\u0441\u043d\u0438\u0439 \u0441\u0442\u0430\u043d, \u0442\u0435\u0445\u043d\u0456\u043a\u0438, \u043f\u0456\u0434\u0445\u043e\u0434\u0438 \u0442\u0430 \u0437\u0430\u0441\u0442\u043e\u0441\u0443\u0432\u0430\u043d\u043d\u044f. International Journal of Advances in Soft Computing and its Applications. 2021; 13(3):144\u201368.<\/li>\n<li>\u0422\u043e\u043c\u0430\u0441 \u0414\u041c, \u041c\u0430\u0442\u0443\u0440 \u0421. \u0414\u0430\u0442\u0430 \u0430\u043d\u0430\u043b\u0456\u0437 \u0434\u043b\u044f \u043f\u0430\u0440\u0441\u0438\u043d\u0433\u0443 \u0437 \u0432\u0438\u043a\u043e\u0440\u0438\u0441\u0442\u0430\u043d\u043d\u044f\u043c python. 2019. \u0441. 450\u20134.<\/li>\n<li>\u041c\u0430\u043a\u0425\u0435\u043d\u043b\u0456 \u0420. \u041d\u0430\u0432\u0447\u0430\u043d\u043d\u044f: \u0422\u0435\u043a\u0441\u0442\u043e\u0432\u0430 \u0430\u043d\u0430\u043b\u0456\u0442\u0438\u043a\u0430 \u0434\u043b\u044f \u043c\u043e\u0434\u0435\u043b\u044e\u0432\u0430\u043d\u043d\u044f \u0437\u0430 \u0434\u043e\u043f\u043e\u043c\u043e\u0433\u043e\u044e python. 2021. \u0441.\u00a068\u201382.<\/li>\n<\/ol>\n<p style=\"text-align: center;\"><strong>References<\/strong><\/p>\n<ol>\n<li>Comparison Between Selenium vs BeautifulSoup: Which Is the Best One? Limeproxiess. Comparison Between Selenium vs BeautifulSoup: Which Is the Best One? Limeproxies. URL: https:\/\/limeproxies.netlify.com\/blog\/selenium-vs-beautifulsoup<\/li>\n<li>Andrade F. Web Scraping with Beautiful Soup, Selenium or Scrapy? Medium. 2021: URL https:\/\/towardsdatascience.com\/web-scraping-with-beautiful-soup-selenium-or-scrapy-62c6f3545de7<\/li>\n<li>Bhattacharya C. Web Scraping E-commerce sites using Selenium &amp; Python. Analytics Vidhya. 2020. URL: https:\/\/medium.com\/analytics-vidhya\/web-scraping-e-commerce-sites-using-selenium-python-55fd980fe2fc<\/li>\n<li>Online store ROZETKA: the official site of the most popular online hypermarket in Ukraine. URL: https:\/\/rozetka.com.ua\/ua\/<\/li>\n<li>Khder MA. Web scraping or web crawling: State of art, techniques, approaches and application. International Journal of Advances in Soft Computing and its Applications. 2021; 13(3):144\u201368.<\/li>\n<li>Thomas DM, Mathur S. Data Analysis by Web Scraping using Python. 2019. p. 450\u20134.<\/li>\n<li>McHaney R. Tutorial: Text analytics for simulation with python. 2021. p. 68\u201382.<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[62],"tags":[],"_links":{"self":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/12167"}],"collection":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=12167"}],"version-history":[{"count":1,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/12167\/revisions"}],"predecessor-version":[{"id":12169,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=\/wp\/v2\/posts\/12167\/revisions\/12169"}],"wp:attachment":[{"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=12167"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=12167"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/journals.khnu.km.ua\/vestnik\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=12167"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}