Improvisation of Web Crawling on Web Maps Using Distributed Algorithm with Socket Programming-Based Coordination Model

  • Muhammad Ridho Rizqillah Universitas Negeri Jakarta
  • Muhammad Eka Suryana Universitas Negeri Jakarta
  • Med Irzal Universitas Negeri Jakarta

Abstract

Search engines require a large amount of data to operate optimally. Handling large data necessitates massive crawling processes. This research aims to enhance crawling efficiency by implementing a distributed crawler using sockets as the communication medium between devices. The distributed architecture of the crawler system includes a tracker, manager, and client. Its primary goal is to improve crawling efficiency and effectiveness through distributed means. The final outcome of this innovation shows a distributed increase of 30% in data acquisition with two crawlers compared to individual crawlers, with no duplicated data.

Published
2024-06-28