UGC Approved Journal no 63975

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 9 | Issue 1 | January 2022

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 8 Issue 7
July-2021
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2107479


Registration ID:
312733

Page Number

d690-d695

Share This Article


Jetir RMS

Title

DATA CRAWLERS TO COLLECT DATA

Abstract

A data crawler, often known as a spider or web crawler, is an Internet bot that methodically browses the World Wide Web in order to build search engine indices. Web crawling is used all the time by companies like Google and Facebook to acquire data. By following connections in web pages, web crawling mostly refers to downloading and storing the contents of a vast number of websites. Information on the web is frequently altered or modified without warning or notification. A web crawler scans the internet for fresh or updated content. Users can use various hypertext links to locate their resources. Three steps make up the web crawling process. The spider begins by crawling specific pages on a website. Following that, it continues to index the website's words and content, and finally, it visits all of the site's hyperlinks.Web search engines have additional issues as a result of the large number of web pages available, making the obtained results less relevant to the analysers. Web crawling, on the other hand, has recently focused primarily on acquiring the links to the appropriate documents.Today, multiple techniques and software are used to crawl links from the web that must be further processed for future usage, causing the analyser to become overloaded. This project focuses on crawling the links and extracting the associated information in order to make processing easier for various purposes.

Key Words

scraping. scraper, crawler, data crawlers

Cite This Article

"DATA CRAWLERS TO COLLECT DATA", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.8, Issue 7, page no.d690-d695, July-2021, Available :http://www.jetir.org/papers/JETIR2107479.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"DATA CRAWLERS TO COLLECT DATA", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.8, Issue 7, page no. ppd690-d695, July-2021, Available at : http://www.jetir.org/papers/JETIR2107479.pdf

Publication Details

Published Paper ID: JETIR2107479
Registration ID: 312733
Published In: Volume 8 | Issue 7 | Year July-2021
DOI (Digital Object Identifier):
Page No: d690-d695
Country: Siddipet, Telangana, India .
Area: Engineering
ISSN Number: 2349-5162


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000166

Print This Page

Current Call For Paper

Jetir RMS