UGC Approved Journal no 63975(19)

ISSN: 2349-5162 | ESTD Year : 2014
Call for Paper
Volume 11 | Issue 4 | April 2024

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 9 Issue 6
June-2022
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIRFM06063


Registration ID:
403688

Page Number

353-357

Share This Article


Jetir RMS

Title

EFFICIENT SCRAPING OF DATA FROM WEBSITES USING SELENIUM

Abstract

Internet is an ocean of information spread across various websites, where it is categorized, interlinked and mostly freely available for everyone. A vast amount of data is being created every second. All this ‘Big Data’ is in heterogeneous formats. We need to access information fast and quickly. Data extraction can be done manually but it can be time-consuming and can also be a very complicated task, for this reason Web Scraping is used. Web Scraping is the technique of automating the process of navigating through links, and then navigating and collecting the relevant data from these relevant links. The proposed system is a method of extracting and restructuring information from web pages. It is a technique for targeted, automated extraction of information from websites. This system acquires non-tabular or poorly structured data from websites and converts it into a usable structured format. The main objective of the proposed system is to extract information from one or many websites and process it into simple structures such as CSV files. In this proposed system, Text Grepping technique is used, that offers insight into price data, market dynamics, prevailing trends, practices employed by various competitors, and the challenges they face. The result of this technique is to easily access relevant data from websites. The proposed system can be modified for scraping dynamic websites. This proposed system will be beneficial in many business and at education areas.

Key Words

Web scraping, Big Data, CSV file, Structured and Unstructured data

Cite This Article

"EFFICIENT SCRAPING OF DATA FROM WEBSITES USING SELENIUM", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.9, Issue 6, page no.353-357, June-2022, Available :http://www.jetir.org/papers/JETIRFM06063.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"EFFICIENT SCRAPING OF DATA FROM WEBSITES USING SELENIUM", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.9, Issue 6, page no. pp353-357, June-2022, Available at : http://www.jetir.org/papers/JETIRFM06063.pdf

Publication Details

Published Paper ID: JETIRFM06063
Registration ID: 403688
Published In: Volume 9 | Issue 6 | Year June-2022
DOI (Digital Object Identifier):
Page No: 353-357
Country: -, -, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000312

Print This Page

Current Call For Paper

Jetir RMS