ISSN: 2349-5162 | Impact Factor: 5.87

JETIREXPLORE- Search Thousands of research papers



Published in:

Volume 2 Issue 4
April-2015
eISSN: 2349-5162

Unique Identifier

JETIR1504084

Page Number

1292-1296

Share This Article


Jetir RMS

Indexing Partner


Title

A Survey on Text Mining and Sentiment Analysis for Unstructured Web Data

Abstract

Unstructured data refers to information that doesn’t have a pre-defined data archetype. Unstructured information is typically textual data, but may also contain numerical data, and factual details. This results in data that is obscure, irregular and ambiguous, thus making it difficult to analyse using conventional computing means. Much of the data in the web, in the form of blogs, news, social media platforms is unstructured. But they serve as a potential vast source of information, if processed efficiently. In this paper, the basics of harnessing unstructured data from the web and the techniques to process it are discussed. The concepts of web crawling, text mining and natural language processing are discussed in brief, to give an outline of how web data is processed and analysed. Sentiment Analysis, which is a major aspect of present day NLP, is also described, along with issue of mining from Twitter, which has emerged as the most important data source for NLP in the recent past. The paper concludes with a brief outline of the use of web data mining and analysis, and the potential for future growth in the field.

Key Words

Data Mining, Natural Language Processing (NLP), Sentiment Analysis, Text Mining, Web Crawling

Cite This Article

"A Survey on Text Mining and Sentiment Analysis for Unstructured Web Data", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.2, Issue 4, page no.1292-1296, April-2015, Available :http://www.jetir.org/papers/JETIR1504084.pdf

Publication Details

Published Paper ID: JETIR1504084
Registration ID: 150376
Published In: Volume 2 | Issue 4 | Year April-2015
DOI (Digital Object Identifier):
Page No: 1292-1296
ISSN Number: 2349-5162

Preview This Article


Click here for Article Preview

Download PDF

Downloads

0002302

Print This Page

Impact Factor

Impact factor: 4.14

Jetir RMS