Identifying spam SMS using Apache Spark MLlib

Atanu Ghosh; Mr. Ajit Kumar Pasayat

Volume 5 Issue 5
May-2018
eISSN: 2349-5162

7.95 impact factor calculated by Google scholar

Published Paper ID:
JETIR1805753

Registration ID:
182952

Identifying spam SMS using Apache Spark MLlib

Short Message Service (SMS) has grown huge now days because of its flexibility and making communication more effortless to the mobile phone users. At the same time, increasing popularity of SMS and reduction of the cost of messaging service has made advertiser more interesting to send unsolicited commercial advertisements (spam) to users. Spam messages are annoying and many people do not want to get. There are many methods available to detect spam messages. Different classifiers which depend on Naïve Bays, Support Vector Machine and many other ML algorithms were already used. In this paper we compared different classifiers which mainly depend on Apache Spark MLlib library to evaluate accuracy and runtime. Here we used Logistic Regression with L-BFGS (Limited-memory Broyden–Fletcher–Goldfarb–Shanno), Naïve Bays, Decision Tree and Gradient Boosted Trees to compare the vectors. Besides using different classifiers, also descried the most important features that were being used as input to Decision Tree, Gradient Boosted Trees, Naïve Bays and Logistic Regression with L-BFGS classifiers. Features those are helps to detect spam SMS mostly the existence of URL and the number of digits present in a SMS. The experiment used the dataset which is proposed by UCI Machine Learning Repositories. For this experiment the dataset was split into two parts so that 80% of the data were taken as training purpose and 20% of the data were taken as testing purpose. Therefore, experiments show that Naïve Bays is the faster algorithm to achieve best accuracy than others. It took 3.16 seconds to achieve 95% accuracy on test data.

SMS Spam, Naïve Bays, Decision Tree, Logistic Regression with LBFGS, Gradient Boosted Trees, Apache Spark MLlib

"Identifying spam SMS using Apache Spark MLlib", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.5, Issue 5, page no.893-897, MAY-2018, Available :http://www.jetir.org/papers/JETIR1805753.pdf

"Identifying spam SMS using Apache Spark MLlib", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.5, Issue 5, page no. pp893-897, MAY-2018, Available at : http://www.jetir.org/papers/JETIR1805753.pdf

Published Paper ID: JETIR1805753

Registration ID: 182952

Published In: Volume 5 | Issue 5 | Year May-2018

DOI (Digital Object Identifier):

Page No: 893-897

Country: PASCHIM MEDINIPUR, WEST BENGAL, India .

Area: Science & Technology

ISSN Number: 2349-5162

Publisher: IJ Publication

Home |
Contact Us

Contact Us
Click Here

WhatsApp Contact
Click Here

Published in:

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

Unique Identifier

Page Number

Post-Publication

Share This Article

Important Links:

Jetir RMS

Title

Authors

Abstract

Key Words

Cite This Article

ISSN

Cite This Article

Publication Details

Download Paper / Preview Article

Download Paper

Preview This Article

Download PDF

Downloads

Print This Page

Impact Factor:

7.95

Impact Factor Calculation click here

Impact Factor:

7.95

Impact Factor Calculation click here

Current Call For Paper

Call for Paper
Cilck Here For More Info

Important Links:

Jetir RMS

Contact Us Click Here

WhatsApp Contact Click Here

Published in:

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

Unique Identifier

Page Number

Post-Publication

Share This Article

Important Links:

Jetir RMS

Title

Authors

Abstract

Key Words

Cite This Article

ISSN

Cite This Article

Publication Details

Download Paper / Preview Article

Download Paper

Preview This Article

Download PDF

Downloads

Print This Page

Impact Factor: 7.95 Impact Factor Calculation click here

Impact Factor:

7.95

Impact Factor Calculation click here

Current Call For Paper

Call for Paper Cilck Here For More Info

Important Links:

Jetir RMS

Contact Us
Click Here

WhatsApp Contact
Click Here

Impact Factor:

7.95

Impact Factor Calculation click here

Call for Paper
Cilck Here For More Info