UGC Approved Journal no 63975(19)
New UGC Peer-Reviewed Rules

ISSN: 2349-5162 | ESTD Year : 2014
Volume 12 | Issue 10 | October 2025

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 10 Issue 6
June-2023
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2306562


Registration ID:
519474

Page Number

f447-f453

Share This Article


Jetir RMS

Title

Word embedding Technique with Similarity Measures based Approach for Author Profiling

Abstract

Author Profiling (AP) is a type of text classification and information extraction technique, which extracts the author’s hidden information from their written texts. This technique extracts the author’s demographic features like gender, age, location, nativity language, educational background, personality traits etc., by analysing their written texts. The author profiling techniques are used in various applications like marketing, security, forensic analysis etc. Researchers proposed several methods for differentiating the writing style of authors by using stylistic features, machine learning techniques and deep learning techniques. In this work, we proposed two approaches based on word embedding techniques with machine learning algorithms and word embedding techniques with similarity measures for author profiling. We are concentrating on prediction of gender and age group of authors. The PAN competition 2014 dataset is used in this work for experimentation. In the first proposed approach, we used word embedding techniques for representing words as vectors. These word vectors are used for representing the documents as vectors. The machine learning algorithms are used for training these document vectors. These algorithms develop the classification model to predict the accuracy of gender and age prediction. In the second approach, we used word embedding techniques for representing words as vectors and similarity measures are used to find the similarity among the documents. For gender and age dimensions, the second approach attained good accuracies than first approach. The proposed approaches attained best accuracies for gender and prediction when compared with other popular approaches to author profiling.

Key Words

Author Profiling, Gender Prediction, Age Prediction, Word2Vec, Doc2Vec, Similarity Measures.

Cite This Article

"Word embedding Technique with Similarity Measures based Approach for Author Profiling", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.10, Issue 6, page no.f447-f453, June-2023, Available :http://www.jetir.org/papers/JETIR2306562.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Word embedding Technique with Similarity Measures based Approach for Author Profiling", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.10, Issue 6, page no. ppf447-f453, June-2023, Available at : http://www.jetir.org/papers/JETIR2306562.pdf

Publication Details

Published Paper ID: JETIR2306562
Registration ID: 519474
Published In: Volume 10 | Issue 6 | Year June-2023
DOI (Digital Object Identifier):
Page No: f447-f453
Country: Guntur, Andhra Pradesh, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

000253

Print This Page

Current Call For Paper

Jetir RMS