JETIREXPLORE- Search Thousands of research papers



Published in:

Volume 5 Issue 8
August-2018
eISSN: 2349-5162

Unique Identifier

JETIR1808823

Page Number

472-476

Share This Article


Title

Large Scale Data Processing from Multiple Data Centers

ISSN

2349-5162

Cite This Article

"Large Scale Data Processing from Multiple Data Centers", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.5, Issue 8, page no.472-476, August-2018, Available :http://www.jetir.org/papers/JETIR1808823.pdf

Abstract

With the globalization of service, organizations continuously produce large volumes of data that need to be analyzed over geo-dispersed locations. Traditionally central approach that moving all data to a single cluster is inefficient or infeasible due to the limitations such as the scarcity of wide-area bandwidth and the low latency requirement of data processing. Processing big data across geo-distributed datacenters continues to gain popularity in recent years. However, managing distributed Map Reduce computations across geo-distributed datacenters poses a number of technical challenges: how to allocate data among a selection of geo-distributed datacenters to reduce the communication cost, how to determine the Virtual Machine provisioning strategy that offers high performance and low cost, and what criteria should be used to select a datacenter as the final reducer for big data analytics jobs. In this paper, these challenges is addressed by balancing bandwidth cost, storage cost, computing cost, migration cost, and latency cost, between the two Map Reduce phases across datacenters. We formulate this complex cost optimization problem for data movement, resource provisioning and reducer selection into a joint stochastic integer nonlinear optimization problem by minimizing the five cost factors simultaneously. An efficient online algorithm that is able to minimize the long-term time-averaged operation cost is further designed.

Key Words

Big Data Processing; Cloud Computing; Data Movement; Virtual Machine Scheduling; Online Algorithm

Cite This Article

"Large Scale Data Processing from Multiple Data Centers", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.5, Issue 8, page no. pp472-476, August-2018, Available at : http://www.jetir.org/papers/JETIR1808823.pdf

Publication Details

Published Paper ID: JETIR1808823
Registration ID: 185677
Published In: Volume 5 | Issue 8 | Year August-2018
DOI (Digital Object Identifier):
Page No: 472-476
ISSN Number: 2349-5162

Download Paper

Preview Article

Download Paper




Cite This Article

"Large Scale Data Processing from Multiple Data Centers", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.5, Issue 8, page no. pp472-476, August-2018, Available at : http://www.jetir.org/papers/JETIR1808823.pdf




Preview This Article


Downlaod

Click here for Article Preview