1.1 INTRODUCTION TO CLOUD COMPUTING
The running generation of world, cloud computing has become the most powerful, chief and also lightning technology. The term Cloud refers to a Network or Internet. In other words, we can say that Cloud is something, which is present at remote location. Cloud can provide services over network, i.e., on public networks or on private networks, i.e., WAN, LAN or VPN. Applications such as e-mail, web conferencing, customer relationship management (CRM), all run in cloud. Cloud is just a metaphor for the internet. Cloud computing also known as on-demand computing, is a kind of Internet-based computing that provides shared processing resources and data to computers and other devices on demand. The cloud computing provides on demand self services that means customers can request and manage their own resources.
History of Cloud Computing: Cloud Computing is believed to have been invented by “Joseph Carl Robnett licklider in the 1960’s with his work on “ARPANET”(The Advanced Research Projects Agency Networks) to connect people and data from anywhere at any time.J.C.R was one of the Americas leading computer scientists.
Characteristics
The National Institute of Standards and Technology’s definition of cloud computing identifies “five essential characteristics”:
1) On Demand Self-Service : Cloud Computing allows the users to use web services and resources on demand. One can logon to a website at any time and use them.
2) Broad network access: Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations).
3) Resource pooling: The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand.
4) Rapid elasticity: Capabilities can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with demand. To the consumer, the capabilities available for provisioning often appear unlimited and can be appropriated in any quantity at any time.
5) Measured service: Cloud systems automatically control and optimize resource use by leveraging a metering capability at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.
1.1.1 Architecture
The Cloud Computing architecture comprises of many cloud components, each of them are loosely coupled. We can broadly divide the cloud architecture into two parts:
• Front End
• Back End
Each of the ends are connected through a network, usually via Internet. The following diagram shows the graphical view of cloud computing architecture:
• FRONT END
Front End refers to the client part of cloud computing system. It consists of interfaces and applications that are required to access the cloud computing platforms, e.g., Web Browser.
• BACK END
Back End refers to the cloud itself. It consists of all the resources required to provide cloud computing services. It comprises of huge data storage, virtual machines, security mechanism, services, deployment models, servers, etc. Important Points
It is the responsibility of the back end to provide built-in security mechanism, traffic control and protocols.
The server employs certain protocols, known as middleware, helps the connected devices to communicate with each other.
1.1.2 SERVICE MODELS:
Cloud computing mainly consists of three service models that are provided to the users those are as below:
a. Infrastructure as a Service (IaaS)
b. Platform as a Service (PaaS)
c. Software as a Service (SaaS)
There are many other service models all of which can take the form like XaaS, i.e., Anything as a Service. This can be Network as a Service, Business as a Service, Identity as a Service, Database as a Service or Strategy as a Service.
Fig : Service Models
1. Infrastructure as a Service (IaaS) :
IaaS provides access to fundamental resources such as physical machines, virtual machines, virtual storage, etc. Apart from these resources, the IaaS also offers Virtual machine disk storage, Virtual local area network (VLANs) , Load balancers , IP addresses, Software bundles .All of the above resources are made available to end user via server virtualization. Moreover, these resources are accessed by the customers as if they own them.
Benefits
IaaS allows the cloud provider to freely locate the infrastructure over the Internet in a cost-effective manner. Some of the key benefits of IaaS are listed below:
• Full Control of the computing resources through Administrative Access to VMs.
• Flexible and Efficient renting of Computer Hardware.
• Portability, Interoperability with Legacy Applications.
Issues
• IaaS shares issues with PaaS and SaaS, such as Network dependence and browser based risks. It also have some specific issues associated with it.The following issues are:
1) Compatibility with legacy security vulnerabilities
Because IaaS offers the consumer to run legacy software in provider’s infrastructure, therefore it exposes consumers to all of the security vulnerabilities of such legacy software.
2) Virtual machine sprawl
The VM can become out of date with respect to security updates because IaaS allows the consumer to operate the virtual machines in running, suspended and off state. However, the provider can automatically update such VMs, but this mechanism is hard and complex.
3) Robustness of vm-level isolation
IaaS offers an isolated environment to individual consumers through hypervisor. Hypervisor is a software layer that includes hardware support for virtualization to split a physical computer into multiple virtual machines.
4) Data erase practices
The consumer uses virtual machines that in turn uses the common disk resources provided by the cloud provider. When the consumer releases the resource, the cloud provider must ensure that next consumer to rent the resource does not observe data residue from previous consumer.
Characteristics
Here are the characteristics of IaaS service model:
• Virtual machines with pre-installed software.
• Virtual machines with pre-installed Operating Systems such as Windows, Linux, and Solaris.
• On-demand availability of resources.
• Allows to store copies of particular data in different locations.
• The computing resources can be easily scaled up and down.
2. Platform-as-a-Service (PaaS):
PaaS offers the runtime environment for applications. It also offers development & deployment tools, required to develop applications. PaaS has a feature of point-and-click tools that enables non-developers to create web applications. Google’s App Engine, Force.com are examples of PaaS offering vendors. Developer may log on to these websites and use the built-in API to create web-based applications. But the disadvantage of using PaaS is that the developer lock-in with a particular vendor. For example, an application written in Python against Google’s API using Google’s App Engine is likely to work only in that environment. Therefore, the vendor lock-in is the biggest problem in PaaS. The following diagram shows how PaaS offers an API and development tools to the developers and how it helps the end user to access business applications.
Benefits: Following are the benefits of PaaS model:
• Lower administrative overhead
Consumer need not to bother much about the administration because it’s the responsibility of cloud provider.
• Lower total cost of ownership
Consumer need not purchase expensive hardware, servers, power and data storage.
• Scalable solutions
It is very easy to scale up or down automatically based on application resource demands.
• More current system software
It is the responsibility of the cloud provider to maintain software versions and patch installations.
Issues : Like SaaS, PaaS also place significant burdens on consumer’s browsers to maintain reliable and secure connections to the provider systems. Therefore, PaaS shares many of the issues of SaaS.
1) Lack of portability between PaaS clouds
Although standard languages are used yet the implementations of platforms services may vary. For example, file, queue, or hash table interfaces of one platform may differ from another, making it difficult to transfer workloads from one platform to another.
2) Event based processor scheduling
The PaaS applications are event oriented which poses resource constraints on applications, i.e., they have to answer a request in a given interval of time.
3) Security engineering of PaaS applications
Since the PaaS applications are dependent on network, PaaS applications must explicitly use cryptography and manage security exposures.
Characteristics
Here are the characteristics of PaaS service model:
• PaaS offers browser based development environment. It allows the developer to create database and edit the application code either via Application Programming Interface or point-and-click tools.
• PaaS provides built-in security, scalability, and web service interfaces.
• PaaS provides built-in tools for defining workflow and approval processes and defining business rules.
• It is easy to integrate with other applications on the same platform.
• PaaS also provides web services interfaces that allow us to connect the applications outside the platform.
3. Software as a Service (SaaS):
Software as a Service (SaaS) model allows providing software application as a service to the end users. It refers to a software that is deployed on a hosted service and is accessible via Internet. There are several SaaS applications, some of them are listed below:
• Billing and Invoicing System
• Customer Relationship Management (CRM) applications
• Help Desk Applications
• Human Resource (HR) Solutions
Some of the SaaS applications are not customizable such as an Office Suite. But SaaS provides us Application Programming Interface (API), which allows the developer to develop a customized application.
Characteristics
Here are the characteristics of SaaS service model:
• SaaS makes the software available over the Internet.
• The Software are maintained by the vendor rather than where they are running.
• The license to the software may be subscription based or usage based. And it is billed on recurring basis.
• SaaS applications are cost effective since they do not require any maintenance at end user side.
• They are available on demand.
• They can be scaled up or down on demand.
• They are automatically upgraded and updated.
• SaaS offers share data model. Therefore, multiple users can share single instance of infrastructure. It is not required to hard code the functionality for individual users.
• All users are running same version of the software.
Benefits : Using SaaS has proved to be beneficial in terms of scalability, efficiency, performance and much more. Some of the benefits are listed below:
• Modest Software Tools
• Efficient use of Software Licenses
• Centralized Management & Data
• Platform responsibilities managed by provider
• Multitenant solutions
Issues : There are several issues associated with SaaS, some of them are listed below:
1) Browser based risks
2) Network dependence
3) Lack of portability between SaaS clouds
1.1.3 DEPLOYMENT MODELS:
Deployment models define the type of access to the cloud, i.e., how the cloud is located? Cloud can have any of the four types of access:
1. Public Cloud
2. Private Cloud
3. Hybrid Cloud
4. Community Cloud
Fig : Deployment Models
1. PUBLIC CLOUD MODEL:
The Public Cloud allows systems and services to be easily accessible to general public, e.g., Google, Amazon, Microsoft offers cloud services via Internet.
Benefits : There are many benefits of deploying cloud as public cloud model.
• Cost effective : Since public cloud share same resources with large number of consumer, it has low cost.
• Reliability : Since public cloud employs large number of resources from different locations, if any of the resource fail, public cloud can employ another one.
• Flexibility : It is also very easy to integrate public cloud with private cloud, hence gives consumers a flexible approach.
• Location independence : Since, public cloud services are delivered through Internet, therefore ensures location independence.
• Utility style costing : Public cloud is also based on pay-per-use model and resources are accessible whenever consumer needs it.
• High scalability : Cloud resources are made available on demand from a pool of resources, i.e., they can be scaled up or down according the requirement.
Disadvantages : Here are the disadvantages of public cloud model:
• Low security : In public cloud model, data is hosted off-site and resources are shared publicly, therefore does not ensure higher level of security.
• Less customizable : It is comparatively less customizable than private cloud.
2. PRIVATE CLOUD MODEL:
The Private Cloud allows systems and services to be accessible within an organization. The Private Cloud is operated only within a single organization. However, It may be managed internally or by third-party.
Benefits : There are many benefits of deploying cloud as private cloud model. The following diagram shows some of those benefits:
• Higher security and privacy : Private cloud operations are not available to general public and resources are shared from distinct pool of resources, therefore, ensures high security and privacy.
• More control : Private clouds have more control on its resources and hardware than public cloud because it is accessed only within an organization.
• Cost and energy efficiency : Private cloud resources are not as cost effective as public clouds but they offer more efficiency than public cloud.
Disadvantages
Here are the disadvantages of using private cloud model:
• Restricted area : Private cloud is only accessible locally and is very difficult to deploy globally.
• Inflexible pricing : In order to fulfill demand, purchasing new hardware is very costly.
• Limited scalability : Private cloud can be scaled only within capacity of internal hosted resources.
3. HYBRID CLOUD MODEL
The Hybrid Cloud is a mixture of public and private cloud. Non-critical activities are performed using public cloud while the critical activities are performed using private cloud.
Benefits : There are many benefits of deploying cloud as hybrid cloud model. The following diagram shows some of those benefits:
• Scalability : It offers both features of public cloud scalability and private cloud scalability.
• Flexibility : It offers both secure resources and scalable public resources.
• Cost efficiencies :Public cloud are more cost effective than private, therefore hybrid cloud can have this saving.
• Security :Private cloud in hybrid cloud ensures higher degree of security.
Disadvantages
• Networking issues :Networking becomes complex due to presence of private and public cloud.
• Security compliance :It is necessary to ensure that cloud services are compliant with organization’s security policies.
• Infrastructural dependency :The hybrid cloud model is dependent on internal IT infrastructure, therefore it is necessary to ensure redundancy across data centers.
4. COMMUNITY CLOUD MODEL
The Community Cloud allows system and services to be accessible by group of organizations. It shares the infrastructure between several organizations from a specific community. It may be managed internally or by the third-party.
Benefits :There are many benefits of deploying cloud as community cloud model. The following diagram shows some of those benefits:
• Cost effective : Community cloud offers same advantage as that of private cloud at low cost. Sharing Between Organizations Community cloud provides an infrastructure to share cloud resources and capabilities among several organizations.
• Security : Community cloud is comparatively more secure than the public cloud.
Issues
1) Since all data is housed at one location, one must be careful in storing data in community cloud because it might be accessible by others.
2) It is also challenging to allocate responsibilities of governance, security and cost.
Advantages:
1) Easy implementation
Cloud hosting allows business to retain the same applications and business processes without having to deal with the backend technicalities. Readily manageable by the Internet, a cloud infrastructure can be accessed by enterprises easily and quickly.
2) Accessibility
Access your data anywhere, anytime. An Internet cloud infrastructure maximizes enterprise productivity and efficiency by ensuring your application is always accessible. This allows for easy collaboration and sharing among users in multiple locations.
3) No hardware required
Since everything will be hosted in the cloud, a physical storage center is no longer needed. However, a backup could be worth looking into in the event of a disaster that could leave your company’s productivity stagnant.
4) Cost per head
Overhead technology costs are kept at a minimum with cloud hosting services, enabling businesses to use the extra time and resources for improving the company infrastructure.
5) Flexibility for growth
The cloud is easily scalable so companies can add or subtract resources based on their needs. As companies grow, their system will grow with them.
6) Efficient recovery
Cloud computing delivers faster and more accurate retrievals of applications and data. With less downtime, it is the most efficient recovery plan.
Disadvantages:
1) No longer in control
When moving services to the cloud, you are handing over your data and information. For companies who have an in-house IT staff, they will be unable to handle issues on their own. However, Stratosphere Networks has a 24/7 live help desk that can rectify any problems immediately.
2) May not get all the features
Not all cloud services are the same. Some cloud providers tend to offer limited versions and enable the most popular features only, so you may not receive every feature or customization you want. Before signing up, make sure you know what your cloud service provider offers.
3) Doesn’t mean you should do away with servers
You may have fewer servers to handle which means less for your IT staff to handle, but that doesn’t mean you can let go of all your servers and staff. While it may seem costly to have data centers and a cloud infrastructure, redundancy is key for backup and recovery.
4) No Redundancy
A cloud server is not redundant nor is it backed up. As technology may fail here and there, avoid getting burned by purchasing a redundancy plan. Although it is an extra cost, in most cases it will be well worth it.
5) Bandwidth issues
For ideal performance, clients have to plan accordingly and not pack large amounts of servers and storage devices into a small set of data centers.
1.2 INTRODUCTION TO LOAD BALANCING
What is load balancing?
Load balancing is dividing the amount of work that a computer has to do between two or more computers. So that more work gets done in the same amount of time and all users get served faster. Load balancing can be implemented with hardware, software, or a combination of both. A load balancing algorithm which is dynamic in nature does not consider the previous state or behavior of the system, that is, it depends on the present behavior of the system.
1.2.1 Goals of Load balancing
As given in the goals of load balancing are :
• To improve the performance substantially
• To have a backup plan in case the system fails even partially
• To maintain the system stability
• To accommodate future modification in the system
1.2.2 Load balancing algorithms
Depending on the current state of the system, load balancing algorithms can be divided
in to two categories:
Fig : Types Of Load balancing algorithms
1) Static: It doesn’t depend on the current state of the system. Prior knowledge of the system is needed.
2) Dynamic: Decisions on load balancing are based on current state of the system. No prior knowledge is needed. So it is better than static approach. Here we will discuss on various dynamic load balancing algorithms for the clouds of different sizes.
1) Static Load Balancing Algorithm:
In static environment the cloud provider installs homogeneous resources. Also the resources in the cloud are not flexible when environment is made static. In this scenario, the cloud requires prior knowledge of nodes capacity, processing power , memory, performance and statistics of user requirements. These user requirements are not subjected to any change at run-time. Algorithms proposed to achieve load balancing in static environment cannot adapt to the run time changes in load. Although static environment is easier to simulate but is not well suited for heterogeneous cloud environment. Static load balancing algorithms are :
i. Round Robin Algorithm
ii. First Come First Serve (FCFS)
i. Round Robin Algorithm
The round robin algorithm allocates task to the next VM in the queue irrespective of the load on that VM.The Round Robin policy does not consider the resource capabilities, priority, and the length of the tasks. So, the higher priority and the lengthy tasks end up with the higher response times. Round Robin algorithm [1] provides load balancing in static environment. In this the resources are provisioned to the task on first-cum-first-serve (FCFS- i.e. the task that entered first will be first allocated the resource) basis and scheduled in time sharing manner. The resource which is least loaded (the node with least number of connections) is allocated to the task. Eucalyptus uses greedy (first-fit) with round-robin for VM mapping. Round Robin is one of the efficient algorithms for CPU scheduling. The basic idea in RR is it gives equal CPU time to each process. This CPU time is known as a quantum. The performance of RR is basically depends upon the quantum value [2,3]. There is no specific method is available to select the quantum. If quantum is too large it maximizes the waiting time whereas if it is too small it increases the context switching. The quantum can be fixed during all iterations or change dynamically based on some optimization technique. The simple RR is primarily selects a fixed time quantum. All processes executes for fixed amount of time. When quantum expires process is attached at the end of ready queue and a new process executes on CPU for next quantum. This procedure continues till all processes complete their execution on CPU. Round Robin (RR) algorithm focuses on the fairness. RR uses the ring as its queue to store jobs. Each job in a queue has the same execution time and it will be executed in turn. If a job can’t be completed during its turn, it wil l be stored back to the queue waiting for the next turn. The advantage of RR algorithm is that each job will be executed in turn and they don’t have to be waited for the previous one to get completed. But if the load is found to be heavy, RR will take a long time to complete all the jobs. The Cloud Sim toolkit supports RR scheduling strategy for internal scheduling of jobs.The drawback of RR is that the largest job takes enough time for completion.
RR ALGORITHM
Start Round Robin Scheduling Algorithm
Step1. Input number of processes. (nop )is taken as input by the user.
Step2. Randomly generate burst time for each process. (burst time is generated
using rand() function).
Step3. Allocate a time slice for each process.
Step4. Now calculate the waiting time of each process.
Step5. Calculate total waiting time
End Round Robin Scheduling Algorithm
ii. First Come First Serve (FCFS)
The simplest CPU scheduling algorithms is First Come First Serve (FCFS) which selects the first arrived process for execution from the ready queue [4]. The processes are executed according to priority, such that the process having highest priority will execute first. On the basis of execution of each process, the waiting time and turnaround time is calculated but in the case of similar priority FCFS i.e. first Come First Serve is used. In which, if the two or more process have similar priority then the process which comes first is executed first, algorithmfor priority Scheduling[5]are as follows.FCFS for parallel processing and is aiming at the resource with the smallest waiting queue time and is selected for the incoming task. The Cloud Sim toolkit supports First Come First Serve (FCFS) scheduling strategy for internal scheduling of jobs. Allocation of application specific VMs to Hosts in a Cloud based data center is the responsibility of the virtual machine provisioned component. The default policy implemented by the VM provisioned is a straightforward policy that allocates a VM to the Host in FirstComeFirst Serve (FCFS) basis. The disadvantages of FCFS is that it is non preemptive. The shortest tasks which are at the back of the queue have to wait for the long task at the front to finish .Its turnaround and response is quite low. The drawback of FCFS is that, process with smaller burst time unnecessary waits for long duration if the process with longest burst time selected first.Thus, FCFS always select first process .
Start FCFS Algorithm
Step1. Input number of processes. (nop) is taken as input by the user.
Step2. Randomly generate burst time for each process. (burst time is generated
using rand() function).
Step3. Calculate waiting time of each process.
Step4. Calculate total waiting time.
End FCFS Algorithm
2) Dynamic Load balancing algorithm:
• These algorithms are more resilient than static algorithms, can easily adapt to alteration and provide better results in heterogeneous and dynamic environments.
• In dynamic environment the cloud provider installs heterogeneous resources. The resources are flexible in dynamic environment. In this scenario cloud cannot rely on the prior knowledge whereas it takes into account run-time statistics. The requirements of the users are granted flexibility (i.e. they may change at run-time). Algorithm proposed to achieve load balancing in dynamic environment can easily adapt to run time changes in load. Dynamic environment is difficult to be simulated but is highly adaptable with cloud computing environment.Dynamic load balancing algorithms have two types. It is defined as following:
a) Distributed approach
b) Non Distributed approach
a) Distributed approach:
• In this approach the dynamic load balancing algorithm is executed by all nodes present in the system and the task of load balancing is shared among them.
• The interaction among nodes to achieve load balancing can take two forms:
i. Cooperative: In this, the nodes work side-by-side to achieve a common objective, for example, to improve the overall response time.
ii. Non-cooperative: In this, each node works independently toward a goal local to it, for example, to improve the response time of a local task.
b) Non Distributed approach:
• In this approach either one node or a group of nodes do the task of load balancing. Non-distributed dynamic load balancing algorithms can take two forms:
i. Centralized: In this, the load balancing algorithm is executed only by a single node in the whole system: the central node. This node is solely responsible for load balancing of the whole system. The other nodes interact only with the central node.
ii. Semi-distributed: In this, nodes of the system are partitioned into clusters, where the load balancing in each cluster is of centralized form. A central node is elected in each cluster by appropriate election technique which takes care of load balancing within that cluster.
1.2.3 Policies or Strategies in dynamic load balancing
There are 4 policies. There are
1. Transfer Policy: The part of the dynamic load balancing algorithm which selects a job for transferring from a local node to a remote node is referred to as Transfer policy or Transfer strategy.
2. Selection Policy: It specifies the processors involved in the load exchange (processor matching)
3. Location Policy: The part of the load balancing algorithm which selects a destination node for a transferred task is referred to as location policy or Location strategy.
4. Information Policy: The part of the dynamic load balancing algorithm responsible for collecting information about the nodes in the system is referred to as Information policy or Information strategy.
1.4 Objective of the Project:
The major problem which we present in this paper is that cloud computing distributes workload dynamically across multiple nodes where no single resource is underutilized and is known as optimization problem.
To overcome this problem the load balancing strategy we use Genetic Algorithm (GA).
1.5 Organization of the Project Report
The whole thesis is organized as follows.
Chapter 2 deals with System Environments.
Chapter 3 gives detail description about Related work.
Chapter 4 deals with the description of Cloud System, Load Balancing and
Genetic Algorithm
Chapter 5 UML Diagrams.
Chapter 6 Discusses about Proposed Strategy.
Chapter 7 discuss about the CloudSim.
Chapter 8 shows the results of the whole work.
Chapter 9 the conclusion of whole work is presented in and Future work.