Why Is Cloud So Big?
In a series of articles over the period of last several weeks, I have been delving into various facets of Cloud Computing. If you have not had a chance to know what this computing paradigm means, please refer to the first article in the series, "The Story of the Digital Cloud - Cloud Computing", and then read the recent ones.
Moving ahead, let us discuss why people attach the tag 'big' while referring to Cloud. Cloud stores 'big' data (refer to article, "Bizarre Data versus Big Data - A Normative Shift"), can handle large scale of computation (read 'big'), and takes the entire Internet (a big or giant edifice) to operate. Like any computing paradigm, cloud computing is all about storing, crunching and availing data on demand, but now on 'big' scale.
Cloud adopts the web architecture into its fold for data delivery to end users. The user sends request for data through the age-old HTTP protocol using the convenient software tool called a browser. The request is processed at Cloud, and the response with resulting data is sent back through the same protocol. Thus, it may look like a simple extension of web architecture that facilitates transmission of 'big' size of data. Effectively, as it may be apparent, much should not change except the bandwidth capacity of our data transmission infrastructure. In actual world, it is not true. In spite of ever-increasing requirement of bandwidth, the facilitator for data transmission simply does not qualify for the description of Cloud. But then, the impact has made the way for re-looking at the architecture itself.
It may be recollected (refer my earlier article, "Adopting Cloud for Scalability - The Way forward") that Cloud uses resources like storage, processing and network together in its architectural goals towards serving 'big' data to its consumers. Cloud requires that it scales its capacity, availability and performance on demand. Cloud needs to devise its architecture in such a way that it can serve the varying needs of its consumers, perhaps concurrently, without fail.
The capacity may be expanded vertically using powerful machines with greater storage exposed to higher bandwidth lines. But the reality is quite different as requirement of such vertical scaling fails miserably in comparison to the available technology capabilities. But life does not end here as we have an alternate way for expanding our capacity - in a horizontal fashion. This is like matching the power of an elephant with a team of horses. For your own good, calculate the number of horsepower (hp) an elephant can exert; I have my number as 4 (four) for an Indian elephant. Back to our discussion - but then there has to be a central coordination; and adding more horses should be seamless. This is exactly the concept of virtualization in a Cloud. Each horse can be thought of an independent unit for the work to be done by it while they need to be put into use in particular ways by a central command as per the need of particular situation.
Going with our concept of the central command, the request from consumer must be received here first. Then the central command allocates requisite resources for performing the task without any knowledge of the consumer. It manages resources, allocates or de-allocates. It manages billing and monitoring process for consumer. It uses different software modules including load-balancers, billing systems, and others to achieve such tasks. Well, the devil is in the details; and we shall not discuss here.
Well, we now understand the way Cloud manages its capacity and performance. Bringing the earlier discussion about the data availability, Cloud uses its distributed environment to keep the program and storage units redundantly at multiple locations. This enables the consumer to avail data quickly - nearer is faster, a kind of 'horizontal' approach. The redundancy of data traffic is eliminated by implementing the redundancy in data storage! This is nothing but edge-caching, and is implemented through standard CDN (Content Delivery Networks) and P2P networks.
In short, the 'big' Cloud infrastructure is managed through a central command (or a number of commands adhering to a set of strict rules of coordination) and a large set of distributed computing resources on a global scale. That's BIG! People also go sentimental when we speak about the scale of its impact, and attach all weird things with it. For example, a section of information scientists contemplate to determine the extent of disruption that storms can do to cloud services. Oceanic storms can have disruptive impact on sea cables; and storms over land areas can affect the Internet services to consumers, and so, the cloud services. Well, anything of such a big scale can be affected by big many factors!