In Denmark the discussion between Security freaks and the avant garde computer nerds has raged this summer om the question whether or not the Public Sector in Denmark should pursue the use of Cloud computing aggressively. (My blog – in Danish – covers this discussion). The con's are led by the IT Council for Security, while the pro's – amazingly – is lead by the Minister of Science, Charlotte Sahl-Madsen, who sees the opportunity to involve a number of innovative, small companies in utilizing the vast amount of public data until now stored away in departmental systems, bunkers or behind firewalls.
Gartner Group has recently publicized it's 'Hype Cycle' for Cloud Computing that points to the observation that the concept is probably currently at the very top of the Hype and bound to go through the 'Trough of Disillusionment' before we start to see real large scale, public use of the technology.
The opponents are pointing to 3 reasons why the public sector should be extremely careful to deploy CC at a large scale:
Cloud Computing should comply with all existing legislation concerning data security.
Cloud Computing solutions should obey to 'best practices' and common accepted standards that the providers should be certified to guarantee
Critical data should obey to national and EU standards in emergencies – currently demanding that critical data (Citizen Records, Health Records, State Financial Records, Military and police records) should be guarded and kept within full physical control of the responsible departments/public organizations.
The Minister of Science requests a new look at Data Security, and even EU is expected to produce a suggestion to ease the European Directives on data protection to stimulate use of Cloud Computing.
But what are the facts and what are the real security challenges in Cloud Computing? Are there any new technological developments that will ensure that legislations' intents are obeyed even if the rigorous requirements for 'physical control and inspection' can not be performed in it's original sense?
First we need to agree on what Cloud Computing is and what is isn't. In February 2009 the UC Berkeley Adaptive Distributed Systems Laboratory gave out a technical report describing the phenomenon.
They point out that Cloud Computing refers to both the applications delivered as services over the internet and the hardware and systems software that provide these services. The applications themselves are labelled Software as a Service. The underlying HW and SW is described as the cloud. When a cloud is made available on a pay-as-you-go manner, we call it a Public Cloud. The service being sold is called Utility Computing. In contrast to this, a Private Cloud refers to internal data centers or data centers not made a available to other that one customer (or a coherent set of customers). The Private Cloud is dismissed from the Berkeley sense of Cloud Computing, even if pricing and to a certain degree scalability might be at least similar to 'Real Cloud Computing'.
Now the business model of Cloud Computing in it's original form thus consist of this 'value chain': we have the SaaS user, the end user that buys the services, web applications. He might buy these services from a SaaS-Provider, who can be seen as a Cloud User, that buys the underlining HW and basic SW capabilities from a Cloud Provider. Of course the Cloud Provider and SaaS provider might be one and the same organization.
The various classes of Utility Computing that are offered through individual business models, vary of course (Amazon, Microsoft Azure, Google, IBM etc.) and the various types are reflected in various models for computation models (Virtual Machines running on 'Hypervisors' or a 'farm' of VM's), by various storage models including variations of DB SW provisioning and finally various Network models from pre-defined IP address spaces, dynamic allocation, fixed topologies, 'elastic zones' etc.
This variation reflects the lack of common standards but is of course a result of the intense competition in the marketplace. But is has led to new definitions a part from SaaS that are used commonly: Platform-as-a service – typically major application areas, SAP for instance – and Infrastructure-as-a service. Especially the latter variation leads to the conclusion that the organization that buys IaaS simply buys the basic building blocks and otherwise are left with the responsibility to populate the Cloud by his own applications and development tools. This type of Cloud business model is especially appealing to SW developers, while the PaaS model is aiming at solving enterprise (or Government institutions') problems.
To confuse the matter more, the terms 'Community Cloud' and 'Hybrid Cloud' have risen recently – the former simply refers to a private cloud with many tenants (municipalities, agencies for instance) while Hybrid indicates that some of the applications in the portfolio in running in the Public Cloud while other services within the portfolio remains in the Private Cloud.
The application opportunities that have occurred the last couple of years as a result of this combination of highly scalable, robust 'back end' systems and an aggressive 'pay-as-you-go' pricing model covers interesting areas like mobile interactive applications, large scale batch processing jobs, heavy data analytical computation jobs, to more straight-forward desktop services solutions avoiding tedious updating and installation of new SW at each and every employees desktop.
The economics of Cloud Computing explains why the Berkeley-team did not want to include 'Private Clouds' in the definition: The huge requirements of back-end capacity and the real need for economy of scale makes it almost mandatory that the players are very large companies that can afford the risk of large scale investments – and on the other hand are able to manage and balance load in an extremely efficient way. Sharp changes in capacity demand, seasonal fluctuations, including back-up requirements, fall-over facilities etc. plus of course traffic bottlenecks in communication equipment clearly makes the unit costs of Cloud Providers competitive compared to traditional Data Centers. This is the major argument for US CIO Vivek Kundra's Federal Cloud Strategy 'Cloud First', announced February 2011, where it is stated that at least 20 Billion US $ out of the total 80 billion $ spent by federal agencies in 2011 could be saved by moving to the cloud.
Other economic benefits of moving to the Cloud as stated by Vivek Kundra are the improved productivity in application development, fast response time to rising needs, better linkage to emerging technologies and alignment with private sector innovation.
It is tempting to argue that the huge difference between the European 'Cloud Scepticists' and Vivek Kundra is that US is desperately in need of savings and in any case not very concerned about privacy, the Europeans have a more than 10 year legislation on privacy, now adopted in almost all EU countries and at least in some countries (led by Germany) this plays a very large role in public opinion. This is the reason why the EU Commission has requested an in depth study of legal and technical implications of Cloud Computing in light of the EU 94/46 Directive. These changes are not due to occur before spring 2012.
OK, but where are the REAL risks?
Looking at Figure 1 we can identify 4 classes of risks of malicious, unauthorized access:
End user access – non-authorized attempts to gain access to read, steal, modify data, DOS etc.
Inter-Cloud Access using external services to hide malware, worms, trojan horses etc.
Tenant related 'insider' attempts to gain un-authorized access
Cloud Provider administrator 'insider' attacks
Add to these risks the risk of errors in a shared physical environment if instances of VM and in particular data are not kept isolated from other tenant's.
Let us try to address the question of how a secure cloud solution could be established following the 3 items mentioned by the Danish Council for IT Security.
Comply with Data Protection laws.
Their first concern is that a cloud and Saas-provider should comply with the current legislation. This has as it's root the requirement that it should be guaranteed that only authorized persons get access to data that contains PID (Personally Identifiable Data).
You could argue that this is first and foremost the responsibility of the department that collects and stores the data to ensure access rights and to ensure that the Cloud provider follows the requirements in this respect: logs access as well as access attempts, prevents unauthorized access to customer data also by his own staff. Ultimately this requires for the government agency that wants to move to the cloud that it maintains a security organization that issues or controls issuance of access certificates and up-to-date Access Control Lists.
It is likewise a requirement for any procurement and contracts with cloud providers.
Cloud provider should adapt 'best practices' and standards and be certified.
Best Standards are at this stage only emerging, and it may be a tough requirement, before internal standardization agencies and governments agree on this. But there are practices that could help guide the Cloud Vendor. (See also Google's White Paper on Cloud Security)
First of all the psychological fears of privacy breaches could be avoided if the cloud provider could guarantee against intruders passing the firewall and accessing user data, we already discussed the first level of this. The second layer attack could be hidden in the cross-application communication that will eventually occur when external clouds exchange services and data with the tenants of one cloud operator. It might be simple things like services for currency exchange calculations, or it might be much more SOA-like, cross domain data analytics – for instance health care research across several EHR-systems.
The more basic services to protect against things like VLAN-spoofing and firewall configuration errors should in any case be part of a contract with a trusted provider, but the Achilles-heal is the second type: access to primary data across a number of domains.
We will discuss this in more detail later to see if there are some technological development ahead.
The 3rd requirement from the Security Council concerns national security regulations. These requirements are probably more difficult to deal with if legislation is taken literally: that inspectors should be able to physically identify data bases and storage devices. That is, of course unless you are able to prove that a) The data are protected from any 3rd party reading and interpretation, and b) that even if the case of destruction you are able to re-construct data in a secure way. One would think that Vivek Kundra had this in mind but came to the conclusion that all major cloud providers are deeply rooted in US.
There are, however, some inconsistencies in the current legislation; while it is understood that no foreign power should gain access to a complete Person Registration system, it is questionable why a financial system and related data should be considered 'strategic' if the data are duplicated and can be reproduced. This is the position of the Danish Minister of Science. But the intention as specified in national legislation to keep strategic data within the country seems to be more like a legal problem – especially if the true meaning of the word 'Control' can be solved in other ways. If you have full back up/recovery of data, full duplication and secured your data by top class (maybe even dynamic) encryption of data you could argue that you, the data owner has full control.
So it seems that there are technical ways around at least most of the issues – but how real is the idea to have all public institutions manage the access rights, secure encryption of the data and still be able to exchange and exploit data across institutional borders in a 'community-like' public cloud environment? Let us assume that all policy aspects are solved, agreements of responsibilities between tenant of cloud provider (and SaaS-provider) are put in place, SLA's established, certification of best practices – then is the idea of a grand scheme of encrypted data at all feasible?
Already in 2009 the IBM researche Craig Gentry suggested a method of what he called Homomorphic computing which would enable users with appropriate access rights to access and calculate on encrypted data across domains in a way where the final result could be decrypted and produce the same result as if the operation had been performed on the 'raw' data. If this is a feasible and economic practical solution, then most of the obstacles of putting sensitive data in the Cloud have disappeared. But as usual, there is no such thing as a free meal. At the current state of development, the computational requirements for this 2-step encryption method are enormous, at least at present enough to turn the economic advantages of Cloud Computing off. So we will still have to wait for the costs of computing to come down – and as it seems we are already getting beyond Moore's law and accelerating developments, it is now believed that homomorphic encryption will become economical feasible within the next 5 years. And when it will be solved, the interim requirements of close control with the Cloud provider might be reduced.
But then we need in any case to agree on and clean the standards, define and refine policies, practice and implement large scale ID-management and authorization. In any case the answer to the question whether the technical risks of cloud computing are real is yes, and it will be solved. Eventually.