In the field of computing, storage of shared data has traditionally been undertaken using “centralised data management and processing servers” (Gorelik, 2013), referred to in this report as corporate databases. The storage of data sources in the 21st century has developed beyond the corporate database to include mobile devices, e-mail and social media output (Villars, Olofson and Eastwood, 2011) to name but a few. An idea that has been discussed amongst the computing fraternity since the 1960’s, this network of shared data between fixed and mobile devices is now widely described as “cloud computing” (Schmidt, 2006).
The concept of cloud computing is now at the heart of modern society, in every city, every walk of life people are engaging with technology which interfaces with data stored in the cloud. Social media encourages us to remotely upload photos to share with friends. Our music libraries are stored external to our music playing devices, we stream the music we want to hear from the cloud, music we don’t physically own, that we can’t touch. Even the university (various) encourages us, the student, to upload our latest assignment to a remote drive, seemingly in the sky.
But if we don’t own this data, be it music, files or otherwise, then who does? Who is responsible for problems when and if they occur? What might these problems look like? Is this cloud-based data secure? Who controls it? Who else can see what we chose to share – and what of that data which we don’t want to share?
This report aims to touch on some of the issues mentioned above whilst examining the question: Does cloud computing make large corporate databases unnecessary? The research methodology in compiling this report consisted almost exclusively of qualitative data research and involved critically evaluating a vast array of academic material, including journals, peer reviews, books and thesis. The main weakness of conclusions drawn from this report is that secondary quantitative data sourcing was not undertaken, preventing evaluation of statistical data relating to the topic.
Offen and Jeffery, (1997) developed the “M3P framework” which addressed the requirements of their argument that “the design of an integrated data repository needs to model an organisation’s software development process” (Offen and Jeffery, 1997). In the case of identifying lessons learnt from the anaysis of IBM coporate databases, Kitchenham et el, (2009) it is clear that the modeling of databases by companies who employ a physical, on site solution can be undertaken to factor in the software development process of the business concerned. This in theory allows a business to factor in their own measures in considering data security, privacy and regulatory compliance during the design phase.
In theory, this sounds like an ideal solution for those companies and organisations interested in ensuring the security of their own sensitive data. The Ministry of Defence UK (MOD) and banks are two examples of customers whom rely heavily on the use of sensitive data and are therefore may consider the use of large physical, on-site databases as an attractive option.
The perceived advantages of employing a physical on-site database solution in support of maintaining a firm grip on data security arguably addresses one of the major disadvantages of cloud computing highlighted by Yang, (2012) who argues that the protection of data must be seriously considered as part of any potential adoption of cloud computing services.
Yang’s argument is interesting in theory but what does it look like in real life? In the setting of a traditional on-premises database, access control relies on the users of data and the storage servers which host the data, being locked within the same trusted domain. And yes, this domain is of the physical nature as oppose to the virtual cloud type alternative.
In cloud computing, access control between parties authorised to access data and the data itself is provided by a number of security features which appear in various guises throughout the industry. One such technique is described by Zhou, (2013) who states that “key management is a critical element in cloud computing,” in this technique data encryption before outsourcing is used to protect data privacy.
Whatever access control method is used, there exists a danger of multitenancy, the resource sharing of computer storage functionary by multiple and different organisations. An information security risk potentially could occur in a virtual environment in which numerous servers (virtual) are entirely hosted within a single, physical host computer. In this scenario as noted by Kajiyama, (2012) there is a “risk of access privileges to another virtual machine being granted in error, resulting in a loss of data confidentiality”.
It is worth noting that whilst this kind of mix-up sounds unlikely to the novice, it must be considered that cyber criminals are routinely (if the media is to be believed) attempting to extract access privileges from users of internet banking facilities on an almost daily basis.
If cloud based computing is the selected option, there are steps which can be taken to ensure data protection within the spheres of transmission, storage and encryption. Buyya, (2008) suggests that “in a cloud envionment, access to a cloud data center from outside should be handled in a logically isolated communication channel, such as an encrypted IPSec virtually protected network (VPN).”
Traditional best pratices can be applied to virtualised systems and implementation could include (Krutz and Vine, 2010):
“Hardening the local operating system, limiting physical access to the host, implementing file integrity checks and maintaining backups.”
All of these tasks could potentially be covered, at cost, by the service provider of the cloud database itself, with specifics of any support covered by a service level agreement between service provider and end-user (Kumar and Ravali, 2012).
In terms of a traditional style physical corporate database, data security would begin with obdurate physical level security in the form of monitoring and governing access to the database servers themselves on-site. This would require physical access security controls and in some circumstances, perhaps even the presence of 24hour security guard control by a person/s.
This leads nicely into the argument of cost differences between corporate and cloud-based database services. Dwivedi and Mustafee, (2010) report advantages of cloud computing, with specific reference to potential cost saving benefits, including reduced hardware investment and maintenance costs, coupled with lower electricity consumption on-site.
Smaller companies with a limited budget whom experience high and low periods of demands for IT. services may only want to pay for services they use, with options to increase server coverage as the business grows. As Lin and Chen, (2012) point out, some cloud computing services offer a “measured service” (pay per use) which has “no fixed cost, allowing for lower investment with immediate access to cost saving improvement”.
For an in-house database, software updates, whether for security purposes or just general application improvements, are likely to be completed by an in-house team. The financial cost to the company will include the staff themselves, software and the down-time of the data server itself, this is particularly significant if the data relates to an online business, for example, which is open to customers 24/7.
Compare this to a cloud based service where the service provider is contractually committed to host seamless application and security software updates, without any scheduled downtime to company data availability (known as “agile updating”) (Sultan, 2011; Yang, 2012) and it is clear that the cloud service offers a cost-competative advantage over a corporate database.
That said it cannot be assumed that a cloud based system would be immune from the same technical issues that an organisation with a physical corporate database may find themselves suffering from. Noor, (2013) speculates that “server downtime, maturity and performance issues as well as internet service provider outage” represent issues that could potencially impact the availability of data from either a cloud-based or on-site server.
So on one hand there’s a clearly defined argument that cloud based databases, in theory, should be more reliable than large corporate databases but on the other, there is a fair weight of academic evidence and opinion that both systems of work have the potential to suffer outages based on similar technical issues.
Let us examine some real-world examples; Black Friday, 2016, a growing phenomenon whereby the retail sector look to increase profits by offering consumers vast savings on stock. The retailer GAME sees their website suffer from technical issues as customers swamp the site to grab a bargain. Without digging into the technical data, GAME employs a physical onsite database (various) and suffered downtime due to significant demands on the server hardware and software.
But what of their competitor, CURRYS PC WORLD whose website utilises a cloud-based database server, provided through a third-party contractor, surely their site couldn’t have suffered a similar fate at the hands of extreme volumes of customer traffic on their website? – well, it did!
So there we have it, two major competitors, employing vastly different methods in the provision of database services to their respective websites and yet both suffered downtime due to fairly similar technical issues. So when considering database services in purely cost terms, does it really matter whether a system is cloud based or hosted physically on-site? Or is the key question one of reliability in terms of potential downtime and subsequent loss of revenue incurred?
Moving on, another issue highlighted with “regard to the storage of digital data”, (Gutierrez, 2015), are the implications of allowing a nominated third-party to store data gathered by a business. The discussions of Romero, (2012) and Dutta, (2013), who agree that the placement of data in third-party hands could lead to “questions over data ownership”, are also worthy of consideration.
It is interesting that this ownership discussion hasn’t stimulated the interest of the UK general public in greater detail, surely a user of social media sites “owns” any data they upload to the site they are a member of? Or does data ownership automatically belong to the site operating from the moment a user chooses to upload their information? Well, taking the social media provider, Instagram as an example, you own the data you upload (http://www.socialmedialawbulletin.com), (2015). Overall, this lack of awareness is almost certainly a result of lack of education and awareness amongst the average user, or perhaps in more simple terms they just don’t care what happens to the posts and pictures they upload?
Whilst the influx of social media site availability is slowly raising awareness of data protection amongst the general public, organisations are already showing themselves in some sectors to be data savvy. In the banking sector, data of high sensitivity is now routinely stored on data severs over which customers “have no domain or ownership” (Bannister, 2011).
Romero, (2012) cites this as an example in explaining why the move to cloud-based database services by organisations charged with handling sensitive data, is one of slow progression rather than instant adoption. First the water is “tested” with the migration of data which is less sensitive, before gradually progressing to the “uploading of more sensitive data to the cloud.” (Romero, 2012).
Certainly with the vast majority of banks now seeking to offer financial protection to insure customers against online fraud , it seems a worthwhile long-term investment to ensure customer data is as secure as possible.
Another consideration is the potential impact when considering where exactly cloud-based data is hosted in geographical terms. Brender and Markov, (2013) highlight that “location of data centres is a critical consideration as privacy and data protection laws varies for each country.”
Let us consider the United Kingdom post-Brexit, at the time of writing Britain remains a part of the European Union and therefore bound by its’ rules and regulations, including those relating to data protection, information sharing and other data-centric laws.
In an example loosely based on the comments of Dutta, (2013), let us consider what happens to data stored by a cloud storage provider who is based in France (for example)? At the current point in time, the data privacy laws obeyed by both Britain and France are almost identical. However the moment the United Kingdom ceases to be part of the European Union formally this will change. The data laws followed by the French based provider of the cloud may be completely different to our own.
What is interesting in the above discussion is the question of whom is responsible in ensuring adherence to data protection law and more importantly, what happens if an organisation employing cloud database services decides to terminate that employed agreement.
Herein lies a significant but often unmentioned disadvantage of cloud-based service provision, the “vendor lock-in” scenario, as it is known within the industry. Lock-in usually occurs in one of two situations, the first occurs when a significant amount of data is held within a cloud storage solution. The very size of the data held means that it becomes “extremely costly to transfer the data to another provider” (Marinescu, 2012).
The second circumstance often occurs due to a lack of common standards and protocols of application programming interfaces (APIs), Armbrust, (2009). Here, if the API used for the cloud application is proprietary (as is commonly and purposefully the case), it would prove significantly complex (and thus costly) in programming terms to migrate data between different cloud-based database providers (Armbrust, 2009).
It’s clear that vender lock-in has the potential to reduce the cost effectiveness of employing cloud-based database technology. What is less clear is the potential impact of a service provider ceasing to operate, there is little defence against this, from a customer perspective, short of hoping a market competitor buys the failing business or the data agreements it is responsible for. A useful historical example of vendor lock-in is that of Linkup, a cloud data provider ceased trading in 2008 at very short notice, resulting in the loss of 45% customer data almost overnight (Armburst, 2009).
Large corporate databases are not immune from the impact of company financial loss and subsequent cessation of trading either but the difference is that if the company concerned ceases to trade then it is unlikely to have an any requirement for the data it held. Although it could potentially have some monetary value during the liquidation process and could be brought by a competitor.
In concluding this essay it is clear that the employment of either large corporate databases or a cloud-based computing solution warrants careful consideration with each option having distinct advantages and disadvantages.
As highlighted by Kitchenham, (2009), physical, on-site solutions allow an organisation to plan their database from the concept phase up, considering data security, privacy and compliance to regulations. These same considerations (Yang, 2012) are effectively made the responsibility of a nominated third party if a cloud-based computing solution is employed. Although numerous security solutions, such as “key management” (Zhou, 2013) and the use of “an encrypted IPSec VPN” (Buyya, 2011) exist, the fact that a third party is controlling access to data is one that an organisation cannot escape.
As Brender and Markov (2013) discuss, geo-graphical location of cloud data servers is crucial to understanding how data privacy laws and governance potentially impact the way data is employed by an organisation. The impact of key political events could also produce second and third order effects to the business-space in rapid fashion (think BREXIT!)
Cost effectiveness achieved in employing a cloud solution, is amongst the most favourable advantages of this option. The reduced initial capital expenditure (Dwivedi and Mustafee, 2010), on-going investment in hardware and software maintenance, coupled with the ability to expand a data server network in-line with company growth make for an attractive option, especially for smaller companies and start-ups.
However, as Noor, (2013) points out, the use of cloud data versus a physical corporate database cannot fully mitigate the potential for server downtime due to technical issues and as such, there can be no cast iron financial guarantee in terms of insuring against revenue loss due to IT. failure.
The costs associated with “vender lock-in” (Marinescu, 2012) must also be considered and planned for and again have the potential to significantly reduce any cost related advantage gained by utilising a cloud-based computing service.
So, does cloud computing make large corporate databases unnecessary? The answer is not entirely, perhaps not even nearly. What cloud computing does is offer an alternative to traditional, large corporate databases. An alternative that may, prove more cost effective, particularly for new and smaller businesses but one which must be strongly considered at the strategic management level to ensure it fully supports and meets the strategic aims of the organisation concerned.
27.11.2016