Archive for February, 2009
Being the architect of a cloud-based enterprise software-as-a-service (SaaS) product — Adobe LiveCycle Express — I get asked a lot about the security of cloud-based applications. They are all the questions you would expect: “Is my data secure in the cloud?”; “How safe are cloud service environments?; “Aren’t you just asking to be hacked?”. My response to all of these questions is, first, to tell them to take a deep breath and relax, and second, to think of the cloud as any other computer system faced with security threats, and for which you need to develop a threat response. Security for cloud-based applications is actually a multifaceted problem, with distinct threats and responses for network security, operating system security, data security, and virtualization security — but each of these distinct areas has an analog in non-cloud-based applications that has been analyzed and for which there are security procedures to help reduce threats. So, ultimately, we just need to ignore the hype surrounding cloud computing and focus on the fundamental computer system security issues that applications will face in this domain. Some brief commentary on each of the threat areas faced by cloud-based applications follows, along with a note about regulatory issues. As with anything you read on the internet, your mileage may vary.
Network Security for Cloud-based Enterprise Applications
The ideal network security paradigm for cloud-based enterprise applications would be to have the cloud services be an extension of the customer’s existing internal networks, using all of the same protection measures and security precautions that their administrators have developed. This implies a strategy that would allow a customer to extend their network security envelope to encompass the cloud services that they use. For my application, I have chosen to use an encrypted TCP port forwarding strategy to extend the customer’s network security envelope into the cloud. Specifically, I use an implementation of the ssh protocol that performs bi-directional TCP traffic forwarding across encrypted, compressed connections between cloud-based virtual instances and customer machines behind their corporate firewalls. The encryption strength is configurable, and is at least as secure as any corporate VPN encryption. This is another way of saying that it does not introduce any new network threats beyond what a customer already faces with their existing VPN technologies. Less paranoid customers can choose to downgrade the security of their cloud instances and access them directly via HTTP and HTTPS, but my recommended approach is to use encrypted traffic forwarding, both for robust security and to translate the problem of cloud-based network security into terms and concepts — VPNs and encryption — that system administrators will understand. In reducing the problem of cloud-based network security to one of corporate network security, we simplify the problem of assessing and responding to cloud-based network security threats.
Operating System Security for Cloud-based Enterprise Applications
Cloud-based virtual instances face the same OS-level threats as standard hardware-based installs. The typical responses for these threats are the same as for hardware-based installs (e.g. block all non-essential listen ports via firewalls, keep the OS patches current, and run modern anti-virus software). For my enterprise application, I recommend that all ports except the ssh port (port 22) are to be blocked by firewall, patches are to be updated periodically and rolled into new virtual images, and antivirus software is to be installed; these measures collectively ensure that the virtual instances face no new OS-level threats beyond what a customer already faces with their existing hardware. As with network security, by reducing the problem of cloud-based operating system security to one of traditional hardware-based operating system security we simplify the problem of assessing and responding to threats.
Data Security for Cloud-based Enterprise Applications
In cloud-based applications, data encryption is the simplest way to protect data. Data in a cloud-based application will likely be transiting public networks and shared resources, and it makes sense to protect it against simple observation. Modern encryption technologies will withstand all but the most determined attempts to crack them, and while they require some key sharing infrastructure to implement them properly, they can be deployed using traditional methods employed for non-cloud-based applications. There are potential performance implications for data encryption, but, in my opinion, when using modern hardware and fast networks with a well-designed application, the performance implications can be effectively minimized. In my application, virtual instances store their backups off-instance within a cloud storage service. These backups are encrypted using a PKI key pair, with only the public key stored on the virtual instance. The data storage needs of virtual instances in my application are simple and infrequent, so the application easily lends itself to a data encryption strategy. Not all applications will have these characteristics; however, there are very few alternatives that can ensure data security in cloud-based applications. Thus, data encryption in some form will likely be a necessary evil of all cloud-based applications…but it’s not really that evil. And once again, we have reduced the problem of data security in cloud-based applications to an existing, non-cloud-based technology domain.
Virtualization Security for Cloud-based Enterprise Applications
Cloud-based virtual instances face resource-sharing threats that are inherent to the virtualization environment. These threats, however, are similar to the resource-sharing threats seen by enterprise applications that share hardware resources, either by design or necessity, within corporate data centers, and for which there are some procedures and assumptions that can be put to use in the cloud domain. For cloud-based resource-sharing threats, the problem is largely within the domain of the cloud virtualization provider, and it is their responsibility to respond to the threat. Customers of a virtualization service are ultimately trusting the virtualization service provider to guarantee the inter-instance security of the virtual instances, much as shared applications on common hardware are trusting the operating system to enforce inter-process security. Cloud-based applications rely on cloud services to enforce inter-instance security and guarantee that shared memory and disk resources can not be used to access another customer’s applications. Many cloud service providers run their own business applications within the same virtualization fabric, so it is fair to assume that inter-instance security is being monitored by corporate security professionals. In my opinion, the security budgets of the major cloud service providers far exceeds the security budgets of most cloud service consumers, and thus the degree of virtualization security provided in these service environments will be at least as good as the resource-sharing security found within a customer’s data center. For example, Amazon runs their flagship business within the same data centers as their cloud services, and I imagine that they have some of the best security professionals in the industry ensuring that their business applications are not at risk to resource-sharing threats from their cloud services. Ultimately, however, the threat is not substantially different than the hardware-based shared resource threats that customers currently face in their own datacenters. We can reduce the issue of virtualization security to one that is similar to an existing threat, and for which there are standard practices to assess and respond to that threat.
Regulatory Issues for Cloud-based Enterprise Applications
A discussion of security for cloud-based enterprise applications would not be complete without some comments on the regulatory issues that come into play when enterprise applications, and specifically enterprise application data, are moved into cloud-based services. Legislation and regulation in many industries has led to strict guidelines regarding enterprise data, primarily to protect the privacy of individuals. For example, The Health Insurance Portability and Accountability Act (HIPAA) contains strict provisions on what can be done with individuals’ health records, and what permissions must be explicitly received from those individuals for any transfer of those records. For an enterprise application that contains health records, this presents an open question that must be answered before that application can integrate cloud services into its architecture: How do the regulations within HIPAA apply to cloud-based services, and what restrictions do they place, if any, upon the use of third-party cloud services within enterprise healthcare applications? The answer to this question, and similar questions for regulations in other industries, is evolving rapidly, but it will ultimately be answered by legislation and regulation rather than by technology. However, progress towards an answer will be made by software vendors and enterprise customers that make the case that threats faced by cloud-based enterprise applications are ultimately no different, and no more severe, than the existing threats faced by those applications in their current deployment formats.
I have tried to make the argument that the security threats seen by cloud-based enterprise applications are translations of threats seen by existing enterprise applications within corporate data centers, and thus there are existing procedures and responses that can be put to use in designing an appropriate security strategy. To paraphrase Bruce Schneier, security is theater, and the perception of security for cloud-based enterprise applications is likely more important than the actual manifestation of the security mechanisms. This has been true at each evolutionary stage for enterprise systems — does everyone remember the paranoia surrounding the first attempts to connect corporate networks to the public internet? — and the continuing deployment of ever more cloud-based applications will need to reach a critical mass, with respect to the perceived security story, before it becomes effectively mainstream. But that day is coming.
Hope this helps.
Recently, I needed to configure a Windows 2003 AMI in EC2 to run a ssh server. I would have expected this to be a simple job, with a variety of choices for making this work, but in the end it was far more time consuming, complicated, and frustrating than I would have guessed. Here is a quick road map of what I did.
My initial thought was that there must be a free, native port of openssh for Windows that installs as a service and otherwise conforms to the Windows environment…wrong! I can’t tell you why this is the case — maybe ssh is just not a microsofty way of doing remote terminals and file transfers — but I couldn’t find anything resembling a free, functional port of openssh for Windows. I found a few blog posts that mentioned that people had tried this, but ultimately they gave up when faced with the integration between openssh’s user/group namespace functions and Windows’ user/group concepts (to say nothing of the differences between the Windows command prompt and the UNIX shells). And these blog posts ultimately suggested that it was easier to run sshd via cygwin than it would be to port sshd to run natively. So….cygwin time!
UNIX is my OS of choice, and I’ve had cygwin on every Windows box I have ever had, so it was a quick jump to download the cygwin installer and install the packages I needed on a freshly started Windows 2003 instance in EC2 (incidentally, I am running the 64-bit, large EC2 instance AMI of Windows 2003 Server with SQL Server Express and no Authentication Services). The openssh package comes with a simple script — ssh-host-config — to generate the server host keys and create the users needed for privilege separation, so it was a nice, simple, relatively painless install. There are a few things that the config script misses, however, which requires you to run it several times before it ultimately succeeds (although it is nice enough to point out the problem each time and prompt you to fix it). After playing with it, I came up with the following actions to perform before running ssh-host-config in order to make it succeed the first time without errors:
0) Add the following line to /cygwin.bat:
set CYGWIN=binmode tty ntsec
1) Run a new cygwin bash shell (after the edit of cygwin.bat) and enter:
mount -s --change-cygdrive-prefix /
chmod +r /etc/passwd /etc/group
chmod 755 /var
2) Run a new cygwin bash shell (to pick up the cygdrive prefix change) and enter:
-- yes for privilege separation
-- "binmode tty ntsec" for CYGWIN environment variable setting for the service
-- enter your password of choice for the cyg_server account
3) Enter the following to start sshd:
net start sshd
4) Open the Windows Firewall editor, and add an exception for TCP traffic on port 22 for sshd.
5) If you haven’t already done so, open up port 22 for your EC2 instance group (assuming you are running your instance in the default group):
ec2-authorize -p 22 default
If everything went well, sshd is running and available on port 22, and you can login normally via ssh from other machines. All that is left to do is bundle up a new AMI to capture the cygwin installation…and that should be a piece of cake, right? The updated EC2 API has a new method — ec2-bundle-instance — that kicks off an AMI bundling job for an EC2 instance running Windows, so it should be as simple as calling this method and then grabbing a beer to wait for it to complete. If only it were that simple…
Unlike the AMI bundling scripts for Linux-based EC2 instances, which are ultimately just packaging up the existing file system, the Windows AMI bundling mechanism needs to perform several Windows-specific functions that are ultimately a real pain in the neck. First and foremost is sysprep. Sysprep is Microsoft’s answer to the problem of Windows virtualization; apparently the simple cloning of a Windows installation is not acceptable, and a new Windows SID should be generated for each new instantiation of a Windows virtual image. Sysprep does some other things, too (search for sysprep on Microsoft’s support web site for a more complete description — I am certainly not an expert on it), but ultimately the SID generation is the one that causes problems for a lot of installed software…like cygwin. After bundling a new AMI and starting a new instance with it, I found that sshd is hosed for no apparent reason. Attempts to start sshd via “net start sshd” produce the following cryptic error message:
The CYGWIN sshd service is starting.
The CYGWIN sshd service could not be started.
The service did not report an error.More help is available by typing NET HELPMSG 3534.
After several time-consuming iterations of start new instance -> install cygwin -> bundle new AMI -> start new AMI instance -> wonder why sshd is hosed, I found something in the HKEY_USERS tree of the Windows registry that changes after the bundling step. Prior to bundling, with a functioning cygwin/sshd, I see the following in the registry:
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions\Cygwin\Program Options]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions\Cygwin\Program Options]
After bundling, in a new instance in which sshd is hosed, I see the following in the registry:
[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions][HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin]
[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin\Program Options]
All of the other registry entries related to cygwin remain the same before and after the bundling step, so my guess is that the loss of entries in the bundled instance is the source of the trouble. But what exactly are those entries?
Again, I’m no windows expert, but the entries in question appear to have the windows SID followed by a user identifier (e.g. in S-1-5-21-4261372910-2505678249-1238160980-500, S-1-5-21-4261372910-2505678249-1238160980 is the SID, and 500 is the user id). Looking at the /etc/passwd file for cygwin, the user id 500 corresponds to the Administrator account, and user id 1013 corresponds to the cyg_server account, used by sshd as a privileged account for switching effective user ids during login. So, my hypothesis is that the privileges for the cyg_server account are somehow lost by sysprep during the bundling step, and sshd is hosed without them in the new bundled AMI instance.To test my hypothesis, I decided to configure the AMI bundling step to skip sysprep. The base Windows EC2 AMIs come with an application in the start menu called “ec2Service Setting” that has a check box to enable/disable sysprep during AMI bundling, so it is easy enough to test this. However, I have no idea what happens to Windows if I disable sysprep during bundling, and I was not able to find a satisfactory answer via internet searches. The closest I got to an answer was to see several of the Amazon admins on the EC2 forum comment that it was not a good idea to disable sysprep if you were going to instantate multiple instances. I also found several documents online that discussed how sysprep was used to sanitize a Windows installation, generate a new SID, and make it generic for installation on any type of hardware. Since the virtual hardware of EC2 is, roughly speaking, identical (given that it is using Xen underneath the hood), I’m not too worried about the hardware issue. I have no idea about “sanitizing” the Windows instance or SID generation, though, so bundling without sysprep might mortally wound Windows (again…I’m no Windows expert). And I do want to run multiple instances from the bundled AMI, so that might be a non-starter as well. So I guess I will try the ready-shoot-aim approach of seeing what happens when I turn it off…
Compressing time, I started with a fresh Windows instance, installed cygwin and configured sshd like before, turned off sysprep and bundled it, started a new instance from the new bundled AMI, and…sshd still works. The new instance retains the SID that it had prior to bundling, and the registry entries are still there for the cyg_server account. Windows also appears to be working in all respects, but I’m not sure I could detect problems that might result internally from the omission of sysprep in the bundling. I guess I can run one more test, starting a bunch of instances at once, to see if having the same SID causes them to interfere with one another. I started four instances, running concurrently, and they each seem to be working fine. Or at least I can’t detect any problems.
So, in closing, it looks like I may have a solution: turn off sysprep if you want to use cygwin sshd in a bundled Windows AMI. Someone with more Microsoft kung-fu might be able to figure out how to make sysprep retain the registry entries for the cyg_server account, or maybe they would write a script to insert them directly into the registry at restart if they are missing…who knows. But for me, disabling sysprep seems to be the way to go. I found lots of other complaints on the internet about sysprep and what it does to installed software when the SID changes, so I’m guessing that there will be a lot of bundled AMIs in EC2 that are created with sysprep disabled. If there are, in fact, issues with multiple instances using the same SID, then I expect we will be reading about it in the EC2 forums, since everyone who creates a new AMI from the base Windows AMIs without sysprep will have the same base SID in their AMIs, and so on….
Anyway, that’s it. Hope that helps.
About 18 months ago, I created an experimental search engine to play with some new ways to extract and distill information from the web and present it in a more topic-focused way. The site is called doryoku (which is Japanese — 努力 — meaning supreme effort). It has been doing well, and my experiments and enhancements continue. My latest addition involves automatic translation: the site is now available in 10 languages: English, Japanese, Chinese, Korean, French, German, Italian, Spanish, Russian, and Greek. The translation capabilities are courtesy of the Google Language API.
More new features to come. Stay tuned!
- Siri – a Shot Across Google’s Bow
- “Is colocation cheaper than using a cloud computing service to run the same workload?”
- “How Big is Amazon’s Cloud Computing Business?”
- “Android will run majority of smartphones by Spring”
- Amazon EC2 I/O Performance: Local Ephemeral Disks vs. RAID 0 Striped EBS Volumes
- “We create 5 exabytes every two days.”
- “Go Screw Yourself, Apple.”
- “You are not Google. (or: you don’t really need NoSQL…)”
- “The largest cloud providers are botnets.”
- Observed Performance of Amazon EC2 Instances
- Cloud Computing and Mobile Devices
- Time and Clock Issues in Windows-Based EC2 Instances
- My experimental local and real-time search engine is now available
- Entropy in Cloud Computing Applications
- How to Jailbreak iPhone 3.01