hacking

Entropy in Cloud Computing Applications

Entropy, as it pertains to computer science and cryptography, is one of those topics that most of us (myself included) largely take for granted these days. In this context, entropy is a source of pseudorandomness that is typically collected by the operating system and made available to applications via a pseudorandom number generator (PNRG). We tend to implicitly trust that our applications have a source of entropy that is sufficiently random to ensure that the strength of our cryptographic techniques — SSL handshakes, SSH keys, and the wide variety of other cryptographic techniques used in modern public-facing applications that rely upon a pseudorandom number generator — is as strong, algorithmically speaking, as expected. But what happens when that source of entropy is not as strong as we think it is?

An excellent case study of what happens when a source of entropy is not as random as expected can be found in the weakness that was introduced into the Debian Linux package of the OpenSSL library in May of 2008 (see http://www.debian.org/security/2008/dsa-1571). A change was made to a single line of code in the open source OpenSSL package in order to clean up the output of purify and valgrind as part of the build and test sequence. This minor change had a side effect; it caused the pseudorandom number generator within the OpenSSL library to be predictable because it tightly constrained the number of possible seed values that could be used. Consequently, any cryptographic keys generated using this source of entropy could be guessed within a relatively short period of time using a brute force attack, constrained by the small set of possible seed values to the pseudorandom number generator. This issue was found and addressed quickly, but it illustrates an excellent point about entropy in software applications: a reduction in the quality of a source of entropy can be very difficult to detect if you are not specifically looking for it.

So what does any of this have to do with cloud computing? The current “best practice” for the collection of entropy by an operating system is to collect keyboard timings, mouse movements, network interrupts, disk drive head seeks, and other operating system events that are collectively random and can be processed to generate a stream of randomness to seed the pseudorandom number generator. This works reasonably well for a desktop or laptop that has a keyboard and a mouse and is being used interactively in an arbitrary fashion by a human. It also can be made to work for server hardware, although the rate of entropy generation is slower (and thus activities like key generation are slower) when a human with a keyboard and a mouse is not actively involved, since the technique relies more heavily on unattended events like network and disk use. And this is where a potential problem arises for entropy in cloud computing: a set of virtual machine instances running within a cloud-based virtualization service could potentially share a source of entropy from the underlying hardware. If the instances all share a single piece of underlying physical hardware, then they also all share the same set of network and disk events, and thus a clever attacker might be able to predict the stream of entropy that might be utilized by an application on one of those instances.

There are other techniques for entropy generation (e.g. hardware entropy generators, software techniques involving samples from a microphone or webcam, and entropy services available via the internet) that can be employed to attenuate or eliminate the potential threat of shared entropy sources in cloud computing environments, and as cloud computing environments continue to mature there will undoubtedly be advances in this area to address this issue. In the interim, however, we should all take a closer look at the use of entropy within our cloud-based applications to ensure that we haven’t introduced a “side effect” that will have serious security implications.

Tags:

Wednesday, August 5th, 2009 cloud computing, hacking 1 Comment

How to Jailbreak iPhone 3.01

Apple just released the iPhone 3.01 firmware update, and that means it is time to update my jailbroken iPhone to 3.01 and then jailbreak it again. In the past, I have been a happy user of PwnageTool for the jailbreak, and I would be again except that PwnageTool hasn’t been updated yet for the 3.01 firmware. Doh! I could just wait for the PwnageTool update, but the firmware update is to address a SMS crack that can give someone root on your phone. So I guess I better find a way to do this without PwnageTool.

After the requisite sync and aptbackup, I decided I would first try a quick hack and see how smart PwnageTool is. I put PwnageTool in expert mode and browsed to the 3.01 firmware IPSW to see if I could trick PwnageTool into building a custom IPSW from the 3.01 IPSW. No such luck — PwnageTool checks the firmware and simply won’t do it if it isn’t a supported IPSW version (and 3.01 is not supported in the current version of PwnageTool). So I guess I really do need to use something other than PwnageTool for the jailbreak.

Luckily, I found a post on the dev-team blog that says you can use redsn0w 0.8 to jailbreak the 3.01 firmware provided that you use the 3.0 IPSW as a base. Apparently the changes in 3.01 are very minimal and the redsn0w jailbreak procedure only changes a few things within the existing firmware, rather than completely overwriting it as PwnageTool seems to do. I couldn’t find any good postings with a complete set of instructions on how to do this with redsn0w, but here is what ultimately worked for me:

  1. Connect your phone to iTunes and do a sync. Always good to start with this.
  2. Run aptbackup and select “Backup” so we can restore Cydia after the upgrade and jailbreak.
  3. In iTunes, restore your iPhone. This will also upgrade the firmware to the official 3.01 from Apple.
  4. Run redsn0w 0.8, and select the 3.0 IPSW (iPhone1,2_3.0_7A341_Restore.ipsw) firmware from ~/Library/iTunes/iPhone Software Updates
  5. Follow the instructions to put the phone in DFU mode. Note these are different than how PwnageTool does it, and you need to start with your phone off and connected to iTunes.
  6. Once you are in DFU mode, kickoff the jailbreak.
  7. At some point during the jailbreak, redsn0w told me it was waiting for a reboot. I waited quite a while, and it seemed to be hung. As a last resort, I decided to unplug the iPhone and start over. I unplugged the iPhone and plugged it back in, and…viola! The phone jumped into the redsn0w firmware loader screen and the jailbreak proceeded to completion. I don’t know if I was supposed to do this or not (like I said, I don’t normally use redsn0w)…but it worked.
  8. After a little while my phone came back to life and rebooted and the jailbreak appeared to have succeeded, with Cydia installed.
  9. Run aptbackup and select “Restore”. As part of the process, Cydia asked to upgrade a bunch of essential packages.
  10. One more reboot to check everything and…all done. The firmware revision is now 3.01 according to iTunes, and I have all of my jailbroken applications restored and in place.

That’s it. I hope this helps. And I hope to see PwnageTool updated in the near future, since it has several features (like custom boot images) that I would like to use with my iPhone.

Tags:

Monday, August 3rd, 2009 hacking No Comments

How to Detect the Front (Home) Page of a Wordpress Blog

I recently wanted to add a Wordpress widget that would be conditionally visible in the sidebar of the front (home) blog page only. A reasonable search on this topic turns up a large collection of information and discussion — most of which turns out not to work. The following is a brief overview I what I tried and what I ultimately did.

Most of the information on this topic points to either the is_home() or is_front_page() Wordpress PHP functions. In theory, you can use either of these functions in a conditional expression and detect if you are in the front (home) page. In practice, however, it is a little more complicated. These functions are not based on the URI of the page that is being loading; instead they are based on several global boolean variables that are set based on what queries have occurred in a page. I imagine there is probably a good reason for this in Wordpress-land (and I am not a Wordpress expert), but it seems to me that if I call is_front_page() within the front page of a Wordpress blog then it should reliably return true, and conversely if I call is_front_page() from any other page in the blog then it should reliably return false. In my case, I was trying to create a conditionally visible widget in the sidebar by modifying wp-include/widgets.php to contain a conditional expression within wp_widgets_init(). What I found was that is_front_page() always returned true when called from within widgets.php for any page on my blog. I traced this to the fact that the values of the global variables upon which is_front_page() is based are changed by some of the standard activities performed within wp_widgets_init() — specifically any query to get posts or categories to populate standard widgets that show archives and categories. The is_home() function seems to suffer from the same issue, as it appears to be based on the same global variables. Several posts described interesting ways to combine is_front_page() or is_home() with other functions to get the correct conditional behavior, but none of these worked in my case, and all of them seemed to be specific to their context, rather than general purpose solutions to the problem.

After banging my head against this for a while, I decided to try a different approach. The is_front_page() and is_home() functions are based on global variables rather than the URI of the blog page. So forget those functions…and find an expression that checks the URI directly. It seems simple enough, and it is:

if ( $_SERVER["REQUEST_URI"] == '/' )
{
/* do something */
}

And that’s it. This block of code works reliably throughout the pages of my blog. I am willing to bet that there might be some peculiar Wordpress configurations (e.g. using a custom page as the front page, etc.) for which this does not work, but like I said, I am not a Wordpress expert, so your mileage may vary.

Hope this helps.

Tags:

Sunday, August 2nd, 2009 hacking 1 Comment

How to Create an Amazon EC2 AMI That is Larger Than 10GB

Recently, I have been dealing with an issue surrounding the 10GB size limit for AMIs within Amazon’s EC2 service. If you don’t what I’m talking about, here is a quick primer: a virtual instance running within Amazon’s Elastic Compute Cloud (EC2) service is launched from a read-only boot image that Amazon refers to as an Amazon Machine Image (AMI); Amazon has set the upper size limit for an AMI to be 10GB, and this restricts the amount of disk content that can be loaded on to the instance at boot. For a Windows-based EC2 instance, the 10GB AMI corresponds to the C:\ drive containing Windows; for a Linux-based instance, the 10GB AMI corresponds to the boot partition containing Linux. EC2 instances have several larger, ephemeral drives with capacities far in excess of 10GB, but those ephemeral drives have no persistence, and they will be empty when an EC2 instance boots. Amazon also has a service called the Elastic Block Store (EBS) that functions like a network mounted file system from a storage area network (but for various reasons EBS was not a feasible solution for my problem).

The problem I faced was that I needed about 16GB of data to be available on an EC2 instance at boot, and I needed it to otherwise operate like a standard instance launched from an AMI. It would be great if I could simply use a 16GB AMI, but Amazon does not permit this due to the 10GB size constraint. I was obviously going to need an alternate mechanism to load additional data on to the ephemeral drives at boot time.

My solution is ultimately derived from the same mechanism that Amazon uses to load an AMI at boot time. AMIs in EC2 are stored in Amazon’s Simple Storage Service (S3). When an instance is started in EC2, the AMI is loaded from S3 into the Xen domain that EC2 has provisioned for the instance (Xen is the open source virtualization software that is at the heart of Amazon’s EC2 service). I decided to take the same approach to populate the ephemeral drives at boot time. Specifically, I store a compressed archive in S3 that is downloaded and inflated on the first ephemeral drive in order to populate the instance with the additional content. The procedure to download the compressed archive from S3 and inflate it in the proper places is scripted and connected to the boot sequence (it’s a Windows service on Windows, and it is linked into the rc startup script mechanism on Linux).

The only issue I have found with this approach is latency. It takes a non-negligible amount of time to download and inflate several GB of data from S3, and this is all happening after the operating system boot has initiated. Amazon provides no quantitative guarantees about the network bandwidth that a given instance will be able to use, so the amount of time that the download will take is dependent on a variety of factors that are out of our control. In experiments I have measured download speeds from S3 to an EC2 instance to be in the range of 15 MB/sec to 25 MB/sec (those units are megabytes per second), so if you downloading several GB of data to your instance via this method then you can expect a delay of several minutes before the ephemeral drives are populated and available. This might or might not be a problem, depending on what else is starting on your instance immediately after boot. In my case, an application is starting that will take up to 10 minutes to start, so I have plenty of time to populate the ephemeral drives. If you are starting up instances to add to a cluster in response to load, and you need the additional cluster capacity as soon as possible, then this method is likely not for you. But in either case it is important to keep in mind that the startup latency will be directly proportional to the size of the additional content.

Hope this helps.

Tags: , , ,

Thursday, June 4th, 2009 cloud computing, hacking 1 Comment

Perl DBI and DBD::mysql on Cygwin — Connecting to a Native Windows Build of MySQL on a Windows 2003 AMI Within Amazon EC2

In my ongoing project involving Amazon’s EC2 service, I had a frustrating problem to solve this past weekend. I have an EC2 instance running Windows 2003, and on that instance I have a native Windows version of MySQL 5 and Cygwin. I wanted to use the mysqlhotcopy Perl script from the Cygwin command line against the Windows-native MySQL instance. Once again, I would have expected this to be a simple job with a simple solution, but in the end it turned into an extensive hacking session. Here is a quick roadmap of what I did.

My initial thought was that this should just work: MySQL and its scripts should not care if they are running in native Windows mode or in Cygwin, and mysqlhotcopy is just a Perl script that should run fine in either Cygwin or Windows…wrong! The native Windows version of MySQL does not ship with the mysqlhotcopy script, probably because that script uses Perl and DBI and there is no guarantee that Perl/DBI will be available on Windows. So I grabbed the mysqlhotcopy script from a UNIX box and attempted to run it via Cygwin. I got this Perl error saying that the DBI module was not found:

Can't locate DBI.pm in @INC (@INC contains: /usr/lib/perl5/5.10/i686-cygwin /usr/lib/perl5/5.10 /usr/lib/perl5/site_perl/5.10/i686-cygwin /usr/lib/perl5/site_perl/5.10 /usr/lib/perl5/vendor_perl/5.10/i686-cygwin /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .) at ./mysqlhotcopy line 8.
BEGIN failed--compilation aborted at ./mysqlhotcopy line 8.

So I guess I just need to get DBI installed for Perl and we should be good to go…right? Perl modules can be installed on Cygwin using cpan, so I ran:

cpan DBI

This command completed without errors. Let’s try the mysqlhotcopy script again…it runs without errors and prints out the usage page. Progress! So now let’s test it out with a real call to take a hot copy of the database:

mysqlhotcopy -u <username> -p <password> <database> <backup directory>

This command gives me the following error, complaining about DBD::mysql (the MySQL driver used by DBI to actually connect to MySQL):

install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (@INC contains: /usr/lib/perl5/5.10/i686-cygwin /usr/lib/perl5/5.10 /usr/lib/perl5/site_perl/5.10/i686-cygwin /usr/lib/perl5/site_perl/5.10 /usr/lib/perl5/vendor_perl/5.10/i686-cygwin /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/vendor_perl/5.10 /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .) at (eval 9) line 3.
Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right.
Available drivers: DBM, ExampleP, File, Gofer, Proxy, Sponge.  at ./mysqlhotcopy line 182

So we just need to install the DBD::mysql module and we should be good to go, right?. I ran the following command:

cpan DBD::mysql

This command failed with a build error:

Can't exec "mysql_config": No such file or directory at Makefile.PL line 76.

The DBD::mysql module is compiled locally, using the mysql_config script to find the location of the local MySQL installation. But the native Windows version of MySQL does not contain the mysql_config script. Ugh. I tried copying this file over from a UNIX box, but the output from the script (which is just configuration info for the MySQL installation and the settings in my.ini) looked a little screwy. So I guess I need to figure out what mysql_config is used for within the mysqlhotcopy script.

After some digging, it appears that the crux of the problem is that the MySQL client libraries are not available in the native Windows MySQL installation, and these libraries are required to build DBD::mysql. So if we can figure out a way to get these libraries to work in Cygwin, then we should have a working solution. Luckily, I found a note in the DBD::mysql readme file that pointed me in the right direction. Here is what I ultimately did:

0) Download and unzip the MySQL source code (I grabbed mysql 5.1.34).

1) Build the MySQL client libraries (without the server) via:
./configure --without-server --prefix=/usr/local/mysql-5.1.34
make

The build halts with an error for the file sys/ttypdefaults.h (not found), so I copied that file from /usr/include/sys/ttydefaults.h on a UNIX box into /usr/include/sys within Cygwin. Running make again completes the build after this file is in place. There is little of consequence in this file, so I am hoping that copying it from a UNIX box into Cygwin won’t have any serious side effects.

2) Once the MySQL build has finally completed (and this takes a while), run a manual build of the cpan download of DBD::mysql in the .cpan cache directory, using parameters for the location of the MySQL client libraries (which eliminates the need for mysql_config to be used to find them):

cd ~/.cpan/build/DBD-mysql-4.011-ynTTNR
perl Makefile.PL --libs="-L/usr/local/mysql-5.1.34/lib/mysql -lmysqlclient -lz" --cflags=-I/usr/local/mysql-5.1.34/include/mysql --testhost=127.0.0.1make
make install

So now we are ready to try mysqlhotcopy again. The MySQL client build installed a copy of mysqlhotcopy in /usr/local/mysql-5.1.34/bin, so let’s use that one instead of the one that was copied in from a UNIX box. Here’s the command:

/usr/bin/mysql-5.1.34/bin/mysqlhotcopy -u <username> -p <password> <database> <backup directory>

Still no joy; now we get this error:

DBI connect(';host=localhost;mysql_read_default_group=mysqlhotcopy','<database>',...) failed: Can't connect to local MySQL server through socket '/tmp/mysql.sock'(2) at /usr/local/mysql-5.1.34/bin/mysqlhotcopy line 177

This looks to me like DBI (using DBD::mysql) is trying to connect to a UNIX socket on the local machine instead of using TCP. Given that we’re on Windows, it will probably be a pain in the neck to figure out how to get the native Windows version of MySQL to listen on a local UNIX socket. Luckily, I’ve spent some time looking at the Perl code in mysqlhotcopy and it turns out that if you specify an IP address via the -h command, then this will override the use of the UNIX socket and will force DBI to use TCP to connect to MySQL. So let’s try the localhost loopback address (127.0.0.1) to see if that works:

/usr/bin/mysql-5.1.34/bin/mysqlhotcopy -h '127.0.0.1' -u <username> -p <password> <database> <backup directory>

Success! The command runs to completion without errors, and I can verify that the backup has taken place.

Hope this helps.

Tags: , , , ,

Thursday, April 30th, 2009 cloud computing, hacking No Comments

Ephemeral Drives in Amazon EC2 – When Are They Mounted?

Virtual instances running in Amazon’s EC2 service have several ephemeral disk drives that can be used for temporary storage (temporary because they are not persisted as part of the AMI). Recently, I had to figure out exactly when those drives were mounted and available during boot. The specific issue I was seeing was that I had registered some services to start automatically during boot, and those services started software packages that relied upon the ephemeral drives. This is on Windows 2003 Server, by the way; this is a non-issue on Linux, where mounted drives precede the init sequence for application level processes.

Through some trial and error (and I’ll abridge the details here), I was able to determine that the ephemeral drives are ready in all respects after the following two services have started: Ec2Config and Virtual Disk Service (vds). It was a simple matter of creating service dependencies for my registered services to ensure that they started after Ec2Config and VDS were started, and that fixed the glitch. I was using cygwin so I was able to use the cygrunsrv command to create the dependencies (via the --dep argument). People with more Windows kung fu would probably use regedit to do the same thing.

Hope this helps.

Tags: ,

Friday, March 13th, 2009 cloud computing, hacking No Comments

Cygwin Lighttpd with SSL

Last week, I needed to configure a Windows 2003 AMI in EC2 to run lighttpd with SSL. Once again, a simple job that I thought would be quick and painless turned into an extensive hacking session. Here is a quick roadmap of what I did.

My initial thought was that there must be a native port of lighttpd with SSL support for Windows…wrong again! I don’t know why I continue to think that open source projects will be ported to Windows; Linux is long since mature enough that people can use it directly, rather than porting open source Linux software to Windows.

In any event, cygwin lists lighttpd in its package list, so I was left with the option to run lighttpd via cygwin. A few quick clicks of the cygwin installer gave me lighttpd 1.4.20-1, all installed and ready to start. I dropped in a lighttpd.conf file containing ssl.engine declarations and started the server. Lighttpd promptly informed me that SSL support had not been compiled in.

I really have no idea why lighttpd on cygwin ships without SSL support compiled in, especially since the openssl libraries are available within cygwin and SSL support can be compiled in with just a simple option to ./configure. But, I should just be able to download the lighttpd source and compile it myself, on cygwin, with SSL support enabled, right?

Wrong again! After downloading lighttpd source for 1.4.20 and installing gcc on cygwin, I ran ./configure with the following options:

./configure --disable-shared --enable-static --prefix=/usr --with-openssl

I kicked off the build with make, and went to grab a beer (the build proceeds very slowly). When I returned, I found the build had failed with a symbol undeclared error for EAI_SYSTEM. This error has lots of search hits and some hacking-oriented solutions, but none of those solutions seemed to work for me, owing to conflicts between the static build of lighttpd and the main cygwin DLL. Even when I did manage to get the build to successfully complete, I found that some features (e.g. CGI execution) just didn’t work. The last thing I want is a flaky build of lighttpd running within cygwin on Windows, so I needed a plan B.

After a few more searches, I came across the WLMP project. WLMP has a standalone build of lighttpd — optionally bundled with MySQL and Php — that includes the cygwin DLLs but doesn’t otherwise require or run within cygwin. Funky, but promising. I installed the lighttpd-only version of WLMP, and I was able to bring lighttpd up with support for SSL, but only as a stand alone application, outside of cygwin. If I tried to run it from within a cygwin shell, or even with the cygwin DLL loaded, I found all sorts of silent errors without any messages. I intended to use cygwin extensively while lighttpd was running, so this was a significant problem.

On to plan C. I’m starting to think this isn’t going to happen when it occurs to me that the WLMP lighttpd build is a cygwin build with some conflicting cygwin files. If I download a version of WLMP that uses the same lighttpd 1.4.20 version as cygwin, and remove the files that conflict, then maybe I can get the WLMP build to run in cygwin. It’s worth a shot, right?

After some trial and error, I was able to make this work as follows:

0) Download WLMP lighttpd 1.4.20, matching the version of lighttpd in cygwin.
1) Delete the Cygwin1.dll file from the WLMP lighttpd directory
2) Add execute permission to all of the DLLs in the WLMP lighttpd directory, and all of the mod DLLs in the WLMP lighttpd/lib directory.
3) Cd into the WLMP lighttpd directory, and from within the WLMP lighttpd directory start lighttpd with the following command:

./lighttpd -V -m /usr/lib/lighttpd

If all went according to plan, you should see the list of features compiled into lighttpd, and there should be a “+” sign in front of “SSL Support”. My next test was to start lighttpd with the same –m argument, and using my lighttpd.conf with the SSL configuration, and it worked flawlessly without any cygwin conflicts.

I can’t tell you why the current directory (within the WLMP lighttpd directory) is important in getting this hack to work. My guess is that the WLMP version of cygwin is using a combination of DLLs from the standard cygwin lighttpd lib directory in /usr/lib/lighttpd and the WLMP lighttpd lib directory in the current directory, and somehow this combination works without cygwin DLL conflicts in this particular situation. Who knows? All I can say is that it would have been really nice if the cygwin folks had enabled SSL support in their build of lighttpd…

Anyway, that’s it. Hope that helps.

Tags: , ,

Friday, March 6th, 2009 cloud computing, hacking, lighttpd No Comments

Cygwin SSHd on a Windows 2003 AMI Within Amazon EC2

Recently, I needed to configure a Windows 2003 AMI in EC2 to run a ssh server. I would have expected this to be a simple job, with a variety of choices for making this work, but in the end it was far more time consuming, complicated, and frustrating than I would have guessed. Here is a quick road map of what I did.

My initial thought was that there must be a free, native port of openssh for Windows that installs as a service and otherwise conforms to the Windows environment…wrong! I can’t tell you why this is the case — maybe ssh is just not a microsofty way of doing remote terminals and file transfers — but I couldn’t find anything resembling a free, functional port of openssh for Windows. I found a few blog posts that mentioned that people had tried this, but ultimately they gave up when faced with the integration between openssh’s user/group namespace functions and Windows’ user/group concepts (to say nothing of the differences between the Windows command prompt and the UNIX shells). And these blog posts ultimately suggested that it was easier to run sshd via cygwin than it would be to port sshd to run natively. So….cygwin time!

UNIX is my OS of choice, and I’ve had cygwin on every Windows box I have ever had, so it was a quick jump to download the cygwin installer and install the packages I needed on a freshly started Windows 2003 instance in EC2 (incidentally, I am running the 64-bit, large EC2 instance AMI of Windows 2003 Server with SQL Server Express and no Authentication Services). The openssh package comes with a simple script — ssh-host-config — to generate the server host keys and create the users needed for privilege separation, so it was a nice, simple, relatively painless install. There are a few things that the config script misses, however, which requires you to run it several times before it ultimately succeeds (although it is nice enough to point out the problem each time and prompt you to fix it). After playing with it, I came up with the following actions to perform before running ssh-host-config in order to make it succeed the first time without errors:

0) Add the following line to /cygwin.bat:
set CYGWIN=binmode tty ntsec

1) Run a new cygwin bash shell (after the edit of cygwin.bat) and enter:
mount -s --change-cygdrive-prefix /
chmod +r /etc/passwd /etc/group
chmod 755 /var

2) Run a new cygwin bash shell (to pick up the cygdrive prefix change) and enter:
ssh-host-config
-- yes for privilege separation
-- "binmode tty ntsec" for CYGWIN environment variable setting for the service
-- enter your password of choice for the cyg_server account

3) Enter the following to start sshd:
net start sshd

4) Open the Windows Firewall editor, and add an exception for TCP traffic on port 22 for sshd.

5) If you haven’t already done so, open up port 22 for your EC2 instance group (assuming you are running your instance in the default group):
ec2-authorize -p 22 default

If everything went well, sshd is running and available on port 22, and you can login normally via ssh from other machines. All that is left to do is bundle up a new AMI to capture the cygwin installation…and that should be a piece of cake, right? The updated EC2 API has a new method — ec2-bundle-instance — that kicks off an AMI bundling job for an EC2 instance running Windows, so it should be as simple as calling this method and then grabbing a beer to wait for it to complete. If only it were that simple…

Unlike the AMI bundling scripts for Linux-based EC2 instances, which are ultimately just packaging up the existing file system, the Windows AMI bundling mechanism needs to perform several Windows-specific functions that are ultimately a real pain in the neck. First and foremost is sysprep. Sysprep is Microsoft’s answer to the problem of Windows virtualization; apparently the simple cloning of a Windows installation is not acceptable, and a new Windows SID should be generated for each new instantiation of a Windows virtual image. Sysprep does some other things, too (search for sysprep on Microsoft’s support web site for a more complete description — I am certainly not an expert on it), but ultimately the SID generation is the one that causes problems for a lot of installed software…like cygwin. After bundling a new AMI and starting a new instance with it, I found that sshd is hosed for no apparent reason. Attempts to start sshd via “net start sshd” produce the following cryptic error message:

The CYGWIN sshd service is starting.
The CYGWIN sshd service could not be started.
The service did not report an error.More help is available by typing NET HELPMSG 3534.

WTF?

After several time-consuming iterations of start new instance -> install cygwin -> bundle new AMI -> start new AMI instance -> wonder why sshd is hosed, I found something in the HKEY_USERS tree of the Windows registry that changes after the bundling step. Prior to bundling, with a functioning cygwin/sshd, I see the following in the registry:

[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions\Cygwin]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-1013\Software\Cygnus Solutions\Cygwin\Program Options]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions\Cygwin]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-2574196159-1727499900-3384088469-500\Software\Cygnus Solutions\Cygwin\Program Options]

After bundling, in a new instance in which sshd is hosed, I see the following in the registry:

[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions][HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin]
[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin\mounts v2]
[HKEY_USERS\S-1-5-21-4261372910-2505678249-1238160980-500\Software\Cygnus Solutions\Cygwin\Program Options]

All of the other registry entries related to cygwin remain the same before and after the bundling step, so my guess is that the loss of entries in the bundled instance is the source of the trouble. But what exactly are those entries?

Again, I’m no windows expert, but the entries in question appear to have the windows SID followed by a user identifier (e.g. in S-1-5-21-4261372910-2505678249-1238160980-500, S-1-5-21-4261372910-2505678249-1238160980 is the SID, and 500 is the user id). Looking at the /etc/passwd file for cygwin, the user id 500 corresponds to the Administrator account, and user id 1013 corresponds to the cyg_server account, used by sshd as a privileged account for switching effective user ids during login. So, my hypothesis is that the privileges for the cyg_server account are somehow lost by sysprep during the bundling step, and sshd is hosed without them in the new bundled AMI instance.To test my hypothesis, I decided to configure the AMI bundling step to skip sysprep. The base Windows EC2 AMIs come with an application in the start menu called “ec2Service Setting” that has a check box to enable/disable sysprep during AMI bundling, so it is easy enough to test this. However, I have no idea what happens to Windows if I disable sysprep during bundling, and I was not able to find a satisfactory answer via internet searches. The closest I got to an answer was to see several of the Amazon admins on the EC2 forum comment that it was not a good idea to disable sysprep if you were going to instantate multiple instances. I also found several documents online that discussed how sysprep was used to sanitize a Windows installation, generate a new SID, and make it generic for installation on any type of hardware. Since the virtual hardware of EC2 is, roughly speaking, identical (given that it is using Xen underneath the hood), I’m not too worried about the hardware issue. I have no idea about “sanitizing” the Windows instance or SID generation, though, so bundling without sysprep might mortally wound Windows (again…I’m no Windows expert). And I do want to run multiple instances from the bundled AMI, so that might be a non-starter as well. So I guess I will try the ready-shoot-aim approach of seeing what happens when I turn it off…

Compressing time, I started with a fresh Windows instance, installed cygwin and configured sshd like before, turned off sysprep and bundled it, started a new instance from the new bundled AMI, and…sshd still works. The new instance retains the SID that it had prior to bundling, and the registry entries are still there for the cyg_server account. Windows also appears to be working in all respects, but I’m not sure I could detect problems that might result internally from the omission of sysprep in the bundling. I guess I can run one more test, starting a bunch of instances at once, to see if having the same SID causes them to interfere with one another. I started four instances, running concurrently, and they each seem to be working fine. Or at least I can’t detect any problems.

So, in closing, it looks like I may have a solution: turn off sysprep if you want to use cygwin sshd in a bundled Windows AMI. Someone with more Microsoft kung-fu might be able to figure out how to make sysprep retain the registry entries for the cyg_server account, or maybe they would write a script to insert them directly into the registry at restart if they are missing…who knows. But for me, disabling sysprep seems to be the way to go. I found lots of other complaints on the internet about sysprep and what it does to installed software when the SID changes, so I’m guessing that there will be a lot of bundled AMIs in EC2 that are created with sysprep disabled. If there are, in fact, issues with multiple instances using the same SID, then I expect we will be reading about it in the EC2 forums, since everyone who creates a new AMI from the base Windows AMIs without sysprep will have the same base SID in their AMIs, and so on….

Anyway, that’s it. Hope that helps.

Tags: , ,

Wednesday, February 18th, 2009 cloud computing, development, hacking 1 Comment

Installing Lighttpd, Ruby on Rails, FastCGI, and MySQL on RedHat Enterprise Linux 5

Recently, I needed to configure a RedHat Enterprise Linux 5 box with lighttpd, FastCGI, Ruby on Rails, and MySQL. The box was subscribed to RHN, so I assumed a few simple commands like “sudo yum install lighttpd”, etc., would do the trick. Imagine my surprise to find that lighttpd, Ruby, gem, and FastCGI were all not available via RHN. I don’t know what the folks at RedHat are thinking by not including these packages, but it made me wish the box were running Debian Linux (which, for the record, has all of these packages available via aptitude).

Anyhow, switching to Debian was not an option, so I had to go back and do this the old fashioned way. The following is a quick walkthrough of the steps to install lighttpd, FastCGI, Ruby on Rails, and MySQL on RHEL 5. I am going to assume that you possess a reasonable level of Linux kung-fu if you are going to attempt this, so I won’t go too deeply into the individual details.

First, make sure your box is fully up to date, Linux-wise:

sudo yum update

Next, stop and deactivate Apache, which is running by default (I think) on RHEL 5:

sudo /etc/init.d/httpd stop
sudo chkconfig httpd off

Next, install necessary libraries and build lighttpd from the source code. I am adding some configuration options to ./configure to install files into the /usr/* directories, rather than the /usr/local/* directories as would be typical. This is a personal choice, so feel free to strip the options if you prefer to have your manually built applications in /usr/local…

sudo yum install pcre-devel zlib-devel bzip2-devel openssl-devel
wget 'http://www.lighttpd.net/download/lighttpd-1.4.20.tar.gz'
tar xvfz lighttpd-1.4.20.tar.gz
cd lighttpd-1.4.20
./configure --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --with-openssl
make
sudo make install

Next, create a non-root user and group that will run lighttpd. I typically use “www-data”, since I am a Debian guy, but feel free to use whatever you want. The important bit, of course, is that the user is something other than root for (hopefully) obvious reasons:

sudo /usr/sbin/adduser -s /sbin/nologin www-data
sudo /usr/sbin/addgroup www-data

Next, configure lighttpd and create necessary config and log files and directories. The contents of lighttpd.conf will depend on the location of your Rails application, among other things. I have included a simple Rails config below…but I assume if you are doing this you probably already have that…

sudo mkdir /etc/lighttpd

sudo echo > /etc/lighttpd/lighttpd.conf <<!

#
# Global config
#

server.port = 80
server.username = "www-data"
server.groupname = "www-data"
server.pid-file = "/var/run/lighttpd.pid"
server.errorlog = "/var/log/lighttpd/error.log"
server.indexfiles = ( "index.php", "index.html" )

accesslog.filename = "/var/log/lighttpd/access.log"

url.access-deny = ( "~", ".inc" )

server.modules = (
  "mod_rewrite",
  "mod_redirect",
  "mod_alias",
  "mod_access",
  "mod_auth",
  "mod_fastcgi",
  "mod_accesslog"
)

var.your.app = "/www/path-to-your-app"
server.document-root = var.your.app + "/public"
server.error-handler-404 = "/dispatch.fcgi"
url.rewrite = ( "^/$" => "index.html", "^([^.]+)$" => "$1.html" )
fastcgi.server = ( ".fcgi" =>
  ( "localhost" =>
    ( "min-procs" => 4,
      "max-procs" => 4,
      "socket" => "/tmp/your.app.fcgi.socket",
      "bin-path" => var.your.app + "/public/dispatch.fcgi",
     "bin-environment" => ( "RAILS_ENV" => "development" )
    )
  )
)
!

sudo chown -R www-data.www-data /etc/lighttpd

sudo echo "LIGHTTPD_CONF_PATH=/etc/lighttpd/lighttpd.conf" > /etc/sysconfig/lighttpd

sudo echo > /etc/init.d/lighttpd <<!

#!/bin/sh
#
# lighttpd Startup script for the lighttpd server
#
# chkconfig: - 85 15
# description: Lighttpd web server
#
# processname: lighttpd
# config: /etc/lighttpd/lighttpd.conf
# config: /etc/sysconfig/lighttpd
# pidfile: /var/run/lighttpd.pid

# Source function library
. /etc/rc.d/init.d/functions

if [ -f /etc/sysconfig/lighttpd ]; then
  . /etc/sysconfig/lighttpd
fi

if [ -z "$LIGHTTPD_CONF_PATH" ]; then
  LIGHTTPD_CONF_PATH="/etc/lighttpd/lighttpd.conf"
fi

prog="lighttpd"
lighttpd="/usr/sbin/lighttpd"
RETVAL=0

start() {
  echo -n $"Starting $prog: "
  daemon $lighttpd -f $LIGHTTPD_CONF_PATH
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog
  return $RETVAL
}

stop() {
  echo -n $"Stopping $prog: "
  killproc $lighttpd
  RETVAL=$?
  echo
  [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog
  return $RETVAL
}

reload() {
  echo -n $"Reloading $prog: "
  killproc $lighttpd -HUP
  RETVAL=$?
  echo
  return $RETVAL
}

case "$1" in
  start)
    start
    ;;
  stop)
    stop
    ;;
  restart)
    stop
    start
    ;;
  condrestart)
    if [ -f /var/lock/subsys/$prog ]; then
      stop
      start
    fi
    ;;
  reload)
    reload
    ;;
  status)
    status $lighttpd
    RETVAL=$?
    ;;
  *)
    echo $"Usage: $0 {start|stop|restart|condrestart|reload|status}"
    RETVAL=1
esac

exit $RETVAL
!

sudo chmod 755 /etc/init.d/lighttpd
sudo /sbin/chkconfig --add lighttpd
sudo /sbin/chkconfig lighttpd on

sudo mkdir -p /var/log/lighttpd
sudo chown www-data.www-data /var/log/lighttpd

Next, build and install Ruby from the source code. Again, I am adding configuration options to ./configure to install files into the /usr/* directories:

wget 'ftp://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.7-p72.tar.gz'
tar xvfz ruby-1.8.7-p72.tar.gzcd ruby-1.8.7-p72./configure --program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man --infodir=/usr/share/info --with-openssl
make
sudo make install

Next, install MySQL. This is the one step that RHN handles, thankfully…

sudo yum install mysql mysql-server mysql-devel
chkconfig mysqld on ; /etc/init.d/mysqld start

You should probably take this opportunity to change your MySQL root password, for (hopefully) obvious reasons:

/usr/bin/mysqladmin -u root password 'new password'
/usr/bin/mysqladmin -u root -p -h 'hostname' password 'new password'

Next, install FastCGI for use by lighttpd to spawn Rails dispatchers (again, using options for the /usr/* directories…):

wget http://www.fastcgi.com/dist/fcgi.tar.gz
tar xvfz fcgi-2.4.0.tar.gz
cd fcgi-2.4.0
./configure --prefix=/usr
make
sudo make install

Next, install RubyGems, so that we can grab Rails:

wget 'http://rubyforge.org/frs/download.php/45905/rubygems-1.3.1.tgz'
tar xvfz rubygems-1.3.1.tgz
cd rubygems-1.3.1
sudo ruby setup.rb

Next, grab Rails and install some other useful Gems:

sudo gem update --system
sudo gem update
sudo gem install rails --include-dependencies  # installs Rails 2.1.2
sudo gem install mysql -- --with-mysql-config=/usr/bin/mysql_config  # installs mysql-2.7
sudo gem install fcgi  # installs fcgi-0.8.7
sudo gem install packet   # installs packet 0.1.14

And, finally, if all has gone according to plan, you can bring up lighttpd and access your Rails application:

sudo /etc/init.d/lighttpd start

Hope this helps. Maybe some day soon this will be condensed to about five yum install commands via RHN…

Tags: , , , ,

Sunday, November 23rd, 2008 hacking, lighttpd, ruby on rails No Comments

Rails 2, Flex 3, and Form Authenticity Tokens

Recently, I was working with a Ruby on Rails application and I had the need to call a Rails controller method, with some parameters, from a remote Flex client. I would have thought that this would be a simple HTTP GET or POST to the Rails controller/method URL, using a Flex HTTPService object, with a tweak to the Rails method to render XML back to the client for parsing within Flex. However, Rails introduced the concept of form authenticity tokens in Rails 2.0, and these tokens are designed to block naive attempts to call Rails controller methods from outside of views rendered by Rails.

In simple terms, form authenticity tokens are one-time hashcodes that are generated as a hidden parameter for any form that is rendered by Rails. When the form is submitted, the hashcode is passed as a hidden parameter to the Rails controller, and Rails validates this hashcode to ensure that the form submission came from a view generated by Rails. This provides a measure of security against naive attempts to submit the form from other clients, since they will not have the proper hashcode needed to pass the Rails authenticity filter for the form submission. The specifics of the hashcode generation algorithm are covered elsewhere, but it suffices to say that they will resist uninspired hacking attempts, and it requires significant kung fu to bypass them without access to the Rails application.

In my case, I am not trying to hack the application — I just need to allow my Flex client to call my Rails methods. So I need to emulate the control flow of form generation in Rails, so that the view that kicks off my Flex client will contain a generated form authenticity token that can be passed to the Flex client as a startup parameter. There are 3 parts to this (or two if you want to condense parts 1 and 2):

  • Store the generated form authenticity token for the Flex launch view in a javascript variable, so that it can be substituted intto the flashvars parameter of the Flex AC_FL_RunContent() javascript method. I chose to put this in the layout for the Flex launch view with:
<%= javascript_tag "const AUTH_TOKEN = #{form_authenticity_token.inspect};" if protect_against_forgery? %>
  • Modify the call to AC_FL_RunContent() in the Flex launch view to include the form authenticity token. The line of code for this in the AC_FL_RunContent parameters list (if you are using my method of storing this is javascript as AUTH_TOKEN) is:
AC_FL_RunContent(
[...]
"flashvars","authenticityToken="+AUTH_TOKEN,
[...]
);
  • I can now access the form authenticity token within Flex Actionscript code with a reference to:
Application.application.parameters.authenticityToken

Now that we have the form authenticity token in Flex, all that is left is to pass it as a parameter in the GET or POST to the Rails controller method. I found this last step to be surprisingly tricky. The Flex HTTPService object allows you to specify the parameters for a HTTP POST operation via an XML structure. Rails happily accepts and parses XML in POST operations, provided that the content type is set appropriately to application/xml. The tricky part is that the XML structure that is submitted by the Flex HTTPService object will be wrapped with a root XML tag of <request></request>, and all of the specified parameters will be contained within these tags. Rails will look for the form authenticity token as a root level tag named <authenticity_token>, and if it sees only a single root level tag of <request> (as sent by the HTTPService in Flex), then it fails the form authenticity test.

The workaround is to pass the form authenticity token as a URL parameter in the target URL of the HTTPService object, and to pass the other form variables within the standard request block of the HTTPService object, e.g.:

<mx:HTTPService id="httpService"
  url="http://mysite/method/?authenticity_token={Application.application.parameters.authenticityToken}"
  [...]>
  <mx:request>
    [...]
  </mx:request>
</mx:HTTPService>

The result of this is that Rails sees two parameters in the form submission: an XML document with a root tag of <request>, and the form authenticity token with its proper name of <authenticity_token>. The form parameters are accessible via the XML document, and the form authenticity token is automatically found and validated by the form authenticity filter in Rails.

That’s it. Hope this helps.

Tags: , , ,

Wednesday, June 25th, 2008 development, hacking, ruby on rails 1 Comment