Skip to main content

How to create a custom RedHat / CentOS Amazon EC2 / Cloud image

Note: Cloud computing is currently a very dynamic segment of IT and relevant changes - regarding technologies, available features and licensing - might happen anytime.

First of all you may ask why to build your own AMI when you can find many existing  - CentOS and other free - Amazon EC2 images.

The following sentences are taken from an official amazon document.

"You launch AMIs at your own risk. Amazon cannot vouch for the integrity or security of AMIs shared by other EC2 users. Therefore, you should treat shared AMIs as you would any foreign code that you might consider deploying in your own data center and perform the appropriate due diligence.

Ideally, you should get the AMI ID from a trusted source (a web site, another EC2 user, etc). If you do not know the source of an AMI, we recommend that you search the forums for comments on the AMI before launching it. Conversely, if you have questions or observations about a shared AMI, feel free to use the AWS forums to ask or comment."

So what if you do not trust those public AMIs?  And where are the official RedHat images? (Old document is here)

Update: When I started to write this post - 2011.04 - RedHat did not have any available public AMIs in amazon. Besides that RedHat licensing was really-really restrictive and using your own image required special RedHat subscription and transferring 25 subscriptions to the cloud for at least half a year (at least as much as I understood the license).
 
Some historical documents:
RedHat Cloud Computin(Had been removed)
RedHat Cloud FAQ (Archived)


Licensing in the cloud is another general problem. Amazon released EC2 beta in 2006 and went public in Q4 2008. Although 3-5 years have gone OEM licensing is still a very big problem in most cases. OEMs are trying - or seeing the results probably not really - to adapt licensing and changing the cloud (amazon) licensing very frequently. Sometimes releasing paid AMIs and covering license fees such a way, sometimes requires special licenses and sometimes allows you to use your existing licensesSo probably if you can not find something now or license is not available currently, it will be available in a month, in week or even in a day. 

Update: RedHat has released some new payed AMIs. For some additional hourly fee you can use these instances and you will not have licensing problem. Search for the owner 309956199498 in the list of available AMIs and you can see RedHat AMIs. 

See RedHat pricing here and standard pricing here for comparison.
Other historical RedHat announcements:
RedHat 5.5 32/64
RedHat 5.6 32/64
RedHat 6.0 32/64
RedHat 6.1 32/64
Common RedHat AMI updates
RedHat on Amazon EC2

So if you still want to build your own image - because of security, licensing or because you have no time to wait till it is officially released by the OEM - amazon has a GOOD description how to build your own image.

BUILDING YOUR OWN IMAGE IS ALWAYS A RISK FROM LICENSING POINT OF VIEW!!!
MAKE SURE YOU HAVE THE PROPER LICENSES BEFORE USING YOUR OWN IMAGE!!!
OR BETTER USE FREE SOFTWARES IN THE CLOUD.
(Debian, Ubuntu, CentOS)

I can only add some probably hopefully useful information.

If you want to create CentOS image via yum you have to use a pre installed CentOS operating system as a build server. (CentOS is free to use in the cloud and have a central repo even for install and update.)

If you want to create RedHat image via yum you have to use a pre installed RedHat operating system as a build server. (Licensing of RedHat might be a problem. And upgrading from the cloud via RedHat Network [RHN] is probably not the best idea.)

I recommend to use a 64bit OS version as a build platform since you can use that to build both 32 and 64 bit images. On a 32bit OS however you can build only 32bit images.

I also recommend to use the latest version of the OS. It is possible to install older version of the OS on the latest version. (It is most probably possible to install latest version of the OS on an older version but might have some unavailable new features - for example missing file system support.)

I prefer to use original kernels from RedHat and CentOS and not using the available amazon AKIs.

You can make your life easier by disabling SELinux - or at least running in permissive mode. After you could build a running image you can try to enable SELinux again if you need. (And I would suggest to use it if possible.)

When you are installing the base OS with yum the amazon document shows how to create a yum configuration file by hand. CentOS has a so called "release" file which contains the yum config. Using that it is more automatic and version safe. RedHat also has a release file but yum config is needed there.

Comments

Popular posts from this blog

Insufficient Disk Space reported under wine

Did you try to install/setup any Windows Application - actually a Game what else could be necessary - and got a message that you do not have enough free space on your drive meanwhile you had lot of free space on the chosen mounted partition? You will learn the problem and hopefully the solution too. (Of course I suppose it is not the real situation you have no enough space. If so do not read ahead.) The problem is that wine does not check the amount of free space on the mounted partition corresponds to the selected directory but reports the free on the root of the directory the partition mounted to . ;( Probably it is not clean so here is an example: Let say you have / only and something is mounted as /mnt/part1 If you directly select /mnt/part1 during installation wine will check free space in fact on / and does not calculate free on the partition mounted under /mnt/part1. How to solve it you may ask? It is easy. Start winecfg and create a new drive with the directory you want to use....

User based queue mapping for Capacity Scheduler

When I  started to use Capacity Scheduler hierarchical queue features on top of Hortonworks' HDP 2.0 I have immediately realized that I need automatic assignment of job to queue based on username. Sounds easy and useful? Yes! But could not find any configuration parameter and example for that. I found only references to use mapred.job.queuename config option. This can be configured in HIVE via set mapred.job.queuename=yourqueue or using -Dmapred.job.queuename=yourqueue as a hadoop command argument. After some hours of unavailing googling I have checked the corresponding code part and have been shocked. This is available only since HADOOP-2.6 (HDP-2.2). Check YARN-2411 for details. According to the CHANGELOG this is a relatively new feature. So sadly this is not available to me until an upgrade. :( See below an example based on YARN-2411 to use it in Hadoop 2.6 or higher for Hortonworks HDP-2.2 1. user1 is mapped to queue1, group1 is mapped to queue2: yarn.schedul...

Ansible ec2 module "region must be specified" issue

Some month ago I made an Ansible based autoinstall for Hortonwork's HDP 2.2. Since HDP 2.2.4.2 is out I wanted to update my install process and test how it works. However I had to realize that my previously working ansible playbooks are failing with an error message. TASK: [Launching Ambari instance] ********************************************* failed: [localhost] => {"failed": true} msg: region must be specified FATAL: all hosts have already failed -- aborting First I have checked my ansible, eucalyptus and boto config. However everything was fine. So I have checked the code of the ec2 module of ansible and found the error message in the source. # tail -n +1205 /usr/share/pyshared/ansible/modules/core/cloud/amazon/ec2.py|head -17 ec2 = ec2_connect(module) ec2_url, aws_access_key, aws_secret_key, region = get_ec2_creds(module) if region: try: vpc = boto.vpc.connect_to_region( region, aws_access_key_id=aws_access_key, aws_secret_access_key=aws_secr...