Skip to main content

How to create a custom RedHat / CentOS Amazon EC2 / Cloud image

Note: Cloud computing is currently a very dynamic segment of IT and relevant changes - regarding technologies, available features and licensing - might happen anytime.

First of all you may ask why to build your own AMI when you can find many existing  - CentOS and other free - Amazon EC2 images.

The following sentences are taken from an official amazon document.

"You launch AMIs at your own risk. Amazon cannot vouch for the integrity or security of AMIs shared by other EC2 users. Therefore, you should treat shared AMIs as you would any foreign code that you might consider deploying in your own data center and perform the appropriate due diligence.

Ideally, you should get the AMI ID from a trusted source (a web site, another EC2 user, etc). If you do not know the source of an AMI, we recommend that you search the forums for comments on the AMI before launching it. Conversely, if you have questions or observations about a shared AMI, feel free to use the AWS forums to ask or comment."

So what if you do not trust those public AMIs?  And where are the official RedHat images? (Old document is here)

Update: When I started to write this post - 2011.04 - RedHat did not have any available public AMIs in amazon. Besides that RedHat licensing was really-really restrictive and using your own image required special RedHat subscription and transferring 25 subscriptions to the cloud for at least half a year (at least as much as I understood the license).
 
Some historical documents:
RedHat Cloud Computin(Had been removed)
RedHat Cloud FAQ (Archived)


Licensing in the cloud is another general problem. Amazon released EC2 beta in 2006 and went public in Q4 2008. Although 3-5 years have gone OEM licensing is still a very big problem in most cases. OEMs are trying - or seeing the results probably not really - to adapt licensing and changing the cloud (amazon) licensing very frequently. Sometimes releasing paid AMIs and covering license fees such a way, sometimes requires special licenses and sometimes allows you to use your existing licensesSo probably if you can not find something now or license is not available currently, it will be available in a month, in week or even in a day. 

Update: RedHat has released some new payed AMIs. For some additional hourly fee you can use these instances and you will not have licensing problem. Search for the owner 309956199498 in the list of available AMIs and you can see RedHat AMIs. 

See RedHat pricing here and standard pricing here for comparison.
Other historical RedHat announcements:
RedHat 5.5 32/64
RedHat 5.6 32/64
RedHat 6.0 32/64
RedHat 6.1 32/64
Common RedHat AMI updates
RedHat on Amazon EC2

So if you still want to build your own image - because of security, licensing or because you have no time to wait till it is officially released by the OEM - amazon has a GOOD description how to build your own image.

BUILDING YOUR OWN IMAGE IS ALWAYS A RISK FROM LICENSING POINT OF VIEW!!!
MAKE SURE YOU HAVE THE PROPER LICENSES BEFORE USING YOUR OWN IMAGE!!!
OR BETTER USE FREE SOFTWARES IN THE CLOUD.
(Debian, Ubuntu, CentOS)

I can only add some probably hopefully useful information.

If you want to create CentOS image via yum you have to use a pre installed CentOS operating system as a build server. (CentOS is free to use in the cloud and have a central repo even for install and update.)

If you want to create RedHat image via yum you have to use a pre installed RedHat operating system as a build server. (Licensing of RedHat might be a problem. And upgrading from the cloud via RedHat Network [RHN] is probably not the best idea.)

I recommend to use a 64bit OS version as a build platform since you can use that to build both 32 and 64 bit images. On a 32bit OS however you can build only 32bit images.

I also recommend to use the latest version of the OS. It is possible to install older version of the OS on the latest version. (It is most probably possible to install latest version of the OS on an older version but might have some unavailable new features - for example missing file system support.)

I prefer to use original kernels from RedHat and CentOS and not using the available amazon AKIs.

You can make your life easier by disabling SELinux - or at least running in permissive mode. After you could build a running image you can try to enable SELinux again if you need. (And I would suggest to use it if possible.)

When you are installing the base OS with yum the amazon document shows how to create a yum configuration file by hand. CentOS has a so called "release" file which contains the yum config. Using that it is more automatic and version safe. RedHat also has a release file but yum config is needed there.

Comments

Popular posts from this blog

Insufficient Disk Space reported under wine

Did you try to install/setup any Windows Application - actually a Game what else could be necessary - and got a message that you do not have enough free space on your drive meanwhile you had lot of free space on the chosen mounted partition? You will learn the problem and hopefully the solution too. (Of course I suppose it is not the real situation you have no enough space. If so do not read ahead.) The problem is that wine does not check the amount of free space on the mounted partition corresponds to the selected directory but reports the free on the root of the directory the partition mounted to . ;( Probably it is not clean so here is an example: Let say you have / only and something is mounted as /mnt/part1 If you directly select /mnt/part1 during installation wine will check free space in fact on / and does not calculate free on the partition mounted under /mnt/part1. How to solve it you may ask? It is easy. Start winecfg and create a new drive with the directory you want to use....

User based queue mapping for Capacity Scheduler

When I  started to use Capacity Scheduler hierarchical queue features on top of Hortonworks' HDP 2.0 I have immediately realized that I need automatic assignment of job to queue based on username. Sounds easy and useful? Yes! But could not find any configuration parameter and example for that. I found only references to use mapred.job.queuename config option. This can be configured in HIVE via set mapred.job.queuename=yourqueue or using -Dmapred.job.queuename=yourqueue as a hadoop command argument. After some hours of unavailing googling I have checked the corresponding code part and have been shocked. This is available only since HADOOP-2.6 (HDP-2.2). Check YARN-2411 for details. According to the CHANGELOG this is a relatively new feature. So sadly this is not available to me until an upgrade. :( See below an example based on YARN-2411 to use it in Hadoop 2.6 or higher for Hortonworks HDP-2.2 1. user1 is mapped to queue1, group1 is mapped to queue2: yarn.schedul...

Python Azure ML SDK issue on Ubuntu 22.04

It has been quite a while since I posted last time. Why? Because simply I did not run into any issue worth to share. But now! I did.  Recently we are doing some Machine Learning on Azure using Azure Machine Learning Python SDK. No problem you might think. Well. As it turned out Ubuntu 22.04 is not supported. And this is clearly said in a message. Which is in fact a lie. The Error message: NotImplementedError: Linux distribution ubuntu 22.04 does not have automatic support. Missing packages: {'liblttng-ust.so.0'} .NET Core 3.1 can still be used via `dotnetcore2` if the required dependencies are installed. Visit https://aka.ms/dotnet-install-linux for Linux distro specific .NET Core install instructions. Follow your distro specific instructions to install `dotnet-runtime-*` and replace `*` with `3.1.23`. Ok but what is this? And why? So as the error mentions dotnetcore2==3.1.23 Python package uses .NET Core 3.1 but Ubuntu 22.04 has only dotnet6 packages. And also Micro...