Thursday, 5 February 2009

Linux Kernel Installation with VirtualBox

Following the installation of virtualbox, I have downloaded and built my own Linux kernel and applications from scratch. This blog entry is a step-by-step description of what I've done.

Contents

Build Machine
Development Tools
Ipkg Manager
Directory Structure
bin Directory and Tools
etc/ipk Directory
makefiles Directory
Packages
Downloading
Building Packages
Creating the Filesystem
Building the Kernel
Building Glibc
Building Busybox
Installing Packages
Building Image
Creating CDROM
Making Bootable


Build Machine

The machine used to build is a Packard Bell desktop, with an Intel Quad CPU Q6600 @ 2.40GHz (19200 BogoMIPS total), with 1Gb RAM.

Each of the compile steps below, was performed in sequence, and has an indication of how long it took on my machine. Once you have performed one of the steps, you should gain an understanding of how each step takes (releatively).


Development Tools

A number of development tools are needed to build a software release - with Mandriva, all of these tools can be installed through the admin menu.
  • GCC 4.3.2 or later
  • GNU Make
  • GNU Bison
  • GNU binutils 2.9.1.0.23 or later
  • Other standard GNU/Unix tools
  • LZO 1.02 (devel) or later
  • Ruby
IPKG Manager

From the installation files, you will see that I am always trying to keep the install size down, that's why I selected busybox. This is also why I've selected to use the itsy package manager which is a cut down version of the Debian package manager. It creates ipk files, and the tools can be downloaded from here:
  • ipkg-tools
  • ipkg-build

Directory Structure

I've created a directory hierarchy on my Linux machine, which will contain files as follows:
  • bin - contains scripts and programs I used to develop the build
  • build - working directory where all of the compilation is performed
  • etc - directory, where I include all of my configuration files for the build
  • ipk - directory where all of the ipk install files are stored
  • makefiles - I use make to control all of the building, and the makefiles live here
  • sources - holds the downloaded source code
  • image - where a build image is created
  • iso - where the iso cdrom images are stored
Bin Directory and Tools

In the bin directory, I have several scripts, the first being 'setpath', which is used to set up the build environment, and is sourced by typing a dot, space and setpath, i.e.: . setpath

setpath file contents:
#
# Root work area
#
ROOT="/home/blin"
export ROOT

PATH="$ROOT/bin:$PATH"
export PATH

#
# Source Tarball directory
#
SRC="$ROOT/source"
export SRC

#
# SVN Archive directory
# Used for locally developed applications
#
SVN="$ROOT/svn"
export SVN

#
# Build directory
# Used to compile programs
#
BUILD="$ROOT/build"
export BUILD
#
# Installation directory
# Used to install programs (full development tree)
#
INSTALL="$ROOT/install"
export INSTALL

#
# IPK directory
# Used to create distribution packages from installation
#
IPK="$ROOT/ipk"
export IPK

#
# Image directory
# Used to create distribution image from packages
#
IMAGE="$ROOT/image"
export IMAGE

#
# Aliases
#
alias cdp="cd $ROOT"
alias cdb="cd $BUILD"
Now, I manage all of my source bundles with ipk control files, which are stored in etc/ipk (see below). There are two scripts that extract information from these files: getversion and download.

getversion looks into the appropriate ipk control file, and extracts the version number of the requested application:

getversion file contents:
#!/bin/sh

if [ "$ROOT" == "" ]
then
echo "\$ROOT not set, did you source setpath?"
exit
fi

cd $ROOT/build

if [ "$1" == "" ]
then
echo "getversion application"
echo ""
echo "e.g. getversion glibc"
echo ""
exit
fi

if [ -f $ROOT/etc/ipk/$1 ]
then
VERS="`cat $ROOT/etc/ipk/$1 | grep Version | cut -d' ' -f2-`"
echo "$VERS"
else
echo "0.0.0.0.0.0.0"
fi
download is used to download the requested file from its source, and store the resulting tar / bzip file in the sources directory.

download file contents:
#!/bin/sh

DATE="`date +%y%m%d`"

if [ "$ROOT" == "" ]; then
echo "ROOT environment variable is not set."
echo "has setpath been sourced?"
exit
fi

if [ "`which wget`" == "" ]; then
echo "wget is required, please install."
exit
fi

cd "$ROOT/sources"

#
# Download via wget
#

if [ "$1"=="" ]; then
SOURCES="cat /etc/ipk/* | grep Source: | grep wget, | cut -d ',' -f2-`"
else
SOURCES="cat /etc/ipk/$1 | grep Source: | grep wget, | cut -d ',' -f2-`"
fi

for SOURCE in $SOURCES
do
FILE=`basename $SOURCE`
NAME="`echo $FILE | cut -d- -f1`"
if [ -f $FILE ]
then
echo "$NAME has already been downloaded, skipping."
else
echo "Downloading $NAME from $SOURCE."
mv -f "$NAME-*" old > /dev/null 2>&1
wget -nv "$SOURCE"
fi
done

#
# Download via git
#

if [ "$1"=="" ]; then
SOURCES="`cat ../etc/ipk/* | grep Source: | grep git, | cut -d',' -f2-`"
else
SOURCES="`cat ../etc/ipk/$1 | grep Source: | grep git, | cut -d',' -f2-`"
fi

for SOURCE in $SOURCES
do
NAME=`basename $SOURCE`
FILE="$NAME-$DATE.tar.bz2"
if [ -f $NAME-*.tar.bz2 ]
then
echo "$NAME has already been downloaded, skipping. (Manually remove $FILE to force download)"
else
echo "Downloading $NAME from $SOURCE."
cd ../tmp
rm -rf ../tmp/*
git clone $SOURCE > /dev/null
mv $NAME $NAME-$DATE
rm -rf $NAME-$DATE/.git
tar cf ../sources/$NAME-$DATE.tar $NAME-$DATE
rm -rf $NAME-$DATE
cd ../sources
bzip2 $NAME-$DATE.tar
fi
done
In order to unpack the downloaded bundles for compilation, I use an unpack script called, unpack!

unpack file contents:

#!/bin/sh

if [ "$ROOT" == "" ]
then
echo "\$ROOT not set, did you source setpath?"
exit
fi

if [ "$ARCH" == "" ]
then
echo "\$ARCH not set, did you remember to set and export it?"
exit
fi

HOSTP="`cat $ROOT/etc/ipk/$1 | grep Depends | cut -d' ' -f2`"
if [ "$HOSTP" == "host-development-environment" ]
then
export ARCH="host"
fi

mkdir -p $ROOT/build/$ARCH
cd $ROOT/build/$ARCH

if [ "$1" == "" ]
then
echo "unpack application"
echo ""
echo "e.g. unpack glibc"
echo ""
exit
fi

if [ -f $ROOT/sources/$1-* ]
then
FILE="`ls $ROOT/sources/$1-* | head -1`"
EXT="`echo $FILE | awk -F . '{print $NF}'`"
DONE="0"
if [ "$EXT" == "tar" ]
then
echo "Unpacking $FILE"
tar xf $FILE
touch $1-*
DONE="1"
fi
if [ "$EXT" == "gz" ]
then
echo "Unpacking compressed $FILE"
tar xzf $FILE
touch $1-*
DONE="1"
fi
if [ "$EXT" == "bz2" ]
then
echo "Unpacking bz2 compressed $FILE"
bunzip2 -dc $FILE | tar xf -
touch $1-*
DONE="1"
fi
if [ "$DONE" == "0" ]
then
echo "Unknown file extension, or file not found for $0"
fi
else
echo "package $1 not found in $ROOT/sources"
echo "ensure that the ipk control file exists in $ROOT/etc/ipk"
echo "then run download"
exit
fi
etc/ipk Directory

The etc/ipk directory contains the control files for all of the applications to be downloaded and used in the image. The ipkg manager understands the concept of dependencies, and requires that specific packages are installed in order to use other packages. Each package has a control file, which contains this and more information. Control files are in the following format:
Package: packagename
Section: admin / base / comm / editors / extras / graphics / libs / misc / net / text / web / x11
Priority: required / optional / standard / important / extra
Version: version.number
Architecture: ARCHITECTURE / all
Maintainer: www.vizier.co.uk
Source: wget,url / git,url
Depends: dependencies,separated,by,commas
Description: Textual Description of the package
Packages are downloaded and stored in files containing their package name and version number, e.g. packagename-version.number.tar.bz2. The word ARCHITECTURE is automatically replaced during compilation with the processor architecture of the release, e.g. i586. When they are built, they end up in an ipk file with a name such as busybox-1.16.0-i586.ipk.

The source field is used by the download script to fetch the appropriate sourcecode and place it in the sources directory.

Depends is a list of the packages that must be installed for this package to work. The root package is named filesystem, onto which the linux kernel is installed. The main library glibc needs the kernel, and most applications need glibc. The hierarchy of dependencies is shown in the Packages section below.

There are also 4 scripts which may be used during installation / removal of a package - these are:
preinst [runs prior to install]
postinst [runs after files are installed]
prerm [runs prior to removal]
postrm [runs after files are removed]
These scripts may also exist in the etc/ipk directory, and be named packagename-preinst, packagename-postinst etc.

makefiles Directory

Building packages is performed using makefiles. In order to place all of the customisation in a single place, it is all done in the apporpriate make file.

... add words on makefiles here ..

Packages

The packages are all identified in the etc/ipk directory, and have the following dependency hierarchy:
filesystem
linux
glibc
busybox
ipkg
brltty
Rather than show the contents of all of the packages here, they can be downloaded as part of the install tree from here.

Downloading

With the configuration files created, downloading is as simple as typing 'download', which runs the script from the bin directory.

Building Packages

Downloaded packages all need to be configured and compiled for the target architecture.

Creating the Filesystem

The target machine filesystem needs some basic directory structure, and some system device 'files' creating. This can all be done inside an ipk install script.

.. list the ipk install script here as it is interesting ..

Once the ipk install script has been created, it should be saved as a source tarball (remember that this one is created and cannot be downloaded, so be careful about removing everything in the sources directory).

.. create the filesystem tarball ..

.. create the filesystem ipk file ..

Building the Kernel

Building the linux kernel is probably one of the longest individual tasks that you will need to do - for me, the compilation took 25 minutes.



Building Glibc
Building Busybox


Installing Packages


Building Image


Creating CDROM


Making Bootable





Unpacking, Configuring and Building the Kernel (25 minutes )

For my first build, I've decided to build the kernel with its default configuration. I've downloaded the kernel source, and unpacked into source/linux-2.6.28.10 and compiled it:
# unpack

cd sources
tar xzf ../0-tarballs/linux-2.6.28.10.tar.gz

# configure kernel

cd source/linux-*
make mrproper
make menuconfig
cp .config* ../kernel-config

# compile

make
make bzImage
make modules
rm -rf ../../images/kernel
mkdir ../../images/kernel ../../images/kernel/boot
make INSTALL_MOD_PATH=../../images/kernel INSTALL_PATH=../../images/kernel/boot modules_
install install
* Note - need to install the kernel and header files in images, so that glibc can use them when it is being compiled.

This installation created .... in ....


Unpacking and Building GLibC (15 minutes)

GLibC is configured and built in the images directory:
# unpack

cd sources
tar xzf ../tarballs/glibc-2.10.tar.gz


# build

mkdir ../root ../root/etc
cd ../images/glibc
../../source/glibc*/configure --prefix=`pwd`/../root CFLAGS="-march=i586 -O2"
make
echo > ../root/etc/ld.so.conf
make install
This installation creates libc.a and libc.so in images/glibc, along with the header files.

* Note need to install glibc and header files in images so that applications can use them when compiling.

Unpacking and Building the Bootloader
# cd bld/grub-1.96
# DST=`pwd`/../../dst
# ./configure --prefix=$DST CFLAGS="-march=i586 -O2"
# make
# make install
Unpacking and Building Busybox

I've downloaded and unpacked busybox-1.13.2 into the bld/busybox-1.13.2 directory.
# unpack

cd source
tar xvf ../tarballs/busybox*.tar.bz2
cd busybox*
make defconfig
make
Busybox need libm.so.6, libc.so.6, linux-gate.so.1, ld-linux.do.2 - the above script need modifying to link against freshly compiled libraries.

This installation creates 'busybox' in the source/busybox* directory.


Building and Installing Grub

cd source/grub-1.95
./configure --prefix=`pwd`/../../root
make
make install

Building an Image
# Directories here need to be sorted out

cd distribution
rm -rf iso-boot
mkdir iso-boot
(cd kernel ; tar cf - . ) | ( cd iso-boot ; tar xf - )
(cd iso-boot/boot ; ln -s vm* vmlinuz)

genisoimage -o iso/boot.iso \
-b source/syslinux*/core/isolinux.bin -c source/syslinux*/core/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
images/iso-boot
The Automated Makefile

Booting VirtualBox

Modifying the Image

Distributing the Image


Notes

Builds should be done with the correct architecture in mind, for example:
CFLAGS="-march=i586 -O2"
The C Flag -march indicates that the architecture is for the Intel 586. This is particularly important when compiling low level code, such as parts of GLibC and Kernel Drivers, which use architecture-dependent assembler.

You can change the architecture, but you need to do it consistently.





No comments:

Post a comment