unixkernelcompilationguntis/unix/kernel64.pdf · compile a kernel from source wmostly you don’t...
TRANSCRIPT
Unix Kernel Compilationand 64-bit computing
Operētājsistēma UNIX (DatZ6007)
Pasniedzēji: doc. Normunds Grūzītis, prof. Guntis Bārzdiņš
Līdzautori: Ģirts Folkmanis, Juris Krūmiņš, Kristaps Džonsons, Leo Trukšāns, Artūrs Lavrenovs
Latvijas Universitāte
Kernel Overview
distro
kernel
Compile a kernel from source Mostly you don’t need to compile the kernel
When there is a critical update, use APT (etc.) to upgrade the kernel (distro)
There are few situations where you may have to compile it To install the latest kernel whose deb/rpm is not available yet To enable experimental features that are not part of the default kernel To enable support for a specific new hardware that is not currently
supported by the default kernel To learn how kernel works, you might want to explore the kernel source
code [and make some changes], and compile it on your own
Note: you don’t need to compile the kernel just to compile a driver You need only the headers of the kernel
Linux
Kernel source
Download source from www.kernel.org
Unpack: cd /usr/src/ tar xzf linux-<version>.tar.gz
Source-Root: /usr/src/linux-<version>/
12 Nov 2019 (next: 14 Nov)
2.6.20-pre4
major
minor
patchlevel
distro
Kernel source tree/usr/src/linux-<version>/Documentation
archfs
init kernel
include
ipc
drivers
net
mmlib
scripts
alphaarmi386ia64m68kmipsmips64ppcs390shsparcsparc64
acornatmblockcdromchardiofc4i2ci2oideieee1394isdnmacintoshmiscnet…
adfsaffsautofsautofs4bfscodecramfsdevfsdevptsefsext2fathfshpfs…
asm-alphaasm-armasm-genericasm-i386asm-ia64asm-m68kasm-mipsasm-mips64linuxmath-emunetpcmciascsivideo …
adfsaffsautofsautofs4bfscodecramfsdevfsdevptsefsext2fathfshpfs …
802appletalkatmax25bridgecoredecneteconetethernetipv4ipv6ipxirdakhttpdlapb…
Kernel source (linux-2.4.0)size directory entries files LOC90M /usr/src/linux/ 19 7645 2.6M
4.5M Documentation 97 380 N/A
16.5M arch 12 1685 466K
54M drivers 31 2256 1.5M
5.6M fs 70 489 150K
14.2M include 19 2262 285K
28K init 2 2 1K
120K ipc 6 6 4.5K
332K kernel 25 25 12K
80K lib 8 8 2K
356K mm 19 19 12K
5.8M net 33 453 162K
400K scripts 26 42 12K
arch/
Subdirectories for each current port.
Each contains kernel, lib, mm, boot and other directories whose contents override code stubs in architecture independent code.
lib contains highly-optimized common utility routines such as memcpy, checksums, etc.
arch as of 2.4: alpha, arm, i386, ia64, m68k, mips, mips64
ppc, s390, sh, sparc, sparc64
drivers/ Largest amount of code in the kernel tree Device, bus, platform and general directories
drivers/char – n_tty.c (the default line discipline) drivers/block – elevator.c, genhd.c, linear.c, ll_rw_blk.c, raidN.c drivers/net – specific drivers and general routines (net_init.c) drivers/scsi – scsi_*.c (generic), sd.c (disk), sr.c (CD-ROM)
General: cdrom, ide, isdn, parport, pcmcia, pnp, sound, video Buses: fc4, i2c, nubus, pci, sbus, tc, usb Platforms: acorn, macintosh, s390, sgi
fs/ Contains:
Virtual filesystem (VFS) framework Subdirectories for actual filesystems
VFS-related files: exec.c, binfmt_*.c – files for mapping new process images devices.c, blk_dev.c – device registration, block device support super.c, filesystems.c inode.c, dcache.c, namei.c, buffer.c, file_table.c open.c, read_write.c, select.c, pipe.c, fifo.c fcntl.c, ioctl.c, locks.c, dquot.c, stat.c
include/ include/asm-*:
Architecture-dependent include subdirectories
include/linux: Header info needed both by kernel and user apps Usually linked to /usr/include/linux Kernel-only portions guarded by #ifdefs
#ifdef __KERNEL__ /* kernel stuff */ #endif
Other directories: math-emu, net, pcmcia, scsi, video
init/
Just two files (as of 2.4.x): version.c, main.c
version.c – the version banner that prints at boot
main.c – architecture-independent boot code start_kernel is the primary entry point
ipc/
System V IPC facilities
If disabled at compile-time, util.c exports stubs that simply return -ENOSYS
One file for each facility: sem.c – semaphores shm.c – shared memory msg.c – message queues
kernel/ The core kernel code sched.c – “the main kernel file”:
scheduler, wait queues, timers, alarms, task queues
Process control: fork.c, exec.c, signal.c, exit.c, etc.
Kernel module support: kmod.c, ksyms.c, module.c
Other operations: time.c, resource.c, dma.c, softirq.c, itimer.c printk.c, info.c, panic.c, sysctl.c, sys.c
lib/
Kernel code cannot call standard C library functions Files:
brlock.c – “Big Reader” spinlocks cmdline.c – kernel command line parsing routines errno.c – global definition of errno inflate.c – “gunzip” part of gzip.c used during boot string.c – portable string code
Usually replaced by optimized, architecture-dependent routines
vsprintf.c – libc replacement …
mm/ Paging and swapping:
swap.c, swapfile.c (paging devices), swap_state.c (cache). vmscan.c – paging policies, kswapd. page_io.c – low-level page transfer.
Allocation and deallocation: slab.c – slab allocator. page_alloc.c – page-based allocator. vmalloc.c – kernel virtual-memory allocator.
Memory mapping: memory.c – paging, fault-handling, page table code. filemap.c – file mapping. mmap.c, mremap.c, mlock.c, mprotect.c.
scripts/
Scripts for: Menu-based kernel configuration Kernel patching Generating kernel documentation
Linux Kernel Configuration
Download the source code and extract it under /usr/src
Customize kernel configuration make config make menuconfig make xconfig
Device support Three options: Y, M, N
Y (compile directly into the kernel) N (do not compile at all) M (compile as a kernel module)
Kernel Config
1. make menuconfig
2. make dep
3. make bzImage
4. make modules
5. make modules_install
6. copy bzImage to /boot
7. edit bootloader config
Linux Kernel Configuration cd /usr/src/linux-2.4.20-8 make xconfig
This command runs an X-based configuration tool that asks you specific question about every kernel configuration
Linux Kernel Configuration make config - plain text interface
make menuconfig - text-based interface with menus and radiolists; allows to save the progress; ncurses must be installed (apt-get install libncurses5-dev)
make xconfig - X-windows interface - QT is required (KDE)
make gconfig - X-windows interface - GTK is required (GNOME)
make defconfig - creates a config file that uses default settings based on the current system’s architecture
make localmodconfig - creates a config file based on the current list of loaded modules and system configuration
make oldconfig - updates an existing config file to be compatible with the newer kernel source code (asks only the questions that are new)
make randconfig - makes random choices for the kernel
…
Building and Installing Kernel Compiling the kernel
[ make dep ] [ make clean ] make bzImage make modules
Installing the kernel make modules_install make install [ cp arch/i386/boot/bzImage /boot/ ] [ Edit lilo.conf and run /sbin/lilo ] Reboot
Building the Kernel (Cont’d) [ make dep ]
Creates dependency information, so that the compiler knows each component’s dependencies and can compile components as appropriate.
[ make clean ] Cleans up some miscellaneous object files.
make bzImage Compiles the Linux kernel properly. The result is a kernel file called bzImage and located in /usr/src/linux-
2.4.20-8/arch/i386/boot
make modules Compiles the kernel modules files
Building the Kernel (Cont’d) make modules_install
Installs the kernel modules into the directory path /lib/modules/2.4.20-8/kernel/drivers
make install
Copies the new kernel and its associated files to the /boot directory
Builds a new initrd image
Adds new entries to the boot loader configuration file
Use ls /boot to make sure the initrd-2.4.20-8.img file was created
Check that the file /boot/grub/grub.conf contains a title section with the same version as the kernel package just installed
initrd
/bootunix boot # pwd; ls -lRp/boot.:total 1582lrwxrwxrwx 1 root root 1 Sep 23 14:11 boot -> ./drwxr-xr-x 2 root root 1024 Sep 23 15:34 grub/-rw-r--r-- 1 root root 458622 Sep 23 14:58 initrd-2.4.26-gentoo-r9-rw-r--r-- 1 root root 1137878 Sep 23 14:50 kernel-2.4.26-gentoo-r9
./grub:total 846-rw-r--r-- 1 root root 30 Sep 23 15:34 device.map-rw-r--r-- 1 root root 11264 Sep 23 15:34 e2fs_stage1_5-rw-r--r-- 1 root root 10256 Sep 23 15:34 fat_stage1_5-rw-r--r-- 1 root root 9216 Sep 23 15:34 ffs_stage1_5-rw-r--r-- 1 root root 245 Sep 23 15:34 grub.conf-rw-r--r-- 1 root root 1495 Sep 23 15:32 grub.conf.sample-rw-r--r-- 1 root root 11456 Sep 23 15:34 jfs_stage1_5lrwxrwxrwx 1 root root 9 Sep 23 15:32 menu.lst -> grub.conf-rw-r--r-- 1 root root 9600 Sep 23 15:34 minix_stage1_5-rwxr-xr-x 1 root root 196836 Sep 23 15:32 nbgrub-rwxr-xr-x 1 root root 197860 Sep 23 15:32 pxegrub-rw-r--r-- 1 root root 12864 Sep 23 15:34 reiserfs_stage1_5-rw-r--r-- 1 root root 33856 Sep 23 15:32 splash.xpm.gz-rw-r--r-- 1 root root 512 Sep 23 15:34 stage1-rw-r--r-- 1 root root 135148 Sep 23 15:34 stage2-rwxr-xr-x 1 root root 196900 Sep 23 15:32 stage2.netboot-rw-r--r-- 1 root root 8896 Sep 23 15:34 vstafs_stage1_5-rw-r--r-- 1 root root 12840 Sep 23 15:34 xfs_stage1_5
GRUB # /etc/grub.conf generated by anacondatimeout=10splashimage=(hd0,1)/grub/splash.xpm.gzpassword --md5 $1$ÕpîÁÜdþï$J08sMAcfyWW.C3soZpHkh.title Red Hat Linux (2.4.18-3custom) root (hd0,1) kernel /vmlinuz-2.4.18-3custom ro root=/dev/hda5 initrd /initrd-2.4.18-3.imgtitle Red Hat Linux (2.4.18-3) Emergency kernel (no afs) root (hd0,1) kernel /vmlinuz-2.4.18-3 ro root=/dev/hda5 initrd /initrd-2.4.18-3.imgtitle Windows 2000 Professional rootnoverify (hd0,0) chainloader +1
Linux Loader (LILO)# sample /etc/lilo.confboot = /dev/hda delay = 40 password=SOME_PASSWORD_HERE default=vmlinuz-stable
vga = normal root = /dev/hda1 image = vmlinuz-2.5.99 label = net test kernel
restricted image = vmlinuz-stable
label = stable kernel restricted
other = /dev/hda3 label = Windows 2000 Professional
restricted table = /dev/hda
Build kernel for other platform
make menuconfig ARCH=x86_64 /usr/src/linux/arch
alpha cris ia64 mips parisc ppc64 s390x sh64 sparc64 arm i386 m68k mips64 ppc s390 sh sparc x86_64
Cross-compile make HOSTCC="gcc -m32" ARCH="x86_64" bzImage
Get a gcc wrapper in order to crosscompile on a i386 host http://www.jukie.net/~bart/debian/amd64/scripts/gcc.bart
BSD
Source Code Control
The entire source code for FreeBSD is stored in a CVS / SVN repository
The logs, and individual changes for each file can be traced back to 1994
The source tree can be checked out at any state, or corresponding to any release
CDs are available taking the history back a further 20 years
Building World
The entire distribution, including all libraries and utilities can be built with a single command: “make world”
The kernel is built separately with a single command: “make kernel”
The source code is located in /usr/src
Building Releases
“make release”: a single command to build a complete release of FreeBSD
Used by large companies to produce special versions of FreeBSD with special patches or additional software installed by default Makes it easier to deploy hundreds and thousands of
systems pre-configured for a specific environment
It is also the well documented way in which the release engineering team makes all official releases of FreeBSD
P.S. Running a FreeBSD binary
Code likefd = open(“/etc/passwd”, O_RDONLY);
Becomessyscall(5, ...)
Kernel knows it’s a FreeBSD binary,uses freebsd_syscalls[ ] arrayfreebsd_syscalls[5] = freebsd_open(…);
File is opened
P.S. Running a Linux binary
Code likefd = open(“/etc/passwd”, O_RDONLY);
Becomessyscall(5, ...)
Kernel knows it’s a Linux binary,uses linux_syscalls[ ] arraylinux_syscalls[5] = linux_open(…);
File is opened All Linux file operations redirected to /compat/linux first
Linux Kernel Modules (LKM)
Linux Modules Driver modules in /lib/module[root@dafinn net]# pwd; ls/lib/modules/2.4.22-1.2166.nptlsmp/kernel/drivers/net3c509.o b44.o eepro100.o netconsole.o pppox.o tg3.o3c59x.o bonding epic100.o ns83820.o ppp_synctty.o tlan.o8139cp.o de4x5.o ethertap.o pcmcia r8169.o tulip8139too.o dl2k.o fealnx.o pcnet32.o sis900.o tun.o82596.o dmfe.o irda ppp_async.o sk98lin typhoon.o8390.o dummy.o mii.o ppp_deflate.o slhc.o via-rhine.oacenic.o e100 natsemi.o ppp_generic.o smc9194.o wirelessamd8111e.o e1000 ne2k-pci.o pppoe.o starfire.o
Recompiling the kernel Make the kernel smaller Add a new device Modify system parameters
LKM Utilities insmod
Insert an LKM into the kernel. rmmod
Remove an LKM from the kernel. depmod
Determine interdependencies between LKMs. kerneld / kmod
Kernel module daemon program lsmod
List currently loaded LKMs. modinfo
Display contents of .modinfo section in an LKM object file. modprobe
Insert or remove an LKM or set of LKMs intelligently. For example, if you must load A before loading B, modprobe will automatically load A when you tell it to load B.
Linux modules
Device driver can be dynamically loaded into the kernel List installed modules
lsmod
Module dependencies Load the module mannually:
insmod [-k] 3c509 modprobe smc-ultra
Generate the dependency depmod -a
Remove module: rmmod
Ways for loading a module
Manually: using insmod / modprobe
Automatically: the kernel discovers the need for loading a module (e.g., user mounts a file system)=> requests kmod daemon to load the module=> kmod loads module using modprobe
Using commands in one of the boot-time rc scripts
Hello World !
#define MODULE#include <linux/module.h>int init_module() {
printk("<1>Hello, world\n"); return 0; }void cleanup_module() {
printk("<1>Goodbye !\n"); }
Loaded and Unloaded
root# gcc -c hello.croot# insmod ./hello.oHello, worldroot# rmmod helloGoodbye !root#
Must be root to load / unload a module
Check /var/log/messages if nothing shown
The ‘insmod’ mechanism for loading a module
Reads the module into virtual memory
Fixes unresolved references to kernel routines and resources using the exported symbols from the kernel
Requests the kernel for enough space to hold the new module
Kernel allocates a new module data structure and enough kernel memory to hold the new module and puts it at the end of the kernel modules list
The ‘insmod’ mechanism for loading a module
insmod copies the module into the allocated space and relocates it so that it will run from the kernel address that it has been allocated
The new module exports sysmbols to the kernel and insmod builds a table of these exported symbols
If the new module depends on another module, that module has the reference of the new module
The kernel calls the module initialization routine and carries on installing the module
Unloadig a module
Two ways for unloading a module Manually: the rmmod command Automatically: when idle timer expires, kmod calls the service
routines for all unused loaded modules
The mechanism of unloading If the module can be unloaded, its cleanup routine is called to
free up the kernel resources that it has allocated The module data structure is unlinked from the list of kernel
modules All of the kernel memory that the module needed is deallocated
User Space drivers Advantages
Full C library can be linked in Run in conventional debugger A driver is unlikely to hang the entire system User memory is swapping (more to use) Still allow concurrent access to a device for well-designed driver
Disadvantages Interrupts are not available in user space Direct access to memory by mmapping /dev/mem (only privileged user) Access to I/O ports after calling ioperm or iopl (not all platform support), and
access to /dev/port can be too slow (only privileged user) Response time is slower (context switch, driver swapped to disk) The most important devices can’t be handled in user space (block, network)
64bit vs. 32bit kernels/apps
Single Instruction, Multiple Data (SIMD); General-Purpose Registers (GPR)
myApp.c
myApp.c
64-bit OS & Application Interaction
32-bit Compatibility Mode 64-bit OS runs existing 32-bit Apps with leading
edge performance No recompile required, 32-bit code directly
executed by CPU 64-bit OS provides 32-bit libraries and a
translation layer for 32-bit system calls.
64-bit Mode 64-bit OS requires all kernel-level programs &
drivers to be ported. Any program that is linked or plugged in to a 64-
bit program (ABI-level) must be ported to 64-bits.
USER
KERNEL
64-bit Operating System
64-bit Device Drivers
Translation
32-bit thread
32-bitApplication4GB expanded address space
64-bit Application
64-bit thread
512GB (or 8TB) address space
Increased Memory for 32-bit Applications
32-bit server, 4 GB RAM
64-bit server, 12 GB RAM
0 GB
2 GB
4 GB
0 GB
2 GB
4 GB
Shared32-bit
OS
32-bitApp
32-bitApp
32-bitOS
VirtualMemory
4GBDRAM
VirtualMemory
12 GB
32-bit App
0 GB
2 GB
2 GB
4 GB
32-bit App
12 GB
Notshared
Notshared
Notshared
64-bit
OS64-bit
OS
VirtualMemory
VirtualMemory
12GBDRAM
• OS & App share small 32-bit VM space• 32-bit OS & applications all share 4GB RAM• Leads to small dataset sizes & lots of paging
• App has exclusive use of 32-bit VM space
• 64-bit OS can allocate each application large dedicated portions of 12GB RAM
• OS uses VM space way above 32-bits
• Leads to larger dataset sizes & reduced paging
Multilib Where to put the extra libraries is a problem
without an obvious solution. Sorting out which directory is which is done by the
dynamic loader, and thus it is transparent to the 32-bit program running on a 64-bit processor.