What tool do you use to generate these cute schema and illustrations ?
Wow, that sounds very complicated. Is the process really that complicated?
No, he made it all up. What were you thinking?!
Only for people that spam.
@FAb: cool, you’re welcome. The diagrams were all done in Visio 2007.
@Frank: thanks for reading.
@xxx: hahaha.
@Maurice: Yea, that comment cum URL is borderline. Sigh.
I have been seeing that ultimate anonymity crap come up in so many comment threads lately. Funny, always under different URL’s too. Other than that, great article (although I really could have used it about two weeks ago when trying to fix some weird boot errors, ah well, muddled through them in the end)
Alright, then I’m deleting it. Thanks for the heads up.
It fills in a few blanks I got in my vague knowledge of this process, and being pretty humble in my knowledge, don’t think you under-complicated it at all – however I now am curious for more detail.
Oh, yeah. Did an upgrade which crashed and it deleted /sbin/init – at least now I know what step of the process that was … in hindsight. LOL
what software do you use to create those nice diagrams?
However, there’s a little error in the article:
CR3 is the PDBR (Page Directory Base Register, holds the physical address of the page directory) so is only needed when paging is enabled. The Global Descriptor Table is loaded into GDTR (special register just like IDTR) by the lgdt instruction.
I also highly recommend Linux Kernel Development by Robert Love
Much less dry then Understanding the Linux Kernel, and also more recent.http://www.amazon.com/Linux-Kernel-Development-Novell-Press/dp/0672327201
This would be infinitely more useful if you showed the Windows boot process first since that is what most computers actually use. And THEN at the end you can use the academic Linux boot process for completeness. Nope, sorry, this gets a thumbs down from me on SU. Next.
Incidentally, I've just skimmed your entire blog, and I'm rather impressed. Not only is your English quite good (a concern you voiced in one post – IMO, reading TCP/IP Illustrated is a damn good start if you're doing technical writing
, but _every_ post you've written so far looks interesting and substantial.
Posso traduzir e colocar no meu blog, e uma referência p/ cá?[]´s
It’s fake, photoshopped. Look, you can see the blurred pixel area
I'm a huge fan of the W. Richard Stevens books as well, so I fully agree they're a damn good start. What I meant by the English comment was that I sometimes feel a lack of non-tech reading has hampered my English. Say, when it's time to come up with a metaphor or the 'right word', that kind of thing. But I've been here in the US for a few years now, so it's less of a problem now.
@Christian: obrigado, e pode traduzir sem problemas, desde que tenha o link. Se vc quiser eu posse te mandar os arquivos Visio 2007 para as imagens ou traduzi-las pra voce.
@eto: hahaha, fake computer pr0n. Anyhow, thanks for the kind words
Olá Gustavo!
Pode deixar que eu vou colocar o link sim.
Por favor, me envie os arquivos para eu traduzir.Quando eu terminar de traduzir tudo, eu mando para você dar uma revisada, vc quer?
Obrigado e forte abraço!
Excellent article, but I have one question:
If decompression happens in-place, how come the compressed parts don't get overwritten by uncompressed data before those compressed part are read?
@Idefix: the compressed image is temporarily moved up in memory a notch, creating a 'buffer zone' between the place in memory where uncompressed contents are being written to and the place where compressed contents are read from.
The code is here.
Hi Gustavo,
The whole process of computer boot up from memory map to kernel loading was amazing. I linked your articles to http://www.osnews.com/story/20064/Computer_Boot_Up_Process. It is refreshing to see articles that are succinct and resourceful.
"In real-mode the interrupt vector table for the processor is always at memory address 0, whereas in protected mode the location of the interrupt vector table is stored in a CPU register called IDTR" – This is not true. Also in real mode, the CPU uses the IDTR to locate the (real mode) IVT. In practice, the IDTR is always set to 0, but it could be changed.
Hi Gustavo,
that was great article, i had a question , is it possible to monitor the boot parameters beacause of finding out if everything is ok or not , such as :\
- MBR parameters(of its code and partition table) and their place in memory , and which of them are still in RAM or removed.
actually , I would like to program it under a program to use it as a utility in shell environment , and honestly i don't khow what must i do , and even i don't khow that it must be a assembly program or it can be C one,
I really will be so much thankful if you lead me in this way,
thanks a lot,-Kimia
hi duartes
you have done a very good job…I have few queries : Is there any fixed size in memory reserved for user and kernel space? otherwise how can we know the size of memory used by kernel and user application?
@Kimia: Nearly all of the real-mode data from the kernel is wiped out once the protected mode part starts running, so some of these early parameters are lost.
You can however read the MBR and partition table right off of the disk, by reading for example a device like /dev/hdxx or /dev/sbxx corresponding to your hard disk. It would not be too hard to read the partition table and MBR doing that. You might want to read the source code for fdisk() and other Linux disk utilities.
Does this help?
@kavitha: There is memory that the kernel reserves for itself, yes. But there's also dynamic memory that the kernel allocates and frees as it runs. The kernel keeps a database of all of the memory and how it has been distributed (which process owns it, etc).
I'll write a post on memory this weekend that will cover some of this.
Hi Gustavo,
first of all , thanks alot of your attention and help.
I see, so as you said i can read from disk , but unfortunetly i don't know exactly what cammand must be using to read first sector from disk in linux, i'm new to linux system programming and so i need so much help in this field ,would you please guide me to a good way and reference to know more about this, i just know rare and not implemented khowledge in real systems and i'm new to this world with hungry mind,
best regards,
Kimia
@Kimia: no problem at all. Regarding reading from the disk, it's Unix tradition to expose hardware devices as magic files in the filesystem, usually under directory /dev
Many devices are exposed there as files, including disks. The exact name of the file depends on the nature of the device (hard disk, USB disk, scanner, sound card, etc), the bus (ide, scsi, sata, usb, etc), the order of the device (1st hard drive on bus, 2nd, etc).
So to read the hard drive, you'd need to find the right device, and then you can use regular C functions like open(), read() to read raw bytes out of disk.
However, what I _really_ suggest you do is _read source code_. It's one of the _best_ ways to learn, and in this case there are tools that do exactly what you want (MBR and partition table manipulation) and whose code is open source. So get yourself the code for Linux fdisk, maybe the GRUB configuration installer, and read the code. It can teach you a lot.
Of course, you need some books too. My favorite Unix author was W. Richard Stevens, he's got some great books, but sadly he died and the books haven't been updated. Look in Amazon for his books, maybe see what the commenters are saying, and find some 5-star books on Unix programming.
hope this helps,
gustavo
I really enjoyed this article and it is a very good one to understand the kernel boot process. Thank you very much for this wonderful article that too in simplified form.
Hola Gustavo, estaba leyendo tu articulo a ver si me aclaraba algunas cosas sobre el proceso de arranque del kernel, ya que soy un poco nuevo en estos menesteres.
Estoy trabajando con un sistema empotrado y quiero actualizar el kernel. Lo he compilado y se me ha generado una imagen del kernel y otro archivo con el sistema de ficheros rootfs.ext2. Mi pregunta es, ¿es necesario que almacene los dos en la flash de mi sistema empotrado? o basta con la imagen del kernel solo?
He tratado de arrancar mi nuevo kernel y empieza a descomprimir pero de da un panic
Kernel panic – not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
A lo mejor me puedes aclarar un poco como funciona esto. Estoy un poco perdido…
Gracias por adelantado
nice details about kernel boot-up process. Never found better then this. Thank you.
[…] The Kernel Boot Process by Gustavo Duarte explains the Linux kernel boot process on an x86 platform. Very well written with descriptive and good looking diagrams. […]
Hi Gustavo again,
Hi, thanks for this great work.
I am trying to read MBR using linux 2.6 kernel.
I need a source code for fdisk utility for intel x86 architecture.
Could you please guide me to the correct link to follow ?
The easiest way to do it is to just install source packages for the package that contains fdisk. Not sure which distro you have, but you would need to:
1. Find which package fdisk came from (rpm or yum or apt should be able to tell you)
2. Install the source package that corresponds to that package. There's a 1-to-1 correspondence between binary packages and source packages.
I think in Debian-based distros the package is util-linux, see here:
The FSF has an fdisk project too, but I'm not sure if it's the same thing because they say they provide an alternative to util-linux fdisk. The FSF page for their fdisk is here:
But go with the distro's source.
Great article. Any idea where the source code for /sbin/init itself is? I'm also interested in the "events" process that is spawned by the "init" process, and would like to know where the source code for that is as well, if you know
Frank -
One minor additional complexity. The initial root filesystem is, by default, assembled from the contents of the usr/ subdirectory in the kernel source tree (it’s a compressed cpio archive) and linked into the kernel image; alternatively, a compressed filesystem image can be linked into the kernel image or provided in a separate file. Part of the boot process (in init/main.c:do_basic_setup()) involves executing ‘initcalls’, which are stored in an array of pointers to functions to be called at boot time, constructed by the linker. One of these initcalls is init/initramfs.c:populate_rootfs(), which initializes the nonswappable memory-backed filesystem which is always mounted at / (the *real* root filesystem is mounted over the top of it, later on). The rootfs is never unmounted: you can see it as the first entry in /proc/mounts. Then it uncompresses the cpio archive or arranges for the filesystem to be backed by that compressed filesystem image, if either are present, and executes /init on that filesystem, if present, to complete the boot via an ‘early userspace’, chroot to the real root filesystem once it’s found it, and exec the real init. So the job of finding root filesystems is *completely* customizable. You can assemble it from a RAID array with some components pulled over the network if you like (I’ve done this in extremis as part of disaster recovery).
Finally, if that didn’t work and we still don’t have a useful root filesystem with an /sbin/init on it, just before calling init_post(), the system may call prepare_namespace() in init/do_mounts.c. This can try to dig up a root filesystem in a variety of ways: pausing for a configurable amount of time so the user can do something to provide a filesystem, waiting for delayed device probes in case the root filesystem is on some slow-to-start thing like a SCSI disk or a USB key, doing automated RAID probing (somewhat dangerous because it can’t tell if the array it’s assembling is actually made of pieces that are meant to go together: the recommended way to boot off RAID is to use one of the earlier customizable boot processes and run the mdadm tool in there to do the assembly), mounting a block device specified via root= on the kernel command line, or even asking the user to insert a separate floppy containing the root filesystem (I’m not sure *anyone* does this anymore, even in emergencies).
I haven’t got into the half a dozen horrible ways the various early userspaces can signal their completion (echoing the real device numbers into a file in /proc, executing the horrible ‘pivot_root()’ syscall, or just deleting everything on the rootfs and doing a ‘chroot exec /sbin/init’ into the real root filesystem, which is the modern way to boot up because it doesn’t rely on any horrible early-userspace-specific hacks). For more, see Documentation/filesystems/ramfs-rootfs-initramfs.txt and Documentation/initrd.txt in your favourite Linux kernel tree.
Gustavo, thanks a ton for this article. Excellent explanation of a really complex process, even a newbie like me have no problems following.
I really liked you article. You have a nice Linux-like sense of humor:)
[quote]This would be infinitely more useful if you showed the Windows boot process first since that is what most computers actually use. And THEN at the end you can use the academic Linux boot process for completeness. Nope, sorry, this gets a thumbs down from me on SU. Next.[/quote]
Yeah, all those Windows kernel hackers out there must be disappointed…
This is *easily* the best article on this subject I’ve been able to find on the web. Very clear, right level of detail (for me, at least), and good references. Thanks very much for writing it.
I infer from some prior comments that you aren’t a native English speaker. Fear not — your English is better than most natives’.
Any chance you’ll write something comparable, describing EFI? (For that matter, I’d like coverage of OSX as well, but suppose that’s asking too much.)
I’ve bookmarked your homepage.
hi Gustavo,
Thanks for the wonderful article but i have questionwhat is that “0x3aeb”? i found only “0xeb” (unconditional jump) in header.S. what does that “3a” represent? And why do we need this unconditional jump here? why can’t we put the start_of_setup code directly there?
Try this, very easy to understand\http://www.redhatlinux.info/2010/11/steps-of-boot-process.html
I have a doubt.
The decompression routines will also be a part of the Compressed kernel Image. These routines itself need to be decompressed.
How does this happen?
Thanks in advance. -
which tool you used for drawing the image of your blog’s article?
You know, all these are so beautiful!
Thanks for your sharing Linux knowledge… -
Nice job, Gustavo.
I discovered your blog after doing my own trace through the kernel (using a JTAG on my PowerPC-based platform).
What I couldn’t find was which thread (if any) does the scheduler actually run in? It must be able to pre-empt other threads so I assume it is run in some sort of high-priority parent thread?
Let me know if you find the answer…
Daniel -
Can you explain a bit about booting Linux in VirtualBox.
could you explain how C language program can operate during booting. whether we have a compiler here! ( i bet it e xist) . So what going on. thanks for your help!
and i have a question whether i can know about meaning of linux kernel file. what should I do to become a Linux developer
