iPhones, Armed Robbery, and Hacking

(Some security recommendations are summarized at the end.)

I. The Robbery

This past summer I was walking around in the neighborhood where I grew up, happy-go-lucky, when some guy jumped off a motorcycle pointing a gun at me. It was my first time at gunpoint, and from the outset the weapon was positively spellbinding. As I gazed at it, strange thoughts hit me: "Am I going to get shot by this rusty piece of shit? What a sorry way to die! And what if I get tetanus?"

Those were thoughts I wouldn't have anticipated, but as Dan Carlin says, humans in extreme situations often behave unexpectedly. And while a gun-toting thug is a far cry from the Battle of Verdun, it is pretty extreme for me. This post tells the story of the robbery and its surprising information security developments. There are lessons here for both users and designers of technology.

Robbery scene

My daughter and I were visiting Brazil in July, taking a carefree walk in a boulevard lined with lush trees. She had just gotten into "good kid, m.A.A.d. city", ironically enough an album about growing up in Compton amid dire violence. So we were deep in conversation about the US criminal justice system, drug laws, and the ideas of people like Bryan Stevenson and Michelle Alexander[1].

Growing up in Brazil you get a crash course in street smarts. I was mugged twice as a 10-year-old and once at 15. That's counting only the times when stuff was actually taken. There were scores of near-muggings I dodged by either talking my way out or running my way out.

But after 20 trouble-free years, I let my guard down. Absorbed in conversation, I barely noticed the motorcycle driving on the other side of the street. By the time it veered the wrong way into traffic and sped towards the sidewalk we were on, it was too late. The passenger jumped out while the motorcycle was still moving, gun in hand pointed squarely at me.

The scene felt strangely removed - it's cliche, but it really did feel like a movie. Instead of panicked confusion, there was a strong pragmatic voice in my head. I had thought about "what if" scenarios plenty of times before and they kicked in. Who is the attacker? What is their motivation? What's the best course of action?

There are career criminals in Brazil who are downright professional. I know somebody whose house was invaded while they were home and the robbers let them know how long the "job" was going to take, offered them water, and made sure nobody freaked out. Better than some moving companies I've used.

But when someone is robbing random people on the street using a gun, that's pretty far from professional. Way too volatile a situation with huge risks and beggarly payoff. These were at best lowlifes and at worst jittery crackheads. I felt two strong imperatives. First, keep the situation as absolutely relaxed as possible. When they get nervous, they get scared. And when they get scared, that's when I accidentally get shot. But second, and more importantly: if they want to kidnap my daughter, fight it at any cost whatsoever. Better to die on that sidewalk than to let them take her.

I remember thinking, "take a deep breath, raise hands slowly, move smoothly, stay relaxed, hand everything over." It worked. Who knows, maybe Andy from the Headspace app saved my life. The bandits were gone, along with our two iPhones and my watch. But the real fun was still to come.

After we got home, I logged into iCloud and put both phones in lost mode. They had been turned off, predictably. Plenty of crooks have been caught by way of "Find iPhone," but they've learned by now. Thinking of my data in criminal hands was uncomfortable, but the fact that iOS exploits sell for $1.5 million made me feel a lot better. No small-timer is breaking into an iPhone. I figured they would wipe it out and sell it.

I have two-factor authentication in all the accounts that matter, and whenever possible my second factor is an iPhone app that generates time-based one-time passwords (TOTP) for authentication. Google Authenticator is a popular app for this, but I use OTP Auth instead because it is more flexible (more on this in the recommendations). Here's what it looks like, slightly sped up to make it more exciting:

Time-based one-time passwords

When it's time to log into one of your accounts, you provide your login and password as you normally would, plus the temporary code being shown by the app.

I also use a password manager with unique, long passwords for each site. So my main concern at that point was minimizing the impact of this whole thing on my kid. We had dinner planned with friends, tasting menu at a good Japanese place, so I thought it best to go, have a good time, laugh and hopefully cushion the blow. Later I could call T-Mobile and suspend the cell lines.

II. The Hacking

A couple of hours later we were back, much happier, imbued with friendship and, in my case, plenty of sake. I opened Gmail and got some shockers:

Facebook password reset email

Wuh-wait what? I wasn't expecting to see any of these, but least of all the Facebook password reset. Before you read on, take a good look at those emails. It's fun to work out what happened here. Done? Let's dig in:

Facebook password reset email

Whoah! Facebook password reset by phone number? How? Did they unlock my phone? But also... why? At once I felt the sinking realization I misread the situation. They seemed to be more sophisticated that I thought - not the motorcycle crew themselves, but someone else in the operation (his identity would be unmasked later that night).

The idea that somebody was hacking into my accounts right at that moment, with my phone in hand, was deeply unsettling. A malevolent twist to the emotional roller coaster of that evening. But this was a technical problem, so it was time to sober up as best as I could and work methodically.

The "how" was simple. The attacker took the SIM card out of the stolen iPhone and put it in another phone. At that point he found out my phone number, whereas previously he had no information on me. More importantly, he could also receive my SMS text messages. He then attempted to log into Facebook using my phone number as a login, clicked on "Forgot Password," and reset the password via SMS.

So here is a big screw up and a couple of lessons. As I said, I have 2FA (two-factor authentication) in the accounts "that matter." But I rarely use Facebook, so I didn't enable 2FA there. Oops. Turns out it's not such a great idea to have an account in the world's most popular app as a weak link in your defenses.

Now consider Facebook's account recovery policies. If the account has 2FA enabled, passwords can only be reset by email. That's good. But without 2FA, if an account has an associated phone number, the password can be reset via SMS. In such a case, a SIM card is an instant ticket to the account: find it and reset its password in one fell swoop.

That's a disaster. Facebook single handedly provides a way for attackers to go from a SIM card, or hijacked SMS messages, to a trove of personal information for the vast majority of people out there. By contrast, attackers made zero progress in hacking my kid's accounts, mostly because she doesn't use Facebook.

But why the hell was this wretch logging into my FB account? I suspect it wasn't for my cousin's mad political rants. Already shaken from the armed robbery, my mind played tricks on me as paranoid thoughts of identity theft and fraudulent bank transfers loomed.

I immediately logged into t-mobile.com and suspended both cell lines, disabling the attacker's main weapon. As an aside, T-Mobile has been great for international travel. I love you guys, keep your website safe. I tested sending SMS messages to my suspended numbers and happily all attempts generated errors.

On to the other emails. The Facebook password reset arrived at 9:46pm Brazil time. Curiously, at 11:23pm they briefly turned my phone back on with its SIM card, and the phone went into lost mode and flashed on Find iPhone here. But then there is that fourth email with a subject line of "iPhone SE 64GB Silver Was Found!" arriving at 12:20am. Here it is:

iCloud phishing email

The phone model and storage capacity are exactly right. The spelling, grammar, and layout are pretty well done. It was sent to my primary personal email, lifted from Facebook. Imagine a regular user receiving this right after their phone has been stolen, while they're somewhat shaken, and when they're not native English speakers to boot. What are the odds they'll realize this is a phishing attempt for iCloud credentials?

Apart from checking the URL, the biggest clue is the exclamation mark in the subject line, a little too enthusiastic for Apple. Either way, this is a nearly perfect phishing piece, made more so by impeccable timing. Maybe iCloud accounts should be placed in some sort of restricted state after a device is put into lost mode.

It's stressful to face an ongoing, targeted, personal attack. Deep breath again. Time to methodically check every account for suspicious activity, change passwords just in case, and recover compromised accounts. My main Google account, protected by 2FA, was safe throughout the ordeal. I reset my Facebook password by email and got back in. GitHub, AWS, and other professional accounts were also on 2FA and had no unauthorized activity. Audit logs never tasted so sweet. It was a relief knowing I wouldn't have to tell clients, "Hey, how are you? Great! So, listen, this iPhone thieving ring probably has all your data, isn't that funny? Hah! But never mind that! Those Bitcoin prices, huh?"

Then I tried a secondary Gmail account I use for some mailing lists and other non-critical tasks. You know... the kind of account for which one might leave 2FA disabled. Sure enough, the wretch had been there, and the password was changed via SMS password reset. And he only found the account by the phone number in the first place. Familiar? Here's a quick recap:

SMS hacking diagram

This Gmail account did not have a recovery email set up, and ironically I couldn't use SMS anymore. Google offers a recovery algorithm where you try to answer different questions with the ability to "Try another way" if you don't have a particular answer (quick: in which month and year did you create your Google account?). I was locked out for a while, long enough to start thinking I had lost the account, but eventually produced a couple of answers and got back in.

Finally, all of my accounts were safe again. It was getting late, but I had to find out why this person was frantically probing my accounts, and maybe, with some luck, who they were.

I knew the data in the iPhones was safe, as per the Apple vs. government showdown after the San Bernardino terrorist attack. But earlier I had assumed the phones could still be wiped clean and used normally. But maybe they couldn't, and this whole rigmarole was about breaking into my iCloud account. Hence the phishing.

A quick search confirmed the idea. Since iOS 7, released in 2013, Apple has provided the Activation Lock feature, whereby if a device is linked to an iCloud account, activating it requires the password to that account. This has created some misery among people buying and selling used iPhones: if the seller forgets to unlink the device from their iCloud account, the device is bricked until they do so.

A warm wave of righteous schadenfreude washed over me: all the robbers had were parts! They would fetch little money from this whole thing, especially since my kid's screen was cracked. You go, kid! Glad I hadn't replaced it yet. Also glad for activation lock, though perversely my digital torment was its side effect: the world is complicated. It turns out there was no sinister plot, just a miserable scheme for a few hundred dollars. Straight to the depths of hell is where those cowards going.

It was sobering to realize the attacker almost succeeded. Up until a few months prior to the robbery, my iCloud account did not have 2FA enabled and it used the compromised Gmail address as the recovery email. If the robbery had happened then, they would have been able to get in, unlink the phones, and sell them at full (used) price. They would have changed my iCloud password in the process, and might have erased and locked my other Apple devices for the hell of it, which would have been disastrous and possibly ruinous, depending on whether I could get back into the account. Whatever little data I have in iCloud would have been stolen as well.

I hope this motivates you to enable 2FA on all of your accounts, even the unimportant ones. They can interact in incremental and unexpected ways to become your undoing. Moreover, using TOTP apps as the second factor is far safer than SMS.

Apple has done a fantastic job with iOS security and Find iPhone, curbing everything from malware to exploits to theft. But further improvements can be made to better protect its customers. In the next post you'll see week-long sustained hacking attempts and meet the maggot behind the attacks, operating in a wretched hive of "iPhone unlockers."

III. Recommendations

  • Make sure your accounts cannot be hacked via text message (SMS) password resets. You can often do so by enabling two-factor authentication (2FA) for an account, particularly if you use a time-based one-time password (TOTP) app as the second factor. Two such apps are Google Authenticator and OTP Auth. You could also withhold your phone number from certain accounts. Another advantage of TOTP is that if you're unable to receive SMS messages for whatever reason, you can still log in.

  • Beware of your unprotected "less critical" accounts. They might provide a path to your sensitive ones.

  • If you decide to go with a TOTP app, choose one that allows you to make an encrypted backup of your account secrets. OTP Auth provides that along with encrypted iCloud sync, all optional and controlled by the user. Authy is another good option. If you use Google Authenticator, make sure losing your phone won't lock you out of any accounts.

  • If you design apps, be careful with password resets via SMS. SIM cards are an easy target, cell providers are subject to social engineering that could lead to intercepted messages[2], and SMS notifications can be seen on lock screens in most phones. Allow users to choose TOTP as a second factor.

  • If your iPhone is lost or stolen, go to iCloud.com immediately, put it in lost mode, and provide a phone number where you can be reached. Once you've done that, you might want to temporarily suspend your phone line (many carriers offer this on their websites). If you do so, you can no longer call your own phone, and unless it's on wifi, you also can't "Play Sound" or "Erase iPhone" via iCloud - keep that in mind. On the upside, nobody can use your SIM card to hack your accounts via SMS password reset. It's a trade-off.

  • If your iPhone is definitely stolen, rather than lost, it will probably appear off in iCloud. Put it in lost mode anyway. If you provide a phone number, know that it might be targeted for iCloud phishing or social engineering as crooks try to hack into your iCloud account (that's why the attacker briefly turned my phone on: to get a phone number to target). You almost surely want to suspend your cell service immediately. You lose the tracking and other goodies, but thieves generally know to keep the phone off, and handing them a working SIM card is fraught with peril. Tread carefully and may the force be with you.

  • You might want to protect your SIM card with a PIN. This requires you to enter the PIN whenever you turn your phone on. Attackers are thus unable to transplant your SIM card to another device and use it. However, if you lose your iPhone and the battery dies, or the person who finds it turns it off, it's game over. Even if the phone is later turned on, it won't connect to the Internet, enter Lost Mode, show a phone number where you can be reached, or "Play Sound" (this is true even if a known wifi is in range[3]). A phone that otherwise might have been found could be bricked and lost forever.

  • Beware of "Your iPhone was found" emails, text messages, and Whatsapp messages. Scammers attempt to phish for your iCloud credentials in devious ways soon after an iPhone is stolen. If you provided a lost mode phone number, thieves will attempt to use it against you while trying to break into your iCloud account.

  • Read Tech Solidarity's security guide. It's overkill for a regular user, but know the rules before breaking them.

Thank you for reading.


  1. If you're interested in US criminal justice, Ghettoside is a great book with better-than-fiction LA detective stories interwoven with a serious discussion of criminality, murder clearance rates, and other pressing topics. The New Jim Crow by Michelle Alexander is an interesting read on mass incarceration, while Bryan Stevenson's Just Mercy offers a piercing look at the injustices we sometimes create. Ezra Klein has a good interview with Stevenson. Glenn Loury's interview with Sam Harris offers a somewhat different perspective.

  2. VICE reported on a T-Mobile website bug that leaked personal data based based on phone number alone, giving social engineers a leg up. But this type of attack has worked against multiple carriers all over the world. Prominent hacks have happened this way.

  3. You can try this at home: turn your iPhone off and back on. Until the passcode is entered at least once, it won't connect to wifi. See this. If you have a better link, please let me know.

Goto and the folly of dogma

Many programmers are surprised to find out that the goto statement is still widely used in modern, high-quality codebases. Here are some examples, using the first codebases that come to mind:

Repo goto usages ratio to continue
Linux kernel 150k 6.27
.net CLR 5k 2.13
git 960 0.76
Python runtime 5k 16.9
Redis 554 2.14

The ratio to usages of the continue keyword is provided to normalize for lines of code and the prevalence of loops in the code. This is not limited to C code bases. Lucene.net for example has 1,511 goto usages and a ratio of 3 goto usages to each continue usage. The C# compiler, written itself in C#, clocks in at 297 goto usages and 0.22 ratio.

People who take "goto is evil" as dogma will point out that each of these usages could be rewritten as gotoless alternatives. And that's true, of course. But it will come at a price: duplication of code, introduction of flags, several if statements, and overall added complexity. These are highly reviewed codebases written by talented people. When they use goto, it's because they find it to be the simplest approach.

This is exactly how dogma hurts software development. We take a sensible rule that works most of the time and promote it to sacred edict, deeming violators as inferior programmers, producers of unclean code. Thus something that would have been a helpful guideline becomes a hard constraint. Pile up enough of these, and code that could have been simple ends up in a tangled mess, all in the name of "purity."

We have a long tradition of dogmas, but goto is the seminal example, denounced in Edsger Dijkstra's famous letter, Go To Statement Considered Harmful. Just barely over a page, it's a good case study. The letter is good advice in the vast majority of cases: misuse of goto will quickly land you in a maze of twisty little passages, all alike. Less helpful were the creation of a social taboo (goto is the province of inferior programmers) and the absolutist calls for abolition. Dijkstra himself came to regret how "others were making a religion" out of his position, as quoted in Donald Knuth's more level-headed paper, Structured Programming with go to Statements.

Taboos tend to accrete over time. For example, overzealous object-oriented design has produced a lot of lasagna code (too many layers) and a tendency towards overly complex designs. Chasing semantic markup purity, we sometimes resorted to hideous and even unreliable CSS hacks when much simpler solutions were available in HTML. Now, with microservices, people sometimes break up a trivial app into a hard-to-follow spiderweb of components. Again, these are cases of people taking a valuable guideline for an end in itself. Always keep a hard-nosed pragmatic aim at the real goals: simplicity, clarity, generality.

When Linus Torvalds started the Linux kernel in 1991, the dogma was that "monolithic" kernels were obsolete and that microkernels, a message-passing alternative analogous to microservices, were the only way to build a new OS. GNU had been working on microkernel designs since 1986. Torvalds, a pragmatist if there was ever one, tossed out this orthodoxy to build Linux using the much simpler monolithic design. Seems to have worked out.

Every programmer pays lip service to simplicity. But when push comes to shove, most will readily give up simplicity to satisfy dogma. We should be willing to break generic rules when the circumstances call for it. Keep it simple.

Grokbit

TLDR: I launched Grokbit, a code search and browsing tool.

When I was programming as a kid, I longed for a hardcore codebase, like a compiler or operating system, to really understand computers. That stuff just sounded so magical, some kind of Elvish secret way beyond mortals. There were decent books explaining how things worked, but that's a poor substitute for code.

Then the Internet reached Brazil when I was about 14, and all of a sudden there was this "GNU C" compiler that was allegedly better than Microsoft's, and the code was completely open! And what's more, there was an open Unix you could run on your 386! No need to convince your parents to sell the car and buy a SPARCstation!

This was the best present any kid could hope for. This is why, despite a lot of issues in the tech community, my gratitude to these people is overwhelming. You might think Stallman is a lunatic, and you might be right, but damn - he's the geek black-bearded Santa who brought the source to the children.

Now, reading this code was hard. The kernel, in particular, is tough to follow because entry points and flow of execution are unclear. There was no tmux, I didn't know Vim, my regexes were weak. So I printed out a whole bunch of code in my parent's dot matrix printer, and spread it on the floor. A poor man's multi-pane Vim session. Here's an interrupt handler, there's a "bottom half", and hey, look!, the syscall is right over there by the socks.

I think reading code is second only to writing code in making you a better programmer. When I write stuff like Anatomy of a program in memory or What does an idle CPU do?, a big part of the kick is sharing what I think are beautiful designs with people who haven't seen them before.

Still, I always wished I could give readers a better interface to actually dive into the code. But our tools for handling an entire code base, especially in the browser, are just not good enough at the moment. Searching also needs a lot of improvement: I think we can do better than regexes and general full-text searching when it comes to searching code.

That's why I built Grokbit. It has an indexing and search engine that's entirely tailored to code, so you can search semantically, like "give me the definition of foo" or "search for an identifier named bar". It's also wicked fast: you get real-time suggestions even in the largest repos I could try.

But searching was only half the battle. Having a rich UI - especially a multi-pane one, was an absolute requirement for me. When you have function A calling B calling C, it can be enormously helpful to have the 3 of them in the screen at once, and navigate seamlessly. Plus being able to load multiple large files, having back/forward in the browser work well, and an overall smooth UX.

So that's what I went for with Grokbit. It's still crude, lots of low hanging fruit, but it has already been very useful, as you'll be able to tell in future blog posts. But before I put more weekends into it, I'd like some feedback. Try it out, let me know what could be better, or which search features I should build sooner (many easy wins here). I hope it's as fun to use as it was to build.

Finally, if you are interested in working on the project, reach out. I don't know if this will become a SaaS app, or a feature in another product, or an open source project, but I am hellbent on solving this problem. Let's carry the ring into Mordor.

Home Row Computing on Macs

For a number of years I've configured my desktops so that most tasks can be done using only home row keys on the keyboard, a technique I call home row computing. It takes the Vi idea of staying on the home row to every app, all the time, but without using modes so things are simpler.

I've described an implementation for Windows, but I have since moved to Macs and back to a qwerty keyboard (away from Dvorak). The current setup is described in this post. It uses familiar Vi key bindings and is far more suitable. It's fairly painless to configure on the Mac and has never given me any problems, thanks to Takayama Fumihiko's awesome keyboard apps.

Using this is a joy. It's really fast, easy on the hands, and makes you feel like a geek god. If you don't use Vim, you'll now have one of its benefits in your favorite editor and in other apps, plus a weapon against smug Vimmers. If you already use Vim, your cherished hjkl keys become universal and pressing Esc gets a hell of a lot easier.

Some of the important keys that must be moved to home row are the arrow keys, Esc, delete (backspace) and forward delete. Another helpful home row task is moving and resizing windows. The key to all this is remapping Caps Lock to allow combinations of Caps Lock plus a home key to do these tasks. Again, there are no modes involved here, Caps Lock works as a modifier like the cmd and fn keys. Here's a good start:

I have left several keys unmapped so you can customize your own setup, and we'll get to window management in a moment. The first step is to set Caps Lock to No Action in System Preferences > Keyboard > Modifier keys:

Now we must remap the Caps Lock key code to something else. To do so, you need a small tool called Seil (open source). You can map Caps Lock to any other key, like cmd or option. So if you don't want to go all-out home row, you can still benefit from the remapping.

I like to remap Caps Lock into something that guarantees no conflicts ever for our combos. So I use key code 110, which is the Apps key on a Windows keyboard and is safely absent from Apple keyboards:

Now we're in business, the world - or at least the keyboard - is our oyster. The maker of Seil also makes Karabiner, open as well and an outstanding keyboard customizer for OS X. I have no affiliation with these tools, apart from being a happy user for years. If you end up using them, please donate. So go ahead and install Karabiner, and you'll see a plethora of keyboard tweak possibilities:

Each of the tweaks can be toggled on and off. There are even native Vi, Vim, and Emacs modes. However, I don't like the built-in ones, so I built my own config. Go to Misc & Uninstall and click Open private.xml:

In this file, ~/Library/Application Support/Karabiner/private.xml, you can define your own keyboard remapping scheme. I actually symlink that to a Dropbox file to keep the configuration consistent across my machines, but at any rate, here is a file you can use to implement what we have discussed so far. Drop the file in, click ReloadXML and you'll have this:

Home Row Computing is at the top (prefixed with ! for sorting). Toggle it on, and you're done. Enjoy your new keyboard layout, do a search on Spotlight and see how fast and smooth it is to choose an option.

Finally, there is window management. That's an area where you can fumble quite a bit, resizing and moving about clumsily with a mouse. My favorite options to make it fast and homerow-friendly are ShiftIt (open) and Moom (best $10 I ever spent, no affiliation). There are some others, but to me Moom towers above the rest. It has a great two-step usage, where one hot key activates it:

And the following key triggers a command you get to define using window primitives like move, zoom, resize, and change monitors. You can also define shortcuts that run commands directly. Moom has some handy default actions:

Out of box, arrow keys can be used to send a window to the left, right, top, or bottom of the screen, and Moom natively interprets hjkl as arrows making it easy to stay on home row. You can associate keys with various commands and precise window positions:

This is gold for large monitors like Apple Thunderbolts. I remap Caps Lock + M into the global Moom shortcut for painless activation. This allows me to set the shortcut itself to something bizarre that won't conflict with anything but would be a dog to type. Currently it's an improbable Fn + Control + Command + M. I also have Caps Lock + N activating a Moom command that cycles a window between my two monitors. Both of these shortcuts are in the keyboard map I provided.

If you have any questions, let me know. I know a number of keyboard nuts out there use this scheme on Windows and Linux, and I hope this makes it easy to do so on Macs.

System Calls Make the World Go Round

I hate to break it to you, but a user application is a helpless brain in a vat:

Every interaction with the outside world is mediated by the kernel through system calls. If an app saves a file, writes to the terminal, or opens a TCP connection, the kernel is involved. Apps are regarded as highly suspicious: at best a bug-ridden mess, at worst the malicious brain of an evil genius.

These system calls are function calls from an app into the kernel. They use a specific mechanism for safety reasons, but really you're just calling the kernel's API. The term "system call" can refer to a specific function offered by the kernel (e.g., the open() system call) or to the calling mechanism. You can also say syscall for short.

This post looks at system calls, how they differ from calls to a library, and tools to poke at this OS/app interface. A solid understanding of what happens within an app versus what happens through the OS can turn an impossible-to-fix problem into a quick, fun puzzle.

So here's a running program, a user process:

It has a private virtual address space, its very own memory sandbox. The vat, if you will. In its address space, the program's binary file plus the libraries it uses are all memory mapped. Part of the address space maps the kernel itself.

Below is the code for our program, pid, which simply retrieves its process id via getpid(2):

pid.cdownload
1
2
3
4
5
6
7
8
9
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main()
{
pid_t p = getpid();
printf("%d\n", p);
}

In Linux, a process isn't born knowing its PID. It must ask the kernel, so this requires a system call:

It all starts with a call to the C library's getpid(), which is a wrapper for the system call. When you call functions like open(2), read(2), and friends, you're calling these wrappers. This is true for many languages where the native methods ultimately end up in libc.

Wrappers offer convenience atop the bare-bones OS API, helping keep the kernel lean. Lines of code is where bugs live, and all kernel code runs in privileged mode, where mistakes can be disastrous. Anything that can be done in user mode should be done in user mode. Let the libraries offer friendly methods and fancy argument processing a la printf(3).

Compared to web APIs, this is analogous to building the simplest possible HTTP interface to a service and then offering language-specific libraries with helper methods. Or maybe some caching, which is what libc's getpid() does: when first called it actually performs a system call, but the PID is then cached to avoid the syscall overhead in subsequent invocations.

Once the wrapper has done its initial work it's time to jump into hyperspace the kernel. The mechanics of this transition vary by processor architecture. In Intel processors, arguments and the syscall number are loaded into registers, then an instruction is executed to put the CPU in privileged mode and immediately transfer control to a global syscall entry point within the kernel. If you're interested in details, David Drysdale has two great articles in LWN (first, second).

The kernel then uses the syscall number as an index into sys_call_table, an array of function pointers to each syscall implementation. Here, sys_getpid is called:

In Linux, syscall implementations are mostly arch-independent C functions, sometimes trivial, insulated from the syscall mechanism by the kernel's excellent design. They are regular code working on general data structures. Well, apart from being completely paranoid about argument validation.

Once their work is done they return normally, and the arch-specific code takes care of transitioning back into user mode where the wrapper does some post processing. In our example, getpid(2) now caches the PID returned by the kernel. Other wrappers might set the global errno variable if the kernel returns an error. Small things to let you know GNU cares.

If you want to be raw, glibc offers the syscall(2) function, which makes a system call without a wrapper. You can also do so yourself in assembly. There's nothing magical or privileged about a C library.

This syscall design has far-reaching consequences. Let's start with the incredibly useful strace(1), a tool you can use to spy on system calls made by Linux processes (in Macs, see dtruss(1m) and the amazing dtrace; in Windows, see sysinternals). Here's strace on pid:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
~/code/x86-os$ strace ./pid

execve("./pid", ["./pid"], [/* 20 vars */]) = 0
brk(0) = 0x9aa0000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7767000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=18056, ...}) = 0
mmap2(NULL, 18056, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7762000
close(3) = 0

[...snip...]

getpid() = 14678
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 1), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7766000
write(1, "14678\n", 614678
) = 6
exit_group(6) = ?

Each line of output shows a system call, its arguments, and a return value. If you put getpid(2) in a loop running 1000 times, you would still have only one getpid() syscall because of the PID caching. We can also see that printf(3) calls write(2) after formatting the output string.

strace can start a new process and also attach to an already running one. You can learn a lot by looking at the syscalls made by different programs. For example, what does the sshd daemon do all day?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
~/code/x86-os$ ps ax | grep sshd
12218 ? Ss 0:00 /usr/sbin/sshd -D

~/code/x86-os$ sudo strace -p 12218
Process 12218 attached - interrupt to quit
select(7, [3 4], NULL, NULL, NULL

[
... nothing happens ...
No fun, it's just waiting for a connection using select(2)
If we wait long enough, we might see new keys being generated and so on, but
let's attach again, tell strace to follow forks (-f), and connect via SSH
]

~/code/x86-os$ sudo strace -p 12218 -f

[lots of calls happen during an SSH login, only a few shown]

[pid 14692] read(3, "-----BEGIN RSA PRIVATE KEY-----\n"..., 1024) = 1024
[pid 14692] open("/usr/share/ssh/blacklist.RSA-2048", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid 14692] open("/etc/ssh/blacklist.RSA-2048", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[pid 14692] open("/etc/ssh/ssh_host_dsa_key", O_RDONLY|O_LARGEFILE) = 3
[pid 14692] open("/etc/protocols", O_RDONLY|O_CLOEXEC) = 4
[pid 14692] read(4, "# Internet (IP) protocols\n#\n# Up"..., 4096) = 2933
[pid 14692] open("/etc/hosts.allow", O_RDONLY) = 4
[pid 14692] open("/lib/i386-linux-gnu/libnss_dns.so.2", O_RDONLY|O_CLOEXEC) = 4
[pid 14692] stat64("/etc/pam.d", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 14692] open("/etc/pam.d/common-password", O_RDONLY|O_LARGEFILE) = 8
[pid 14692] open("/etc/pam.d/other", O_RDONLY|O_LARGEFILE) = 4

SSH is a large chunk to bite off, but it gives a feel for strace usage. Being able to see which files an app opens can be useful ("where the hell is this config coming from?"). If you have a process that appears stuck, you can strace it and see what it might be doing via system calls. When some app is quitting unexpectedly without a proper error message, check if a syscall failure explains it. You can also use filters, time each call, and so so:

1
2
3
4
5
6
7
8
9
~/code/x86-os$ strace -T -e trace=recv curl -silent www.google.com. > /dev/null

recv(3, "HTTP/1.1 200 OK\r\nDate: Wed, 05 N"..., 16384, 0) = 4164 <0.000007>
recv(3, "fl a{color:#36c}a:visited{color:"..., 16384, 0) = 2776 <0.000005>
recv(3, "adient(top,#4d90fe,#4787ed);filt"..., 16384, 0) = 4164 <0.000007>
recv(3, "gbar.up.spd(b,d,1,!0);break;case"..., 16384, 0) = 2776 <0.000006>
recv(3, "$),a.i.G(!0)),window.gbar.up.sl("..., 16384, 0) = 1388 <0.000004>
recv(3, "margin:0;padding:5px 8px 0 6px;v"..., 16384, 0) = 1388 <0.000007>
recv(3, "){window.setTimeout(function(){v"..., 16384, 0) = 1484 <0.000006>

I encourage you to explore these tools in your OS. Using them well is like having a super power.

But enough useful stuff, let's go back to design. We've seen that a userland app is trapped in its virtual address space running in ring 3 (unprivileged). In general, tasks that involve only computation and memory accesses do not require syscalls. For example, C library functions like strlen(3) and memcpy(3) have nothing to do with the kernel. Those happen within the app.

The man page sections for a C library function (the 2 and 3 in parenthesis) also offer clues. Section 2 is used for system call wrappers, while section 3 contains other C library functions. However, as we saw with printf(3), a library function might ultimately make one or more syscalls.

If you're curious, here are full syscall listings for Linux (also Filippo's list) and Windows. They have ~310 and ~460 system calls, respectively. It's fun to look at those because, in a way, they represent all that software can do on a modern computer. Plus, you might find gems to help with things like interprocess communication and performance. This is an area where "Those who do not understand Unix are condemned to reinvent it, poorly."

Many syscalls perform tasks that take eons compared to CPU cycles, for example reading from a hard drive. In those situations the calling process is often put to sleep until the underlying work is completed. Because CPUs are so fast, your average program is I/O bound and spends most of its life sleeping, waiting on syscalls. By contrast, if you strace a program busy with a computational task, you often see no syscalls being invoked. In such a case, top(1) would show intense CPU usage.

The overhead involved in a system call can be a problem. For example, SSDs are so fast that general OS overhead can be more expensive than the I/O operation itself. Programs doing large numbers of reads and writes can also have OS overhead as their bottleneck. Vectored I/O can help some. So can memory mapped files, which allow a program to read and write from disk using only memory access. Analogous mappings exist for things like video card memory. Eventually, the economics of cloud computing might lead us to kernels that eliminate or minimize user/kernel mode switches.

Finally, syscalls have interesting security implications. One is that no matter how obfuscated a binary, you can still examine its behavior by looking at the system calls it makes. This can be used to detect malware, for example. We can also record profiles of a known program's syscall usage and alert on deviations, or perhaps whitelist specific syscalls for programs so that exploiting vulnerabilities becomes harder. We have a ton of research in this area, a number of tools, but not a killer solution yet.

And that's it for system calls. I'm sorry for the length of this post, I hope it was helpful. More (and shorter) next week, RSS and Twitter. Also, last night I made a promise to the universe. This post is dedicated to the glorious Clube Atlético Mineiro.

What does an idle CPU do?

In the last post I said the fundamental axiom of OS behavior is that at any given time, exactly one and only one task is active on a CPU. But if there's absolutely nothing to do, then what?

It turns out that this situation is extremely common, and for most personal computers it's actually the norm: an ocean of sleeping processes, all waiting on some condition to wake up, while nearly 100% of CPU time is going into the mythical "idle task." In fact, if the CPU is consistently busy for a normal user, it's often a misconfiguration, bug, or malware.

Since we can't violate our axiom, some task needs to be active on a CPU. First because it's good design: it would be unwise to spread special cases all over the kernel checking whether there is in fact an active task. A design is far better when there are no exceptions. Whenever you write an if statement, Nyan Cat cries. And second, we need to do something with all those idle CPUs, lest they get spunky and, you know, create Skynet.

So to keep design consistency and be one step ahead of the devil, OS developers create an idle task that gets scheduled to run when there's no other work. We have seen in the Linux boot process that the idle task is process 0, a direct descendent of the very first instruction that runs when a computer is first turned on. It is initialized in rest_init, where init_idle_bootup_task initializes the idle scheduling class.

Briefly, Linux supports different scheduling classes for things like real-time processes, regular user processes, and so on. When it's time to choose a process to become the active task, these classes are queried in order of priority. That way, the nuclear reactor control code always gets to run before the web browser. Often, though, these classes return NULL, meaning they don't have a suitable process to run - they're all sleeping. But the idle scheduling class, which runs last, never fails: it always returns the idle task.

That's all good, but let's get down to just what exactly this idle task is doing. So here is cpu_idle_loop, courtesy of open source:

cpu_idle_loop
1
2
3
4
5
6
7
8
9
10
11
while (1) {
while(!need_resched()) {
cpuidle_idle_call();
}

/*
[Note: Switch to a different task. We will return to this loop when the
idle task is again selected to run.]
*/
schedule_preempt_disabled();
}

I've omitted many details, and we'll look at task switching closely later on, but if you read the code you'll get the gist of it: as long as there's no need to reschedule, meaning change the active task, stay idle. Measured in elapsed time, this loop and its cousins in other OSes are probably the most executed pieces of code in computing history. For Intel processors, staying idle traditionally meant running the halt instruction:

native_halt
1
2
3
4
static inline void native_halt(void)
{
asm volatile("hlt": : :"memory");
}

hlt stops code execution in the processor and puts it in a halted state. It's weird to think that across the world millions and millions of Intel-like CPUs are spending the majority of their time halted, even while they're powered up. It's also not terribly efficient, energy wise, which led chip makers to develop deeper sleep states for processors, which trade off less power consumption for longer wake-up latency. The kernel's cpuidle subsystem is responsible for taking advantage of these power-saving modes.

Now once we tell the CPU to halt, or sleep, we need to somehow bring it back to life. If you've read the last post, you might suspect interrupts are involved, and indeed they are. Interrupts spur the CPU out of its halted state and back into action. So putting this all together, here's what your system mostly does as you read a fully rendered web page:

Other interrupts besides the timer interrupt also get the processor moving again. That's what happens if you click on a web page, for example: your mouse issues an interrupt, its driver processes it, and suddenly a process is runnable because it has fresh input. At that point need_resched() returns true, and the idle task is booted out in favor of your browser.

But let's stick to idleness in this post. Here's the idle loop over time:

In this example the timer interrupt was programmed by the kernel to happen every 4 milliseconds (ms). This is the tick period. That means we get 250 ticks per second, so the tick rate or tick frequency is 250 Hz. That's a typical value for Linux running on Intel processors, with 100 Hz being another crowd favorite. This is defined in the CONFIG_HZ option when you build the kernel.

Now that looks like an awful lot of pointless work for an idle CPU, and it is. Without fresh input from the outside world, the CPU will remain stuck in this hellish nap getting woken up 250 times a second while your laptop battery is drained. If this is running in a virtual machine, we're burning both power and valuable cycles from the host CPU.

The solution here is to have a dynamic tick so that when the CPU is idle, the timer interrupt is either deactivated or reprogrammed to happen at a point where the kernel knows there will be work to do (for example, a process might have a timer expiring in 5 seconds, so we must not sleep past that). This is also called tickless mode.

Finally, suppose you have one active process in a system, for example a long-running CPU-intensive task. That's nearly identical to an idle system: these diagrams remain about the same, just substitute the one process for the idle task and the pictures are accurate. In that case it's still pointless to interrupt the task every 4 ms for no good reason: it's merely OS jitter slowing your work ever so slightly. Linux can also stop the fixed-rate tick in this one-process scenario, in what's called adaptive-tick mode. Eventually, a fixed-rate tick may be gone altogether.

That's enough idleness for one post. The kernel's idle behavior is an important part of the OS puzzle, and it's very similar to other situations we'll see, so this helps us build the picture of a running kernel. More next week, RSS and Twitter.

When Does Your OS Run?

Here's a question: in the time it takes you to read this sentence, has your OS been running? Or was it only your browser? Or were they perhaps both idle, just waiting for you to do something already?

These questions are simple but they cut through the essence of how software works. To answer them accurately we need a good mental model of OS behavior, which in turn informs performance, security, and troubleshooting decisions. We'll build such a model in this post series using Linux as the primary OS, with guest appearances by OS X and Windows. I'll link to the Linux kernel sources for those who want to delve deeper.

The fundamental axiom here is that at any given moment, exactly one task is active on a CPU. The task is normally a program, like your browser or music player, or it could be an operating system thread, but it is one task. Not two or more. Never zero, either. One. Always.

This sounds like trouble. For what if, say, your music player hogs the CPU and doesn't let any other tasks run? You would not be able to open a tool to kill it, and even mouse clicks would be futile as the OS wouldn't process them. You could be stuck blaring "What does the fox say?" and incite a workplace riot.

That's where interrupts come in. Much as the nervous system interrupts the brain to bring in external stimuli - a loud noise, a touch on the shoulder - the chipset in a computer's motherboard interrupts the CPU to deliver news of outside events - key presses, the arrival of network packets, the completion of a hard drive read, and so on. Hardware peripherals, the interrupt controller on the motherboard, and the CPU itself all work together to implement these interruptions, called interrupts for short.

Interrupts are also essential in tracking that which we hold dearest: time. During the boot process the kernel programs a hardware timer to issue timer interrupts at a periodic interval, for example every 10 milliseconds. When the timer goes off, the kernel gets a shot at the CPU to update system statistics and take stock of things: has the current program been running for too long? Has a TCP timeout expired? Interrupts give the kernel a chance to both ponder these questions and take appropriate actions. It's as if you set periodic alarms throughout the day and used them as checkpoints: should I be doing what I'm doing right now? Is there anything more pressing? One day you find ten years have got behind you.

These periodic hijackings of the CPU by the kernel are called ticks, so interrupts quite literally make your OS tick. But there's more: interrupts are also used to handle some software events like integer overflows and page faults, which involve no external hardware. Interrupts are the most frequent and crucial entry point into the OS kernel. They're not some oddity for the EE people to worry about, they're the mechanism whereby your OS runs.

Enough talk, let's see some action. Below is a network card interrupt in an Intel Core i5 system. The diagrams now have image maps, so you can click on juicy bits for more information. For example, each device links to its Linux driver.

Let's take a look at this. First off, since there are many sources of interrupts, it wouldn't be very helpful if the hardware simply told the CPU "hey, something happened!" and left it at that. The suspense would be unbearable. So each device is assigned an interrupt request line, or IRQ, during power up. These IRQs are in turn mapped into interrupt vectors, a number between 0 and 255, by the interrupt controller. By the time an interrupt reaches the CPU it has a nice, well-defined number insulated from the vagaries of hardware.

The CPU in turn has a pointer to what's essentially an array of 255 functions, supplied by the kernel, where each function is the handler for that particular interrupt vector. We'll look at this array, the Interrupt Descriptor Table (IDT), in more detail later on.

Whenever an interrupt arrives, the CPU uses its vector as an index into the IDT and runs the appropriate handler. This happens as a special function call that takes place in the context of the currently running task, allowing the OS to respond to external events quickly and with minimal overhead. So web servers out there indirectly call a function in your CPU when they send you data, which is either pretty cool or terrifying. Below we show a situation where a CPU is busy running a Vim command when an interrupt arrives:

Notice how the interrupt's arrival causes a switch to kernel mode and ring zero but it does not change the active task. It's as if Vim made a magic function call straight into the kernel, but Vim is still there, its address space intact, waiting for that call to return.

Exciting stuff! Alas, I need to keep this post-sized, so let's finish up for now. I understand we have not answered the opening question and have in fact opened up new questions, but you now suspect ticks were taking place while you read that sentence. We'll find the answers as we flesh out our model of dynamic OS behavior, and the browser scenario will become clear. If you have questions, especially as the posts come out, fire away and I'll try to answer them in the posts themselves or as comments. Next installment is tomorrow on RSS and Twitter.

Closures, Objects, and the Fauna of the Heap

The last post in this series looks at closures, objects, and other creatures roaming beyond the stack. Much of what we'll see is language neutral, but I'll focus on JavaScript with a dash of C. Let's start with a simple C program that reads a song and a band name and outputs them back to the user:

stackFolly.cdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <stdio.h>
#include <string.h>

char *read()
{
char data[64];
fgets(data, 64, stdin);
return data;
}

int main(int argc, char *argv[])
{
char *song, *band;

puts("Enter song, then band:");
song = read();
band = read();

printf("\n%sby %s", song, band);

return 0;
}

If you run this gem, here's what you get (=> denotes program output):

1
2
3
4
5
6
7
./stackFolly
=> Enter song, then band:
The Past is a Grotesque Animal
of Montreal

=> ?ǿontreal
=> by ?ǿontreal

Ayeee! Where did things go so wrong? (Said every C beginner, ever.)

It turns out that the contents of a function's stack variables are only valid while the stack frame is active, that is, until the function returns. Upon return, the memory used by the stack frame is deemed free and liable to be overwritten in the next function call.

Below is exactly what happens in this case. The diagrams now have image maps, so you can click on a piece of data to see the relevant gdb output (gdb commands are here). As soon as read() is done with the song name, the stack is thus:

At this point, the song variable actually points to the song name. Sadly, the memory storing that string is ready to be reused by the stack frame of whatever function is called next. In this case, read() is called again, with the same stack frame layout, so the result is this:

The band name is read into the same memory location and overwrites the previously stored song name. band and song end up pointing to the exact same spot. Finally, we didn't even get "of Montreal" output correctly. Can you guess why?

And so it happens that the stack, for all its usefulness, has this serious limitation. It cannot be used by a function to store data that needs to outlive the function's execution. You must resort to the heap and say goodbye to the hot caches, deterministic instantaneous operations, and easily computed offsets. On the plus side, it works:

The price is you must now remember to free() memory or take a performance hit on a garbage collector, which finds unused heap objects and frees them. That's the fundamental tradeoff between stack and heap: performance vs. flexibility.

Most languages' virtual machines take a middle road that mirrors what C programmers do. The stack is used for value types, things like integers, floats and booleans. These are stored directly in local variables and object fields as a sequence of bytes specifying a value (like argc above). In contrast, heap inhabitants are reference types such as strings and objects. Variables and fields contain a memory address that references these objects, like song and band above.

Consider this JavaScript function:

1
2
3
4
5
function fn()
{
var a = 10;
var b = { name: 'foo', n: 10 };
}

This might produce the following:

I say "might" because specific behaviors depend heavily on implementation. This post takes a V8-centric approach with many diagram shapes linking to relevant source code. In V8, only small integers are stored as values. Also, from now on I'll show strings directly in objects to reduce visual noise, but keep in mind they exist separately in the heap, as shown above.

Now let's take a look at closures, which are simple but get weirdly hyped up and mythologized. Take a trivial JS function:

1
2
3
4
5
function add(a, b)
{
var c = a + b;
return c;
}

This function defines a lexical scope, a happy little kingdom where the names a, b, and c have precise meanings. They are the two parameters and one local variable declared by the function. The program might use those same names elsewhere, but within add that's what they refer to. And while lexical scope is a fancy term, it aligns well with our intuitive understanding: after all, we can quite literally see the bloody thing, much as a lexer does, as a textual block in the program's source.

Having seen stack frames in action, it's easy to imagine an implementation for this name specificity. Within add, these names refer to stack locations private to each running instance of the function. That's in fact how it often plays out in a VM.

So let's nest two lexical scopes:

1
2
3
4
5
6
7
8
9
function makeGreeter()
{
return function hi(name) {
console.log('hi, ' + name);
}
}

var hi = makeGreeter();
hi('dear reader'); // prints "hi, dear reader"

That's more interesting. Function hi is built at runtime within makeGreeter. It has its own lexical scope, where name is an argument on the stack, but visually it sure looks like it can access its parent's lexical scope as well, which it can. Let's take advantage of that:

1
2
3
4
5
6
7
8
9
function makeGreeter(greeting)
{
return function greet(name) {
console.log(greeting + ', ' + name);
}
}

var heya = makeGreeter('HEYA');
heya('dear reader'); // prints "HEYA, dear reader"

A little strange, but pretty cool. There's something about it though that violates our intuition: greeting sure looks like a stack variable, the kind that should be dead after makeGreeter() returns. And yet, since greet() keeps working, something funny is going on. Enter the closure:

The VM allocated an object to store the parent variable used by the inner greet(). It's as if makeGreeter's lexical scope had been closed over at that moment, crystallized into a heap object for as long as needed (in this case, the lifetime of the returned function). Hence the name closure, which makes a lot of sense when you see it that way. If more parent variables had been used (or captured), the Context object would have more properties, one per captured variable. Naturally, the code emitted for greet() knows to read greeting from the Context object, rather than expect it on the stack.

Here's a fuller example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function makeGreeter(greetings)
{
var count = 0;
var greeter = {};

for (var i = 0; i < greetings.length; i++) {
var greeting = greetings[i];

greeter[greeting] = function(name) {
count++;
console.log(greeting + ', ' + name);
}
}

greeter.count = function() { return count; }

return greeter;
}

var greeter = makeGreeter(["hi", "hello", "howdy"])
greeter.hi('poppet'); // prints "howdy, poppet"
greeter.hello('darling'); // prints "howdy, darling"
greeter.count(); // returns 2

Well... count() works, but our greeter is stuck in howdy. Can you tell why? What we're doing with count is a clue: even though the lexical scope is closed over into a heap object, the values taken by the variables (or object properties) can still be changed. Here's what we have:

There is one common context shared by all functions. That's why count works. But the greeting is also being shared, and it was set to the last value iterated over, "howdy" in this case. That's a pretty common error, and the easiest way to avoid it is to introduce a function call to take the closed-over variable as an argument. In CoffeeScript, the do command provides an easy way to do so. Here's a simple solution for our greeter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
function makeGreeter(greetings)
{
var count = 0;
var greeter = {};

greetings.forEach(function(greeting) {
greeter[greeting] = function(name) {
count++;
console.log(greeting + ', ' + name);
}
});

greeter.count = function() { return count; }

return greeter;
}

var greeter = makeGreeter(["hi", "hello", "howdy"])
greeter.hi('poppet'); // prints "hi, poppet"
greeter.hello('darling'); // prints "hello, darling"
greeter.count(); // returns 2

It now works, and the result becomes:

That's a lot of arrows! But here's the interesting feature: in our code, we closed over two nested lexical contexts, and sure enough we get two linked Context objects in the heap. You could nest and close over many lexical contexts, Russian-doll style, and you end up with essentially a linked list of all these Context objects.

Of course, just as you can implement TCP over carrier pigeons, there are many ways to implement these language features. For example, the ES6 spec defines lexical environments as consisting of an environment record (roughly, the local identifiers within a block) plus a link to an outer environment record, allowing the nesting we have seen. The logical rules are nailed by the spec (one hopes), but it's up to the implementation to translate them into bits and bytes.

You can also inspect the assembly code produced by V8 for specific cases. Vyacheslav Egorov has great posts and explains this process along with V8 closure internals in detail. I've only started studying V8, so pointers and corrections are welcome. If you know C#, inspecting the IL code emitted for closures is enlightening - you will see the analog of V8 Contexts explicitly defined and instantiated.

Closures are powerful beasts. They provide a succinct way to hide information from a caller while sharing it among a set of functions. I love that they truly hide your data: unlike object fields, callers cannot access or even see closed-over variables. Keeps the interface cleaner and safer.

But they're no silver bullet. Sometimes an object nut and a closure fanatic will argue endlessly about their relative merits. Like most tech discussions, it's often more about ego than real tradeoffs. At any rate, this epic koan by Anton van Straaten settles the issue:

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said "Master, I have heard that objects are a very good thing - is this true?" Qc Na looked pityingly at his student and replied, "Foolish pupil - objects are merely a poor man's closures."

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire "Lambda: The Ultimate..." series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying "Master, I have diligently studied the matter, and now understand that objects are truly a poor man's closures." Qc Na responded by hitting Anton with his stick, saying "When will you learn? Closures are a poor man's object." At that moment, Anton became enlightened.

And that closes our stack series. In the future I plan to cover other language implementation topics like object binding and vtables. But the call of the kernel is strong, so there's an OS post coming out tomorrow. I invite you to subscribe and follow me.

Tail Calls, Optimization, and ES6

In this penultimate post about the stack, we take a quick look at tail calls, compiler optimizations, and the proper tail calls landing in the newest version of JavaScript.

A tail call happens when a function F makes a function call as its final action. At that point F will do absolutely no more work: it passes the ball to whatever function is being called and vanishes from the game. This is notable because it opens up the possibility of tail call optimization: instead of creating a new stack frame for the function call, we can simply reuse F's stack frame, thereby saving stack space and avoiding the work involved in setting up a new frame. Here are some examples in C and their results compiled with mild optimization:

Simple Tail Callsdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
int add5(int a)
{
return a + 5;
}

int add10(int a)
{
int b = add5(a); // not tail
return add5(b); // tail
}

int add5AndTriple(int a) {
int b = add5(a); // not tail
return 3 * add5(a); // not tail, doing work after the call
}

int finicky(int a) {
if (a > 10) {
return add5AndTriple(a); // tail
}

if (a > 5) {
int b = add5(a); // not tail
return finicky(b); // tail
}

return add10(a); // tail
}

You can normally spot tail call optimization (hereafter, TCO) in compiler output by seeing a jump instruction where a call would have been expected. At runtime TCO leads to a reduced call stack.

A common misconception is that tail calls are necessarily recursive. That's not the case: a tail call may be recursive, such as in finicky() above, but it need not be. As long as caller F is completely done at the call site, we've got ourselves a tail call. Whether it can be optimized is a different question whose answer depends on your programming environment.

"Yes, it can, always!" is the best answer we can hope for, which is famously the case for Scheme, as discussed in SICP (by the way, if when you program you don't feel like "a Sorcerer conjuring the spirits of the computer with your spells," I urge you to read that book). It's also the case for Lua. And most importantly, it is the case for the next version of JavaScript, ES6, whose spec does a good job defining tail position and clarifying the few conditions required for optimization, such as strict mode. When a language guarantees TCO, it supports proper tail calls.

Now some of us can't kick that C habit, heart bleed and all, and the answer there is a more complicated "sometimes" that takes us into compiler optimization territory. We've seen the simple examples above; now let's resurrect our factorial from last post:

Recursive Factorialdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

int factorial(int n)
{
int previous = 0xdeadbeef;

if (n == 0 || n == 1) {
return 1;
}

previous = factorial(n-1);
return n * previous;
}

int main(int argc)
{
int answer = factorial(5);
printf("%d\n", answer);
}

So, is line 11 a tail call? It's not, because of the multiplication by n afterwards. But if you're not used to optimizations, gcc's result with O2 optimization might shock you: not only it transforms factorial into a recursion-free loop, but the factorial(5) call is eliminated entirely and replaced by a compile-time constant of 120 (5! == 120). This is why debugging optimized code can be hard sometimes. On the plus side, if you call this function it will use a single stack frame regardless of n's initial value. Compiler algorithms are pretty fun, and if you're interested I suggest you check out Building an Optimizing Compiler and ACDI.

However, what happened here was not tail call optimization, since there was no tail call to begin with. gcc outsmarted us by analyzing what the function does and optimizing away the needless recursion. The task was made easier by the simple, deterministic nature of the operations being done. By adding a dash of chaos (e.g., getpid()) we can throw gcc off:

Recursive PID Factorialdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int pidFactorial(int n)
{
if (1 == n) {
return getpid(); // tail
}

return n * pidFactorial(n-1) * getpid(); // not tail
}

int main(int argc)
{
int answer = pidFactorial(5);
printf("%d\n", answer);
}

Optimize that, unix fairies! So now we have a regular recursive call and this function allocates O(n) stack frames to do its work. Heroically, gcc still does TCO for getpid in the recursion base case. If we now wished to make this function tail recursive, we'd need a slight change:

tailPidFactorial.cdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int tailPidFactorial(int n, int acc)
{
if (1 == n) {
return acc * getpid(); // not tail
}

acc = (acc * getpid() * n);
return tailPidFactorial(n-1, acc); // tail
}

int main(int argc)
{
int answer = tailPidFactorial(5, 1);
printf("%d\n", answer);
}

The accumulation of the result is now a loop and we've achieved true TCO. But before you go out partying, what can we say about the general case in C? Sadly, while good C compilers do TCO in a number of cases, there are many situations where they cannot do it. For example, as we saw in our function epilogues, the caller is responsible for cleaning up the stack after a function call using the standard C calling convention. So if function F takes two arguments, it can only make TCO calls to functions taking two or fewer arguments. This is one among many restrictions. Mark Probst wrote an excellent thesis discussing Proper Tail Recursion in C where he discusses these issues along with C stack behavior. He also does insanely cool juggling.

"Sometimes" is a rocky foundation for any relationship, so you can't rely on TCO in C. It's a discrete optimization that may or may not take place, rather than a language feature like proper tail calls, though in practice the compiler will optimize the vast majority of cases. But if you must have it, say for transpiling Scheme into C, you will suffer.

Since JavaScript is now the most popular transpilation target, proper tail calls become even more important there. So kudos to ES6 for delivering it along with many other significant improvements. It's like Christmas for JS programmers.

This concludes our brief tour of tail calls and compiler optimization. Thanks for reading and see you next time.

Recursion: dream within a dream

Recursion is magic, but it suffers from the most awkward introduction in programming books. They'll show you a recursive factorial implementation, then warn you that while it sort of works it's terribly slow and might crash due to stack overflows. "You could always dry your hair by sticking your head into the microwave, but watch out for intracranial pressure and head explosions. Or you can use a towel." No wonder people are suspicious of it. Which is too bad, because recursion is the single most powerful idea in algorithms.

Let's take a look at the classic recursive factorial:

Recursive Factorial - factorial.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>

int factorial(int n)
{
int previous = 0xdeadbeef;

if (n == 0 || n == 1) {
return 1;
}

previous = factorial(n-1);
return n * previous;
}

int main(int argc)
{
int answer = factorial(5);
printf("%d\n", answer);
}

The idea of a function calling itself is mystifying at first. To make it concrete, here is exactly what is on the stack when factorial(5) is called and reaches n == 1:

Each call to factorial generates a new stack frame. The creation and destruction of these stack frames is what makes the recursive factorial slower than its iterative counterpart. The accumulation of these frames before the calls start returning is what can potentially exhaust stack space and crash your program.

These concerns are often theoretical. For example, the stack frames for factorial take 16 bytes each (this can vary depending on stack alignment and other factors). If you are running a modern x86 Linux kernel on a computer, you normally have 8 megabytes of stack space, so factorial could handle n up to ~512,000. This is a monstrously large result that takes 8,971,833 bits to represent, so stack space is the least of our problems: a puny integer - even a 64-bit one - will overflow tens of thousands of times over before we run out of stack space.

We'll look at CPU usage in a moment, but for now let's take a step back from the bits and bytes and look at recursion as a general technique. Our factorial algorithm boils down to pushing integers N, N-1, ... 1 onto a stack, then multiplying them in reverse order. The fact we're using the program's call stack to do this is an implementation detail: we could allocate a stack on the heap and use that instead. While the call stack does have special properties, it's just another data structure at your disposal. I hope the diagram makes that clear.

Once you see the call stack as a data structure, something else becomes clear: piling up all those integers to multiply them afterwards is one dumbass idea. That is the real lameness of this implementation: it's using a screwdriver to hammer a nail. It's far more sensible to use an iterative process to calculate factorials.

But there are plenty of screws out there, so let's pick one. There is a traditional interview question where you're given a mouse in a maze, and you must help the mouse search for cheese. Suppose the mouse can turn either left or right in the maze. How would you model and solve this problem?

Like most problems in life, you can reduce this rodent quest to a graph, in particular a binary tree where the nodes represent positions in the maze. You could then have the mouse attempt left turns whenever possible, and backtrack to turn right when it reaches a dead end. Here's the mouse walk in an example maze:

Each edge (line) is a left or right turn taking our mouse to a new position. If either turn is blocked, the corresponding edge does not exist. Now we're talking! This process is inherently recursive whether you use the call stack or another data structure. But using the call stack is just so easy:

Recursive Maze Solverdownload
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <stdio.h>
#include "maze.h"

int explore(maze_t *node)
{
int found = 0;

if (node == NULL) {
return 0;
}

if (node->hasCheese) {
return 1; // found cheese
}

found = explore(node->left) || explore(node->right);
return found;
}

int main(int argc)
{
int found = explore(&maze);
}

Below is the stack when we find the cheese in maze.c:13. You can also see the detailed GDB output and commands used to gather data.

This shows recursion in a much better light because it's a suitable problem. And that's no oddity: when it comes to algorithms, recursion is the rule, not the exception. It comes up when we search, when we traverse trees and other data structures, when we parse, when we sort: it's everywhere. You know how pi or e come up in math all the time because they're in the foundations of the universe? Recursion is like that: it's in the fabric of computation.

Steven Skienna's excellent Algorithm Design Manual is a great place to see that in action as he works through his "war stories" and shows the reasoning behind algorithmic solutions to real-world problems. It's the best resource I know of to develop your intuition for algorithms. Another good read is McCarthy's original paper on LISP. Recursion is both in its title and in the foundations of the language. The paper is readable and fun, it's always a pleasure to see a master at work.

Back to the maze. While it's hard to get away from recursion here, it doesn't mean it must be done via the call stack. You could for example use a string like RRLL to keep track of the turns, and rely on the string to decide on the mouse's next move. Or you can allocate something else to record the state of the cheese hunt. You'd still be implementing a recursive process, but rolling your own data structure.

That's likely to be more complex because the call stack fits like a glove. Each stack frame records not only the current node, but also the state of computation in that node (in this case, whether we've taken only the left, or are already attempting the right). Hence the code becomes trivial. Yet we sometimes give up this sweetness for fear of overflows and hopes of performance. That can be foolish.

As we've seen, the stack is large and frequently other constraints kick in before stack space does. One can also check the problem size and ensure it can be handled safely. The CPU worry is instilled chiefly by two widespread pathological examples: the dumb factorial and the hideous O(2n) recursive Fibonacci without memoization. These are not indicative of sane stack-recursive algorithms.

The reality is that stack operations are fast. Often the offsets to data are known exactly, the stack is hot in the caches, and there are dedicated instructions to get things done. Meanwhile, there is substantial overhead involved in using your own heap-allocated data structures. It's not uncommon to see people write something that ends up more complex and less performant than call-stack recursion. Finally, modern CPUs are pretty good and often not the bottleneck. Be careful about sacrificing simplicity and as always with performance, measure.

The next post is the last in this stack series, and we'll look at Tail Calls, Closures, and Other Fauna. Then it'll be time to visit our old friend, the Linux kernel. Thanks for reading!