Understanding and Visualizing Covid Growth in the US

This post takes a look at Covid data with a particular focus on the number of new daily cases and the growth (or reduction) of those daily cases over time. If this were physics, we’d be looking at speed and acceleration, rather than the total distance traveled. I won’t try to convince you of anything, but rather just try to build an understanding of where we’ve been, where we are, and what to expect in the next few months.

Let’s start with the growth in daily cases for US states since March 10th, for states reporting at least 20 cases:

Each dot represents the growth in the number of new daily cases for a US state on a given day. I discuss methodology further at the end of this post if you’re interested. [1]

We can clearly see a few crucial trends in this chart. Growth was furious for all states in mid-March (20% daily growth means doubling in 3.8 days, as you’ve surely heard) and showed a lot of variance. Then nearly all states issued stay-at-home orders between March 23rd and April 3rd. [2] These orders, no doubt coupled with some amount of anxiety and precautions from the population, quickly reduced growth rates, which were clustered around 0% by mid-April. This was a significant accomplishment. Sadly, we were unable to improve from there, and never brought growth figures consistently or substantially below zero. Here’s the same data seen by week in a slightly different way:

We started out red-hot and worsening in mid-March, but that gave way to slower growth and calmer colors. The initial success was followed by stagnation, and slight worsening in the last two weeks. Let’s look at our nationwide figures:

New daily cases peaked on April 10 in the US at about 32,000 cases/day. They have since fallen to 21,500 cases/day. [1:1] Growth peaked at 40% on March 24, shortly before the lockdowns started, then fell sharply hitting 0% on April 15.

Now consider this: we had about 7,000 cases/day on March 25, as we headed into lockdowns, and we have 21,500 cases/day now, as we are leaving them. That might feel a little disheartening. What happened? Was there any point to this whole thing? Did we just destroy countless jobs, businesses, and dreams for no good reason?

There are three good answers here. The first is that the precipitous fall in growth brought about by the lockdowns was a major win that probably averted total disaster. However, unless you look at a plot of growth rates, or at least look at daily cases and appreciate the trend, this win is somewhat hidden. I hope the charts so far have done a decent job of showing this aspect of our journey.

The second is that the lockdowns were indeed somewhat pointless. Not because they are inherently so, but because we’ve done a bad job and failed to significantly bring down case numbers while we had a perfect opportunity to do it. We bought the lockdown with trillions of dollars and untold sacrifice, and then squandered it.

The third answer is that we have to consider states separately to really analyze the situation, because national data is just too blunt. States had varying levels of success and peaked at different times, and to understand what worked we need to factor that in.

Let’s look at what other countries achieved with their lockdowns:

Those curves show the kind of drastic reduction in the number of daily cases that well-organized societies can achieve. They are able to push growth significantly below zero and keep it there long enough to bring case numbers down an order of magnitude or more. A smaller outbreak is then more amenable to containment by well-design policies while economic and social activity is restored.

Let’s look at more countries for better context. Here are the ten countries most successful at containing the pandemic from a peak of at least 70 cases:

I have excluded China from the list due to controversies around their data. They would have been 4th place with a 99.8% reduction from a peak of 4,687 cases/day. We see some islands in there, some smaller populations, and also small peaks. It’s worth pointing out that neither islandness nor a small population are any guarantees, as the history of smallpox in Iceland can attest. [3] Still, countries like Switzerland and Austria vanquished pretty large outbreaks and are not islands last I checked. Social cohesion and good policies seem like the overriding factors. But let’s look at a more diverse group of places:

Sweden is the only wealthy country in this list doing worse than the US. This was not cherry picked: that remains true when you look at the whole world, where the US ranks 62nd by this metric. In the last week Sweden’s top epidemiologist has admitted mistakes in their strategy. [4] [5] However, the overall number of infections is low in Sweden, and their growth has been kept mostly in check, never spiraling out of control. They are a highly conscientious society that took a daring (and often misrepresented) approach with a clear understanding of the trade-offs involved.

The situation in the bottom countries is catastrophically different. They all have strong growth of already sizable outbreaks, with Brazil in an especially dire situation, no doubt the worst in the world, having recently overtaken the US for the top spot in daily cases amid continued growth. Their president is now attempting to censor Covid numbers, and it’s possible Brazilian data will no longer be reliable over the next few weeks. [6]

Even if we ignore any mistakes made before mid-March, it is clear from this data that the US has not done a great job containing the pandemic. Despite remaining in a fairly strict lockdown for weeks, we performed worse than all but one rich nation in reducing case numbers. But let’s not yet worry about whether we’re a failed state or have been made great again.

After all, the US is a large and heterogeneous place, and looking at national aggregate data obscures a lot of the story. States like Alaska, Montana and Wyoming never had more than 25 cases/day, while New York reached 9900 cases/day, a peak greater than every nation’s except for Brazil and Russia. Having seen what other countries look like, here is what happened in US states with a peak of at least 70 cases/day:

A handful of states managed substantial reductions in daily cases, including New York, which had by far the largest outbreak in the US. That’s cup half full. Still, at 91.3% decrease New York is behind most developed countries. It is striking that none of our states have managed to do as well as Spain, Italy, or Germany when it comes to reducing case numbers.

And then there are the states at the bottom of this list. When you see 0% that means no reduction: these states are currently at their historical maximum and growing, and we don’t know when and where they’ll peak.

Keep in mind the decreases in the chart above show the reduction in each state’s daily cases measured against its own peak. To get an idea of how states changed since the national peak, and how the outbreak decreased in some areas and increased in others, here are the most substantial deltas in daily cases by state since the US peaked on April 10th:

Since we peaked nationally on April 10, we have reduced daily cases by about 10,500/day, with most of the reduction coming from New York (9,000 cases/day) and New Jersey (3,000 cases/day). It might strike you as odd that the national decrease (10,500 cases/day) is smaller than the decrease from just New York and New Jersey (a combined 12,000 cases/day). And sure enough, if we exclude those two states, daily cases have actually increased in the rest of the US since our national peak. Without NY and NJ, on April 10 we were at 18,300 cases/day, then we peaked on May 6 at 21,400 cases/day, and are now at 20,000 cases/day, for a reduction of 7%.

So let’s talk the future and make some predictions. Think about these two questions:

  1. How many states will see a daily cases peak at least 30% greater than any peak they’ve had so far?
  2. How many states will be forced back into lockdown?

Then consider these facts: compared to other developed nations we have done a much worse job reducing our outbreak; we did not use our lockdown period to develop comprehensive policies to fight Covid; we have not used leadership to galvanize the population to fight the pandemic and adopt practices that mitigate spread - quite the opposite, we have started a culture war around wearing masks, social distancing and whether to even take Covid seriously; many American leaders undermine mitigation by deed and word; even while in lockdown, we have only been able to achieve modest daily reductions in case numbers; people feel like they have done their duty and should now be able to resume life, being generally sick of hearing about Covid and all its controversies and conspiracies; places highly prone to spread, such as gyms, churches, and restaurants, will resume operations; domestic travel will resume so that any counties with larger outbreaks might seed those with fewer cases; finally, if daily growth increases even to a modest 5%, cases will double in two weeks under the inexorable march of exponential increase.

Offsetting these is the fact that a large part of the population is much more careful and attuned to the spread of Covid. Humans are remarkably adaptable, and maybe smart on-demand interventions at the county and state levels can curb local outbreaks.

Before answering those two questions, let’s take a look at the familiar case-and-growth plots for the 40 states with cases/day currently over 70:

Many of those curves don’t look great. Keep in mind some of the spikes we see mid-graph are due to specific incidents like outbreaks at a prison.

But enough of the charts, let’s try our hand at divination. Only eight states have managed a decrease of 70% or more in their daily cases (nine if we count Pennsylvania at 69.8%). These are the states most likely to keep things under control: most have seen a serious situation, all have been effective by US standards, and they are further down from their peaks. I’ll round up and say 10 states will avoid a greater peak in the future. The other 40 will see a peak at least 30% greater than their current peak. And of these 40, at least half will adopt lockdown measures before the end of the year that affect a majority of their population.

This is all the data I’ve got for now, but if you’ve read this far, you might as well stick around for a few broader considerations.

First, the trade-off between economic outcomes and epidemiological outcomes has become grossly overstated. The more infection we have, the more the economy will be affected as people shy away from economic activity.[7] A failure to intelligently fight Covid is an economic failure as well. Brazil is a sad example of this, as the out-of-control outbreak has wreaked havoc in the economy.

Almost every containment strategy - personal behaviors, contact tracing, widespread testing, effective quarantine of sick patients, etc. - ultimately benefits the economy. Every leader who has mocked or sabotaged Covid containment is hurting economic output. And plenty of economic activity can be encouraged with low risk, especially if smart mitigation is applied. Even where a trade-off seems obvious, say opening up restaurants without restrictions, things are not so simple: the net economic effect needs to account for the consequences of the greater spread of Covid, which unfortunately is very likely in restaurants.

The trade-off is much more direct when it comes to personal freedom. Church services are a perfect example. They are simultaneously: 1) prone to spreading Covid, 2) not responsible for a lot of economic activity, and 3) extremely important to a large part of the population.

Or to pick a different demographic, look at skiing in Colorado. Plenty of people here would be willing to risk infection in order to ski, yet this choice was denied to them. This may seem like a trivial sacrifice, but to many it is deeply meaningful. Skiing is a complex trade-off since it does involve a lot of economic activity and also enormous Covid risk, as we saw when tourists started various outbreaks in our ski towns. Yet there is also a strong personal freedom component embedded in it. It is interesting that the restrictions which most incensed Michigan protesters were related to personal freedoms, like the use of personal boats.

The moral calculus around Covid trade-offs is complex. Risk to self; risk to others you might infect; risk to society at large if we overrun the health system; how to weigh death against hardship, enjoyment, and freedom; how much we value the life of elderly people and those at greater risk of complications, and so on.

But there are a lot of actions and personal decisions that remain invariant no matter how you feel about trade-offs. God knows we are all sick of Covid, now that the novelty wore off and this looks like a long haul. But stay as safe as possible, and for whatever degree of risk-taking you decide on, mitigate as much as possible.

I hope this has been useful and informative. Thanks for reading!


  1. All of the data for this post comes from either the European CDC or the New York Times state-level dataset for Covid. The Covid Tracking Project dataset has also been extremely helpful, but is not used here. I used 7-day rolling averages for all Covid figures. The county, state, and national reporting is very noisy with frequent spikes and troughs. They also tend to be very sensitive to the day of the week and particularly to weekends. The 7-day average smooths this out with the nice benefit of capturing exactly one week, which further helps with the day-of-week variations. I also use a 7-day interval to compute growth. This again smooths out noise and allows for more meaningful comparisons. The growth figure is simply the seventh root of the factor obtained by dividing a figure for day N by the figure for day N-7. Whether to use cases, hospitalizations, or deaths is another interesting decision. Cases and deaths data is more robust and widespread. Deaths are a lot more sensitive to particularities of an outbreak: a high percentage of deaths is linked to elderly care facilities, for example, so it is possible to have high death figures that overstate the size of an outbreak. Deaths also depend on quality of care, and are far more delayed, frequently happening anywhere from 2 to 12 weeks after infection. Symptoms and detection of a new case are much quicker and vary less. I feel that to understand the dynamics of an outbreak, cases are more useful. Since these charts are all generated by code, I did an experiment using deaths instead of cases and the trends held up consistently, albeit delayed by 2-3 weeks. Cases are sensitive to the amount of testing being done. If the amount of testing is somewhat constant, and the percentage of detected cases is consistent, then at least the relative changes in the number of cases will be meaningful, even if they only capture a fraction of the total. But if testing is increased, this can show up as more daily case numbers, when in reality only detection increased. Looking at the percentage of positive tests vs. total tests can help detect that issue. I have used the data from the Covid Tracking Project, which does provide testing information, and also the figures for deaths, to see whether changes in testing play a big role in these trends. That does not seem to be the case looking at the data. ↩︎ ↩︎

  2. U.S. state and local government response to the COVID-19 pandemic ↩︎

  3. https://www.newyorker.com/magazine/2020/06/08/how-iceland-beat-the-coronavirus ↩︎

  4. https://www.bloomberg.com/news/articles/2020-06-03/man-behind-sweden-s-virus-strategy-says-he-got-some-things-wrong ↩︎

  5. https://www.theguardian.com/world/2020/jun/03/architect-of-sweden-coronavirus-strategy-admits-too-many-died-anders-tegnell#maincontent ↩︎

  6. https://www1.folha.uol.com.br/equilibrioesaude/2020/06/governo-deixa-de-informar-total-de-mortes-e-casos-de-covid-19-bolsonaro-diz-que-e-melhor-para-o-brasil.shtml ↩︎

  7. Morning Consult tracks how safe consumers feel and consumer confidence more broadly. It will be interesting to see the relationship between economic recovery and successful containment in various countries. ↩︎

Covid-19 Data Sources for Programmers

I’ve been doing analysis of Covid cases to try to understand what to expect in terms of lockdown length and disease progress, especially in Colorado and Brazil, the places I spend the most time in. There are a lot of data sources around, and it took me a few hours to find and test a number of them. I hope this saves time for anyone interested in crunching Covid numbers. If you have suggestions and tips on data sources, please open a PR or issue in my Github repo. Here we go.

John Burn-Murdoch and his team at the Financial Times have done a great job reporting visually on the pandemic. They have fewer and simpler charts than many other sites but their charts are done exquisitely and distill a lot of data to provide you the clearest picture available of each country’s situation, plus a few of the regional hotspots around the world.

Our World in Data is a wonderful project based at Oxford University that attempts to explain the world using rigorous data sources and beautiful charts. They have been producing a lot of great Covid content since the pandemic broke out. If you have some time, I suggest exploring the non-covid areas of the site as well (and if you enjoy that, I highly recommend the book Factfulness). All of their work is open sourced.

The OWID data is in a GitHub repo. Their main source is the European CDC, which publishes confirmed cases and deaths aggregated by date and country for most of the world (not just Europe) in JSON, CSV, and XML files.

Johns Hopkins University has built a wildly popular dashboard available in desktop and mobile versions. Their repository is public and it aggregates data from a variety of sources into easy to use CSV files (there’s also a JSON mirror). In addition to worldwide national totals, data is available for individual US counties and states. It includes number of cases and deaths along with recovered and active patients. Since they aggregate data from the US CDC, China CDC, European CDC and several other national institutions, this is a great way to get your hands at worldwide data.

The New York Times offers a plethora of high-quality Covid maps and visualizations. It’s not a surprise Mike Bostock, creator of the D3.js library, used to work there. The NYT open sourced a repository providing high-quality and painstakingly verified data for US cases at both the state and county level. This is probably the best source of data for analyzing number of cases and deaths in the US.

Another outstanding US data source is the Covid Tracking Project, powered by dozens of volunteers attempting to collect data on number of tests performed, positive and negative results, hospitalizations, patients in the ICU and on ventilators, and so on. They face a severe dearth of information in the US and the complete lack of centralized reporting, but they’re making the best of it. If you want to attempt more sophisticated analysis, this is a good source. But mind the gaps.

I’m sure you’ve hit Worldometer while googling covid information. They provide encyclopedic amounts of data about Covid infection worldwide through an effective bare-bones interface with good charts. Data is aggregated by country and includes deaths and active cases, both by day and totalized.

Finally, if you are interested in more regional data for other countries, there are great repositories for Spain and Italy. It’s not easy to aggregate UK data, but Tom White has a good repo. Álvaro Justen has done the same for Brazil, while research lab Fiocruz has a good web UI for Brazilian data.

If you know of other high-quality regional repos, please send a PR or GitHub issue. I’d love to expand this post with the best repos for each region.

iPhones, Armed Robbery, and Hacking

(Some security recommendations are summarized at the end.)

I. The Robbery

This past summer I was walking around in the neighborhood where I grew up, happy-go-lucky, when some guy jumped off a motorcycle pointing a gun at me. It was my first time at gunpoint, and from the outset the weapon was positively spellbinding. As I gazed at it, strange thoughts hit me: “Am I going to get shot by this rusty piece of shit? What a sorry way to die! And what if I get tetanus?”

Those were thoughts I wouldn’t have anticipated, but as Dan Carlin says, humans in extreme situations often behave unexpectedly. And while a gun-toting thug is a far cry from the Battle of Verdun, it is pretty extreme for me. This post tells the story of the robbery and its surprising information security developments. There are lessons here for both users and designers of technology.

Robbery scene

My daughter and I were visiting Brazil in July, taking a carefree walk in a boulevard lined with lush trees. She had just gotten into “good kid, m.A.A.d. city”, ironically enough an album about growing up in Compton amid dire violence. So we were deep in conversation about the US criminal justice system, drug laws, and the ideas of people like Bryan Stevenson and Michelle Alexander[1].

Growing up in Brazil you get a crash course in street smarts. I was mugged twice as a 10-year-old and once at 15. That’s counting only the times when stuff was actually taken. There were scores of near-muggings I dodged by either talking my way out or running my way out.

But after 20 trouble-free years, I let my guard down. Absorbed in conversation, I barely noticed the motorcycle driving on the other side of the street. By the time it veered the wrong way into traffic and sped towards the sidewalk we were on, it was too late. The passenger jumped out while the motorcycle was still moving, gun in hand pointed squarely at me.

The scene felt strangely removed - it’s cliche, but it really did feel like a movie. Instead of panicked confusion, there was a strong pragmatic voice in my head. I had thought about “what if” scenarios plenty of times before and they kicked in. Who is the attacker? What is their motivation? What’s the best course of action?

There are career criminals in Brazil who are downright professional. I know somebody whose house was invaded while they were home and the robbers let them know how long the “job” was going to take, offered them water, and made sure nobody freaked out. Better than some moving companies I’ve used.

But when someone is robbing random people on the street using a gun, that’s pretty far from professional. Way too volatile a situation with huge risks and beggarly payoff. These were at best lowlifes and at worst jittery crackheads. I felt two strong imperatives. First, keep the situation as absolutely relaxed as possible. When they get nervous, they get scared. And when they get scared, that’s when I accidentally get shot. But second, and more importantly: if they want to kidnap my daughter, fight it at any cost whatsoever. Better to die on that sidewalk than to let them take her.

I remember thinking, “take a deep breath, raise hands slowly, move smoothly, stay relaxed, hand everything over.” It worked. Who knows, maybe Andy from the Headspace app saved my life. The bandits were gone, along with our two iPhones and my watch. But the real fun was still to come.

After we got home, I logged into iCloud and put both phones in lost mode. They had been turned off, predictably. Plenty of crooks have been caught by way of “Find iPhone,” but they’ve learned by now. Thinking of my data in criminal hands was uncomfortable, but the fact that iOS exploits sell for $1.5 million made me feel a lot better. No small-timer is breaking into an iPhone. I figured they would wipe it out and sell it.

I have two-factor authentication in all the accounts that matter, and whenever possible my second factor is an iPhone app that generates time-based one-time passwords (TOTP) for authentication. Google Authenticator is a popular app for this, but I use OTP Auth instead because it is more flexible (more on this in the recommendations). Here’s what it looks like, slightly sped up to make it more exciting:

Time-based one-time passwords

When it’s time to log into one of your accounts, you provide your login and password as you normally would, plus the temporary code being shown by the app.

I also use a password manager with unique, long passwords for each site. So my main concern at that point was minimizing the impact of this whole thing on my kid. We had dinner planned with friends, tasting menu at a good Japanese place, so I thought it best to go, have a good time, laugh and hopefully cushion the blow. Later I could call T-Mobile and suspend the cell lines.

II. The Hacking

A couple of hours later we were back, much happier, imbued with friendship and, in my case, plenty of sake. I opened Gmail and got some shockers:

Facebook password reset email

Wuh-wait what? I wasn’t expecting to see any of these, but least of all the Facebook password reset. Before you read on, take a good look at those emails. It’s fun to work out what happened here. Done? Let’s dig in:

Facebook password reset email

Whoah! Facebook password reset by phone number? How? Did they unlock my phone? But also… why? At once I felt the sinking realization I misread the situation. They seemed to be more sophisticated that I thought - not the motorcycle crew themselves, but someone else in the operation (his identity would be unmasked later that night).

The idea that somebody was hacking into my accounts right at that moment, with my phone in hand, was deeply unsettling. A malevolent twist to the emotional roller coaster of that evening. But this was a technical problem, so it was time to sober up as best as I could and work methodically.

The “how” was simple. The attacker took the SIM card out of the stolen iPhone and put it in another phone. At that point he found out my phone number, whereas previously he had no information on me. More importantly, he could also receive my SMS text messages. He then attempted to log into Facebook using my phone number as a login, clicked on “Forgot Password,” and reset the password via SMS.

So here is a big screw up and a couple of lessons. As I said, I have 2FA (two-factor authentication) in the accounts “that matter.” But I rarely use Facebook, so I didn’t enable 2FA there. Oops. Turns out it’s not such a great idea to have an account in the world’s most popular app as a weak link in your defenses.

Now consider Facebook’s account recovery policies. If the account has 2FA enabled, passwords can only be reset by email. That’s good. But without 2FA, if an account has an associated phone number, the password can be reset via SMS. In such a case, a SIM card is an instant ticket to the account: find it and reset its password in one fell swoop.

That’s a disaster. Facebook single handedly provides a way for attackers to go from a SIM card, or hijacked SMS messages, to a trove of personal information for the vast majority of people out there. By contrast, attackers made zero progress in hacking my kid’s accounts, mostly because she doesn’t use Facebook.

But why the hell was this wretch logging into my FB account? I suspect it wasn’t for my cousin’s mad political rants. Already shaken from the armed robbery, my mind played tricks on me as paranoid thoughts of identity theft and fraudulent bank transfers loomed.

I immediately logged into t-mobile.com and suspended both cell lines, disabling the attacker’s main weapon. As an aside, T-Mobile has been great for international travel. I love you guys, keep your website safe. I tested sending SMS messages to my suspended numbers and happily all attempts generated errors.

On to the other emails. The Facebook password reset arrived at 9:46pm Brazil time. Curiously, at 11:23pm they briefly turned my phone back on with its SIM card, and the phone went into lost mode and flashed on Find iPhone here. But then there is that fourth email with a subject line of “iPhone SE 64GB Silver Was Found!” arriving at 12:20am. Here it is:

iCloud phishing email

The phone model and storage capacity are exactly right. The spelling, grammar, and layout are pretty well done. It was sent to my primary personal email, lifted from Facebook. Imagine a regular user receiving this right after their phone has been stolen, while they’re somewhat shaken, and when they’re not native English speakers to boot. What are the odds they’ll realize this is a phishing attempt for iCloud credentials?

Apart from checking the URL, the biggest clue is the exclamation mark in the subject line, a little too enthusiastic for Apple. Either way, this is a nearly perfect phishing piece, made more so by impeccable timing. Maybe iCloud accounts should be placed in some sort of restricted state after a device is put into lost mode.

It’s stressful to face an ongoing, targeted, personal attack. Deep breath again. Time to methodically check every account for suspicious activity, change passwords just in case, and recover compromised accounts. My main Google account, protected by 2FA, was safe throughout the ordeal. I reset my Facebook password by email and got back in. GitHub, AWS, and other professional accounts were also on 2FA and had no unauthorized activity. Audit logs never tasted so sweet. It was a relief knowing I wouldn’t have to tell clients, “Hey, how are you? Great! So, listen, this iPhone thieving ring probably has all your data, isn’t that funny? Hah! But never mind that! Those Bitcoin prices, huh?”

Then I tried a secondary Gmail account I use for some mailing lists and other non-critical tasks. You know… the kind of account for which one might leave 2FA disabled. Sure enough, the wretch had been there, and the password was changed via SMS password reset. And he only found the account by the phone number in the first place. Familiar? Here’s a quick recap:

SMS hacking diagram

This Gmail account did not have a recovery email set up, and ironically I couldn’t use SMS anymore. Google offers a recovery algorithm where you try to answer different questions with the ability to “Try another way” if you don’t have a particular answer (quick: in which month and year did you create your Google account?). I was locked out for a while, long enough to start thinking I had lost the account, but eventually produced a couple of answers and got back in.

Finally, all of my accounts were safe again. It was getting late, but I had to find out why this person was frantically probing my accounts, and maybe, with some luck, who they were.

I knew the data in the iPhones was safe, as per the Apple vs. government showdown after the San Bernardino terrorist attack. But earlier I had assumed the phones could still be wiped clean and used normally. But maybe they couldn’t, and this whole rigmarole was about breaking into my iCloud account. Hence the phishing.

A quick search confirmed the idea. Since iOS 7, released in 2013, Apple has provided the Activation Lock feature, whereby if a device is linked to an iCloud account, activating it requires the password to that account. This has created some misery among people buying and selling used iPhones: if the seller forgets to unlink the device from their iCloud account, the device is bricked until they do so.

A warm wave of righteous schadenfreude washed over me: all the robbers had were parts! They would fetch little money from this whole thing, especially since my kid’s screen was cracked. You go, kid! Glad I hadn’t replaced it yet. Also glad for activation lock, though perversely my digital torment was its side effect: the world is complicated. It turns out there was no sinister plot, just a miserable scheme for a few hundred dollars. Straight to the depths of hell is where those cowards going.

It was sobering to realize the attacker almost succeeded. Up until a few months prior to the robbery, my iCloud account did not have 2FA enabled and it used the compromised Gmail address as the recovery email. If the robbery had happened then, they would have been able to get in, unlink the phones, and sell them at full (used) price. They would have changed my iCloud password in the process, and might have erased and locked my other Apple devices for the hell of it, which would have been disastrous and possibly ruinous, depending on whether I could get back into the account. Whatever little data I have in iCloud would have been stolen as well.

I hope this motivates you to enable 2FA on all of your accounts, even the unimportant ones. They can interact in incremental and unexpected ways to become your undoing. Moreover, using TOTP apps as the second factor is far safer than SMS.

Apple has done a fantastic job with iOS security and Find iPhone, curbing everything from malware to exploits to theft. But further improvements can be made to better protect its customers. In the next post you’ll see week-long sustained hacking attempts and meet the maggot behind the attacks, operating in a wretched hive of “iPhone unlockers.”

III. Recommendations

  • Make sure your accounts cannot be hacked via text message (SMS) password resets. You can often do so by enabling two-factor authentication (2FA) for an account, particularly if you use a time-based one-time password (TOTP) app as the second factor. Two such apps are Google Authenticator and OTP Auth. You could also withhold your phone number from certain accounts. Another advantage of TOTP is that if you’re unable to receive SMS messages for whatever reason, you can still log in.

  • Beware of your unprotected “less critical” accounts. They might provide a path to your sensitive ones.

  • If you decide to go with a TOTP app, choose one that allows you to make an encrypted backup of your account secrets. OTP Auth provides that along with encrypted iCloud sync, all optional and controlled by the user. Authy is another good option. If you use Google Authenticator, make sure losing your phone won’t lock you out of any accounts.

  • If you design apps, be careful with password resets via SMS. SIM cards are an easy target, cell providers are subject to social engineering that could lead to intercepted messages[2], and SMS notifications can be seen on lock screens in most phones. Allow users to choose TOTP as a second factor.

  • If your iPhone is lost or stolen, go to iCloud.com immediately, put it in lost mode, and provide a phone number where you can be reached. Once you’ve done that, you might want to temporarily suspend your phone line (many carriers offer this on their websites). If you do so, you can no longer call your own phone, and unless it’s on wifi, you also can’t “Play Sound” or “Erase iPhone” via iCloud - keep that in mind. On the upside, nobody can use your SIM card to hack your accounts via SMS password reset. It’s a trade-off.

  • If your iPhone is definitely stolen, rather than lost, it will probably appear off in iCloud. Put it in lost mode anyway. If you provide a phone number, know that it might be targeted for iCloud phishing or social engineering as crooks try to hack into your iCloud account (that’s why the attacker briefly turned my phone on: to get a phone number to target). You almost surely want to suspend your cell service immediately. You lose the tracking and other goodies, but thieves generally know to keep the phone off, and handing them a working SIM card is fraught with peril. Tread carefully and may the force be with you.

  • You might want to protect your SIM card with a PIN. This requires you to enter the PIN whenever you turn your phone on. Attackers are thus unable to transplant your SIM card to another device and use it. However, if you lose your iPhone and the battery dies, or the person who finds it turns it off, it’s game over. Even if the phone is later turned on, it won’t connect to the Internet, enter Lost Mode, show a phone number where you can be reached, or “Play Sound” (this is true even if a known wifi is in range[3]). A phone that otherwise might have been found could be bricked and lost forever.

  • Beware of “Your iPhone was found” emails, text messages, and Whatsapp messages. Scammers attempt to phish for your iCloud credentials in devious ways soon after an iPhone is stolen. If you provided a lost mode phone number, thieves will attempt to use it against you while trying to break into your iCloud account.

  • Read Tech Solidarity’s security guide. It’s overkill for a regular user, but know the rules before breaking them.

Thank you for reading.


  1. If you’re interested in US criminal justice, Ghettoside is a great book with better-than-fiction LA detective stories interwoven with a serious discussion of criminality, murder clearance rates, and other pressing topics. The New Jim Crow by Michelle Alexander is an interesting read on mass incarceration, while Bryan Stevenson’s Just Mercy offers a piercing look at the injustices we sometimes create. Ezra Klein has a good interview with Stevenson. Glenn Loury’s interview with Sam Harris offers a somewhat different perspective. ↩︎

  2. VICE reported on a T-Mobile website bug that leaked personal data based based on phone number alone, giving social engineers a leg up. But this type of attack has worked against multiple carriers all over the world. Prominent [hacks] wired-deray-hack have happened this way. ↩︎

  3. You can try this at home: turn your iPhone off and back on. Until the passcode is entered at least once, it won’t connect to wifi. See this. If you have a better link, please let me know. ↩︎

Goto and the folly of dogma

Many programmers are surprised to find out that the goto statement is still widely used in modern, high-quality codebases. Here are some examples, using the first codebases that come to mind:

Repo goto usages ratio to continue
Linux kernel 150k 6.27
.net CLR 5k 2.13
git 960 0.76
Python runtime 5k 16.9
Redis 554 2.14

The ratio to usages of the continue keyword is provided to normalize for lines of code and the prevalence of loops in the code. This is not limited to C code bases. Lucene.net for example has 1,511 goto usages and a ratio of 3 goto usages to each continue usage. The C# compiler, written itself in C#, clocks in at 297 goto usages and 0.22 ratio.

People who take “goto is evil” as dogma will point out that each of these usages could be rewritten as gotoless alternatives. And that’s true, of course. But it will come at a price: duplication of code, introduction of flags, several if statements, and overall added complexity. These are highly reviewed codebases written by talented people. When they use goto, it’s because they find it to be the simplest approach.

This is exactly how dogma hurts software development. We take a sensible rule that works most of the time and promote it to sacred edict, deeming violators as inferior programmers, producers of unclean code. Thus something that would have been a helpful guideline becomes a hard constraint. Pile up enough of these, and code that could have been simple ends up in a tangled mess, all in the name of “purity.”

We have a long tradition of dogmas, but goto is the seminal example, denounced in Edsger Dijkstra’s famous letter, Go To Statement Considered Harmful. Just barely over a page, it’s a good case study. The letter is good advice in the vast majority of cases: misuse of goto will quickly land you in a maze of twisty little passages, all alike. Less helpful were the creation of a social taboo (goto is the province of inferior programmers) and the absolutist calls for abolition. Dijkstra himself came to regret how “others were making a religion” out of his position, as quoted in Donald Knuth’s more level-headed paper, Structured Programming with go to Statements.

Taboos tend to accrete over time. For example, overzealous object-oriented design has produced a lot of lasagna code (too many layers) and a tendency towards overly complex designs. Chasing semantic markup purity, we sometimes resorted to hideous and even unreliable CSS hacks when much simpler solutions were available in HTML. Now, with microservices, people sometimes break up a trivial app into a hard-to-follow spiderweb of components. Again, these are cases of people taking a valuable guideline for an end in itself. Always keep a hard-nosed pragmatic aim at the real goals: simplicity, clarity, generality.

When Linus Torvalds started the Linux kernel in 1991, the dogma was that “monolithic” kernels were obsolete and that microkernels, a message-passing alternative analogous to microservices, were the only way to build a new OS. GNU had been working on microkernel designs since 1986. Torvalds, a pragmatist if there was ever one, tossed out this orthodoxy to build Linux using the much simpler monolithic design. Seems to have worked out.

Every programmer pays lip service to simplicity. But when push comes to shove, most will readily give up simplicity to satisfy dogma. We should be willing to break generic rules when the circumstances call for it. Keep it simple.

Grokbit

TLDR: I launched Grokbit, a code search and browsing tool.

When I was programming as a kid, I longed for a hardcore codebase, like a compiler or operating system, to really understand computers. That stuff just sounded so magical, some kind of Elvish secret way beyond mortals. There were decent books explaining how things worked, but that’s a poor substitute for code.

Then the Internet reached Brazil when I was about 14, and all of a sudden there was this “GNU C” compiler that was allegedly better than Microsoft’s, and the code was completely open! And what’s more, there was an open Unix you could run on your 386! No need to convince your parents to sell the car and buy a SPARCstation!

This was the best present any kid could hope for. This is why, despite a lot of issues in the tech community, my gratitude to these people is overwhelming. You might think Stallman is a lunatic, and you might be right, but damn - he’s the geek black-bearded Santa who brought the source to the children.

Now, reading this code was hard. The kernel, in particular, is tough to follow because entry points and flow of execution are unclear. There was no tmux, I didn’t know Vim, my regexes were weak. So I printed out a whole bunch of code in my parent’s dot matrix printer, and spread it on the floor. A poor man’s multi-pane Vim session. Here’s an interrupt handler, there’s a “bottom half”, and hey, look!, the syscall is right over there by the socks.

I think reading code is second only to writing code in making you a better programmer. When I write stuff like Anatomy of a program in memory or What does an idle CPU do?, a big part of the kick is sharing what I think are beautiful designs with people who haven’t seen them before.

Still, I always wished I could give readers a better interface to actually dive into the code. But our tools for handling an entire code base, especially in the browser, are just not good enough at the moment. Searching also needs a lot of improvement: I think we can do better than regexes and general full-text searching when it comes to searching code.

That’s why I built Grokbit. It has an indexing and search engine that’s entirely tailored to code, so you can search semantically, like “give me the definition of foo” or “search for an identifier named bar”. It’s also wicked fast: you get real-time suggestions even in the largest repos I could try.

But searching was only half the battle. Having a rich UI - especially a multi-pane one, was an absolute requirement for me. When you have function A calling B calling C, it can be enormously helpful to have the 3 of them in the screen at once, and navigate seamlessly. Plus being able to load multiple large files, having back/forward in the browser work well, and an overall smooth UX.

So that’s what I went for with Grokbit. It’s still crude, lots of low hanging fruit, but it has already been very useful, as you’ll be able to tell in future blog posts. But before I put more weekends into it, I’d like some feedback. Try it out, let me know what could be better, or which search features I should build sooner (many easy wins here). I hope it’s as fun to use as it was to build.

Finally, if you are interested in working on the project, reach out. I don’t know if this will become a SaaS app, or a feature in another product, or an open source project, but I am hellbent on solving this problem. Let’s carry the ring into Mordor.