Untangling Obamacare's Web Glitches
The White House has released the most recent numbers on visitors to healthcare.gov -- 4.7 million unique visitors, reported Jake Tapper on Twitter (presumably, he means since the Oct. 1 launch). Other accounts put the number of visitors on the first day at 2.8 million. According to Tapper, the White House claims not to know how many people have enrolled. The cynic in me wonders if that number isn’t in the single digits.
No one I know has managed to create an account on the federal exchanges, as opposed to those operated by the states. Either they’re stopped at a “please wait” page that never does move them onto enrollment, or they get to the enrollment page and are presented with drop-down list of security questions -- an empty drop-down list. I tried putting random words in the boxes and hit "enter;" that brought me to a screen announcing that an account couldn’t be created at this time. Most people, though, got stopped at the security questions. Apparently some folks got through, only to be routed back out to the beginning of the account-creation process.
I put the question to one of my favorite programmer friends, one with quite a bit of experience working on high-traffic web applications. What the heck could be going on? My friend stated the obvious: “It's clear that they're getting more traffic than they can handle. The question is why they can't handle the traffic they're getting.” Load problems could explain servers hanging in California and New York … but the drop-downs? The standard explanation for this is “high load,” but high server loads don’t cause your security dropboxes to empty out.
“The drop-down thing is mystifying,” he told me. If federal exchanges decided to populate the security question fields by calling up a list of possible questions from another server -- one that didn’t have a lot of capacity -- then that might be causing the sign-up process to stall at that step. For an application that expects a lot of traffic, this is a very bad idea.
“Just cache them on the front ends, for heaven's sake, so you only need to ask once,” he said. “A database call to get questions shouldn't be in the critical serving path. If you're hitting the database just to load the security questions, then just serving individual pages is going to be expensive.”
The various glitches, he pointed out, “could very easily be because deadline pressure caused them to take some shortcuts that impacted their ability to scale.”
“The aforementioned let's-hit-the-database-for-security-questions thing.”
Why would they use such a seemingly obvious poor design?
“It can be easier to make a call to another server to get something when you need it than to implement a cache that you prepopulate either from static files or from the database on startup. Making a call to another server is also something you'd naturally think to do if you hadn't had to focus on scalability before. The security question page is probably not the thing you're most concerned about, so you give it to the new hire to do as their starter project. They don't know what they're doing, so they implement it the straightforward way … and since you're under unbelievable deadline pressure to get something working now nobody reviews it in detail.”
Obviously, we don’t know if this theory is correct -- but it does fit the particulars. If this is the problem, that’s relatively good news for the federal exchanges, since it shouldn’t take that long to fix -- according to my friend, “the coding change should be relatively minimal. Say a day's work for someone who knows what they're doing.” But he cautioned that this is wild speculation, since he doesn’t know what system they’re using, or what their change order and quality-assurance procedures are. Or whether they’ll decide that they absolutely need to hit a database for the security questions, and therefore need to scale up that database so it can handle the traffic load.
On the other hand, since almost nobody has gotten into the system, we don’t know if other bits will break when more than a few people try to use them. So it’s impossible to say whether the federal exchanges will actually be open for business in short order, or breaking down for weeks to come.