In which we change some things in the database, and it's not exactly a simple task, and some things break. Lessons learned. etc.
tl;dr: SparkFun was down for a while last night, while we worked on some database stuff. We should get better at uptime, but right now we're mostly trying to get better at databases. This morning we realized we'd broken currency conversion, and thus the display of prices, on the site. Last night's downtime message / web nerd musings and some post-mortem on what we messed up follow the llama, for those interested.
So tonight, sparkfun.com's going to be down for a while. We're hoping for no more than an hour or two, although it could very well run longer.
This is where I concede that we could really get better at keeping the site up all the time than we are right now. We're working on that bit, but right now what I want to ramble about instead is databases.
Since approximately the dawn of time (which is to say since some time last decade, before I worked here), SparkFun has run the core of its business on MySQL (and recently MariaDB). When I got hired in the fall of 2007, a lot of the important code was still split between a hacked-up osCommerce installation and a pile of MS Access clients. That stuff has been gone for years now, but the database itself persists. In places it's practically a fossil record of offhand design decisions made by other people writing a shopping cart some time in 2003.
This probably happens everywhere. Application code may change drastically, but in a working shop, the data store has a way of becoming the scaffolding around which (for better or worse) everything is built and must be maintained.
As it happens, when we first realized that we needed a better toolset than not-very-object-oriented old-school PHP, we made some decisions with long-term consequences:
A lot of things informed those decisions: We didn't want to have to replace all the working software at once just to use better tooling for new code. Everyone on our team already knew PHP. The frameworks we looked at seemed lacking in comparison to the stuff we'd used in languages like Ruby and Perl. The available PHP ORM libraries were either heavier than we wanted or built on assumptions that didn't fit our existing (frequently terrible) schema. Finally, rolling our own sounded kind of interesting, and we had no idea what we were getting ourselves into.
(Ok, so in the interest of full disclosure, what happened is that I talked louder than people who knew what we were getting ourselves into.)
Fast-forward a couple of years during which we hire more programmers and SparkFun gets another half million orders. We're sitting around in the mountains drinking beer and talking about what we really should get better at and the subject of PostgreSQL comes up.
See, MySQL and its forks, for all of the really huge and amazing stuff built on them in the last decade, have got some drawbacks. On this point, I risk triggering a lot of strong feelings from the sort of people who have strong feelings about databases, so for the time being, let's just say that it started to seem like we'd have a better time writing business software on a database that embodied a more rigorous approach to formal types and relationships. And because of years of our nerd friends telling us we should probably just use PostgreSQL, it kind of seemed like PostgreSQL might be that database, especially since we already knew we were going to use something with open code.
So we had some more beers and we talked about it and we had some more beers and we talked about it some more, and maybe six months later we finally decided to go for it.
So that's what we're doing.
Sort of.
It turns out that, when you write your own ad hoc ORM layer against a piece of software like MySQL without the faintest idea of how far you're about to be in over your head, it's pretty easy to make a bunch of assumptions without even noticing them. Rumsfeldian unknown-unknowns, if you will, creep in at every turn. Like fish and the concept of water, people who work with a given data store for a long time begin to forget that certain abstractions are even objects of possible consideration. These bits of data are represented as strings in PHP? Well, that's natural, right? What else would they be? String comparisons in your database of choice are all case-insensitive? Well, heck, that's almost a feature, right?
So then you decide you're going to point things at a different database and you get a crash course in all of the stuff you were wrong about, and all of the things you made impossible to test, and all of the ways your implicit assumptions about the two or three or five different type systems you rely on across languages and representations of your model are actually pretty fractured and subtly wrong.
So what we're doing tonight is rearranging a bunch of MySQL schema and deploying a terrifyingly long list of code changes that get us about three quarters of the way to being able to actually migrate to PostgreSQL without all of our software exploding continuously. We sincerely hope. (As regards exploding software, I also have a lot of recently developed thoughts about how I'll never again write a large project in a dynamic language without just tons and tons of unit tests.)
I should really write something coherent about all of this, but I just realized we actually need to deploy the code and such now.
Anyway, thanks for bearing with us while the site's down. The colored wires and blinkenlights and such will be for sale again soon.
Postmortem: Ok, so deploying all that stuff went way more smoothly than expected. Then this morning I woke up to discover that we'd been showing customers who logged in after midnight MDT a price of $0.00 (or its equivalent in your favorite currency) for most products.
This is sort of an infuriating bug, because it comes down to a single line of code I personally missed changing while making a few thousand really mechanical, straightforward tweaks. Here's what that actually looks like:
diff --git a/services/update_currencies.php b/services/update_currencies.php
index 8c59226..a504378 100644
--- a/services/update_currencies.php
+++ b/services/update_currencies.php
@@ -135,7 +135,7 @@ class ExchangeRateParser {
}
if ($row=CurrencySaurus::findByCurrencyCode($this->code)){
- $row->value = self::$dbi->escape($this->multiplier);
+ $row->value = $this->multiplier;
$row->last_updated = $date;
print "updating record for {$this->code}: {$this->multiplier}...";
if ($row->update()) {
That call to $dbi->escape()
was always superfluous, but before it didn't break
anything. After changes we made to other libraries, it added a couple of quotation
marks to the value, and we wound up stashing 0s when we updated all of the
rates at midnight.
Of course, it's not really as simple as that, because it turns out that:
In short, this is like a textbook example of how not to do half a dozen things on the web.
Maybe we'll start doing a regular "learn from SparkFun's web development mistakes" series.
If you happen to notice anything else broken around here, do me a favor and leave a comment below. And thanks to everybody who has been super chill about the breakage. You all are approximately the nicest and most forgiving customers a nerd could hope for.