The devs switch underlying databases for SparkFun.com and somehow, miraculously, the sky stays put firmly up above.
Back in August we pushed out a big change to the entire SparkFun.com stack. All the good stuff was happening under the hood and by all rights it should have been one of those "transparent" changes - nobody outside of our own dev group would know or care what changed and everything would still work. Meanwhile, said developers would have a cleaner playground on which to play and build.
But as it goes, the one untested thing that did in fact break was our currency converter. In short it made all of our prices look like $0.00 in every currency for the morning, so that was fun.
Well, last night we pulled everything down again. The last time was in preparation for making the leap from MySQL (actually MariaDB) to PostgreSQL, and this one was actually making that leap. If you're reading this then it worked! These words are being piped to you from a Postgres database wherein lies the rest of our many years' worth of data.
->
Last night's order of operations <-
There are plenty of angles on such a transition that could be explored. The fact that we used MySQL's query cache as a speed crutch for years without fully realizing it and having to dive deep into well-formed indexing and materialized views to retain speed with finer control is one. The nightmarish hodge-podge of character encodings that took herculean acrobatics to cram everything into UTF-8 is another. If you've got questions or just want to jaw about database geekery post a comment below; the devs are watching the site like hawks today.
Instead I'm going to avoid the dirty technical details and explore something even dirtier: business politics!
As mentioned here before, SparkFun.com is really the tip of a much bigger iceberg called Sparkle. Sparkle is an internal website where we hang every tool we need. It does everything but the core accounting. Think about what a business like SparkFun might possibly need to do besides the bookkeeping and you'll begin to get an idea of what Sparkle does. It's what the shippers use to ship almost a thousand orders a day. It's what the production techs use to track the builds of millions of widgets. It's what gets used to manage every product, customer, order, tutorial, blog post, job applicant... we even use it to manage the Dog Tribunal to an extent (that's another blog post entirely).
But Sparkle is woefully incomplete. One big shortcoming is our lack of inventory location (as recounted in this blog post from the beginning of 2013. A lot has happened over the course of the year and Sparkle's grown plenty of new tentacles but inventory location - the notion that Sparkle knows more than just how many of a given thing we have in the whole building - remains the most critically needed feature. We also have lofty plans for how rebuild checkout, push our education site into new territory, and about a centillion other projects screaming for attention.
It took some internal wrangling to argue the point but the devs and I put this Postgres transition on the road map firmly before ascending inventory location peak... and every other peak we'll be scaling in all coming years. Such things can be tough sell. After all, MySQL got us this far. Why dump it now, with so much effort and time, when we could be doing so much else?
For us it came down to using the right tool for the job.
->
That tool and that job are incompatible <-
Inventory location is going to be hard. We have a lot of data now, but it's a blip compared to what we'll have once we get to tracking every movement of bunches of items around the building and beyond. It will literally be the scaling of our database by at least an order of magnitude, possibly two. It could be done with MySQL, but what was the trade-off?
From the development perspective Postgres was a necessity. Not a slam dunk but objectively a better tool for our use case. At the end of the day, though, we're building SparkFun.com and Sparkle for the SparkFun community - the customers and the staff that makes it all go. If we're going to divert for a few months to do this thing then, well, we better have a good reason, right? And if we have countless reasons, all good, but they're nuanced and technical and tricky to impart to a non-technical audience, where does that leave us?
In the end it came down to trust, and this is the point of this blog post. The vast majority of the company probably still doesn't get why we went through this painful contortion at the expense of doing, well, a lot of other things. Sound a fury signifying nothing and all that. But I was relieved and energized by the fact that, after probing the technical comparisons and time lines to near exhaustion, all of the stakeholders from the high-up-manager-types on down didn't really need the deep understanding of the problem because they trusted me. They trusted the IT group on whose behalf I spoke and pushed and campaigned. Truth be told I didn't even need to push all that hard - some folks took me at my word that this was a necessary step. Such implicit trust could only be earned and must never be squandered. It's truly one of the strongest assets at this company: there is a lot of trust between departments. People expect that you generally know what you're doing and will do it well. I consider this a vital feature of a healthy organization so count us fortunate in that regard.
My team and I tested our company's trust pretty hard with this one. But at the end of the day I trust my team that we made the right move and it'll pay off.