You ever been nerd-sniped?
It happens to me regularly, and on Friday I was nerd-sniped by Jamie when he posed this into the ether:
I've been doing data projects for yrs & still don't have a consistent way of organising things. It's doing my head in. Anyone have a system for organising #rstats, #python #qgis & web stuff on a project by project basis, but with a way of keeping them accessible across projects?
— Jamie Whyte (@northernjamie) November 29, 2018
System and data design is kind of my bag, so I immediately shot back that a database would be best but a web api would be super cool.
And then, because I’ve been itching to try out something serverless, I built a tiny proof of concept. Tiny in capability. It took me most of today to actually do.
Jamie pointed me towards a juicy dataset from the Department for Transport concerning road accidents. I threw three giant CSVs into a database, and then struggled to put them into an AWS relational database (RDS). Setting up a cloud-based database is difficult, because the default way of doing things appears to involve spinning up a separate machine to export data onto, as a sort of staging server, and then moving that onto the database server.
Eventually I hacked around it by temporarily opening the database up to everyone, frantically throwing stuff in there, and then locking it up again. It doesn’t seem sustainable. There’s another column in there that I don’t really need, but I can’t work out how to drop it without the enormous palava I went through just trying to get it on there in the first place.
It’s at this point that my brain does the thing that is the basis for a joke that I quite like.
Suppose you are given a phone with a smashed screen that seems otherwise to be working. What should you do if you wish to keep it? Naturally, the answer is to take it to be repaired, pay a fee, wait some time, collect it, and wipe it, ready for either resale or reprogramming. So far so good.
What, though, should you do if you are given a phone whose screen is unsmashed?
Simple: smash it, thereby reducing the problem to one you already know how to solve.
There are a number of variations on this joke, involving Newton and cats or physicists and kettles. The reason I’ve written it like that is because me — or that bit of my brain that gets itchy around imperfect systems — would very much like to just burn the thing I’ve built to the ground, because the solution to my current frustration is hard and unknown while rebuilding it is just hard.⁰
In any case: getting data into the database was very hard. Actually building a tiny little function to grab data and return it was easy: the whole code runs to about 30 lines. It’s also not much use at the moment. It just returns the primary key of an incident report, but does at least give a genuinely horrifying glimpse into the number of traffic accidents that happen on our roads: 129,982.
I’ve not open-sourced the code yet because it’s got details in it (bad practice), but I’m going to continue tinkering with it. I’m not sure how more persistent things, like a database connection or an object relational mapper, fit into this invocation model. Only one way to find out.
If you’d use an API like this, what data would your web application like to consume? Answers in the comments or to me on Twitter
⁰ I’m skating close to something honest about my romantic relationships, so I’m going to do a quick pirouette and skate in the opposite direction