We have to stop insisting that software updates, etc. need to be distributed over HTTPS.
Very often when I say that, all my security friends (and even non-friends) have strong reactions:
No, you don’t know what you’re talking about. It’s very important, because we need to mitigate this or that attack.
My answer goes like this:
No, no, no… Sure, distribute the metadata (list of packages, versions, checksums) over HTTPS all you want. But the big bits - you can serve that over HTTP, FTP, etc.
And the reason being that serving over HTTPS costs a lot of money. Not because TLS is complicated, but because if you’re using HTTP or FTP, then you can just let the world mirror your stuff.
That’s the way that Debian and Slackware and all these distros have operated for decades, on a shoestring budget.
Let’s take Docker Hub. I’m not going to give you numbers from when I was at Docker, because I don’t even know these numbers, and I wouldn’t remember them anyway. Just taking the public numbers from the beginning of this year… Docker said in some PR stuff that they had about 15 petabytes of images on Docker Hub.
Storing that on S3 would be at least $300,000/month, and that’s not counting transfer costs.
Transfer (again, I took some numbers that Docker published in the beginning of this year) is like 8 billion pulls per month. If we go with an average of 10 megs per pull, which is really low…
That would give you a bill of four million dollars per month just to operate Docker Hub. And these are pretty optimistic estimations.
So if only that was mirrorable easily over plain HTTP, FTP, etc. and you just served the metadata over TLS (and perhaps have an origin copy over TLS for the one odd scenario where somebody is running this attack against you that would otherwise prevent you from updating)… I’m not saying that this would have changed the fate of Docker, but I’m curious to see what the parallel universe where things have been made differently in that regard looks like.
A world where you can have something like Docker Hub that doesn’t end up costing in the six, seven, eight digits range per month to some company somewhere.
How much would that actually save?
I think it would save like 99%, or something like that… Which sounds completely like “What?!”
But if you look at Linux distros (and I’m talking about stuff like Debian, Slackware, Arch Linux)… you know, there is not a Debian Inc. or Arch Linux LLC or whatever paying for all the mirrors.
It’s just companies, universities, labs, ISPs, etc. who decide to just mirror all that. Because they feel like it’s the public good. It’s the commons. It’s something that we maintain.
At some point when I was running a hosting company in France a while ago, we had mirrors as well. First for our own convenience, because when we deployed machines, it was so convenient to have something in our network, and it was also good to make that available for others.
At the end of the day, I think it would slash the costs by maybe 100 or 1,000. Something like that.
We ran a Twitter poll to see what people thought of Jérôme’s opinion (like we do with each and every unpopular opinion). The results were split, but slightly to toward the 👎
Tell us what you think of this idea in the comments below! And don’t forget to follow Go Time in your favorite podcast app so you don’t miss future unpopular opinions. ✌️