Consider SQLite

If you were creating a web app from scratch today, what database would you use? Probably the most frequent answer I see to this is Postgres, although there are a wide range of common answers: MySQL, MariaDB, Microsoft SQL Server, MongoDB, etc. Today I want you to consider: what if SQLite would do just fine?

For those who are unfamiliar, SQLite is a implementation of SQL as a library — this means that rather than running a SQL server that you program talks to, you embed the SQL implementation directly in your program, and it uses a single file on disk as a backing store. SQLite is a incredibly popular database — you likely have dozens or hundreds of SQLite database on whatever device you're reading this on, regardless of whether that's a smartphone or a tablet, and regardless of what manufacturer or browser it is. Historically, use of SQLite has usually been constrained to user programs (like browsers or apps), rather than servers. However, due to improvements in SQLite, hardware performance, and third-party libraries over the past ~decade, it now makes a lot of sense to use SQLite for many web projects.

In this post, I'll compare a typical "multitier architecture" with a web server and separate database server (postgres, etc) to running that same app all on a single machine using SQLite. I won't be comparing to more "cloudy" solutions (that'll be the subject of another post), but I think it's still useful to consider this sort of thing even if you end up using some sort of IaaS product. We'll look at scalability, performance, availability, operational complexity, and a few other things.

How scalable is SQLite, really?

There are three main metrics on which we might want a database to be able to scale: the total amount of data it can store, the read throughput, and the write throughput.

Write througput is the area where SQLite struggles the most, but there's not a ton of compelling data online about how it fares, so I got some of my own: I spun up a Equinix m3.large.x86 instance, and ran a slightly modified¹ version of the SQLite kvtest² program on it. Writing 512 byte blobs as separate transactions, in WAL mode with synchronous=normal³, temp_store=memory, and mmap enabled, I got 13.78μs per write, or ~72,568 writes per second. Going a bit larger, at 32kb writes, I got 303.74μs per write, or ~3,292 writes per second. That's not astronomical, but it's certainly way more than most websites being used by humans need. If you had 10 million daily active users, each one could get more than 600 writes per day with that.

Looking at read throughput, SQLite can go pretty far: with the same test above, I got a read throughput of ~496,770 reads/sec (2.013μs/read) for the 512 byte blob. Other people also report similar results — Expensify reports that you can get 4M QPS if you're willing to make some slightly more involved changes and use a beefier server. Four million QPS is enough that every internet user in the world could make ~70 queries per day, with a little headroom left over⁴. Most websites don't need that kind of throughput.

While mixed read/write workloads will be slightly slower than these numbers, the main problem that SQLite had with mixed read/write workloads (that writes would block reads) was fixed with WAL mode around a decade ago.

If the server I rented seems too expensive to you, I also tested it out with the cheapest ($5/month) machine I could get on DigitalOcean, and ran the 512-byte test on it. While one should expect more volatile results, since it's a shared host, I got 35.217μs/write and 3.340μs/read (28,395 writes/sec, and 299,401 reads/sec) — ~2.6× the time for writes and ~1.7× the time for reads — not bad for ~3% the price.

In terms of the amount of data that it can store, SQLite scales pretty well: it can hold up to 281TB of data, which is I think more than the sum total storage space of every disk I've owned in my life. While Postgres can store a theoretically unlimited amount of data in total, it has a lower limit on table size than SQLite does — 32TB with the default block size, up to 128TB with the maximum allowed block size.

How do these numbers compare to real apps? Let's look at two examples: recurse.social, a Mastodon instance serving ~100 users on Postgres, and Lobsters⁵ a news aggregation site with ~14.5k users running on MariaDB.

The Lobste.rs database is the slightly larger of the two, weighing in around 2200MB (largely comprised of the story_texts, votes, and comments tables, respectively). It sees around 28 SELECTs / sec, 0.1 INSERTs / sec, 0.1 UPDATEs / sec, and 0.004 DELETEs / sec⁶. This is quite small — well within the capabilities of any competent database system on reasonable hardware⁷ — when someone looked at it a few years ago, they concluded that Lobsters would need to see about 2,000× the traffic before the database was a scaling bottleneck on a 8-core machine.

recurse.social, despite the much smaller number of users still has a reasonably large database, since it has to sync with many other servers due to the nature of ActivityPub and the Fediverse. The total size of the database is 1543 MB (largely contained in statuses, mentions, conversations, and status_stats), just 0.0005% of SQLite's maximum database size. It sees around 2.7 transactions per second (Postgres doesn't easily let you separate this out into read/write transactions, sadly) — numbers that are trivially achievable on SQLite with no performance tuning and ancient hardware.

You might think that it'll be harder to scale SQLite, since it forces your web server and your database server to be on the same machine, which in turn forces you to only have a single server, meaning that you can't scale by just adding more web servers. However, in most cases, it's noticeably less efficient to run your database on a separate server from your webserver, since it massively increases the latency and decreases the bandwidth available between the database and the webserver. Comparing point query latency on SQLite and Postgres, Postgres is 9.5× slower when running on the same machine as the one doing the query, 17.8× slower when running on a different machine in the same AWS Availability Zone, and 50× slower when running in a different AZ but the same region.⁸

A significant part of the reason that people discount SQLite as a web database server is that in the past, it probably wasn't a good choice! SQLite has relatively quietly gotten significantly faster over the past decade — on my laptop, a recent version of SQLite completes the speedtest1 suite ~4.1× faster than a version of SQLite from 2010 on the same hardware (5.328s now, vs 21.849s then). Running speedtest1 on a $5/month DigitalOcean instance is around 10.8× faster than running the same benchmark with 2010 SQLite on a VPS running on a Intel Xeon E5-2450L with a HDD, which was a fairly typical midrange server in ~2012.

Not only has low-end server hardware improved significantly, but the upper limits of how much you can scale by just buying a bigger (but still commodity) machine have massively increased. These days, you can get servers with 8TB of RAM, hundreds of cores, multiple 100Gbps NICs, and SSDs with speeds approaching the same order of magnitude as RAM, which makes being limited to a single machine much less worrisome from a scaling perspective.

In the past decade, high-end servers have gotten two or three orders of magnitude better on nearly every metric, while website traffic has remained relatively steady. At the same time, SQLite has made significant performance gains, and fixed many of the problems that prevented from making sense as a webapp database — for instance, WAL mode (enabling concurrent reads and writes), JSON support, upsert support, improvements to ALTER TABLE, and strict typing. SQLite has been extremely impressive in the number of features and performance improvements that they've been able to add while maintaining extremely high levels of backwards compatibility and code quality⁹.

SQLite does have some legitimate scaling limitations, just like every system — if you need to write dozens of MB/s to the database, or have very complex or long-lived write transactions, it probably isn't for you. But the limits are likely higher than you expect!

But what about reliability?

"But Wesley!" I hear you shouting at your screen "If I run my website on one server, and that server goes down, then my entire website goes down!"

That's true. But is that a problem for you? No one who takes reliability seriously aims for 100% availability — even (and especially) giant companies like Google and Amazon work on reliability budgets. Say, for instance, you want an availability of 99.99% — that translates to around an hour of downtime per year. There are many VPS providers that will give you VMs that have better uptime than this (which they do via a combination of redundant hardware and by live migrating off machines with flaky hardware behind the scenes).

Most sites only need ~99.99% availability, since you get significantly diminishing returns as soon as you're more reliable than your end user's ISP or home network. Most internet users have less than 99.99% availability just from their ISP having problems or their cat eating their ethernet cable or whatever — why try to be orders of magnitude more available than that, when it won't make a significant dent in the uptime as it's experienced by the end user?

If you really do want more availability, though, you can still get that using SQLite without going for a fully distributed system, by using Litestream or Verneuil to replicate your database to a backup server, and cutting over if your main server goes down. This isn't quite as good as running a distributed database cluster for two reasons: failovers aren't automatic (unless you want split-brain problems), and you may lose a few seconds worth of writes when you fail over, but that's often an acceptable tradeoff. If you automate your ability to do failovers, you should be able to get to 99.999% availability using this technique.

Litestream is another example of a tool that's only recently come on the scene that makes using SQLite as a webapp database viable in significantly more situations than it has been in the past — instead of having daily or hourly backups, you can have a streaming backups that will rarely be more than a few seconds out of date.

It's only when you start needing significantly more than 99.999% uptime that you start to need truly distributed systems, and ironically, it's often the complexity of those systems that ends up causing downtime! For instance, GitLab lost ~6 hours of data and was down for ~18 hours due to difficulties operating their distributed Postgres cluster.

What does it get you, though?

It's all well and good that SQLite is scalable and reliable enough for most needs, but what's the point of using it instead of a more traditional client/server database?

The most obvious reason is simplicity and ease of deployment: with SQLite, your database is a single file, and you don't need a to configure and run database server separately from your app server. This also makes testing extremely easy and fast — you can store test databases in your source repo if you'd like, and you don't need to have some way of installing and configuring postgres in a docker container or whatever to run tests. It shouldn't be understated what an advantage this is — one of the main complaints about end-to-end tests is how long they take, and replacing a heavy database server with SQLite can cut down significantly on that time, allowing you to write many of all of your tests as end-to-end tests, while still maintaining a quick test suite.

Beyond that, SQLite is significantly more efficient than Postgres/etc in many cases, since there's no per-request client/server roundtrip latency — this allows you to care significantly less about the N+1 query problem, which in turn, lets you structure your application in ways that are nicer:

Using the N+1 Query pattern in Fossil does not harm the application. But the N+1 Query pattern does have benefits. For one, the section of the code that creates the timeline query can be completely separate from the section that prepares each timeline entry for display. This provides a separation of responsibility that helps keep the code simple and easy to maintain. Secondly, the information needed for display, and the queries needed to extract that information, vary according to what type of objects to be shown. Check-ins need one set of queries. Tickets need another set of queries. Wiki pages need a different query. And so forth. By implementing these queries on-demand and in the part of the code dealing with the various entities, there is further separation of responsibility and simplification of the overall code base.

So, what's the catch?

There are a few legitimate downsides to using SQLite. First off, the type system. It's bad. Luckily, as of a month ago, you can use strict typing instead, which somewhat improves the situation. It'll take a little while for this to get rolled out to things like the python stdlib, but it'll be excellent when it's there.

Support for migrations is worse — SQLite has essentially no support for live migrations, so you need to instead make a new table, copy the data from the old table into the new one, and switch over. This is slow if you have a very large number of rows, and quite difficult to do without downtime in general. For most applications, it's fine to take a few seconds or a minute of downtime when doing a complex migration, but if you have many gigabytes or more of data that you want to migrate, it's worth looking carefully at how difficult that'll be.

It's significantly harder to geo-shard your app, although you can get very far with CDNs and caches, and once Litestream supports read replicas, there'll be another excellent tool for improving this.

SQLite is often not well-supported by web frameworks, because it's historically been seen as a toy database within that field — for instance, while Django supports it, they say that their migration tool "can be slow and occasionally buggy" when used with SQLite (despite SQLite being the Django's default database!). Many web frameworks also push you towards using many connections, when SQLite is best used with each thread having a single connection to the database that stays open for the entire duration of the program (while connection pools almost do this well, it can be difficult to configure them in a way that's ideal for SQLite).

Running on a single server also may be a poor choice if the domain you're in requires running very heavy computations, or if the language you're using is extremely slow.

Using SQLite has some failure modes that most programmers are not used to — most notably, if you begin a transaction, then go off and do some blocking operation, you will be blocking all writes for the time that you're doing that, unlike in a connection-based database with a connection pool. If you want to write fast and scalable apps with SQLite, you need to think carefully about your architecture and schema. These are things that you should be thinking carefully about anyways, SQLite just makes some of the failures worse if you don't.

It's worth noting that if you find out that these things are problems down the line, it's generally relatively easy to migrate from SQLite to Postgres — Postgres provides a tool to do the migration, and is generally good at labeling which of their features are non-standard¹⁰.

On the whole, I think using SQLite is a good tradeoff for a lot of projects, including webapps that expect to have a potentially large number of users. As long as you don't expect to need tens of thousands of small writes per second, thousands of large writes, or long-lived write transactions, it's highly likely that SQLite will support your usecase. It significantly reduces complexity and operational burden and eases testing, with the primary downside that it's somewhat harder to get levels of availability and uptime that almost no one needs in the first place. I've been using it for thoughts.page for the past ~9 months and have been quite happy with it, and anticipate using it more in the future.

Thanks to Jamie Brandon, Julia Evans, Pierre-Yves Baccou, Adrien Lamarque, Roland Crosby, Michael Malis, and Andrew Emil for early feedback/comments on this post.

See this gist for the details, including patches, compiler version info, etc. ↩
This test is synthetic and quite limited — it's read/writing a blob, it doesn't have to update any indices or triggers, etc. It's about the simplest thing that can be done, so if you have a complex schema, you should expect to get worse performance. This is also testing 100% write load and 100% read load separately, whereas in reality, writing in WAL mode will slow down reading somewhat. But it's useful in benchmarking to get a upper bound on performance, and many common queries in real systems are surprisingly simple! ↩
It's worth noting that synchronous=normal does lose some durability in the case of a system crash — transactions that were committed just before a OS crash or power loss may be rolled back. This is a reasonable tradeoff in many cases, but you probably want to stick with synchronous=full until you need the extra performance. ↩
Obviously you won't be able to actually serve that many HTTP requests, for instance, since there's more overhead and bottlenecks than just the database. But a million QPS on one server was trivial nearly a decade ago, and with some care, 9 million QPS on one medium-end serer was doable half a decade ago. See this excellent post from Julia Evans for some more details about this sort of thing. ↩
Thanks to pushcx for running some queries on the Lobsters database to get these numbers for me. ↩
These numbers are averaged across a ~24-hour period, and as such, one should be careful to remember that they aren't representative of peak load. However, these numbers are so low that that doesn't really matter. ↩
I don't mean to suggest here that any of this stuff is easy, necessarily — Lobsters has had database scaling issues in the past (although one might say that was due to the lack of a competent database system...). My point is more that most sites do not deal with very much traffic, when compared to the speed of modern computers, and that many of the scaling problems that people have are due to poorly written code or poorly thought out schemas/architecture, rather than not using a scalable enough database. And in fact, many database scaling problems become significantly easier to fix if you don't have to worry about the N+1 query problem, as with SQLite. ↩
This makes a lot of sense if you think about what's actually happening here: for RAM access, latency is probably ~100ns and bandwidth ~170 GB/s, whereas latency between machines is more like 90,000ns in a good case, and bandwidth around 12.5 GB/s if you have absolutely top of the line hardware, and is more typically 1.25 GB/s or even 0.125 GB/s. You'll also see much more consistent performance with SQLite than you will with postgres, since the latency distribution of RAM access is tiny compared to the latency distribution of network requests (even requests on the same machine). ↩
The lead developer of SQLite actually said "We still are adding a lot of features, and we do a lot of changes. We don’t talk about the rate of code churn very much, because that would scare people…" If you're scared, though, taking a look at how they do testing will likely assuage your concerns. ↩
Jamie Brandon tells me that SQLite is slightly less compliant than Postgres: The SQLite logic test suite has a few percent failures on Postgres, and the Join Order Benchmark gets the wrong results on SQLite. On the whole, though, the differences are quite small. ↩