Migrating from SQLite to CockroachDB
Sunday 23 June 2024 ยท 14 mins read ยท Viewed 101 timesTable of contents ๐
Introduction ๐
CockroachDB has gone mad and is now proprietary (source-available). While this can be tolerated, the issue is that they don't off a free tier anymore.
I therefore recommend trying YugabyteDB.
The last 3 days, I have been working on migrating my application from SQLite to CockroachDB. The reason is that I'm actually using NFS to share the SQLite database between my servers, and it's not really efficient. Grafana started to show serious "sync" issues, meaning that the database was locked for too long, causing my alerts to go wild.
This article is a simple review of the migration to CockroachDB.
What is CockroachDB? ๐
CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. The syntax is similar to PostgreSQL, but some features are missing, and some defaults are different (It's saner in my opinion).
CockroachDB offers quite a lot of features, such as:
- Distributed SQL: CockroachDB is a distributed SQL database that scales horizontally using the Raft consensus algorithm to ensure that data is always consistent.
- Geo-partitioning: CockroachDB allows you to partition your data based on the location of your users, ensuring that data is always close to where it's needed.
- Automatic failover: CockroachDB automatically handles failover, ensuring that your data is always available.
- Backup and restore: CockroachDB offers a simple backup and restore mechanism that allows you to easily recover from data loss over S3.
Missing features and critical differences compared to PostgreSQL ๐
To migrate from SQLite to CockroachDB, you should know there are some critical features missing compared to PostgreSQL:
ALTER TABLE ... ALTER COLUMN ... TYPE
: This feature is experimental and can be enabled withenable_experimental_alter_column_type_general=true
as a session variable. This feature does not work with transactions. You may need to manually update your schema.- The default integer type is
INT8
instead ofINT4
in PostgreSQL, which can cause conversion issues (like with Rust or JS). To set the default behavior, usedefault_int_size=4
as a session variable. SERIAL
is not same as in PostgreSQL. CockroachDB defaults toINT8
and doesn't use sequences. Instead, it callsunique_rowid()
to generate a unique ID. To set the default behavior, useserial_normalization=sql_sequence_cached
as session variable.- Locale "C" is not supported.
- A
rowid
column is automatically created if the table doesn't have a primary key. This can conflict with some migration steps (like adding a primary key later).
Deployment ๐
CockroachDB deployment is quite simple. Today, CockroachDB recommends using the cockroach-operator
to deploy CockroachDB on Kubernetes. The operator is a Kubernetes operator that manages CockroachDB clusters. It automates tasks like scaling, backups, and upgrades.
While it's quite powerful, I prefer using the Helm chart, which is more explicit and easier to manage. The Helm chart is available on the official Helm repository.
Since the deployment is quite simple, I won't go into details. You can find the official documentation on the CockroachDB website. Just remember to use cert-manager
instead of their signer.
Migration ๐
The migration steps are:
-
Dump the SQLite database:
1sqlite3 mydb.sqlite 2 3.output mydb.sql 4.dump 5.quit
-
Create a new database and its user in Cockroach:
1cockroach sql --certs-dir=certs --host=my-cockroachdb-public
1CREATE DATABASE mydb; 2CREATE USER myuser WITH PASSWORD 'mypassword'; 3GRANT ALL ON DATABASE mydb TO myuser;
-
Apply compatibility flags:
1ALTER ROLE grafana SET enable_experimental_alter_column_type_general=true; 2ALTER ROLE grafana SET default_int_size=4; 3ALTER ROLE grafana SET serial_normalization=sql_sequence_cached;
-
Start the application to be migrated with PostgreSQL:
1export DATABASE_URL=postgresql://myuser:mypassword@my-cockroachdb-public:26257/mydb?sslmode=require
-
Wait for crashes and fix them.
-
Import the data from the SQL dump. Basically, edit the file
mydb.sql
to only have theINSERT
statements. Check also forBOOL
since SQLite doesn't support it and use integers instead. Replace the value withTRUE
orFALSE
. After that, runs the SQL commands manually to insert the data.
Adding monitoring ๐
CockroachDB offers monitoring out of the box by using Prometheus and the ServiceMonitor
CRD. You can find the official documentation on the CockroachDB website.
Adding backup ๐
Since we are now using a complex database, it is good practice to back up regularly. CockroachDB offers a simple backup and restore mechanism that allows you to easily recover from data loss over S3.
I'm using a S3-compatible storage, therefore I can use the following command to backup my database:
1CREATE EXTERNAL CONNECTION backup_bucket AS 's3://my-bucket?AWS_ACCESS_KEY_ID=my-access-key&AWS_SECRET_ACCESS_KEY=my-secret-key&AWS_REGION=my-region&AWS_ENDPOINT=my-endpoint';
2BACKUP INTO 'external://backup_bucket';
To schedule backups, CockroachDB is able to schedule:
1CREATE SCHEDULE backup_schedule FOR BACKUP INTO 'external://backup_bucket' RECURRING '@daily' WITH SCHEDULE OPTIONS first_run='now';
Last improvements ๐
On the S3 provider, remember to add lifecycle rules to archive the backups and change the storage class to GLACIER
after a certain period. This will save you a lot of money.
What has been migrated? What couldn't be migrated? ๐
Grafana was the first application to be migrated. The issues are already been cited:
ALTER TABLE ... ALTER COLUMN ... TYPE
not working with transactions.SERIAL
not the same as in PostgreSQL.- Integer type default is
INT8
instead ofINT4
. - Locale "C" is not supported.
There is still one issue about ctid
not being found.
Second application to be migrated: VaultWarden. The issues are the same as Grafana.
Lastly, the blog, which was developed with CockroachDB in mind.
One thing that couldn't be migrated is Joplin, the note-taking application. Let me lash out on this product for a moment:
- Coded in TS and it LAGS like hell.
- Hard-coded PostgreSQL configuration compared to Knex.js.
- SQLite database is 5000MB?! While there is only 12MB of data?!
- After rebooting back to SQLite ('cause I couldn't migrate to CockroachDB), the password is denied?!
Nope, f- Joplin. I'm going with Obsidian + Syncthing.
Conclusion ๐
CockroachDB is cool, cool like cool for kids. To deploy CockroachDB, I didn't need to configure much, but the migration was a bit painful, and I'm not sure if there are any runtime issues.
Monitoring-wise:
The database seems to reach around 2GB of memory, with Grafana being the intensive player. You can see that the cockroach-client-secure
is leaking, I had to kill it.
Other than that, I can just the that the dashboards are pretty cool:
Overall, yeah, it's cool. Grafana is stable, VaultWarden is replicated, CockroachDB is replicated and backed up. I guess I reached a good level of stability.