Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can you elaborate more? I think the cron-solution will be unable to synchronize with your application code to determine when to stop writing, so more background and your solution would be of interest.

As I understand it, while you do the backup, other writes should go to the WAL log and only get commited until after the backup?



I did some reading, and there are 2 ways to use the SQLite backup API:

1. call backup_init, backup_step with a step size of -1, then backup_finish. This will lock the db the whole time the backup is taking place and backup the entire db.

2. call backup_init, backup_step in a loop until it returns SQLITE_DONE with a positive step size indicating how many pages to copy, then backup_finish.

With method 2, no db lock is held between backup_step calls. If a write occurs between backup_step calls, the backup API automagically detects this and restarts. I don't know if it looks at the commit count and restarts the backup from the beginning or is smart enough to know the first changed page and restarts from there. Because the lock is released, a continuous stream of writes could prevent the backup from completing.

I looked in the sqlite3 shell command source, and it uses method 2. So if using the .backup command with continuous concurrent writes, you have to take a read lock on the db before .backup to ensure it finishes. It would be nice if the .backup command took a -step option. That would enable the -1 step size feature of method 1. The sqlite3 shell uses a step size of 100.

Another option would be to check backup_remaining() and backup_pagecount() after each step, and if the backup isn't making progress, increase the step size. Once the step size is equal to backup_pagecount() it will succeed, though it may have to lock out concurrent writes for a long time on a large db. There's really no other choice unless you get into managing db logs.


It's not what I've experienced!

In my experience, as soon as there is some new data coming in the DB, the .backup command will continue, and if the writes are not stopping, the backup will never stop as well :D

In Gladys case, we put in the application logic a blocking transaction to lock writes during the backup. I haven't found any other way to avoid infinite backups in case of write-heavy databases


I'm using VACUUM INTO, which does basically that: https://sqlite.org/lang_vacuum.html#vacuuminto

> The VACUUM command with an INTO clause is an alternative to the backup API for generating backup copies of a live database....The VACUUM INTO command is transactional in the sense that the generated output database is a consistent snapshot of the original database.

EDIT: Litestream docs will also recommend that: https://github.com/benbjohnson/litestream.io/issues/56


I ran into the case of needing to back up a write-heavy database without blocking anything, and came up with a solution: Writing a VFS ( https://www.sqlite.org/vfs.html ) that makes the application move the Sqlite journals file to a directory for processing instead of deleting them. Another process reads them to see what pages were touched and can very quickly get the changes to update the backup.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: