Thanks! I guess I had some gaps in my mental model of how postgres handles these...

da_chicken · on Aug 29, 2018

* Correct. Adding a column requires providing a value for that column, and it will generally use NULL or the default value of the column when doing so. It does that because it has to (a column must have a value even if it's NULL). The only other time you may see DDLs "changing" data is certain data type changes. For example, if you change an approximate numeric data type to a precise numeric data type (e.g., float vs decimal) or vice-versa or some other data type with different levels of precision (e.g., SQL Server datetime vs datetime2), but you may still get errors or warnings.

* Correct. Adding a constraint or unique index will always require scanning the data to complete the transaction to ensure that it's consistent with the new rules. It's safe to assume that modifying the schema of a table itself in essentially any way will always require an exclusive table lock to complete. Even creating an index will often lock a table. PostgreSQL does allow you to create some indexes without a lock with the CONCURRENTLY option [0] and other RDBMSs may have similar options, but that's an additional feature that you'll want to check in your specific implementation with their own caveats.

* Correct. The design of the relational model which traditional SQL RDBMSs implement as a whole is to focus on ACID [1] compliance and strict determinism. It's focused on being correct first and foremost, and things like performance and availability take a backseat to that (which is where NoSQL data stores step in with different focuses). You wouldn't want to have that kind of focus on a social media site where it's more important to be available first and who cares if a bit of data is lost or broken or out of date now and then, but a financial, government, or medical system must store data correctly and must correctly retrieve data that it claimed to store correctly. Accuracy is paramount in those systems. This is one reason why DBAs have fairly universal reputations as extremely cautious perfectionists (the other reason is that if you don't design your tables and normalize your data it becomes extremely difficult to work with down the line).

MySQL is somewhat notorious for breaking many of the above rules, especially prior to MySQL 5.0. That is half the reason why it tends to have such a poor reputation among DBAs (the other reason being that Oracle is a terrible vendor). MySQL used to allow you to store invalid dates, would silently modify or truncate data rather than error, etc. It's much, much better now, but reputations are hard to repair in a world of strict rules and deterministic functions.

[0]: https://www.postgresql.org/docs/9.1/static/sql-createindex.h...

[1]: https://en.wikipedia.org/wiki/ACID_(computer_science)