I’ve put in my fair share of time with PostgreSQL. It has no shortage of functionality and I believe its expansive feature set drove a lot of its growing popularity in the 2010s. As my time with the database has worn on I’ve come up with a handful of features that I wish the database had. Some of these are ideas borrowed from other systems, some are inventions of my own. The real tragedy is that I simply don’t have the time to implement the patches myself.
–i-am-a-dummy mode
This idea comes from MySQL whose stock CLI client supports an –i-am-a-dummy flag. When enabled, MySQL identifies any UPDATE or DELETE that lacks a WHERE clause and fails them with an error. In other words, if you intended to DELETE FROM table1 WHERE pk = 10
, started typing it into your prompt, and only got as far as DELETE FROM table1
before accidentally executing the query, the query errors out instead of having every row deleted. More modern versions of MySQL also use the flag to block queries that return excessively large result sizes, require the WHERE to reference a key column, or allow a LIMIT in place of a WHERE.
Unit test mode (random result sorting)
The order of a query’s results is famously unspecified unless you nail it down with an ORDER BY. The flexibility is useful: sometimes you need a stable order for whatever code consumes the query results and sometimes you just don’t. But there is a secret, third thing: some code comes to accidentally depend on a coincidental ordering of the results. To help flush these out it would be neat if PostgreSQL could intentionally shuffle the order of query results if a given query did not specify an ORDER BY. I imagine this as a run-time configuration field for the database that you’d turn on when running automated tests or in a testing environment. The alternative, folding every query in your application into a subquery that randomly sorts the results, is ugly and invasive, although perhaps some ORMs could implement the feature at their level.
Update: User avianlyric on Hacker News pointed out that CLUSTER can can be used to reorder results in a pinch.
Query progress in psql
This is another feature from MariaDB. It’s possible in both databases to query the database to get progress information on several types of queries, but the ALTER TABLE, CREATE INDEX, DROP INDEX and LOAD DATA INFILE queries also automatically inform the CLI client on their progress. The CLI uses this to draw a nice animated status bar with some informational messages. PostgreSQL lets you query the system catalog for the progress of several query types but has no live update functionality in the psql client.
Pandas-like join validation
I am a true database expert who has never screwed up a join before. But there are many beginners out there and it would be neat to extend the SQL language to support validation features like Pandas has for its join function. When doing a join you’d specify what relation you expect to have unique keys and which one has potential duplicates and PostgreSQL could warn or panic when this assertion is triggered. Some hypothetical syntax:
SELECT x, y FROM t1 JOIN t2 USING (key) VALIDATE 1:m
Which would have PostgreSQL check that t1.key has no duplicate rows and errors if it comes across any. This is another place where it would be useful to have a runtime option to disable the expensive uniqueness checks.
Pandas’ “suffixes” argument, which automatically renames overlapping columns in the output, and “indicator”, which emits a ne