lunes, 4 de septiembre de 2023

Yet another example of "code the stupidest thing first"

 I've read a post called "Code the shortest path first", and I couldn't agree more, really. The core idea is a well known one to the reader of this blog (me, mostly). It's not even about YAGNI, or "The Wrong Abstraction". It's about having that "tracer bullet" or "steel thread" that guides you through the implementation phases. Also, if the feature looks too big from the beginning, you can try "SPIDR".

Also, what I find is that my first code, it's usually highly compressed, and that also gives a first soul to the implementation that lives on during the implementation. 

jueves, 31 de agosto de 2023

postgres jsonb_diff and array_unique

 I recently copypasted a couple of functions from stackoverflow for my prostgres projects that are worth mentioning, cause they show a pattern on how to write plpsql functions:



Clojure and aplisms

The more I do APL and K, the more I see forks and hooks everywhere else.

That's just a couple of little helpers I've added in some clojure projects. One is "apply-if", which more than from APL, I'm taking it from any procedural language, where you can do `x= if p(x) then f(x) else x`. We used to do this a lot in lua (using ands and ors).


Second one is an attempt at the fork pattern.

Look how the avg looks like. it's verbose, but it captures the idea very close to how APL does it.



martes, 22 de agosto de 2023

2 classes of system designers

Here's a devastating quote from Andy Kinslow (NATO SOFTWARE ENGINEERING CONFERENCE ,1968) 


There are two classes of system designers. The first, if given five problems will solve them one at a time.

The second will come back and announce that these aren’t the real problems, and will eventually propose a solution to the single problem which underlies the original five.

This is the ‘system type’ who is great during the initial stages of a design project. However, you had better get rid of him after the first six months if you want to get a working system.


I found it in the Software Engineering conference pdf , but I didn't know anything about that "Andy Kinslow". And looking for that person, led me to some interesting findings, like an APL paper, or finding out that Andy Kinslow pioneered work on Time-Sharing-System, following leads of Corbato and Strachey

miércoles, 16 de agosto de 2023

Copy table from another db in postgres

 This one is somehow a followup on "Duplicate row in plain sql". What I needed in this case was to copy 

Both dbs live in the same physical postgres, and the main user has rights over everything, so the process was painless, and took 5 minutes to set it up (accounting for reading the docs).


The trick is to use the postgres_fdw and create a schema "dev" inside db_test, so I can access the tables in db_dev.


Now you can easily do your  `insert into table_to_copy select * from dev.table_to_copy;`

jueves, 29 de junio de 2023

Duplicate row in plain sql

 Another no-magic-here post, but quite a useful one when working with data in dev, and needing to massage it.

Have you found the need to duplicate a row, only with slight modification? SQL doesn't have a standard way to do it, and while some rdbms accept some sort of "SELECT * EXCEPT(id) FROM xyz", you might not have it available in your db (postgres)

The clearest solution I found is here:

It's all about creating a temporary table, and use "like" and "insert into", to avoid having to write all the fields:

CREATE TEMP TABLE tmp (like web_book);
INSERT INTO tmp SELECT * FROM web_book WHERE id = 3;
UPDATE tmp SET id = nextval('web_book_id_seq');
INSERT INTO web_book SELECT * from tmp;


In fact, we could simplify it further (at least in postgres) with:

CREATE TEMP TABLE tmp as SELECT * FROM web_book WHERE id = 3;
UPDATE tmp SET id = nextval('web_book_id_seq');
INSERT INTO web_book SELECT * from tmp;

viernes, 23 de junio de 2023

"delete limit X" is not possible in postgres...

And how to fix it.

 

So I was pretty sure that I had done some sort of large-ish destructive (as in side-effect-y) operations in batches of small numbers.  And today I had to delete a bunch of rows from a bunch of tables.

It turns out that postgres doesn't support "LIMIT" in UPDATE nor DELETEs.

 First of all, the misunderstanding when I remembered doing destructive operations in batches comes from calling postgres functions via select.

Let's say you have to kill the db connections to a db:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname='db_1234567890abcdef';

This, being a select, you can do things like:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname like 'db_%' limit 100;

So yes, you are batching operations, but it's because "selects" are not always side-effect-free.

So now, back to our real problem of doing updates or deletes in batches.

 It's not that it hasn't been asked or thought about. Here's one of the mail threads about it. Tom Lane's answer highlights some of the issues from the correctness perspective.


Ok, but if I really want to do it... what's the best way? Crunchydata has a blog post explaining multiple ways of doing it, but the main takeaway is to use CTEs to select the amount you want (using limit), and then delete by id "where exists" or "where id in".

WITH rows AS (
  SELECT
 something
  FROM
    big_table
  LIMIT 10
)
DELETE FROM big_table
WHERE something IN (SELECT something FROM rows);

Cool stuff. Nothing magical, but it's a good pattern to be familiar with.