Using GORM – Part 3: Models and Idempotency
Previous posts in the series:
Part 1: Introduction
Part 2: Transactions and Save Points
So far, GORM has helped us abstract our data-access logic and removed the need to write SQL queries directly, but is everything perfect? In this post, I will cover some of the pitfalls I’ve noticed and advise on how to avoid them.
Modeling
With GORM, it is common to use the Model() function when updating existing records. This is because it removes the need to specify the table name, the primary key, and possibly other arguments.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | func TestUpdateWithModel(t *testing.T) { db := createIsolatedDatabaseConnection(t) // Assuming we have a bill object var bill *Bill err := db. First(&bill, Bill{UUID: "a49140b2-ac55-488d-883d-941c7d4bc81c"}). Error require.NoError(t, err) err = db. Model(bill). Updates(Bill{DueDate: time.Now().Add(24 * time.Hour)}). Error require.NoError(t, err) } |
This will generate the wanted SQL query to make the update as intended:
1 | [1.677ms] [ROWS:1] UPDATE "bills" SET "due_date"='2023-03-08 03:39:51.653' WHERE "uuid" = 'a49140b2-ac55-488d-883d-941c7d4bc81c' |
But now, what will happen if the “User” object is set on the bill?
Let’s change the code to preload the user object to find out.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | func TestUpdateWithModel(t *testing.T) { db := createIsolatedDatabaseConnection(t) // Assuming we have a bill object var bill *Bill err := db. Preload("User"). First(&bill, Bill{UUID: "a49140b2-ac55-488d-883d-941c7d4bc81c"}). Error require.NoError(t, err) err = db. Model(bill). Updates(Bill{DueDate: time.Now().Add(24 * time.Hour)}). Error require.NoError(t, err) } |
With this example, the generated SQL queries will look like this:
1 2 | [1.486ms] [ROWS:0] INSERT INTO "users" ("uuid","first_name","last_name") VALUES ('04469cf4-322b-4ce4-95c4-3bc3b1a529f6','Alexander','Sirotin') ON CONFLICT DO NOTHING [3.309ms] [ROWS:1] UPDATE "bills" SET "due_date"='2023-03-08 03:42:28.23' WHERE "uuid" = 'a49140b2-ac55-488d-883d-941c7d4bc81c' |
Wait. What? It looks like GORM is trying to create the User object as it is associated with the bill object. This is a typical pitfall developers usually miss, as this is a side-effect that is not always intended. In this example, no harm is done as we have the user, and nothing changed, but you may model objects with other details that may accidentally create new objects or modify existing ones.
One thing to prevent this from happening is to omit all associations when using GORM with any mutating operations (INSERT, UPDATE, DELETE). The updated, relevant piece of code will look like this:
1 2 3 4 5 | err = db. Omit(clause.Associations). Model(bill). Updates(Bill{DueDate: time.Now().Add(24 * time.Hour)}). Error |
Furthermore, it gets even more interesting when you are updating multiple records. Assuming I have a few bills (not necessarily related to the same user), I want to update their due date. My tables look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 | test_db=# SELECT * FROM users; uuid | first_name | last_name --------------------------------------+------------+----------- 79cc783f-9a2f-4ca5-a1ce-c21b81b6602f | Alexander | Sirotin c2641724-6e2c-4cb7-8aed-506d4c87e12b | Bob | Marley (2 ROWS) test_db=# SELECT * FROM bills; uuid | user_id | amount | due_date --------------------------------------+--------------------------------------+--------+------------------------------- 43069918-ea84-4ac2-886b-9abe64ef78d9 | 79cc783f-9a2f-4ca5-a1ce-c21b81b6602f | 1000 | 2023-03-07 03:56:19.908971+00 8fc031b8-9b32-4838-9ef9-13c2427bea00 | c2641724-6e2c-4cb7-8aed-506d4c87e12b | 500 | 2023-03-07 03:56:19.913301+00 (2 ROWS) |
Now, let’s have the following piece of code to update the due date of all the overdue bills.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | func TestUpdateWithMultipleModels(t *testing.T) { db := createIsolatedDatabaseConnection(t) var bills *Bill err := db. Where("due_date < ?", time.Now()). Find(&bills). Error require.NoError(t, err) err = db. Model(bills). Updates(Bill{DueDate: time.Now().Add(24 * time.Hour)}). Error require.NoError(t, err) } |
The generated SQL queries will look like this:
1 2 | [2.387ms] [ROWS:1] SELECT * FROM "bills" WHERE due_date < '2023-03-07 04:01:38.68' [1.433ms] [ROWS:1] UPDATE "bills" SET "due_date"='2023-03-08 04:01:38.683' WHERE "uuid" = '43069918-ea84-4ac2-886b-9abe64ef78d9' |
That’s interesting, right? I expected two updates, but instead, I got only 1 row. The reason is that GORM is not intelligent enough to complain that although I am using Find() and there are more than one record that matches the condition, I passed a single Bill variable, not a slice.
Practical Advice: In case I am passing “real” objects into GORM to generate some mutating operations on my data, I always use Omit(clause.Associations)
Tip: Always have some logging or metrics around these things so you can monitor your program and find such issues. It is also a good practice to have unit tests that test calls with multiple records.
After fixing the code to have a slice of bills, the result is as expected:
1 2 | [2.356ms] [ROWS:2] SELECT * FROM "bills" WHERE due_date < '2023-03-07 04:08:29.434' [1.694ms] [ROWS:2] UPDATE "bills" SET "due_date"='2023-03-08 04:08:29.437' WHERE ("uuid" = '43069918-ea84-4ac2-886b-9abe64ef78d9' OR "uuid" = '8fc031b8-9b32-4838-9ef9-13c2427bea00') |
So GORM is smart enough to combine multiple updates into a single UPDATE clause. Let’s make it more interesting by preloading the bill associations before calling updates.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | func TestUpdateWithMultipleModels(t *testing.T) { db := createIsolatedDatabaseConnection(t) var bills []*Bill err := db. Preload("User"). Where("due_date < ?", time.Now()). Find(&bills). Error require.NoError(t, err) err = db. Model(bills). Updates(Bill{DueDate: time.Now().Add(24 * time.Hour)}). Error require.NoError(t, err) } |
The generated SQL queries will look like this:
1 2 3 4 | [1.239ms] [ROWS:2] SELECT * FROM "users" WHERE "users"."uuid" IN ('79cc783f-9a2f-4ca5-a1ce-c21b81b6602f','c2641724-6e2c-4cb7-8aed-506d4c87e12b') [3.368ms] [ROWS:2] SELECT * FROM "bills" WHERE due_date < '2023-03-07 04:15:03.868' [1.572ms] [ROWS:0] INSERT INTO "users" ("uuid","first_name","last_name") VALUES ('79cc783f-9a2f-4ca5-a1ce-c21b81b6602f','Alexander', 'Sirotin'),('c2641724-6e2c-4cb7-8aed-506d4c87e12b','Bob','Marley') ON CONFLICT DO NOTHING [3.299ms] [ROWS:2] UPDATE "bills" SET "due_date"='2023-03-08 04:15:03.872' WHERE ("uuid" = '43069918-ea84-4ac2-886b-9abe64ef78d9' OR "uuid" = '8fc031b8-9b32-4838-9ef9-13c2427bea00') |
So GORM is trying to create the users and then updates the bill’s due date as expected. We already know that the attempt to create the users is due to the object associations that GORM tries to resolve. But now comes the more exciting part, what if I change Updates() to UpdateColumn() ?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | func TestUpdateWithMultipleModels(t *testing.T) { db := createIsolatedDatabaseConnection(t) var bills []*Bill err := db. Preload("User"). Where("due_date < ?", time.Now()). Find(&bills). Error require.NoError(t, err) err = db. Model(bills). UpdateColumn("due_date", time.Now().Add(24 * time.Hour)). Error require.NoError(t, err) } |
At this stage, another weird thing happened, and the generated SQL queries look like this:
1 2 3 4 | [1.961ms] [ROWS:2] SELECT * FROM "users" WHERE "users"."uuid" IN ('79cc783f-9a2f-4ca5-a1ce-c21b81b6602f','c2641724-6e2c-4cb7-8aed-506d4c87e12b') [4.978ms] [ROWS:2] SELECT * FROM "bills" WHERE due_date < '2023-03-07 04:18:28.694' [2.108ms] [ROWS:0] INSERT INTO "users" ("uuid","firs," ," ," name","last_name") VALUES ('79cc783f-9a2f-4ca5-a1ce-c21b81b6602f','Alexander','Sirotin'),('c2641724-6e2c-4cb7-8aed-506d4c87e12b','Bob','Marley') ON CONFLICT DO NOTHING [4.445ms] [rows:2] UPDATE "bills" SET "due_date"='2023-03-08 04:18:28.699',"user_id"='c2641724-6e2c-4cb7-8aed-506d4c87e12b' WHERE ("uuid" = '43069918-ea84-4ac2-886b-9abe64ef78d9' OR "uuid" = '8fc031b8-9b32-4838-9ef9-13c2427bea00') |
For some reason, GORM decided to update the user_id of both items (in addition to their due date) and linked both bills to the same user !!! This is a severe issue and an actual data corruption caused by GORM.
As before, omitting associations solves the problem, and by adding it, the generated SQL query will be:
1 | [1.575ms] [ROWS:2] UPDATE "bills" SET "due_date"='2023-03-08 04:23:57.905' WHERE ("uuid" = '43069918-ea84-4ac2-886b-9abe64ef78d9' OR "uuid" = '8fc031b8-9b32-4838-9ef9-13c2427bea00') |
Practical Advice: Using Model() with your actual objects is dangerous, and I would advise against using it unless you know what you are doing. Instead, I would be more explicit and use Table() and Where() functions.
Note: More information regarding GORM handling of associations, can be found in the official documentation.
Idempotency
When creating records in the database, we usually have an idempotency rule different from the primary key. For example, we don’t want more than one user with the same first and last name. GORM allows us to do so by using the FirstOrCreate function. For example, let’s look at the following code:
1 2 3 4 5 6 7 8 9 | func TestIdempotency(t *testing.T) { db := createIsolatedDatabaseConnection(t) user := User{FirstName: "Alexander", LastName: "Sirotin"} err := db. FirstOrCreate(&user). Error require.NoError(t, err) } |
The generated SQL query is far from what I wanted it to be:
1 | [2.869ms] [ROWS:1] SELECT * FROM "users" ORDER BY "users"."uuid" LIMIT 1 |
GORM assumed I didn’t have any search criteria, so it returned the first record from the Users table, sorted by UUID. Because the table has some users, no “Alexander Sirotin” was created.
What would happen if I also initialize the {UUID: “something”} within the User object? Let’s see:
1 2 | [3.740ms] [ROWS:0] SELECT * FROM "users" WHERE "users"."uuid" = 'something' ORDER BY "users"."uuid" LIMIT 1 [1.299ms] [ROWS:1] INSERT INTO "users" ("uuid","first_name","last_name") VALUES ('something','Alexander','Sirotin') |
Huh. So GORM will attempt to use the primary key of the passed object (in this case, the user’s UUID) as the matcher. There was no match on the user.UUID = “something” so GORM created the user.
To fix it and be able to specify exactly the fields you want, you need to be more explicit and pass the condition as the second argument to FirstOrCreate, or use the Where() function:
1 2 3 4 5 6 7 8 9 10 | func TestIdempotency(t *testing.T) { db := createIsolatedDatabaseConnection(t) user := User{FirstName: "Alexander", LastName: "Sirotin"} err := db. Where(user). FirstOrCreate(&user). Error require.NoError(t, err) } |
And the generated SQL query will look like this:
1 | [2.668ms] [ROWS:1] SELECT * FROM "users" WHERE "users"."first_name" = 'Alexander' AND "users"."last_name" = 'Sirotin' ORDER BY "users"."uuid" LIMIT 1 |
Practical Advice: Always be explicit in your queries and don’t use a “real” object, but instead construct the objects you need for the query or use specific columns by name. The reason is the “user” object I pass to the Where() function might come from a different place in the code that can load other attributes into it, changing my idempotency criteria.
– Alexander