PostgreSQL Arrays and EF Core
26 Jan 2023We recently faced the challenge in our team of storing a collection of primitive values (in our case string
) within PostgreSQL. I’d like to share some interesting insights and learnings, so let’s go!
Imitating arrays by string concatenation
As we’re using EF Core, our first naive approach was to use a field-only property which looked like this:
That worked pretty well, and we built our business logic around this pattern. But to answer questions like which books are written by this particular author, all Book
entities had to be loaded from the database into memory - dark performance problem clouds darken the friendly Swiss Post sky. So this search was moved to the database like this:
Which had some downsides as well:
- A text search with wildcards (
%
) is necessary. - The field
_concatenatedAuthorNames
must be known but it isprivate
. - What if there are authors with double names like Shakespeare and Shakespeare-Doyle?
Using arrays on the database
Some days later, a colleague nudged me to give the PostgreSQL Array Type Mapping a try - a feature I was simply unaware of and which lets you store and search arrays on the database-level.
There are basically two ways for using the PostgreSQL data type text[]
in .NET POCOs:
string[]
List<string>
We strove for using the latter and here’s what our code looked like:
Super nice from the .NET perspective because…
- the business logic doesn’t need to know about any EF Core specific details like
_concatenatedAuthorNames
, - the private field
_concatenatedAuthorNames
is gone at all, - no need to manually override the model builder in
OnModelCreating()
, - well-adopted types like
List<T>
or LINQ APIs like.Contains()
can be used, feeling very natural.
Testing with SQLite
All our system integration tests (using a containerized PostgreSQL database) passed after switching the approach - but the unit and integration test suite literally exploded 🤯💣 What happened?
First of all let’s have a look how EF Core translates the BookContext
:
We see that AuthorNames
is mapped to the database type text[]
which exists for PostgreSQL but not for SQLite. Since our unit and integration tests are using an in-memory SQLite database, this has to be adapted.
To not pollute the BookContext
with testing concerns, a dedicated TestBookContext
is used:
Please notice three important aspects:
listComparer
→ the EF Core Change Tracker will use this comparer to identify changed entities. Without this comparer, EF Core would only detect a completely changed collection (e. g.book.AuthorNames = new List<string> { "New author" };
) but not changes to the collection’s content (e. g.book.AuthorNames.Add("New author");
).listConverter
→ as SQLite has no equivalent array datatype, we pick up the string concatenation approach from this post’s beginning and treat the .NET collection (List<string>
) as a plain string on the database-level.- Datatype of column
AuthorNames
is nowtext
instead oftext[]
and uses the customlistConverter
andlistComparer
.
Now all unit and integration tests were green 🥳
Summary
The colleague who nudged me to use this feature found the perfect summary: thank you for going down the rabbit hole with me. I couldn’t agree more! 😅 It was quite a journey to get it all up and running, but in the end I appreciate this solution due to its simplicity and performance.