If you spend any time writing code to talk to a database from an object-oriented language, you will come across the notion of an object-relational impedance mismatch. This is the idea that there are fundamental differences between the data models of relational databases and object-oriented programming languages that make it difficult to map between the two. This idea has spawned a large number of object-relational mapping (ORM) tools that attempt to bridge the divide. In this post, I’ll argue that while there are many practical differences between SQL and typical OO languages, there is no fundamental disagreement between the two – so long as we adopt an appropriate logical view of objects.
Many developers these days are more familiar with OO programming languages than relational databases. So modelling a problem domain often starts with objects and classes and then work out how to “map” those down to a set of database tables. This is often seen as the problem of persistence; how do I make my objects live beyond the lifetime of my process?
I tend to view a relational database quite differently. The relational model is expressively similar to a cut-down version of first-order logic, and a relational database can be seen as a theorem-prover for that fragment of logic. We can see this most clearly in Datalog, a variant of Prolog designed to work well with relational databases. Datalog with negation can express all queries of relational algebra, plus recursive queries that go beyond the pure relational model.
When deciding how to map a set of classes to a relational database, a natural approach for most developers would be to map fields of a class to columns in a table, and instances of a that class to rows in the table. Inheritance (where still used) and collection-valued fields can be handled with additional tables that are JOINed in queries.
A different, more logical, but initially quite surprising view would be instead to view each class as a table with just one column—a unary predicate in logical terms—and every field as a table with two columns. Madness! But let me explain.
Objects, logically
Let’s suppose you have a class for representing people in your application. A person has a name, and date of birth, and a home address.
class Person {
String name;
Date dateOfBirth;
Address homeAddress;
}
class Address {
String streetAddress;
String townOrCity;
String region;
String postalCode;
}
Person peter = new Person(...);
In a traditional mapping you might end up with two tables: people and addresses. But in a purely logical model you would instead view this as a collection of assertions about a person:
person(peter).
name(peter, “Peter Person”).
dateOfBirth(peter, 1944-02-04).
homeAddress(peter, petersHouse).
address(petersHouse).
streetAddress(petersHouse, “1 Person Drive”).
And so on. These would then form the relations: person (or people), name, date_of_birth, etc.
Of course, having lots of tiny tables that require a huge number of joins to do anything is not going to make your DBAs happy. But my argument is that this is the right way to model your data initially, which you can then denormalize for performance.
(For query-only data models, such as the Q side of a CQRS architecture, you can actually do something like this: define denormalized tables but add a layer of unary and binary views on top. A good query optimizer will eliminate the views).
The reason I think this is the right way to think about things is that the relational-OO impedance mismatch largely disappears when you model things this way. Collection-valued attributes? Just have multiple entries in the field table:
address(peter, home).
address(peter, work).
Inheritance? Easy! Just insert rows into the fields for the superclass.
class Employee extends Person {
String jobRole;
}
person(peter).
name(peter, …).
…
employee(peter).
job_role(peter, manager).
Optional/nullable fields? Just don’t insert anything into the field relation for that object.
I have a pet project programming language that I occasionally work on that is a logic programming with a little syntactic sugar to make this connection more apparent. The person example can be written there as follows:
person peter:
name "Peter Person".
date_of_birth 1944-02-04.
address home:
street "1 Person Way".
...
end
end
This looks like the OO version but expands into the logical form.
The logic of JSON
This view of objects also allows to provide a logical interpretation of object/record-based data formats such as JSON. In this case, each field of a JSON object can be taken to be an assertion about some object. But what is the object? In a programming language you’d create instances of objects and give them names. But in JSON you just have a naked object. The correct logical interpretation is to view a JSON object as an existential statement: asserting the existence of some (unnamed) individual. Thus a JSON object like
{
"name": "Peter Person",
"dob": "1944-02-04",
"address": { ... }
}
can be viewed as a logical statement of the form:
∃x.name(x, “Peter Person”) ∧ dob(x, …) ∧ ∃a.address(x, a) ∧ street(a, …) ∧ …
There is some person, named Peter Person, who was born in February 1944, and has a home address of … and so on.
Thinking in this way can clarify conceptual difficulties in the use of objects and JSON. For example, a JSON Web Token is defined as a set of claims represented as a JSON object and then signed and/or encrypted. A JWT Claims Set is an object like the following:
{
"iat": 123456,
"exp": 124056,
"sub": "peter",
"iss": "https://login.example.com"
}
If we apply the logical interpretation to this JSON object you can see that is making some claims (and some constraints) about some thing. But that thing isn’t a person – these are claims about the JWT itself. It is the token that was issued-at a given time, and the token that will expire in 10 minutes. I certainly hope so anyway, for Peter’s sake! Even the subject (sub) claim is making a claim about the token: this token is about Peter.
Many applications of JWTs go on to define additional claims and they usually do so as new top-level claims. But this can lead to confusion as these new top-level claims are usually about some other object than the token. For example, OpenID Connect defines many claims about a user: preferred_username, email, given_name, and so on. Under the logical view, having these claims mixed with claims about the token itself is nonsensical – a single object can’t be both a person and a token:
There exists some thing that was issued at 9am today. It will expire in 10 minutes. It’s name is Peter. It was born in 1944… wait, what?
The correct thing to do according to the logical interpretation is to define a new sub-object to hold these claims:
{
"iat": 123456,
"iss": "https://login.example.com",
"sub": {
"id": "peter",
"email": "peter@example.com",
"given_name": "Peter",
...
}
}
There exists some thing that was issued at 9am today. It will expire in 10 minutes. It’s about some subject, whose name is Peter, who was born in 1944, …
Suddenly this conceptually makes more sense. It is not the token that was born in 1944, but the subject of the token.
Does this matter?
You might disagree with my logical characterisation of objects, or are willing to let it stand but argue that it makes no real difference. Nobody really gets confused about what claims are being made in OpenID Connect.
In most cases, I’d agree that nobody does get confused about these things, but sometimes they do. JWTs are often used as a format for OAuth2 access tokens. There is also a standard for introspecting OAuth access tokens by calling a HTTP endpoint on the authorization server (AS). The AS returns a response indicating whether the token is still valid and some metadata about the token, such as what client it was issued to, which user approved it, and so on. Some of the claims in the response are based on claims defined for JWT: the AS can return the time when the access token was originally issued as an “iat” claim, or its expiry as the “exp” claim. All well and good.
The problem is that now there is a proposal to let the AS sign the introspection response (for legal/non-repudiation purposes), making the response itself be a JWT. But now, is the “iat” claim a claim about the new JWT or about the original access token? With our logical hat on the solution seems clear: the top-level claims are claims about the JWT and any claims about the original token should be enclosed in an inner object:
{
"iat": 123456,
"iss": "as.example.org",
"token_info": {
"active": true,
"iat": 123000,
"sub": "peter",
...
}
}
Justin Richer has argued the same point on the OAuth mailing list, and I agree with everything he says there. Conceptual clarity about what is being said about what is crucial, especially when security (and legal) issues are involved.
For me, logical analysis is the best tool for ironing out such conceptual issues.