Some intuitions for understanding joins

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the databases category.

Last Updated: 2024-04-19

Join without an on clause

This means no constraints - i.e. a CROSS JOIN - a result set in which every row in each table is joined to every row in the other table; this is also called a cartesian product.

e.g. in Laravel I can join my locations table (which contains data specific to my web-app) to my migrations table (which exclusively contains db meta-data). This join makes no sense logically since they have connection. But it can still be done:

select * from locations join migrations;

id  test_station_id house_number    street  city    postcode    country latitude    longitude   created_at  updated_at  id  migration   batch
2   2   52  Martinistrasse  Hamburg 20251   Germany 53.59024800 9.97848000  2020-05-05 17:05:08 2020-05-05 17:05:08 3   2020_05_04_123257_create_test_stations  1
1   1   1   Charitépl  Berlin  10117   Germany 52.51635000 13.37755000 2020-05-05 17:05:08 2020-05-05 17:05:08 4   2020_05_05_115937_create_slots  1
2   2   52  Martinistrasse  Hamburg 20251   Germany 53.59024800 9.97848000  2020-05-05 17:05:08 2020-05-05 17:05:08 4   2020_05_05_115937_create_slots  1
1   1   1   Charitépl  Berlin  10117   Germany 52.51635000 13.37755000 2020-05-05 17:05:08 2020-05-05 17:05:08 5   2020_05_05_123541_create_tickets    1
2   2   52  Martinistrasse  Hamburg 20251   Germany 53.59024800 9.97848000  2020-05-05 17:05:08 2020-05-05 17:05:08 5   2020_05_05_123541_create_tickets    1
...

Left Outer Join vs Left Join

There is no difference between these two!

Another way to think about this: an inner join cannot be left/right since it exclusively returns entries that match on both sides. Therefore a left join or a right join have to be outer joins.

How many records are returned when products inner joins users?

Say you have the following schema - a User has many Products, which might be modelled in Rails with:

class User < ActiveRecord::Base
  has_many :products
end

class Product < ActiveRecord::Base
  belongs_to :user
end

These are the numbers of records in each table.

Product.count 
=> 2040


User.count 
=> 400

Assuming that each product has to be have a user, what is the count of their inner join?

User.joins(:products).count

Most people think it's upper limit has to be 400, being limited by the number of records in the smaller table. That thinking is wrong. It will probably be some number between 400 and 2040. Why? Because inner join basically creates a virtual table with one row per match. If somes users have multiple products, then there will be multiple rows, bringing the number closer to the number of rows in the products table.