Has and belongs to many tables often need unique indexes

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the databases category.

Last Updated: 2024-03-28

A LawCase had and belonged to many LawDisciplines, and vice-versa thanks to a join table with law_case_id and a law_discipline_id.

I wrote some code that in effect amounted to adding the same law_case multiple times into the association.

law_case = LawCase.last
law_case.law_disciplines << law_case
law_case.law_disciplines << law_case
law_case.law_disciplines << law_case

Wrongly, I assumed that duplicates would not be allowed by Rails. This was not the case. Here's what law_case.law_disciplines contained - i.e. the same discipline "Skills" many times

=> [#<LawDiscipline:0x00007fbbb114b788 id: 306, name: "Skills", level: "LPC", created_at: Sat, 07 Dec 2019 17:30:05 UTC +00:00, updated_at: Fri, 10 Jan 2020 14:47:18 UTC +00:00>,
 #<LawDiscipline:0x00007fbbaed136e0 id: 306, name: "Skills", level: "LPC", created_at: Sat, 07 Dec 2019 17:30:05 UTC +00:00, updated_at: Fri, 10 Jan 2020 14:47:18 UTC +00:00>,
 #<LawDiscipline:0x00007fbbabf5b338 id: 306, name: "Skills", level: "LPC", created_at: Sat, 07 Dec 2019 17:30:05 UTC +00:00, updated_at: Fri, 10 Jan 2020 14:47:18 UTC +00:00>]

Lesson

You need a unique index on a has_and_belongs_to_many table if you wish to avoid duplicates

add_index :law_cases_disciplines, 
  %i[law_case_id law_discipline_id],
  unique: true,
  name: 'law_case_discipline_unique_pair'

In order for this migration to even run, I needed to deduplicate my existing elements, which I located as follows

SELECT * FROM law_cases_disciplines 
GROUP BY law_case_id, law_discipline_id 
HAVING COUNT(*) > 1;