Hold back on string validations until very sure of data format

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the user-input-and-output category.

Last Updated: 2024-03-28

In OxbridgeNotes, I added intense validations to (law) case names, requiring that they must match X v Y. But a few cases had Re X and others, in the EU, used and instead of v.

In the end it was validation whack-a-mole, so I removed this code, despite investing hours into writing it and writing tests for it. Three hours down the drain, without pay. It would have take me an hour to check them all manually too.

Lesson

Be very wary about adding regex validations to strings you don't fully understand the format of. Understanding comes from working with a wide variety of representative data (in my case, law cases from perhaps 8 different files containing 20 each). Only at that point should I consider whether regex validations are realistic or helpful.