Never assume data from a vendor is valid

This is part of the Semicolon&Sons Code Diary - consisting of lessons learned on the job. You're in the vendors category.

Last Updated: 2024-04-25

After someone buys something with my guest checkout, I get the user information from PayPal, including the email address. I then create a user out of this email address without much thought.

However, it turns out some users on PayPal have wildly invalid (i.e. impossible) emails. One guy had a space inside it:

"aa_10 person@example.com"

This led to lots of downstream errors when I tried to contact this person or email them access. They eventually filed a charge-back.

Instead, when importing data from the PayPal vendor, I should have had sanity checks in place to verify the data (e.g. regex on email). Of course, in a production checkout flow this isn't as simple as adding validates to the user model... I can't quite throw an unrescued error here. But I could have redirected to a special "repair" page for the incorrect record.

Moreover, even if their data was "valid" by objective standard, it will might be different according to my own (e.g. I lowercase emails, they don't)

Lesson

Never assume a vendor will give you valid data. Treat it with same skepticism you'd treat user input and validate and format it.