data-quality

Normalize Tags at Write Time

Tags are user input, so normalize aggressively: downcase, strip, collapse whitespace, remove duplicates. Doing this at write time keeps search/filter logic clean and avoids messy edge cases.

Great Expectations checks for dataset health before retraining

Before retraining, I want hard guarantees that the data feed still looks structurally sane. Great Expectations gives teams a shared validation language that analysts, ML engineers, and data engineers can all inspect. I use it to codify invariants that