Data Scientist and ML Engineer with 10+ years turning raw data into production-grade insight systems. Expert in statistical analysis, pandas workflows, feature engineering, scikit-learn, PyTorch, and MLOps. I care about reproducibility, model quality, explainability, and building pipelines that survive real operational pressure.

OpenCV image preprocessing for OCR and vision pipelines

A lot of computer vision performance comes from cleaner inputs rather than larger models. I use OpenCV for resizing, denoising, thresholding, and contour extraction when preparing images for OCR or downstream classification. These classical steps ofte

Geospatial analysis with GeoPandas for location intelligence

Location data becomes useful when spatial joins and distance-based features are handled correctly. GeoPandas is enough for many routing, service coverage, and market analysis tasks before you need heavier GIS infrastructure. I care about coordinate sy

Regular expressions for extracting structured entities from raw text

Regex is not glamorous, but it remains one of the fastest ways to turn messy text into useful structured fields. I use it for IDs, dates, codes, and log fragments before reaching for heavier NLP. The important part is making patterns specific enough t

Web scraping pipelines with requests and BeautifulSoup

For lightweight data collection, I prefer reliable HTML parsing over brittle browser automation. That means stable headers, polite rate limiting, retries, and explicit extraction rules. If scraping becomes core infrastructure, then I graduate it into

SQL window functions for feature extraction and behavioral ranking

A surprising amount of feature engineering is best done in SQL before Python ever runs. ROW_NUMBER, LAG, rolling windows, and partitioned aggregates are ideal for deriving customer behavior signals close to the source. I use SQL here when it reduces m

Great Expectations checks for dataset health before retraining

Before retraining, I want hard guarantees that the data feed still looks structurally sane. Great Expectations gives teams a shared validation language that analysts, ML engineers, and data engineers can all inspect. I use it to codify invariants that