Static application security testing with Semgrep in CI

SAST is most useful when rules are actionable and fit the stack. I use Semgrep to catch dangerous patterns like command injection, weak crypto, SSRF sinks, and raw SQL interpolation. The signal stays high when teams tune rules and suppressions deliber

Secure random token generation for sessions and recovery flows

Predictable tokens become account compromise. I use cryptographically secure randomness, store only token digests when possible, and keep token purpose and expiry specific. Reset tokens, magic links, and API secrets should all be treated like credenti

Word embeddings with gensim for semantic similarity tasks

Dense embeddings help when lexical overlap is weak but semantic similarity matters. I use them for retrieval prototypes, clustering, and feature enrichment when transformer infrastructure is overkill. The main discipline is keeping training data clean

SSH daemon hardening and key based access only

SSH hardening is basic but still worth doing carefully. I disable password auth, restrict root login, and pair strong settings with operational practices like host key monitoring and per-user key lifecycle management. Security without maintainability

Experiment tracking and model registry workflows with MLflow

If experiments matter, they should be searchable after the notebook is closed. MLflow gives me parameter tracking, metric history, artifact storage, and a lightweight model registry without much ceremony. It is one of the fastest ways to make a small

Parameterized queries in Python with psycopg

Even outside ORMs, parameterized database access needs to be the default habit. The query string should describe structure while the driver binds user values separately. That sounds basic, but it is still where too many internal tools quietly fail sec

Fail2ban filters to slow SSH and application abuse

Fail2ban is not a complete defense, but it is a useful friction layer for noisy abuse. I use it where login failures or repeated 401s clearly indicate hostile automation. It works best when paired with centralized logs and upstream rate limiting, not

Cleaning missing values and normalizing messy CSV exports

Real data arrives dirty. I usually start with missing-value audits, duplicate removal, explicit type conversion, and canonical text cleanup. The trick is to make each cleanup rule reproducible rather than burying it in notebook state. I prefer small,

Signed and encrypted Rails cookies for tamper resistant state

Client-side cookies should be treated as attacker-controlled even when the framework signs them. I use encrypted cookies for sensitive state, keep payloads minimal, and avoid long-lived authorization decisions inside the browser. The convenience of co

Time series resampling and rolling windows in pandas

For operational metrics and forecasting features, I standardize timestamps first and then resample into stable windows. Rolling statistics like 7D means, lagged deltas, and volatility bands are easy wins for exploratory analysis. I avoid mixing timezo

Linux privilege escalation checks for suspicious local state

Privilege escalation detection is rarely one command. I look for unexpected SUID binaries, writable service units, dangerous sudo rules, and kernel or package drift. These checks are not glamorous, but they catch a lot of real misconfigurations that a

Convolutional neural networks for image classification in PyTorch

For image work, I start with a compact CNN before reaching for heavy pretrained models. That baseline helps confirm whether labels, normalization, and augmentation are sane. It also makes failure cases easier to explain because the model architecture