Work
Publications, writing, and code.
A running list of what I've published, written, and shipped: research papers, book chapters, technical tutorials, and open-source packages.
Research
Publications
-
2025
AI-driven drug repurposing: a graph neural network and self-supervised learning approach
Computational drug discovery using GNNs and self-supervised pretraining over biomedical knowledge graphs.
-
2025
Multi-modality medical image fusion using ML and deep learning
Methods for integrating CT, MRI, and ultrasound modalities for clinical decision support. Book chapter.
-
2024
Reinforcement learning for decision making
Applications of RL across modern Industry 5.0 systems. Book chapter for Routledge.
-
2023
Do trust-based social recommendation algorithms work as intended?
Empirical study examining whether trust signals deliver on their promise in recommender systems.
-
2020
Influence of social circles on user recommendations
Master's thesis on social-graph-aware recommendation models. San Jose State University.
-
2019
Machine translation of English videos to Indian regional languages
End-to-end pipeline for translating spoken English video into regional Indian languages.
Code
Open source
-
Python
cloudfitCloud-agnostic machine type advisor for batch and bioinformatics workloads. Given a workload spec (CPU, RAM, region, optimize for cost / performance / availability), returns ranked instance recommendations with transparent per-factor scoring. Multi-package OSS ecosystem: scoring engine (
cloudfit-core), GCP provider (cloudfit-provider-gcp), and a stateless FastAPI service (cloudfit-api) with a multi-region bundled snapshot. Built to fill the pre-launch and batch-workload sizing gap that incumbent free tools (Compute Optimizer, GCP Recommender) don't cover.$ pip install cloudfit-core cloudfit-provider-gcp -
Python
clinopsClinical ML pipeline toolkit: MIMIC-IV / FHIR loaders, temporal feature windows, and patient-aware train/test splits that don't leak across cohorts. Distilled from production work in clinical and genomic data engineering.
$ pip install clinops -
Python
samplesheet-parserFormat-agnostic parser for Illumina
SampleSheet.csvfiles. Auto-detects IEM v1 vs. BCLConvert v2, validates index integrity with Hamming distance checks, and converts, diffs, or merges sheets across mixed sequencing fleets.$ pip install samplesheet-parser
Writing
Tutorials & articles
-
2025
Understanding recommender systems: the engine behind personalized experiences
A primer on collaborative filtering, content-based, and hybrid approaches to recommendation. Why personalization engines work the way they do, and where they break.
-
KDnuggets
An introduction to explainable AI and explainable boosting machines
A primer on XAI fundamentals and how EBMs combine accuracy with interpretability.
-
TDS
Build kNN from scratch in Python
A from-the-ground-up implementation of k-nearest neighbors with the math made explicit.
-
Medium
Understanding the NLP pipeline
A walk-through of the typical stages in a modern natural-language-processing system.
-
Medium
Intro to ETL pipelines
A walk-through of extract-transform-load patterns and why they matter for data platforms.
Want to chat about data or research?
Always happy to swap notes on data systems, ML in production, or open-source projects, and to hear feedback on my writing or research.