About · Chicago, IL
A data engineer who likes shipping systems that don't break.
Currently building cloud-based bioinformatics infrastructure at Egen in Chicago. Five years on data systems, plus a few more on software before that.
I'm a Software Engineer (Data) at Egen with 5+ years of experience designing ETL pipelines and cloud-based data systems. Based in the Greater Chicago area. My day-to-day is mostly Python on Google Cloud (with some AWS), and a lot of thinking about data contracts.
I have a master's in Computer Engineering from San Jose State University. I work across data engineering, ML deployment, and cloud architecture, and have collaborated with scientific and medical teams on systems that process structured and unstructured data at scale.
Outside of paid work I maintain a handful of open-source Python packages (cloudfit, a cloud-agnostic machine-type advisor for batch workloads; clinops, a clinical ML pipeline toolkit; and samplesheet-parser, an Illumina SampleSheet parser), write about data engineering and ML, and publish research. Most recent papers are on graph neural networks for drug repurposing and multi-modal medical image fusion.
Experience
Where I've been
-
Apr 2024 to Present
Software Engineer, Data
Egen · Cloud-based bioinformatics applications, ETL optimization. Streamlined a lab workflow that cut turn-around time by 25%.
-
Oct 2022 to Apr 2024
Senior Associate Engineer, Data
Egen · Orchestrated ETL pipelines across cloud data stores. Revamped partner onboarding and automated data pipelines, reducing manual workload by 20%.
-
Aug 2021 to Oct 2022
Associate Software Engineer, Data
Egen · ETL pipelines for structured and unstructured data, cloud data store delivery for downstream analytics.
-
May 2021 to Jul 2021
Data Science Intern
Predmatic AI · Built SKU baseline forecasting, integrated Explainable AI (EBM) into the forecasting workflow, cut forecast error by 10%.
-
Mar 2020 to Apr 2021
Graduate Research Assistant
San Jose State University · Research on the influence of online social circles on recommendation algorithms. Web-scraping pipeline for social media data enrichment.
-
Aug 2017 to Jun 2018
University Innovation Fellow
Stanford University d.school program (KL University cohort) · One of a small group nominated by the Center for Innovation, Incubation & Entrepreneurship. Led peer-mentoring and design-thinking initiatives.
-
Aug 2018 to Dec 2020
MS, Computer Engineering
San Jose State University
-
Jul 2014 to May 2018
BTech, Computer Science & Engineering
Koneru Lakshmaiah University
Skills
What I work with
- Python
- TypeScript
- Node.js
- SQL
- GCP
- BigQuery
- Cloud Storage
- Pub/Sub
- Google Batch
- AWS
- ETL
- Airflow
- Spark
- Docker
- Terraform
- Git
- Linux
- PostgreSQL
- REST APIs
- FastAPI
- NestJS
- Pydantic
- Scikit-learn
- PyTorch
Interests
What I like working on
- Data Engineering
- Open Source
- Cloud Cost Optimization
- Workflow Orchestration
- Data Analytics
- Recommendation Systems
- Machine Learning
- Natural Language Processing
- Model Deployment
- Cloud Architecture
- Bioinformatics
Certifications
What I'm certified in
- GCP Professional Data Engineer
- Generative AI Leader (Google Cloud)
- Machine Learning Engineer Nanodegree (Udacity)
- Data Streaming Nanodegree (Udacity)
- Consumer Neuroscience & Neuromarketing
Honors
Selected recognition
- Stanford University Innovation Fellow · Nominated by the Center for Innovation, Incubation & Entrepreneurship, 2017
- UC Berkeley Open Innovation Hackathon, India · Winning team
- Best Technocrat · KL University, Class of 2018
- Davidson Student Scholar
- Project Expo Winner
Want to connect?
Open to conversations about data engineering, ML, and cloud architecture, and to feedback on my writing or research.