The following are some of the things I have worked on that I can publicly share:



Trained and open sourced code completion LLMs end-to-end at Replit.

Before that, I interned on the data science team at a PE firm in Toronto and on an ML team at Carfax.





  • Hotspot and Binding Residue Prediction Using Embeddings
    • Did undergrad research work in 2021-2022 on large transformers for protein sequences, focusing on applications to hotspot predictions. Showed results supporting the feasibility of using embeddings from large models (ESM-1b, ProtT5) for hotspot, binding and non-binding residue prediction.



These are some interesting things that I worked on:

  • Orientalism in the Music of French Opera, A Technical Research Paper at the Intersection of Music and Continental Philosophy

    • Wrote my IB Extended Essay (capstone high school paper) on how music depicts social, cultural and philosophical “intangible” ideas with a technical analysis on 19th century French Opera.
  • Covid Army

    • Led the backend team for a web app to search for medicines, ventillators and other supplies for Covid related emergencies for India. 240K+ users used the app.
  • Data Modelling for a Theory of Variable Physical Constants

    • Requested by auther to setup basic data modelling code to verify a theory of varying physical constants in the paper: “Cosmology with relativistically varying physical constants” by Rajendra P. Gupta, published in the Monthly Notices of the Royal Astronomical Society, staa2472,, 25 August 2020.
    • This early work later led to results supporting a theory claiming the universe is twice as old as currently known:
  • Implemented and Trained StylePoseGAN from scratch

    • Implemented StylePoseGAN paper on top of lucidrains’ stylegan2-pytorch and trained the GAN-based network from scratch. We were trying to build a product that enables users to pay and get new, AI-generated, realistic photos for their social media, Tinder, etc. in 2021 before StableDiffusion etc. we released.
    • Built web scrapers to get 800K+ training images from ecommerce sites paired on identity, and built the custom data processing pipeline to assemble curated train and custom eval sets.