Work
The following are some of the things I have worked on that I can publicly share:
Open Source
Trained and open sourced code completion LLMs at Replit.
Replit V1.5
Replit V1
- Hugging Face Open Source Repo
- Technical Blog
- Designed the training data infrastructure and pipelines, built feature engineering pipelines for increasing data quality, trained the tokenizer and vocabulary, did the actual model training and evaluation Results Preview
Replit AI Chat
- Designed the new standalone backend service for Replit’s AI Chat.
- Designed, prototyped and implemented the first set of agent features
Presentations/Talks
2023
- A Hacker’s Guide to Building with LLMs: Hacking, training, customizing, and building with LLMs without requiring ridiculous resources
- Top voted Tech Talk at Hack the North 2023
2022
- Hotspot and Binding Residue Prediction Using Embeddings
- Did undergrad research work in 2021-2022 on large transformers for protein sequences, focusing on applications to hotspot predictions. Showed results supporting the feasibility of using embeddings from large models (ESM-1b, ProtT5) for hotspot, binding and non-binding residue prediction.
Miscellaneous
These are some interesting things that I worked on:
Orientalism in the Music of French Opera, A Technical Research Paper at the Intersection of Music and Continental Philosophy
- Wrote my IB Extended Essay (capstone high school paper) on how music depicts social, cultural and philosophical “intangible” ideas with a technical analysis on 19th century French Opera.
-
- Led the backend team for a web app to search for medicines, ventillators and other supplies for Covid related emergencies for India. 240K+ users used the app.
Data Modelling for a Theory of Variable Physical Constants
- Requested by auther to setup basic data modelling code to verify a theory of varying physical constants in the paper: “Cosmology with relativistically varying physical constants” by Rajendra P. Gupta, published in the Monthly Notices of the Royal Astronomical Society, staa2472, https://doi.org/10.1093/mnras/staa2472, 25 August 2020.
- This early work later led to results supporting a theory claiming the universe is twice as old as currently known: https://phys.org/news/2023-08-universe-theory-believed.html.
Implemented and Trained StylePoseGAN from scratch
- Implemented StylePoseGAN paper on top of lucidrains’ stylegan2-pytorch and trained the GAN-based network from scratch. We were trying to build a product that enables users to pay and get new, AI-generated, realistic photos for their social media, Tinder, etc. in 2021 before StableDiffusion etc. we released.
- Built web scrapers to get 800K+ training images from ecommerce sites paired on identity, and built the custom data processing pipeline to assemble curated train and custom eval sets.