The following are some of the things I have worked on that I can publicly share:
Trained and open sourced code completion LLMs at Replit.
- Hugging Face Open Source Repo
- Technical Blog
- Designed the training data infrastructure and pipelines, built feature engineering pipelines for increasing data quality, trained the tokenizer and vocabulary, did the actual model training and evaluation Results Preview
Replit AI Chat
- Designed the new standalone backend service for Replit’s AI Chat.
- Designed, prototyped and implemented the first set of agent features
- A Hacker’s Guide to Building with LLMs: Hacking, training, customizing, and building with LLMs without requiring ridiculous resources
- Top voted Tech Talk at Hack the North 2023
- Hotspot and Binding Residue Prediction Using Embeddings
- Did undergrad research work in 2021-2022 on large transformers for protein sequences, focusing on applications to hotspot predictions. Showed results supporting the feasibility of using embeddings from large models (ESM-1b, ProtT5) for hotspot, binding and non-binding residue prediction.
These are some interesting things that I worked on:
Orientalism in the Music of French Opera, A Technical Research Paper at the Intersection of Music and Continental Philosophy
- Wrote my IB Extended Essay (capstone high school paper) on how music depicts social, cultural and philosophical “intangible” ideas with a technical analysis on 19th century French Opera.
- Led the backend team for a web app to search for medicines, ventillators and other supplies for Covid related emergencies for India. 240K+ users used the app.
- Requested by auther to setup basic data modelling code to verify a theory of varying physical constants in the paper: “Cosmology with relativistically varying physical constants” by Rajendra P. Gupta, published in the Monthly Notices of the Royal Astronomical Society, staa2472, https://doi.org/10.1093/mnras/staa2472, 25 August 2020.
- This early work later led to results supporting a theory claiming the universe is twice as old as currently known: https://phys.org/news/2023-08-universe-theory-believed.html.
- Implemented StylePoseGAN paper on top of lucidrains’ stylegan2-pytorch and trained the GAN-based network from scratch. We were trying to build a product that enables users to pay and get new, AI-generated, realistic photos for their social media, Tinder, etc. in 2021 before StableDiffusion etc. we released.
- Built web scrapers to get 800K+ training images from ecommerce sites paired on identity, and built the custom data processing pipeline to assemble curated train and custom eval sets.