Rohan Sampath

I like building AI Applications. I co-founded and was CEO of Copilot, where we helped sales teams spot risk in their sales deals by fine-tuning custom risk models on their CRM and conversational data.

Co-built the early product
Led the development of four core AI products, including a deal risk engine, a GPT-4-powered deal summarization platform, a memory-augmented autonomous sales agent, and a real-time coaching tool using live transcriptions
Fine-tuned models (e.g., Mistral 7B) and built retrieval-augmented generation (RAG) pipelines used by 30+ customers, including Fortune 100 companies such as IBM
Hired and managed a team of 10 outstanding engineers and a 3-person product/design/customer success pod
Raised over $3M in venture capital funding, and led product, engineering, and go-to-market efforts

I've worked on several open-source projects in areas such as AI agentic systems, Model Context Protocol (MCP), optical character recognition (OCR) models, on-device inference for edge and mobile deployment, model evaluation, and knowledge distillation. I'm particularly interested in designing AI systems that are modular, reliable, and production-ready. A selection of my open-source AI applications can be found here.

I completed an M.S. in Computer Science at Stanford University, with a focus on Artificial Intelligence, specializing in Natural Language Processing and Understanding. I worked closely with Chris Manning and served as a teaching assistant for two of his courses — CS224N (Natural Language Processing with Deep Learning) and CS276 (Information Retrieval and Web Search). In CS276, I also had the opportunity to collaborate with Pandu Nayak, VP of Search at Google.

Before that, I was a consultant at McKinsey & Company, where I worked on large-scale technology M&A and operational transformation initiatives for technology and telecom clients. I also hold a B.A. in Economics and an M.S. in Management Science & Engineering, both from Stanford University.

Offline Translation

A mobile app that performs offline image-to-text translation entirely using local models. It explores CoreML-based pipelines vs. multimodal architectures, built in Swift for iOS. Key result: Using a small, locally deployed multimodal model (i) improves text recognition and translation quality, and (ii) enhances support for low-resource languages, outperforming the Apple Translate app.

Keywords: CoreML, Apple Vision, Apple Language, Apple Translation, OCR, Swift

ScriptForge: Agentic Screenwriting Assistant for Film & TV

A writer assistant that lets you co-author screenplays in multiple genres and formats with fine-grained control over collaboration. Uses a multi-agent architecture with a supervisor agent and multiple specialized worker agents. Built using LangGraph (soon moving to OpenAI Agents SDK), it orchestrates a multi-agent system where a human and LLM "Writer Agent" alternate turns in drafting, rewriting, and structuring scenes.

This approach preserves authorial voice and creative direction while drastically reducing the time to a complete screenplay. Includes tooling for test-time LLM selection, turn management protocols, and structured prompt libraries tailored for screenwriting.

Key differentiator: Configurable control over how often the LLM "Writer Agent" takes turns vs. the human writer — useful for both novice and expert screenwriters.

Keywords: LLM agents, LangGraph, LangSmith (observability), Model Context Protocol, multi-agent orchestration, OpenAI Agents SDK

(Repo currently private due to active commercial discussions.)

Head-to-Head Model Evaluation Harness

A lightweight head-to-head evaluation harness that lets you compare two LLMs—either off-the-shelf or fine-tuned—on a custom dataset. Designed for domain-specific testing (e.g., legal, finance, enterprise), and with curated test cases from non-technical users in mind. For example, a lawyer or financial analyst can define a small evaluation suite and compare model performance without writing code.

Built using vLLM - a high-throughput and memory-efficient inference engine for LLMs.

Key differentiator: Enables evaluation on custom datasets, not just public benchmarks — ideal for quick comparison and selection of models for specific domains or use cases.

Keywords: Hugging Face, vLLM, model evaluation, custom datasets, eval harness infrastructure

(Hugging Face Space - demo and proof of concept on a single A100 GPU.)

ThreeFortyThree Canada: Canadian Election Projections and Modelling

A seat-by-seat forecasting platform for Canadian federal elections. It was built in preparation for the 2025 Canadian federal election, held in April 2025. It combines regional and national polling with historical swing data, incumbent effects, demographic data, and multivariate modeling to simulate likely outcomes across all 343 electoral districts (hence the name). The projection engine uses Monte Carlo simulations to incorporate uncertainty and generate probabilistic seat counts and riding-level forecasts.

Users can interactively simulate elections by adjusting party vote shares and observing how seat-by-seat outcomes shift in response.

Key features: Uses a polling aggregation pipeline, combines multivariate distributions and Monte Carlo simulation to model multi-party dynamics. Read full methodology here

Keywords: React, Node.js, Python backend for Chat (see below, link), Multivariate distributions, Monte Carlo engine, Polling aggregation pipeline

Model Context Protocol (MCP) Server and Client: Elections Canada

A Model Context Protocol server that brings Canadian elections data into LLM tools. Specifically tested for use with Claude Desktop. In addition, a Model Context Protocol Client with a multi-agent architecture that is used at threefortythree.ca/chat. Key feature: Fine-grained querying of all electoral and polling data.

Keywords: MCP, Model Context Protocol, LLM tools, Canadian elections, Claude Desktop, multi-agent architecture

Server: PyPI Client:

In Progress

Inference Time & Cost Estimator

A calculator-style tool that estimates the latency and cost of running various ML models on different hardware. Input model size, architecture type, batch size, and hardware profile to get real-time insights on cost-performance tradeoffs.

Keywords: latency, inference cost, model architecture, hardware benchmarking, deployment planning

DistillKit: Knowledge Distillation Library

A plug-and-play Python library to distill knowledge from larger teacher models into smaller, deployable student models.

Keywords: knowledge distillation, teacher and student models

🧠 Interests: My other interests include Linguistics, Psephology (Election Analysis), Cartography, Skiing, Swimming, Tennis, and Road Biking. I've done several multi-day road bike tours (SF to LA, Oregon Coast, Vancouver Island).

I created ThreeFortyThree Canada, a seat-by-seat election forecasting platform for Canadian federal elections. It uses multivariate modeling and Monte Carlo simulation to translate national and regional polling into seat-level projections across all 343 electoral districts in Canada. I also created Guess the Constituency - a game for the Indian General Election, 2024.

🎙️ Podcast: I hosted the AIs Wide Open Podcast, where I spoke with CEOs and executives about how they’re deploying AI in their businesses, as well as with academics on emerging trends in AI. Past guests include the CEOs of G2, PandaDoc, People.ai, and Degreed — all companies with $1B+ valuations. As of May 2025, the podcast has been downloaded 25,000+ times.

🌍 I'm a Third Culture Individual — I've lived in 4 countries (US, Canada, UAE, and India), and speak 4 languages (English, Hindi, Tamil, and Arabic).