In Progress

Historical Document Transcription

Comparing LLMs (Qwen2.5-VL) vs CNNs for transcribing historical documents. Python layout engine for multi-column reconstruction.

LLMsPyTorchComputer VisionLayout Reconstruction
In Progress

Shellock CLI

Node.js CLI that sets up dev environments. Handles multi-module projects with plan files and rollback.

Node.jsCLIDeveloper ToolingAutomation

Spotify Audio Feature Viz

Altair dashboard with 114,000+ Spotify tracks. "Mood Maps" (Valence vs Energy), brush-linked selection, multi-genre filtering.

PythonAltairData VisualizationPandas

Transformers from Scratch

Custom GPT-style transformer architectures. Token/position embeddings, multi-head attention for arithmetic and boolean logic.

PyTorchNLPTransformersDeep Learning

Lucene Search Engine

Java-based search engine using Apache Lucene 9.9. Custom analyzers, field-specific boosting, BM25 scoring with TREC evaluation.

JavaLuceneInformation RetrievalTREC

ShutterBoxd — Movie Knowledge Graph

Movie knowledge graph built on IMDb and Letterboxd data. RML mappings, SHACL validation, SPARQL queries — all in Protege and GraphDB.

ProtegeRMLSHACLKnowledge GraphsSemantic Web

Fencer-PRO Innovation Pitch

Research pitch for fencing performance tech. Motion capture, real-time feedback, market sizing.

ResearchIdeationProduct Discovery

Heart Murmur Detection

Deep learning model hitting 87% accuracy on heart murmur classification from stethoscope audio. Runs on a digital stethoscope for cheap cardiac screening.

Deep LearningAudio ClassificationHealthcarePython
Live

Is It Legal To

Legal guidance platform I co-founded. Case tracking, AI summaries, lawyer matching — live and used daily.

AIDeep LearningLegalTechLLM Optimization
Published

Hybrid Time Series Forecasting

Combining ARIMA, Echo State Networks, and LSTM for time series prediction. Tested on Brent Oil, Gold, and Sunspot datasets.

ARIMALSTMEcho State NetworksTime Series

WAV2Lip-HQ Inference

Lip-sync pipeline doing 95% accuracy at 30 FPS, supports 4K. Built this for video dubbing at GoWarm.

PythonPyTorchDeep LearningVideo Processing
Published

Cetacean Species Detection

Classifies 93 cetacean species from underwater audio using CNNs, CRNNs, and SoundNet. IEEE published + Indian patent filed.

CNNCRNNSoundNetBioacoustics
Published

CRISP

Real-time GPS tracking for public buses. Built so students could actually figure out when their bus was coming.

GPSReal-Time TrackingUX Design

VISU-NG Robot

Robot that talks back. TTS/STT with LLM integration so it can hold a conversation.

TTSSTTLLMsHuman-Robot Interaction

V-SAT (1U CubeSat)

EM spectrum analysis payload for a 1U CubeSat. Redundant systems, weather-resistant build.

CubeSatEM AnalysisCAD Design

HackAP — Dance Floor Energy

Dance floor that generates electricity. Air-tight syringes + piezoelectric sensors under the tiles.

HydraulicsPiezoelectricSustainable Energy