I study practical machine learning: making models behave, scale, and hold up in the wild.

ReCollect: Learned Garbage Collection for Python

A reinforcement learning approach that replaces static GC heuristics in Python runtimes with learned control policies, achieving up to 200% throughput improvement on amenable workloads.

Optimizations for Matching Algorithms in Generative Graph Models

Improving scalability of Graph VAEs by replacing brute-force matching with binning strategies based on node degree and community structure, achieving 20x speedup.

Mitigating Reward Hacking With Vision-Language Models as Rewards

Detecting reward hacking via Jensen-Shannon Divergence and applying VLM-based regularization to steer RL policies toward human-aligned objectives on MuJoCo robotic control tasks.

Differentiable Permutation Self-Consistency

Mitigating positional bias in LLM-based re-rankers via a differentiable relaxation of permutation self-consistency, enabling training-time optimization for information retrieval.

ToT-UI & OneToT: Practical Tools for Tree-of-Thought Reasoning

A web interface for visualizing and debugging Tree-of-Thought prompting workflows, alongside a single-pass variant of ToT that reduces token usage by up to 79% without significant accuracy loss.