Before this, I completed my B.S. in Computer Engineering at Boston University and my
M.S.E. in Computer and Information Science at the University of Pennsylvania.
My research explores how artificial and natural neural networks integrate multiple sensory modalities to form predictive models of the world's dynamics, supporting decision-making and adaptive behavior in both animals and robots.
Conceptually, I build biologically inspired world models. My work focuses on building
world models with spatial cognition. My recent works study
how the brain represents space through place and grid cells
[1, 2, 3],
and explore how artificial agents can form similar representations from multi-modality sensory inputs.
My most recent work investigating how such models can support cue-triggered goal retrieval
while visulizing expected experiences along planned routes
[3],
the work is accepted to NeurIPS 2025.
We propose REMI, a unified, system-level theory of the hippocampus-medial entorhinal cortex (MEC)
loop. The model explains how interactions between known spatial representation cell
types could support cue-triggered goal retrieval, path planning, and reconstruction of
sensory experiences along planned routes.
We introduce a neuroscience-inspired framework that unifies goal-directed memory retrieval
and model-based planning in a compact single-layer RNN (<3K units) coupled with a modified
vision foundation model (BtnkMAE). Trained on random exploration in Habitat-Sim, the
framework can internally plan trajectories and reconstruct expected views along imagined routes.
Time Makes Space: Emergence of Place Fields in Networks Encoding Temporally Continuous Sensory Experiences NeurIPS, 2024
Zhaoze Wang†,
Ronald W. Di Tullio†,
Spencer Rooke,
Vijay Balasubramanian Paper
/
Code
/
Video
/
Project Page
We model hippocampal CA3 as an autoencoder trained to recall temporally continuous sensory
experiences during spatial traversal, where place cell like activity emerges in the hidden layer.
The model reproduces key CA3 properties, including remapping, orthogonal representations,
and stability during continual learning. See also Trading Place for Space.
We present a computational model of place cells (SLAM in the brain). An RNN trained to
autoencode sensory signals from random traversal develops spatially localized representations
in its hidden layer that reorganize across environments and recover with relearning,
forming a scaffold for continual and multimodal sensory integration.
See also Trading Place for Space.
Trading Place for Space: Increasing Location Resolution Reduces Contextual Capacity in Hippocampal Codes NeurIPS 🏆 Oral Presentation, 2024
Spencer Rooke,
Zhaoze Wang,
Ronald W. Di Tullio,
Vijay Balasubramanian Paper
/
Video
We showed that increasing the resolution of spatial encoding reduces the number of
distinct contexts that can be stored by place cells, revealing a trade-off between
position accuracy and contextual capacity. We derived theoretical bounds on this
trade-off using manifold geometry and neural noise models.
See also Time Makes Space.
We present a geometric model of hippocampal place coding that describes how a self-localizing
system balances spatial precision and contextual capacity. Modeling population activity
as high-dimensional manifolds, we show that separable contexts scale exponentially with
network size but trade off with spatial resolution. See also Time Makes Space.
Digital pathology assessment of kidney glomerular filtration barrier
ultrastructure in an animal model of podocytopathy
Aksel Laudon,
Zhaoze Wang,
Anqi Zou,
Richa Sharma,
Jiayi Ji,
Winston Tan,
Connor Kim,
Yingzhe Qian,
Qin Ye,
Hui Chen,
Joel M. Henderson,
Chao Zhang,
Vijaya B. Kolachalama,
Weining Lu
Biology Methods and Protocols, Vol. 10, Issue 1, 2024
Publisher's Page
/
pubMed
We developed a segmentation and measurement pipeline for Transmission Electron Microscopy
(TEM) images using U-Net to automate the diagnosis of proteinuria-related kidney disease.
A PyTorch library for building complex RNNs (CTRNN, GRU, etc.) with controllable sparsity
and positivity in their connections. It uses custom initialization and backpropagation
hooks to automatically handle these constraints, making it easier to build biologically-realistic
and efficient recurrent models.
A modified Masked Autoencoder (MAE) that encodes each image into a single 1024-dimensional
feature vector and reconstructs it back. Designed as a general vision foundation model
for compact and interpretable representations.
An OpenAI Gym-style simulation suite for simulating spatial navigation and neural responses (grid, place, and sensory cells).
It provides an integrated interface to define environments, sensory tunings, and movement behaviors to study spatial coding and train world models of spatial cognition.