GitHub Page

GaussianWorld

Building a system for robust 3D scene understanding and natural language interaction within complex environments.

Yue Li Qi Ma Runyi Yang Mengjiao Ma Bin Ren Nikola Popovic Martin R. Oswald Danda Pani Paudel
University of Amsterdam INSAIT ETH Zurich

About Us

We are building a system for robust 3D scene understanding and natural language interaction within complex environments. Our goal is to develop a foundation model that can tokenize complex 3D scenes, perform tasks such as instance detection, and enable reasoning to answer complex language queries about the state of the environment and support natural language interaction with the scene along with spatial content grounding.

Our goal is to make 3D scene understanding as accessible and powerful as its 2D counterpart.

Our Research Trajectory

Our projects build upon each other over time to expand the systems capabilities.

Demos

Key capabilities of our 3D systems.

3D Feature Extraction

Our system takes a 3D scene (3DGS or PC) as input and, in a single neural network forward pass, outputs a feature for each 3D primitive.

Language Interactions

Our system provides real-time open-vocabulary 3D content search leveraging the initially extracted 3D features.

The Team

Multi-institutional research group.