Tang receives NSF CAREER award to understand compositionality of the physical world

Wei Tang

Assistant Professor Wei Tang received an NSF CAREER award to develop a computer vision framework that learns and understands the physical world in a compositional manner.

Compositionality has not been well studied in previous research. Current AI models can accomplish object recognition and scene recognition, and can provide 3D reconstruction, but they don’t do as well to understand the compositionality of the objects and scenes in the physical world.

The world we live in is compositional. For robots or autonomous vehicles to interact with the environment and physical world, they need to not only be able to reconstruct objects but also understand the parts that make up a whole.

For example, an office may include a chair, desk, bookshelf, and a lamp; a chair could consist of legs, arms, a seat, and a back. Understanding these typical compositions and relationships is imperative.

“These objects are not independent, they are governed by physical laws,” Tang said. “Generally, an object cannot float in the air–a table is on the floor, a computer monitor is on the desk, in a supporting relationship.”

Also, objects are often grouped, such as pillows on a bed, or a lamp on a nightstand.

While we tend to imbue AI with the ability to think or “be intelligent,” AI systems must be trained on all aspects of a task. We know that if we are bringing someone a cup of tea, the glass must remain upright, and be placed on a surface, such as the top of a coffee table.

Tang said current AI models focus on understanding individual objects, but understanding the parts of the whole will determine the functionality of the objects.

“Perhaps when you get older there will be a robot that can help you sit on a seat,” Tang said. “To achieve this goal, the robot needs to understand which part of the chair a person can sit on, and which part can support them.”

The new framework Tang is developing will enable intelligent systems to engage in richer physical interactions and accomplish more complex tasks. This method can improve on the data inefficiency that current AI models rely on by replacing large-scale annotated datasets.

“This is difficult to model from the computational perspective, but by decomposing these complex entities, such as shape or the motion into simpler substructures, will make the computational modeling much easier,” Tang said.

Tang’s $539,051 grant, “CAREER: Compositional Learning and Understanding of the Physical World,” runs through June 2030.

Tang is actively seeking motivated PhD students to join his research group.