Abstract
SHAPE-IT marks a breakthrough as the inaugural AI-enabled shape display that renders dynamic 3D shapes leveraging Large Language Models. Unlike conventional shape displays requiring pre-programmed behaviors for shape and animation creation, SHAPE-IT integrates Generative AI (GPT4) with pin-based shape displays, facilitating on-demand and on-the-fly authoring of dynamic shapes. This innovative approach empowers users to dictate shape, motion, and interaction through real-time natural language inputs, bypassing the need for coding. The implementation showcases a software bridge between custom prompts for GPT4 and a Unity-based shape rendering software, generating scripts to control shape display based on user instructions. Applications extend to gaming, entertainment, on-demand teaching aids, adaptive furniture, and controllers, demonstrating a rich potential for this technology. A ten-participant user study sheds light on the promising concept and unveils insights for addressing future research challenges.
Contribution
- The significant contribution of this paper is the integration of Generative AI (specifically GPT4) with pin-based shape displays. This integration allows users to generate dynamic shapes on-demand.
- Users can interact with the system using real-time natural language inputs, which eliminates the need for coding.
- Implemented a software bridge between custom prompts for GPT4 and Unity-based shape rendering software. This bridge facilitates the generation of scripts that control the shape display based on user instructions.
Implementation
- Implemented a software system that processes language instructions and translates them into dynamic and interactive shape-changing behavior.
- The system is a proof-of-concept prototype that integrates: Hardware-based multi-modal interaction using inFORCE(5x10) and GUI-based interaction on Unity.
Evaluation
- A user study with 12 participants was conducted to evaluate the concept of SHAPE-IT. The study revealed the promise of the concept and also provided insights into potential research challenges.
- Performance of the system is evaluated based on text complexity as well as demonstrated the system’s versatility in creating shapes, motions, and interactions.
This work is currently under review for CHI2024