PRESENCE

A TOOLSET OF HYPER-REALISTIC AND XR BASED INTERACTIONS

PRESENCE is dedicated to revolutionizing human interactions in virtual environments by enhancing the concept of presence in eXtended Reality (XR) experiences through innovative technologies and interdisciplinary research.

The vision of the PRESENCE project is to transform the XR landscape, aiming to overcome key technological limitations. By advancing real-time volumetric reconstructions, enhancing haptic feedback technologies, and improving virtual human behavior, we’re redefining XR experiences. Our ambition is to make XR a seamless part of daily life, revolutionizing how we collaborate, train, and engage across diverse sectors. Join us on the journey to a future where immersive technologies become an integral and natural part of our reality.

Technical Pillars

Holoportation	Haptics	Virtual Humans

The creation of realistic visual interactions among remote humans, delivering high-end holoportation based on live volumetric capturing, compression and optimization techniques, under heterogeneous computation and network conditions.	The creation of realistic touch among remote users and synthetic objects, developing novel haptic systems and enabling spatial multi device synchronization in multi user scenarios.	The creation of realistic social interactions among avatars and agents, generating AI virtual humans, representing actual users or AI agents.

In the PRESENCE project, we focus on the technical pillar Virtual Humans.

Virtual Humans

Virtual humans are interactive digital characters that users can engage with in realistic 3D environments, including virtual, mixed, and augmented reality environments. This technical pillar focuses on the research, design, and development of advanced methods to create realistic humanoid 3D models utilizing cost-efficient technology. These models will serve as avatars for users, known as Smart Avatars (SAs), or artificial agents, known as Intelligent Virtual Agents (IVAs). The primary focus lies in enabling natural, multimodal communication and interaction through a range of expressive capabilities, including speech, facial expressions, gaze tracking, and full-body animations. By enhancing the realism and responsiveness of virtual characters, this technical pillar strives to facilitate effective human-computer interaction, ultimately leading to more engaging and intuitive user experiences. Furthermore, the objectives extend beyond basic human-human communication, aiming to integrate hybrid interaction models that involve multiple real and virtual participants operating alongside SAs and IVAs.

Speech and Facial Interaction

To enable IVHs to engage in conversations with human users, understand human emotions, and respond appropriately to users’ requests, it is important to enable natural human-like speech and facial interactions with IVHs. This can be achieved by extending recent neural network advancements to analyze, process, and synthesize speech and facial expressions, enhancing the AI capabilities of humanoid 3D models. Using existing humanoid models from our project partner DIDIMO, the project will employ speech-to-text and text-to-speech synthesis to facilitate natural user communication with humans and agents.

Multimodal Interactions

Combining the advantages of photorealistic humanoid 3D models, motion tracking, action classification, speech and facial interaction, and physics-based full-body animations, Intelligent Virtual Humans (IVHs) have the potential to revolutionize human-computer interaction by delivering highly immersive and context-aware experiences. These systems will be capable of perceiving and responding to complex social and environmental cues, enabling natural and lifelike human-agent interactions and collaborations. For example, consider the following scenario: A user interacts with their smart avatar in a virtual environment and notices a humanoid agent standing in the distance. The user greets the agent while raising their hand. The agent detects both the verbal greeting and the raised hand using advanced motion tracking and auditory processing. In response, the agent walks toward the user, maintains an appropriate social distance, and returns the greeting with a friendly gesture and a verbal response. This seamless exchange enables a natural, meaningful conversation to unfold, driven by the agent’s ability to interpret and adapt to the user’s actions and context in real-time.

Intelligent Virtual Human SDK

To easily create virtual humans, we are developing the Intelligent Virtual Human SDK. It allows developers to design diverse embodied anthropomorphic IVAs by customizing their behavior through expressive nonverbal cues, integrating different foundation models, speech-to-text (STT) and text-to-speech (TTS) technologies, and tailoring system prompts to guide interactions. Additionally, we incorporate features like proximity detection, trajectory-based action recognition, and vision-based multimodal prompting to facilitate natural human-IVA interactions within immersive XR environments.

The SDK is available on GitHub.

Contact

Prof. Dr. Frank Steinicke

Dr. Ke Li

Dr. Fariba Mostajeran

Julia Hertel

Publications

K. Li, F. Mostajeran, S. Rings, L. Kruse, S. Schmidt, M. Arz, E. Wolf, F. Steinicke, "I Hear, See, Speak & Do: Bringing Multimodal Information Processing to Intelligent Virtual Agents for Natural Human-AI Communication", 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Saint Malo, France, 2025, pp. 1648-1649, doi: 10.1109/VRW66409.2025.00469.

F. Mostajeran, K. Li, S. Rings, L. Kruse, E. Wolf, S. Schmidt, M. Arz, J. Llobera, P. Nagorny, C. Charbonnier, H. Fassold, X. Alvarez, A. Tavares, N. Santos, J. Orvalho, S. Fernández, F. Steinicke, "A Toolkit for Creating Intelligent Virtual Humans in Extended Reality", 2025 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Saint Malo, France, 2025, pp. 736-741, doi: 10.1109/VRW66409.2025.00149.

K. Li, M. Masuda, S. Schmidt and S. Mori, "Radiance Fields in XR: A Survey on How Radiance Fields are Envisioned and Addressed for XR Research", in IEEE Transactions on Visualization and Computer Graphics, vol. 31, no. 11, pp. 9709-9719, Nov. 2025, doi: 10.1109/TVCG.2025.3616794.

J. Llobera, K. Li, P. Nagorny, C. Charbonnier and F. Steinicke, "A Conversational Virtual Agent with Physics-based Interactive Behaviour", 2025 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Daejeon, Korea, Republic of, 2025, pp. 973-974, doi: 10.1109/ISMAR-Adjunct68609.2025.00273.

A. Schwedler, C. H. Da Costa, L. Korkmaz, R. Karanzie, L. Kruse, K. Li, F. Mostajeran, F. Steinicke, "Nuance in Non-Verbal Communication: How Emotional Granularity Impacts Perception of Intelligent Virtual Agents in Virtual Reality," 2025 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Daejeon, Korea, Republic of, 2025, pp. 327-333, doi: 10.1109/ISMAR-Adjunct68609.2025.00070.

D. Egelhofer, J. Gao, N. Heinsohn, S. Khabari, L. Kruse, K. Li, F. Mostajeran, and F. Steinicke “Effects of Verbal Interruption in Conversations with an Intelligent Virtual Agent in Virtual Reality”, In Proceedings of the 2025 ACM Symposium on Spatial User Interaction (SUI '25). Association for Computing Machinery, New York, NY, USA, Article 18, 1–8, doi: 10.1145/3694907.3765918.

Project Website

For more information about the PRESENCE project, please check out the project website:
https://presence-xr.eu

Funding

Presence XR is a Horizon Europe Innovation Project co-financed by the EC under Grant Agreement ID: 101135025.