Synchronizing the Physical Devices with the Metaverse

By Mary Freund

26 Apr 2024

The January 2023 issue of IEEE JSAC is a special issue on “Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications”. The phrase “semantic communications” started to conquer a significant real estate in the overall discussion on future wireless systems; yet, it sometimes remains fuzzy what the objective and scope of it is. The tutorial article written by the Guest Editors is an excellently written piece that brings clarity to the discourse on semantic communications. Regarding the technical articles published in this issue, the one that was selected to be featured in this blog is:

Z. Meng, C. She, G. Zhao and D. De Martini, "Sampling, Communication, and Prediction Co-Design for Synchronizing the Real-World Device and Digital Model in Metaverse," in IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 288-300, January 2023.

The process of fusing the physical and the digital world has been going on for decades, but it is believed that this will be brought to a new qualitative level with the emergence of metaverse, digital twins, and similar concepts. Synchronization of the events in the physical and the digital world is of central importance in the metaverse in order to have consistency and proper causal relationships between the events. To characterize the synchronization performance, one important measure is Motion-To-Photon (MTP) latency, measured as the time between a user’s action and the corresponding effect displayed in the virtual world. The other two measures to characterize the synchronization are standard measures of communication performance, data rate and packet loss rate. In practice, it is hard to meet all three performance measures simultaneously and one needs to look into the tradeoffs.

The main premise of this paper is that, rather than looking only in the communication, one has to consider, in an integrated way, three operations: sampling, communication, and prediction. This is plausible to improve the performance indicators of synchronization. In doing this, a key observation is that a good prediction decreases the data quantity that is required to be communicated. The authors introduce a co-design framework that includes sampling, communication, and prediction. The framework includes an algorithm that combines Deep Reinforcement Learning (DRL) techniques with expert knowledge on sampling, communication, and prediction. The approach is evaluated through an actual prototype of a robotic arm, which brings an additional credibility to the proposed approach.

JSAC: In the paper you make certain assumptions in the communication model about synchronization, slotted structure, or similar. Which practical scenarios would challenge this model? How would the model change if you assume bandwidth-intensive transmissions, such as video streaming along the trajectory?

Authors: In this paper, we took a standard orthogonal frequency division multiplexing (OFDM) communication model to conduct the cross-system design among sampling, communication and prediction. An ideal assumption in our work is that the communication delay is bounded and known by the system.

In practical scenarios, the upper bound of communication delay may not be available to the system. The wireless link is part of the whole communication system, and the latency and jitter in backhaul and core networks heavily rely on the specific scenario. Although we showed in the paper that our system can be easily extended to some other communication models, we need to fine-tune the pre-trained deep reinforcement learning algorithm in a new scenario.

For bandwidth-intensive transmissions, our approach can reduce the communication load and thus can save bandwidth if the state of the system is predictable. Let’s take virtual reality (VR) as an example. In VR video streaming, a head-mounted device (e.g., VR glasses) needs to transmit the trajectory of human user to the access point, and the access point sends the video within the field-of-view to be requested by the user. Since the trajectory of the user is predictable, our framework can be used to predict the future trajectory and the corresponding field-of-view. In this way, we can reduce the required bandwidth for VR streaming.

JSAC: What was criticized by the reviewers and how did you address it?

Authors: The reviewers’ comments were generally positive. The reviewers paid particular attention to how the expert knowledge is adopted to assist the constrained deep reinforcement learning algorithm. Thanks to this comment, we were able to better sort out the role of expert knowledge as an aid for training. To address this comment, we discussed the impacts of different expert knowledge on the training performance of the proposed knowledge-assisted constrained twin-delayed deep deterministic algorithm (KT-TD3). In addition, we showed that the proposed strategy with full expert knowledge has the best performance in terms of stability and can meet the average tracking error constraint.

JSAC: What are the main communication challenges, not necessarily wireless, you are seeing on the way forward towards enabling fusion of digital with the physical world, such as in the Metaverse?

Authors: There are several communication challenges in enabling the fusion of the digital and physical worlds. One of them is to define new key performance indicators (KPIs) (e.g., motion-to-photon (MTP) latency in human-avatar interaction) for Metaverse applications. Existing communication KPIs are not sufficient to support emerging applications in the Metaverse.

Scalability is another issue to both wireless and wired communications. As the communication, computing, and storage resources are limited, it is difficult to support a large number of users or devices with diverse applications and human-computer interfaces in one communication network. The integration of our sampling, communication, and prediction framework and different types of human-computer interfaces or devices is challenging.

Another challenge is the design approach. Most of the existing communication design and optimization methods are independent of sensing, computing, and robotics systems. Without the domain knowledge and dynamics of these applications, we can only obtain sub-optimal solutions with high communication and computing overhead. Thus, the task-oriented cross-system design would be the approach to tailor the communication system for specific use cases and applications.

Statements and opinions given in a work published by the IEEE or the IEEE Communications Society are the expressions of the author(s). Responsibility for the content of published articles rests upon the authors(s), not IEEE nor the IEEE Communications Society.