One of the driving forces behind early 6G research is that these types of applications will require extreme latency, reliability, and bandwidth requirements—even more so than can be met by 5G’s staggering improvements over 4G. Unfortunately, this has resulted in a 6G vision that is simply “the more bits per second, the more bandwidth, and the more base stations the better.” In other words, the wireless industry is envisioning a 6G that is an evolution of 5G—wireless networks that are X orders of magnitude better or faster or more reliable.
The current consensus is that 6G will be just an incremental evolution of 5G. This singular focus means two other levels of communication have been ignored.
The approach is neither scalable nor sustainable, whether you look at it in terms of energy footprint or network deployment and operational costs. Despite multiple 6G initiatives around the world, the current consensus is that—as it stands today—6G will be just an incremental evolution of 5G.
That’s not to say that 6G will only include technologies already seen in 5G. Research into terahertz waves, for example, could open new bands of spectrum for use. Open RAN might make it more feasible to mix and match radio components from different vendors, allowing network operators to build highly specialized custom wireless networks. Integrated sensing and communication, as the name implies, would make it possible to recycle wireless signals by using them for both purposes (sensing and communication) at once. Reconfigurable intelligent surfaces are manipulable surfaces that can enhance the performance of transmitted signals by controlling how reflected signals impact the surface, to provide better sensing capabilities and reduce interference. Native artificial intelligence and machine learning would allow radios to adapt on the fly to changing environmental or transmission conditions. In addition, 6G will likely include new requirements for network privacy, trustworthiness, resiliency, sustainability, and more.
While these are all interesting and exciting areas of research, these communication technologies—indeed, all our communications systems today—are still fundamentally rooted in what in 1948 Claude Shannon referred to as the “technical problem,” or “level A problem,” in his seminal work, A Mathematical Theory of Communication. Level A stipulates that communication is the process of reproducing at one point either exactly or approximately a message selected at another point—in other words, the bits of information at point A make it, in the same order and correctly, to point B. The semantics and contextual meaning of data are ignored, and all that matters is how to reproduce more and more bits of information more accurately at point B. In other words, this leads to the extreme requirements we see today: More bandwidth, more base stations, and so on.
The focus on the technical problem has—against Shannon’s own word of caution—meant that two additional levels of communication have been ignored. They are the semantic problem (level B) and the effectiveness problem (level C). The semantic problem is concerned with how precisely the transmitted symbols convey the desired meaning of a string of bits. Compare that to the technical problem (level A), in which information is devoid from any context or meaning, but instead what is important is reproducing it as accurately as possible. The effectiveness problem is concerned with how effectively the recipient’s inferred meaning matches the intended meaning that was transmitted. Unlike Level A, levels B and C change communication from a task of reconstructing bits (that is, ensuring the output equals the input) to a process of inducing “behavioral changes” among devices and networks to complete a task or goal.
Using semantic technologies would make it possible for a device to infer missing data based on context clues.
Consider video conferencing, something many of us have become more familiar with during the pandemic. In a level A communication scheme, video conferencing is accomplished by sending large amounts of data between people on the call. The raw video frames and audio must also be encoded at the source and decoded at the destination, with error-correction techniques standing by to fix any mistakes in the transmission.
Including level B technologies would look something like each video call participant locally predicting and rendering any missing portions of the video data if a glitch or network hiccup happens. Currently, we allocate significant amounts of time, energy, and computational resources to ensure very high transmission reliability. But instead, each participant’s machine—whether it’s a laptop, phone, or something else—would “fill in the blanks” by inferring what was missing based on what had arrived. On a deeper level, machines would be able to reconstruct data with the same meaning as what was sent, even if it’s not the same on a bit-by-bit or pixel-by-pixel level. The machine learning techniques to do this already exist, though they are still relatively new: Two examples are variational autoencoders and generative adversarial networks. The latter in particular has gained attention in recent years because of its ability to develop deepfake images.
Using semantic, level B technologies—such as making it possible for a device to infer missing video data based on context clues—would also reduce the bandwidth, data rates, and energy consumption required for transmitting data, without sacrificing reliability. This is level C, which you can recall is the effectiveness problem. You can imagine transmitting information more effectively by sending a synopsis and bibliographic information instead of an entire book. In such an analogy, the transmitted information requires less bandwidth and energy consumption. And video conferencing is just one example. Because we’re talking about a different approach to communications, rather than developing new technologies, we can apply these ideas to any type of communication.
At the University of Oulu in Finland, where I am a professor and head of the Intelligent Connectivity and Networks/Systems group, we’re working on a new research vision called VisionX. Our overarching goal, which we began working toward in 2020, is two-fold. First, we want to research how better to discover higher-order concepts—or semantic representations—of data. Second, we want to be able to distill that understanding and knowledge into devices, base stations, and machines to solve various reasoning tasks including, but not limited to, communication, motion planning, and control.
To unlock a new generation of wireless, we need to move away from machines that learn from pattern matching and towards the ability to understand and reason over the data and how that data is being generated.
If we better understand semantic representations, we could create devices and communication technologies that are able to “reason,” to an extent, about the information they are sending and receiving. Rather than blindly beaming data back and forth—and learning statistical patterns of the data with no ability to comprehend what is being sent—technologies would be able to infer missing or incorrect knowledge and act upon it. These reasoning capabilities would allow devices and networks to be more autonomous, robust, resilient, and sustainable. They would be able to continuously adapt and generalize across different tasks, environments, and types of communication.
Another important part of our work at VisionX is to develop new communication protocols from data as opposed to the handcrafted rules developed by 3GPP, the standards body behind cellular standards today. Such an approach would make it possible to tailor protocols to specific parts of communications networks allowing them to be more effective and flexible.
In short, we’re moving beyond types of machine learning that dominate today—ones that learn simple statistical correlations from data (What we call “system 1” machine learning, to borrow a concept from psychologist Daniel Kahneman’s influential book, Thinking, Fast and Slow, on human cognition). We’re developing semantic communication methods that integrate “system 2” machine learning that are capable of reasoning. Crucially, it would also be an entirely different foundation for building 6G than the current 5G+ approach that’s already underway.
That said, there are still plenty of grand challenges that need to be addressed to make VisionX a reality. One challenge is how, and under what conditions, cooperative communication among agents—a catch-all term encompassing cellular base stations, people on cellphones and laptops, drones, and more—emerges to solve a common task? We also need to be able to measure effective communication and signaling in some way, so that we can quantify whether agents are acting upon received information rather than ignoring it in favor of their own, local information while making decisions.
On the machine learning, front, we need to move away from machines that learn from correlations in the data—pattern matching, in other words—and towards the ability to understand and reason over the data and how that data is being generated. Machines also need to be able to communicate their understandings and reasonings with each other to successfully construct new communication protocols from data.
Additionally, to solve the level C (effectiveness) problem, we need to be sure that this kind of semantic communication is more sustainable and efficient than level A communications. While semantic communications spend less energy sending bits—because less bits need to be sent, and less error correction is needed—there are still computational costs incurred by machine learning. In terms of sustainability and resource efficiency, although order of magnitude more efficient than Shannon communication in terms of transmitting less bits, semantic communication may incur computational costs which need to be characterized. Our preliminary findings, demonstrated how semantic communication can emerge between two agents with a shared context, by reasoning with one another. We were specifically inspired from the two modes of cognition in humans Kahneman described. Within the first mode (system 1 semantic communication), one agent extracts all the concepts from a data set to communicate them to the other agent. In the second mode (system 2 semantic communication), one agent communicates only the most minimal and efficient number of concepts to the second, which then uses its own reasoning to deduce what’s being communicated. We found that the system 2 approach resulted in more efficient and reliable communication compared to system 1. However, this research represents only a fraction of the research we still need to do.
If we genuinely seek to unlock a new generation of wireless that would be radically different than the ones before, we need to go back to the fundamentals instead of pursuing incremental advances. 6G may not be commercialized until 2030, but the heavy lifting is happening now. VisionX is one opportunity to go back to the roots of communication and start fresh, rather than pursue ultimately unsustainable and unscalable advances of more of the same.
Credit: Source link