VONR (Voice over NR) in 5G
Wireless networks have evolved over generations (1G → 2G → 3G → 4G → 5G). Each new generation brings enhancements in capacity, data rates, latency, and new service possibilities. But one challenge has persisted: voice.
In older generations (2G/3G), voice calls were handled via circuit-switched domains—i.e. dedicated channels. From 4G onward, voice shifted to packet-switched domain (VoLTE, etc.). At the level of 5G, things must evolve further. VoNR (Voice over New Radio, sometimes called Vo5G) is the 5G-native way to carry voice (and related services) over 5G, without relying on older 4G or 3G components.
What is VoNR?
VoNR is the 3GPP-standardized technology that enables end-to-end Voice over IP (VoIP) and video services using the 5G NR access network and the 5G Core.
Unlike earlier 5G NSA deployments, where the 5G radio was used only for data while voice services relied on the existing 4G/EPC infrastructure (VoLTE/EPS Fallback), VoNR ensures that the entire voice session, including signaling and media transport, stays within the 5G domain.
In summary
- VoNR = Voice over New Radio. It means voice calls (and SMS/IMS-based services) are carried entirely over 5G, using 5G radio access and a 5G core, rather than relying on LTE for voice anchoring.
- It is the 5G equivalent (or next step beyond) VoLTE in 4G networks.
- VoNR requires a standalone 5G architecture (5G SA) — i.e. a 5G core network and 5G radio access, independent of LTE. In NSA 5G deployments, voice is still anchored over LTE (i.e. via VoLTE).
- It builds on the IMS and SIP / RTP / QoS frameworks that VoLTE also relies on.
VoNR is a fully IP-based voice service over 5G.
Why we need VoNR
- Voice is still a core service that users expect. Even though data/Internet traffic dominates, voice calls remain fundamental (and often a regulatory requirement, e.g. emergency calls).
- With 5G deployments, operators want to eventually phase out older networks (3G, 2G, even LTE in some contexts). To do that without losing voice service, they need a 5G-native voice solution.
- VoNR also allows voice services to take full advantage of the lower latency, better throughput, and flexible architecture of 5G (e.g. network slicing, QoS) without falling back onto older networks.
- It enables better convergence of voice, video, messaging, and “rich communication” services under a unified 5G/IMS umbrella.
Architecture
To understand VoNR deeply, it’s essential to examine the core network elements involved and how they collaborate to establish a call. VoNR relies entirely on the 5G Standalone architecture, integrating the new radio access network with the cloud-native 5G Core and the IMS.
Key Components
5G RAN
- The 5G NR air interface connects the UE to the 5G gNB.
- In VoNR, the radio side handles the packetized voice traffic just as any data traffic, but with prioritized QoS scheduling for voice.
5G Core
- Unlike the LTE/EPC used in earlier generations, 5GC is cloud-native, modular, service-based, with network functions (NFs) like AMF, SMF, UPF, etc.
- It is responsible for mobility, session management, user plane paths, QoS, and routing of all traffic.
IMS
The IP Multimedia Subsystem handles session control, signaling, media negotiation, registrations, etc.
- SIP is used for call setup (INVITE, ACK, BYE, etc.).RTP / RTCP is used for the media (voice) transport once the call is active.
- It also handles supplementary services (call forwarding, call hold, etc.).
QoS / Bearer Management
- For voice calls, a dedicated QoS flow is used, often mapped to a GBR bearer or appropriately prioritized flow so that voice packets get the necessary priority and avoid congestion delay or loss.
- The QoS architecture ensures that voice traffic is scheduled, shaped, and protected from contention with lower-priority data traffic.
Codec / Media
- Advanced audio codecs (such as EVS — Enhanced Voice Services, with super wideband or fullband) are used to improve quality.
- Jitter, delay, packet loss mitigation, error concealment, etc., must be handled in media plane.
VoNR Call Flow
- The UE responds with a Registration Complete message via the gNB to the AMF.
- The UE initiates NAS PDU Session Establishment Request for IMS with 5QI=5.
- The gNB forwards the PDU Session Request to the AMF.
- The AMF sends a PDU Session Create request to the SMF.
- The SMF triggers an N4 Session Setup with the UPF.
- The UPF confirms session establishment by sending PDU Session Created to the SMF.
- The SMF informs the AMF that the PDU session has been created successfully.
- The AMF delivers the PDU Session Accept to the UE via the gNB.
- The UE sends a SIP REGISTER message to the P-CSCF.
- The P-CSCF forwards the REGISTER to the I/S-CSCF.
- The I/S-CSCF communicates with the S-CSCF to assign the user.
- The S-CSCF responds with a 200 OK or authentication challenge.
- The I/S-CSCF forwards the response to the P-CSCF.
- The P-CSCF completes the IMS registration by sending 200 OK to the UE.
- The UE sends a SIP INVITE with SDP parameters such as EVS/AMR codec to the P-CSCF.
- The P-CSCF forwards the INVITE to the I/S-CSCF, which forwards it to the S-CSCF.
- The S-CSCF responds with SIP 183/180 Ringing which is forwarded back through I/S-CSCF and P-CSCF to the UE.
- The S-CSCF responds with SIP 200 OK containing the final SDP.
- The I/S-CSCF and P-CSCF forward the 200 OK to the UE.
- The UE sends a SIP ACK to the P-CSCF, which forwards it through I/S-CSCF to the S-CSCF.
- The AMF instructs the SMF to establish a dedicated GBR QoS flow for IMS voice.
- The SMF sends an N4 QoS Flow Setup request to the UPF.
- The UPF confirms that the QoS flow for voice has been established.
- The UE and UPF exchange RTP media packets carrying EVS/AMR encoded speech.
- The UE sends a SIP BYE to the P-CSCF at the end of the call.
- The P-CSCF forwards the BYE to the I/S-CSCF, which sends it to the S-CSCF.
- The S-CSCF responds with SIP 200 OK which is forwarded back to the UE through I/S-CSCF and P-CSCF.
- The AMF instructs the SMF to release the PDU session.
- The SMF signals the UPF to remove the QoS flow and release resources.
Comparison: VoNR vs VoLTE
| Feature | VoLTE | VoNR | Legacy Circuit-Switched (2G/3G) |
| Network domain | 4G LTE + IMS | 5G SA + IMS | Circuit-switched network |
| Voice path | Packet-switched (IP) | Packet-switched (IP, native 5G) | Circuit-switched (dedicated) |
| Requirement | LTE + IMS | 5G SA + IMS | Legacy voice network |
| Call setup latency | Low (but higher than VoNR) | Lower, near real-time | Higher |
| Simultaneous high-speed data | Yes (since all is IP) | Yes, with potentially less interference | Limited — voice may block data |
| Quality / codec | EVS / AMR-WB | EVS, etc. (potentially better) | Traditional narrowband |
| Dependency on LTE / older networks | Yes (for some fallback or anchoring) | No (in pure 5G areas) | Fully self-contained |
| Future readiness / capability | Good, but tied to LTE | Better aligned with future 5G-only networks | Obsolete for new features |
Current Deployment & Use Cases
State of Deployment
- Many operators are still in the deployment or testing phase of VoNR. Some have done trials; a few have started commercial rollout in limited regions.
- Ericsson’s white paper notes that VoNR is considered the “third and final evolution step for voice in 5G” and that video over NR (ViNR) is also becoming feasible.
- AndroidPolice (in its “Voice over 5G” guide) notes that many users have 5G phones but cannot yet benefit from VoNR until networks, firmware, certification catch up.
- In India, for instance, Jio recently launched VoNR (Voice over New Radio) nationwide. It is reported that users with compatible 5G phones and active plans (e.g. minimum recharge) can use it, and there’s no extra tariff.
- In Spain, MasOrange recently commercialized premium voice over 5G (VoNR) in various cities, offering clear voice (HD voice+) and simultaneous high-speed data use.
- These show that VoNR is transitioning from trials to commercial phases in several markets.
Use Cases & Scenarios
- High-quality voice calls in data-heavy environments-In areas where users are also streaming video, gaming, or using high bandwidth apps, VoNR ensures voice remains stable and high quality.
- Emergency and mission-critical communications-Low latency, reliable voice is critical for emergency services, first responders, etc. VoNR can provide improved service.
- Rich communication services (RCS / IMS enhancements)-Because the voice is in the same IP/IMS framework, operators can embed content, screen share, web apps, or data interaction (via IMS data channel) into calls.
- Seamless voice in 5G-only networks-In future, operators may decommission older networks; VoNR is essential to maintain voice in 5G-only environments.
- Convergence with IoT / wearable / devices-As 5G becomes ubiquitous in devices beyond phones (wearables, AR/VR, IoT), VoNR can extend voice or voice-adjacent services to them.
VoNR is a critical milestone in the evolution of mobile networks. By enabling voice natively over 5G (in a full 5G architecture), operators can:
- Provide better quality, lower latency voice
- Unify voice, data, and multimedia in a common architecture
- Gradually phase out older infrastructure
- Enable advanced voice-integrated services
However, the transition is complex. It demands network investments, device support, backward compatibility, fallback strategies, and careful user experience design.
At present, VoNR is in the early adoption phase in many markets. Over the coming years, as 5G SA coverage expands, operators and device makers will progressively enable this capability.
References:
- 3GPP TS 23.501 – 5G System architecture and overall procedures.
- 3GPP TS 23.502 – Registration, PDU Session, QoS and release procedures.
- 3GPP TS 24.501 – NAS signaling for Registration and PDU Session setup.
- 3GPP TS 23.228 – IMS architecture and reference points.
- 3GPP TS 24.229 – SIP signaling for IMS (REGISTER, INVITE, BYE).
- 3GPP TS 26.114 – IMS media handling and EVS/AMR codecs.
- 3GPP TS 33.203 – IMS security and AKA authentication.
- 3GPP TS 29.244 – PFCP interface procedures between SMF and UPF.
- IETF RFC 3550 – RTP transport protocol for real-time voice.
