Hey Google, turn on the light PLEASE!

Why i say please !!!

HYBRID CLOUDCLOUDANALYTICSAI

Rob Porter

5/29/20255 min read

Hey Google, turn on the light PLEASE!

Why do I say "please"?

You may well ask why I say "please" to what is essentially just a bunch of 1s and 0s. Well, my theory is that if AI (Artificial Intelligence) ever becomes self-aware on this planet, I want to ensure it holds me in good standing while respecting my need to breathe fresh air and exist!

So, is this just another Blog about Artificial Intelligence? No. It is focused on what artificial intelligence needs to analyse phone calls made and received by your organisation successfully. More specifically, we discuss the fundamental requirements of Voice Analytics to ensure it has the absolute best chance of delivering accurate transcription and sentiment analysis for both Customers and Staff conversing by phone.

That fundamental requirement is Clean Voice Recording, which starts with clean voice itself. Our definition of "clean voice recording" is a recording that represents the closest possible copy of the original voices active on any given call. To obtain the best possible "clean voice recording," many factors are at play, many of which are often overlooked.

More importantly, we must look at the challenges of passing voice cleanly through the myriad of voice and data networks. Each challenge can be classified into one of two groups when looking at it from the point of view of a contact centre owner/manager:

Challenges outside your control, which you should always consider and remain aware of
Challenges within your control, particularly when considering the current and future architecture of your voice network

Almost everything outside of your corporate network can be deemed “outside of your control, " or is it? To look at this in more detail, let’s take the scenario of an inbound call to a contact centre, being made from a mobile network to a toll-free number that points to a cloud-based contact centre. The challenges experienced along the way, for voice quality, are numerous, as the caller may:

Not be in a good mobile coverage area
Roam in and out of good coverage during the call
Roam out of coverage, thus ending the call abruptly
Remain in a good coverage area for the duration of the call. However, the mobile cell may become congested during the call, at which point the network provider may dynamically introduce various levels of compression, thus degrading the quality of the call for all or part(s) of the call duration
Originate by dialling a toll-free number, which may point to a different carrier, that may or may not introduce compression of its own, thus further degrading the quality of the call

The Toll-Fee Number Network will ultimately route the call across the PSTN (Public Switched Telephone Network) which is typically land based without the use of compression, It will present the call to a public network gateway such as an SBC (Session Boarder Controller), in the case of SIP (Session Initiation Protocol) trunks, of the contact centre. The SBC(s), or Edge Appliance(s), may be hosted in your Data Centre, the Contact Centre Vendor, Integrator or Carrier. Either way, the call is routed there by way of the Toll-Free number being mapped to the DID (Direct Inward Dial) number allocated to the Carrier-provided voice trunk. After this, the call is then under the control of the contact centre routing logic, and may route to an agent that uses either a:

Hard/physical SIP phone
soft SIP client running on a PC

WebRTC (Web Real-Time Communication) browser-based client

&/or

Fixed headsets – USB or analogue
Bluetooth Wireless Headset: This introduces compression and may suffer further degradation due to local radio saturation and/or distance should the agent roam from the desk.
DECT (Digital Enhanced Cordless Telecommunications) headset: While providing better radio signal distance and object penetration, DECT can also suffer the same effects as its cousin, Bluetooth and introduces a form of compression.

For all these clients, the connectivity from the trunk to the agent may or may not use compression. Also, for this exercise, let’s assume that your internal network is well engineered with no conceivable delay, jitter or compression and that the agents remain on the network with direct connectivity to the cloud contact centre provider, which means they do not connect over the open (uncontrolled) internet. If not, then there is another challenge point to consider. It’s additionally worth noting that some WebRTC clients will only support the OPUS codec, which will also introduce compression.

Now the call is complete and hopefully recorded in the cloud. Have you considered that:

To keep storage costs to a minimum, many cloud-based Contact Centre providers will default to using further voice compression before storing, which in many cases can be reconfigured to eliminate compression.
Your recording may be mono and not stereo. A dual-channel recording will improve the performance of external third-party speech/voice analytics tools. Otherwise, they may need to deploy a speech recognition layer, which will need to be trained to differentiate agent voices from caller voices, which may introduce further inaccuracies.
External third-party voice analytics tools will usually extract your recordings via an open API (application program interface) provided by the cloud-based contact centre vendor. These APIS often provide a choice of file formats for the recordings being retrieved, some of which will introduce further compression of the voice.

So, why all this concern about the introduction of compression? These days, not only do all voice calls start as acoustic energy being converted to electrical energy, which can be considered an analogue waveform, but they are also immediately digitised at the handset, headset, or PC. If good equipment is used, the “Clean Voice” is converted into a bunch of 1’s and 0’s, which can be reconstructed at the other end to replicate that “Clean Voice”. However, when compression is introduced, some of those 1’s and 0’s are robbed, and never transmitted, thus reducing the chance of quality reproduction of the original “Clean Voice” signal. Granted, good compression algorithms with predictive intelligence will arguably recreate the original signal with little to no noticeable degradation to the human ear. However, with original data being robbed, there will always be a finite degree of degradation. Add the number of finite degradations along the path of your typical call together, and you will always have a noticeable resultant degradation. The trick is to keep it all to a minimum.

I now bring you back to the initial two points, “challenges outside and within your control”. First, we must accept that those outside of our control are here to stay for the foreseeable future and remain aware of this. How many of the listed challenges can we influence or control?

The answer is simple: almost 70% is within your power to influence or eliminate. This means that the first 4 points above are outside your control; however, the rest are definitely within our power to change by asking the right questions and making the right architectural decisions when designing and administering your cloud-based Contact Centre or Unified Communications platform. An example of this is point “5” above; while customers of carriers have no control over whether or not compression is used within the Toll-Free network, you do have the right to ask for a Toll-Free Number Product that does not compress across their network. It’s also worth noting that compression technologies were introduced for two main reasons: to reduce the need for expensive bandwidth and storage capacity, which have rapidly become a cheap commodity in recent times.

Address these voice challenges and keep compression to a minimum, and you will give your Artificial Intelligence-based Voice Analytics project the best fighting chance of success.

Call us at +61 2 7227 9388 or email hello@canzuki.com