ensuring the user experience for global mobile device launches
DESCRIPTION
Service providers are more focused on user experience than ever before, actively measuring and comparing the experience of voice over LTE (VoLTE) and IR.94 services to over-the-top (OTT) and legacy services to determine if VoLTE devices and services are ready to launch. At a recent IWPC event, Spirent joined other experts to discuss how the industry can stay on top of mobile device performance and user experience.TRANSCRIPT
September 2014
Measure What MattersEnsuring the User Experience for Global Device Launches
2 PROPRIETARY AND CONFIDENTIAL
3 PROPRIETARY AND CONFIDENTIAL
The ‘bottoms-up’ approach is unsustainable
We can’t just keep adding more and more tests
The industry needs a better way
A ‘top-down’ approach focused on the user gives a sustainable path forward
Following are our best practice recommendations
Evolution
4 PROPRIETARY AND CONFIDENTIAL
Our experience:
• Functional testing is increasingly insufficient for assuring user experience
• Combined live network and lab evaluation of user experience is a better approach
• If you don’t measure everything that matters to users, problems will emerge
Best practices:
• Focus on objective assessment of what the user experiences
• Measure all key factors impacting user experience
• Measure quality and consistency
Practice #1: Measure What Matters
MOS (wideband mobile-to-mobile and
narrowband landline)
Mobile Originated andTerminated Block/Drop
Rates
Conversational Audio Delay
Video Delivery
Audio / Video Sync
Battery Life
VoLTE UX Metrics
5 PROPRIETARY AND CONFIDENTIAL
Our experience:
• Factors that directly impact user experience are highly variable
• Each user experience category has a different variability
• If you don’t take enough samples, your results won’t be repeatable
Best practices:
• Measure the variability of each test
• Estimate a confidence interval based on appropriate distribution
• Compare performance to a reference device based on interval
Practice #2: Ensure Repeatability
Drop
ped Ca
ll Ra
te
Samples
Too few samples: results may change
on re-test!Results are repeatable
Q: How many samples are needed to reliably detect differences in DCR of 2% or more for a DUT vs. Ref?
A: 1500-2000 samples ensures results are repeatable with 90% confidence
90% Confidence Interval
6 PROPRIETARY AND CONFIDENTIAL
Our experience:
• Measurements of UX are only valid if they reflect real user behavior
• Multi-service voice and data usage stresses the device
• Dropped calls and other factors can be significantly impacted by multi-service usage
Best practices:
• Implement multi-service use cases for pre-launch device evaluation
• Setup continuous push emails during test calls
Practice #3: Stress the Device Like a User
1.1%0.9%
1.1%
Manufacturer 1 Manufacturer 2 Manufacturer 3
Voice only Multi‐service
Dropped Call Rate by Device Manufacturer
1.3%1.4%
1.9%
Voice Only Vs. Multi-Service
7 PROPRIETARY AND CONFIDENTIAL
Our experience:
• There is a wide variance in UX across device models
• It’s a competitive marketplace -showing rank by UX category is a powerful motivator
Best practices:
• Consider the relative performance of devices and not just the absolute
• Rank and compare all pre-launch devices by UX category
• Set thresholds based on population performance (and raise over time)
Practice #4: Rank Devices
No one wants to be in this range
Speech QualityDownlink Uplink
Device A
Device A
Device B Device B
Device D
Device D
Device C
Device C
Device E
Device E
8 PROPRIETARY AND CONFIDENTIAL
Our experience:
• You can spend a lot of time & money gathering stack-loads of data that you don’t use
• Broad statistical assessments with focused drilldown into problem areas is more cost effective
• RTP and RF tracing can provide key insight into root causes of poor experience or help you triage
Best practices:
• Only collect RTP/IP and RF DM logs at UE (and IMS) where problems are identified or in focused pre-planned instances
• Use the data to accelerate triage/isolation of UX issues to network/device/service
Practice #5: Assess from the Top Down
0
10
20
30
40
50
60
200 300 400 500 600Pa
cket
s/s
Received Packets/s Missing Packets/s
HO HO
Isolating the Root Causes of Poor VoLTE Speech Quality
0
1
2
3
4
5
0 100 200 300 400 500 600 700 800
Dow
nlin
k M
OS
RTP Metrics Help Identify Cell-Specific Issue
9 PROPRIETARY AND CONFIDENTIAL
Our experience
• Devices are increasingly going global
• Fewer regional / operator variations
• But carrier-specific requirements and band combinations are still critical
Best practices
• Link R&D in Asia-Pacific to operators and Q&A teams in the US
• Need ability to consistently replicate tests around the world / close to each team
• In the lab and live network
Practice #6: Come together
10 PROPRIETARY AND CONFIDENTIAL
Partnering with service providers and device manufacturers to improve the user experience of devices and services
Our Background: Improving User Experience
Over 2 years F4L program led to decreased variance between devices, more
consistent speech quality
F4L program also led to better average speech quality across portfolio
Effect of Fit4Launch on Portfolio Speech Quality over first 2 years
Fit4Launch programs for evaluating user experience of pre-launch devices are
deployed at 3 US operators
11 PROPRIETARY AND CONFIDENTIAL
Inventing systems and methodologies for measuring and analyzing the user experience of mobile devices and services
Our Background: Inventing Systems & Methods
e911
2003 2008 2010 2012 2013 2014
Live Lab
Spirent User Experience Analytics timeline
SpeechFile Transfer & Web Browsing Call
Battery Life
Voice & Video Calling
2011
Video
Thank You