April 22, 2021
April 22, 2021
In November 2018, Uber launched Uber Pro, a Rewards program aiming to recognize drivers’ quality and commitment.
The beta launch was a complex, multi-market initiative—which required multi-market, multi-method research.
Alongside global discovery research, we conducted a series of mixed-methods evaluative initiatives to ensure Uber Pro’s product-market fit and determine the essentials of the program: a tiering system, the categories of rewards and the mechanisms to access those rewards.
The research approach and scope was as important as the research methods we selected. We had to account for a lot of complex variables: different beta-market sizes (i.e. Chicago is 10 times bigger than Orlando), different driver engagement levels (some drove a few hours a week, some drove six days out of seven), different incentive structures, different levels of competition, and different qualifying criteria. Mixing all of the above, and you could find yourself in a difficult position to isolate the cause and effects of the new program.
After discussing it with Saswati (UXR manager) and Sally (Uber Pro Lead Researcher in the US), we embraced a research program that would be executed simultaneously in seven US markets, and could later be replicated in global markets.
We started by mapping out each research question to the method(s) that would be best suitable to answer it:
In each US market, we recruited 4 to 5 participants for each focus group, 7 participants for the diary study. Based on their diary study entries, we selected the most expressive drivers and asked them permission to bring a videography team to their homes to make a mini-documentary. This resulted in a large sample size of over 100 participants for a qualitative research. This was mainly due to the number of segments we had to cover: a matrix of 7 markets and 4 driver segments in each market.
We started with a Beta launch in the US and its results determined how and whether the program would scale globally. We combined large-scale A/B testing experiments with a research to inform the ‘what’, the ‘why’ and the ‘how’ of questions like:
Right after the beta launch, we started a 5-week Diary study in dscout to understand the impact of Uber Pro in drivers’ experience, motivation to drive, and sentiment towards Uber.
Unlike other research methods, dscout Diary brought us very specific findings about participants’ engagement with the program, motivation to drive, and their comprehension after each of the official communications.
Generally speaking, longitudinal, remote qual offers a few distinctive advantages:
In short, diary studies allowed us to get deeper level insight of what changed for participants—rather than leading us to rely on just metrics. It showed us our participants’ in-the-moment, unbiased thoughts, and showed their interactions with the product in a variety of settings and circumstances. And made it such that we didn't have to constantly meet with participants in person to understand triggers for change.
Drivers were prompted to answer three dscout “parts”—completing a repeated series of tasks each week:
Part one focused on planning at the beginning of the week to capture their intentions and other commitments.
Example Question:
Part two captured the end of their week. The goal was to understand how their driving behavior matched (or didn't match) the plan and why. This gave us insight into the effects of Uber Pro components and their satisfaction with the rewards they experienced.
Example Questions:
Part three asked for a deep dive on a specific reward that we called “Your thoughts on…”
Example Questions:
A bonus video-question was optional for drivers to answer. We phrased it as: “Did you just have a moment related to the rewards program that you want to share with us? Describe where you are and what happened. Be as specific as you can. The more entries, the better!”
Drivers would share their delight and appreciation right after enjoying an airport priority dispatch. They’d recount their frustration with a trip cancellation at 2am in the night. Or they’d reflect upon getting home after a full day of driving.
A number of factors helped for this bonus question to become so revealing: it was the only contextual, optional, “answer when you want” type of question. This gave drivers flexibility to be themselves and share their videos with palpable emotions, in the context where it happened.
Some drivers in the Diary study didn't complete all 5 weeks. We anticipated this possibility and over-recruited to make sure we had drivers from all tiers and markets at the end of the study.
There was a lot of pressure to get the program right: the investment volume, the competitive landscape, the number of teams involved, the company working to go public. In light of this, we overdid it. I forgot the principle of “less is more” and went above and beyond to uncover objective insights for each market. In hindsight, we could have limited the duration to three weeks, divided the study into lighter weight studies, and limited the participant numbers to make sure we could effectively analyze and compare their responses.
A few things we learned to make analysis easier for such longitudinal, remote studies:
We had a researcher focusing on each market, conducting and sharing preliminary findings as soon as they gathered the evidence. This setup allowed us to capture video reactions, usage, perception and product bugs, and share them on a broad internal newsletter on a weekly basis.
Once the study closed, we created a findings report template at a market level to standardize and make the comparison across markets easier. After having completed reports for each market, we gathered all of the researchers, data scientists and designers in one room for a one-week research sprint.
Analyzing as a cross-functional team allowed us to triangulate qualitative findings and experiments results to produce very powerful insights. We first created a base level, compared markets and generated insights and concrete opportunities for each of the themes: the user journey, the rewards, the criteria and the communication, comprehension and impact of the program.
The collective research approach also made the insights buy-in a lot easier. Many people contributed to creating the reports and received the opportunity to present to a very senior audience. The findings were broadly shared via internal conferences and weekly newsletters to hundreds of internal stakeholders and generated excitement and purpose among all of the people involved. Through the research shareout, we felt how we were improving drivers’ lives, making them feel proud for their service.
We didn’t stop after the first presentation, but persevered organizing dedicated sessions with different product teams, marketing and support to have the intended impact. The research team identified and helped fix a number of UX problems, renamed and re-ordered some of the rewards, launched special email communications and a push notifications campaign aimed to complement product educational tooltips, adapted the eligibility criteria and prioritized product road map to improve program awareness and comprehension.
These findings green-lighted a rapid global expansion to get Uber Pro to markets across 4 continents within 6 months. Now, the program is live in 58 countries and used by millions of drivers.
Study design:
Execution:
Analysis: