digital signals & access to finance in kenya - slides
DESCRIPTION
In emerging markets, eight out of ten small businesses cannot access the loans they need to grow. USAID’s Development Credit Authority (DCA) uses risk-sharing agreements to mobilize local private capital to fill this financing gap. The goal of this collaboration between UN Global Pulse and USAID is to explore how big data could support the work of USAID’s Development Credit Authority. Kenya has become an established tech leader in Africa in recent years – generating greater volumes of digital data as a result. The goal of this study is to explore what new sources of digital data, and methods for analysis, could be helpful in answering the question: “What barriers to accessing loans do small businesses in Kenya face?” Accordingly, this presentation paints a picture of the big data landscape in Kenya, shows preliminary findings, and lays the groundwork for further investigation.TRANSCRIPT
Landscaping Study:
Digital Signals & Access
to Finance in Kenya
UN Global Pulse USAID Development Credit Authority (DCA) October 2013
Purpose of the project: The goal of this study is to determine the feasibility of answering the question “what barriers to accessing loans do small businesses in Kenya face?” through analysis of new sources of digital data. The research involved the following elements: - Digital landscape in Kenya - Custom analysis of select sources of social media data and
online search data - Assessment of the digital footprint of DCA clients and menu of
potentially relevant data sources for further investigation - Conclusions & recommendations The exercise is intended to inspire new thinking in how USAID’s Development Credit Authority can use new sources of digital data to inform its work.
Relevant sources of Big Data for Development:
WHAT PEOPLE SAY - (i.e., international and local online news sources, publicly accessible blogs, forum posts, comments and public social media content, online advertising, e-commerce sites and websites created by local retailers that list prices and inventory) WHAT PEOPLE DO - (i.e., aggregated transactional data from the use of digital services such as financial services (including purchases, money transfers, savings and loan repayments), communications services (such as anonymized records of mobile phone usage patterns) or information services (such as anonymized records of search queries).
For reference, UN Global Pulse’s introductory guide, “Big Data for Development: A Primer,” is available online and for
download at: http://unglobalpulse.org/bigdataprimer
Social Data?
What is social data?
• Social data is the text that individuals share digitally, e.g.
via Twitter, blogs, Facebook
• Social data is a massive amount of qualitative data.
How is social data analyzed?
• Trends in social data can be analyzed by aggregating
volumes of text relating to a set of predefined key-words.
• Computer algorithms can also automatically detect words
that co-occur with predefined key-words, giving context.
EXAMPLE 1: Rice prices in Indonesia
The number of tweets discussing the price of rice in Indonesia follows a similar
function as the official inflation statistics for the food basket
http://unglobalpulse.org/projects/twitter-and-perceptions-crisis-related-stress
EXAMPLE 2: Finance chatter in the US
Twitter chatter on the topic of finance in the US increase significantly
during the US debt ceiling debate in 2009.
http://unglobalpulse.org/projects/twitter-and-perceptions-crisis-related-stress
• Dec 2012, 78.0% of Kenya’s adults had mobile phones
• Sept 2012, estimated only 7% smart phones
• But Jan 2013, Safaricom’s launched a new smartphone that sold out in
less than two weeks.
• Dec 2012, internet stood at 9.4m subscriptions, growth of 75.1% over
the previous year.
• Including non-subscribers, 41.1% of the population accessing internet
by Dec 2012.
Sources: Communications Commisson of Kenya Quarterly Sector Statistics Report (2012/13),
AudienceScapes, Internet Access and Use in Kenya (2010)
7
Kenya’s Digital Landscape
Phone Ownership in Kenya
A 2009 survey showed that there are comparable rates of mobile ownership
among every income bracket in the country.
Source: Ownership and Usage Patterns in Kenya Amy Wesolowski, Nathan Eagle,, Abdisalan M. Noor, Robert
W. Snow, Caroline O. Buckee
10 DCA clients from Kenya Commercial Bank were surveyed. All of these
clients were farmers, from peri-urban or rural areas in Central Kenya.
Gathering Contextual Knowledge:
DCA Client Survey
Social Media Monitoring
• Various tools/platforms, both proprietary and open-source, allow for social media filtering & analysis
• For this analysis, Global Pulse used the Crimson Hexagon
ForSight platform, which: – Provides access to full archive of public Tweets – Can automate categorization of tweets, once an analyst
creates a set of rules and filters
Step One • The field survey DCA clients to describe, in colloquial language, the
words they tend to use when discussing loans/finance. Step Two • Use the keywords gleaned from survey to create a taxonomy • Test and refine taxonomy iteratively by exploring Twitter data Step Three: • Exclude words that create “noise” in the data (ie. irrelevant posts) • For example, Kenya bank KCB sponsors sporting events so those
tweets are excluded: – Sample tweet: @theARsite Kenya: Amwari to Test New Evo 9 Car At Ngong Ahead of
Next Month's KCB Nyeri Rally (All Africa): Share With Fr... http://bit.ly/YtiS9v
Building a taxonomy of keywords
Keywords (loan OR loans OR mkopo OR wakopo) AND ("Top up" OR "Payback period" OR installments OR expansion OR mpesa OR mbesa OR financing OR "business financing" OR biashara OR dairy OR msoto OR red OR doh OR qualify OR stocking OR application OR maximum OR duration OR interests OR delay OR security OR "land title" OR deed OR "deposit dates" OR tembelea OR "fixed deposit receipts" OR secured OR "calculated interest" OR interest OR guarantees OR guarantor OR lawyer OR Agricultural OR agriculture OR development OR application OR procedures OR payback OR improvement OR n’gombe OR wakora OR repay OR balance OR "agreement letter" OR period OR clear OR siri OR security OR sambaza OR defaulted OR "cooperative society" OR Faulu OR credit OR Agrovets OR mfugo OR zidisha OR "penalty charges" OR penalty OR Emergency OR "ketes temiship" OR inflation OR expectations OR capital OR terms OR payment OR "nilitemelea banki" OR farm OR status OR assets OR asset OR mshwari OR land OR animal OR animals OR "long term" OR "short term" OR "mini statement" OR "mini statements" OR ministatements OR "shamba shape ups" OR "fixed accounts" OR mshwari OR zidisha OR bank OR banki) AND -helb AND -@MweuDeh AND -hooker AND -@helbpage AND -Hooker AND -@HELBpage AND –“car-jacker” AND -Chelsea AND –Manchester
General loan monitor
Original categories Final categories
I want a loan -‐Business
-‐Personal
General Loan, posi5ve
I have a loan, nega5ve -‐Business -‐Personal
General Loan, nega5ve
I have a loan, posi5ve -‐Business -‐Personal
I have a loan, neutral -‐Business -‐Personal
Informa5on seeking Seeking informa5on on loans
Informa5on provision Providing informa5on on loans
Categories were rationalized due to lack of data to break down to a more granular level
*Jokes, sports chat, extraneous noise filtered out
Much of the growth in chatter about loans is related to the launch of M-shwari,
a new mobile based savings and small loans service available to M-Pesa customers.
General Nature of Loan Chatter Jan 1 2013 to March 14, 2013 (before and after M-shwari is launched)
Sample tweet content: “I need a bizness loan…interest is double” Understanding sentiment around bank loans helps people make the best decisions when they need loans.
CONTEXT IS KEY!
Another spike in relevant tweets came after an announcement by a Vice Presidential candidate that if elected, the government would offer an interest free loan to women and youth. While most tweets were neutral in response, the announcement was also met with skepticism – with people tweeting things like “gullible” and “silly season”
Loans by Sector
From January 1, 2012 to August 25, 2013, there have been 5,317 relevant posts, representing an average of 9.2 posts per day. There has been a growth in Twitter similar to the one seen in the first monitor, with a sustained growth after the launch of M-shwari. This growth is driven by chatter in business and personal loans, as opposed to government loans.
Looking at volume and sentiments of tweets
related to a specific bank
Google Trends
Google Trends makes tools publically available to track the volume of
searches over time by country. Using Google Trends, it is possible to:
- Track relative changes in search volumes over time.
- Compare different search volumes.
Limitations
- Can’t create subcategories within one search term
- Only one word that commonly occurs with the initial keyword is
given.
Searches for “loan”
• There is no straightforward way to create sub-categories with-in overall loan
searches. In Google Trends there are two ways to approximate this.
• First is to specify a full search phrase in quotes, for example “business loan”
or “personal loan.” No data was returned.
• Second is to exclude words from the search, for example “Loan -student.”
• Google Trends shows the top co-occurring word with “loan,” which is HELB, the
student loan authority.
How to access other potentially
relevant sources of data This study included a preliminary analysis of readily-available data (Twitter and Google search). However, other sources of data which may reveal highly relevant “digital signals” about the topic would require more effort to access. Namely, this includes mobile phone data and information found on disparate websites.
Data from Mobile Services
Partnerships or subscriptions with service providers (like M-Shwari) would be required to access the data.
Content from Websites
A great deal of information is available online which is updated as things change. While it not feasible to gather this information by hand, a “scraper” can be built to automatically collect the data and integrate it into a useable format.
Example of website which publishes
relevant real-time content: Equity Bank
Opportunities
• Despite small numbers of contextually relevant tweets available in
2013, there is an emergence of a Kenya-specific Twitter culture.
• Twitter is being used to seek, access and share information about
loans, especially mobile loans, as well as to comment on the news
related to personal and business loans.
• Much of the chatter is related to M-Shwari and other Safaricom
products. A future iteration of the monitors could focus solely on
non-traditional banking or exclude M-shwari to focus solely on
traditional bank loans.
• Monitoring banks’ Twitter handles could provide insight into (1)
products & services available, and, to a lesser extent, (2) information
seeking behavior.
• Chance to get in early & set-up monitors rather than reverse
engineer analysis
Challenges
• Kenya’s current digital landscape: quantity of relevant social
media data restricts its utility (small changes, for example due to
a popular retweet or the behavior of one Twitter user, can create
spikes in the charts)
• There social media chatter is largely driven by news/events or
information-seeking, rather than substantively about loans
• There is a lot of “noise” in the data. For KCB, this noise includes
sports chatter. For both banks, this includes news item related
to the overall business of the bank, not necessarily directly
related to bank services.
• Short-term, social data can likely only provide supplementary
insight about barriers to finance in Kenya. (e.g. analysis could be
useful for revealing early themes or trends, to inform the topics
of focus groups to validate)
• Big data projects work best when there are several iterations and collaboration
between topical domain experts (who understand both the context, and the
programmatic information gaps or needs), and data scientists/analysts.
• Need to be imaginative in how new data sources could be used to supplement
traditional data-collection and decision-making processes within organizations
• While social media is on the rise in Kenya, it is also clear that for the purposes of
informing research on financial inclusion, it is not saturated enough yet.
• If this project were to be extended in Kenya, other digital data sources might be
more useful for exploration. Accessing that data would require establishment of
partnership agreements (in the case of mobile phone data), or new capacities (in
the case of scraping data from websites).
• If DCA is interested in beginning to using Twitter data to inform its work
now, it might be a good idea to pilot the methodology in a country
has a stronger social media culture.
What comes next?