what is a data scientist
TRANSCRIPT
#DataTalk What is a Data Scientist?
LIVE TWEETCHAT
FEATURING:Dr. Michael Wu
Chief Scientist, Lithium@mich8elwu
Join our #DataTalk on Thursdays at 5 p.m. ETThis week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.
Check out all tweets from this Twitter chat:
ex.pn/scientist
What type of work doesa data scientist do?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
“A data scientist’s work includes everything from data infrastructure (capture, store,
process) to data service (retrieval).
#DataTalkex.pn/datatalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
“A data scientist converts data into business intelligence.
#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
“ A data scientist’s work includes: decision science, business intelligence, customer
analytics, marketing analytics, fraud, security, etc.
#DataTalkex.pn/datatalk
What are attributes of agood data scientist?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
To be a data scientist, you need the technical expertise in computer science,
statistics, and knowledge/experience with large data sets.
#DataTalkex.pn/datatalk
“
Data scientists should have good intuition, strong coding capability, solid training in statistics & machine learning.
#DataTalkex.pn/datatalk
“Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Educational background for data scientists can be computational genomics, astrophysicists, fluid dynamics, chemistry,
biophysics (like me) ...
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
To be a good data scientist, you need more than tech expertise. You must be
a good communicator to explain complex data/analysis.
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Good data scientists also need to be passionate about data. I’d highly value curiosity, creativity and perseverance
when hiring one.
#DataTalkex.pn/datatalk
“
What kinds of companies have (or need) data scientists?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Any company that already is investedin modern big data infrastructure
will need data scientists to crunch the data.“
ex.pn/datatalk#DataTalk
All companies need to havedata scientists to stay competitive.“
ex.pn/datatalk#DataTalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
All businesses use data, and data willgrow so big that our brain and databaseseventually can’t handle … they all need
data scientists eventually.
“ex.pn/datatalk
#DataTalk
What types of teams do data scientists work with?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
It all depends how the data organizationis structured within the enterprise:independent team, hub & spoke,
or silo in dept.
ex.pn/datatalk
“#DataTalk
Data scientists can work in R&D, product development and support
business operations.
ex.pn/datatalk
“#DataTalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
SMBs use hub and spoke data teamswhere they report to different departments,
but collaborate and work together,so data expertise is shared.
ex.pn/datatalk
“#DataTalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Large companies typically have entire teams of data scientists
within each department and they usually don’t collaborate.
ex.pn/datatalk
“#DataTalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Personally, I work internally withengineering, product, marketing,best practice, service, consulting,
strategy, even sales and human resources.
ex.pn/datatalk
“#DataTalk
I process data, build models, engagewith clients, and facilitate collaboration
among Experian Data Labs.
ex.pn/datatalk
“#DataTalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
What are some big challenges that data scientists face?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
One of the biggest challenges fordata scientists is communication.
Many data scientists speak tech & stats,but they don’t speak business.
“ex.pn/datatalk
#DataTalk
Challenges for data scientists: data governance and what
data can be used for“
ex.pn/datatalk#DataTalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Other challenges for data scientists:Data access, data integration, and motivation.“
ex.pn/datatalk#DataTalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
If a company is starting a data scienceinitiative, their data scientist may not
have access to all data due to securityand compliance.
“ex.pn/datatalk
#DataTalk
Is there an art and science toworking with big data?
Absolutely! Good intuition and domain knowledge are the keys for
successful big data projects.“
#DataTalkex.pn/datatalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
There’s definitely science to workingwith big data… there are rigorous
stats and implementation details youlearn from statistics and computer science.
“#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
There’s also art in working with big data,and this only comes with years
of experience on working with big data.“
#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Picking a good problem is sort of an art,choosing right features from an infinite
number of features, too (feature engineering).“
#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Exploratory data analysis (EDA): getting a feel or a hunch for how the
data behaves is definitely an art.“
#DataTalkex.pn/datatalk
How can data scientists make a big impact in their business?
Dr. Michael WuChief Scientist, Lithium@mich8elwu
First, data scientists need to learnabout the business, so they have the
context to interpret the data andresult of models/analyses.
“#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Second, they need to pick a goodproblem: the most impactful
problem that can be addressed withdata they have access to.
“
#DataTalkex.pn/datatalk
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Third, they must communicateeffectively and make businesses
understand the analysis and businessimplication of the insight they found.
“
#DataTalkex.pn/datatalk
To be impactful, data scientists needto keep an open mind and concentrate
efforts on most impactful problems.“
#DataTalkex.pn/datatalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
A good data scientist must bea good communicator, storyteller,
teacher, etc. who can simplifycomplex data science for business.
“
#DataTalkex.pn/datatalk
What are some big data trends?
Big data will be embraced by moreand more businesses. More decisionswill be driven by data and analytics.
#DataTalkex.pn/datatalk
“Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
In past 5 years, most of the big datatech operate at the infrastructure layer.Now more people are focused on the
algorithm layer.
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
If there’s a big data trend, it’s theshift from infrastructure to analytics/
algorithms on people’s big data asset.
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
It used to be that data scientists can doeverything about any data, now there is data
engineering, algorithm, decision science.
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Now there’s experts in natural languageprocessing, image analysis, video/audio
processing, streaming data, etc.
#DataTalkex.pn/datatalk
“
Why should companies invest more in data science?
Businesses invested in big data wiselywill have a huge competitive advantage
over their peers.“
ex.pn/datatalk#DataTalk
Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Companies should invest more in datascience. They will need to eventually anyway.“
ex.pn/datatalk#DataTalk
Any final tips for those whowant to work in data science?
Tips for new data scientists:Keep an open mind, think outside the
box, and work hard. The future is bright.
#DataTalkex.pn/datatalk
“Shanji XiongGlobal Chief Scientist, Experian@ShanjiXiong
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Final Tips: Learn the tech and stats,learn the business context, learn to
communicate the tech/stats to businessto bridge the gap
#DataTalkex.pn/datatalk
“
Dr. Michael WuChief Scientist, Lithium@mich8elwu
Be patient, follow your passion (which should be data), and pick a good
problem to solve.
#DataTalkex.pn/datatalk
“
Join our #DataTalk on Twitter on Thursdays at 5 p.m. ET.
experian.com/datatalk