using machine learning for network capacity management · lori pollock dr. lori pollock is a...

31
Using Machine Learning for Network Capacity Management Speaker: Taghrid Samak Host: Lori Pollock CRA-W Undergraduate Town Hall November 9 th , 2017

Upload: others

Post on 26-May-2020

14 views

Category:

Documents


1 download

TRANSCRIPT

Using Machine Learning for Network Capacity Management

Speaker: Taghrid SamakHost: Lori Pollock

CRA-W Undergraduate Town HallNovember 9th, 2017

Speaker & Moderator

Taghrid SamakTaghridSamakholdsadoctoratedegreeincomputersciencefromDePaulUniversity,aBScandMScincomputersciencefromAlexandriaUniversityinEgypt,andiscurrentlypursuingherJurisDoctoratedegreeattheUniversityofSanFrancisco. AtGoogle,Taghridappliesstatisticalmodelingfordiversenetworkapplicationsfromcapacityplanningtowirelessnetworks.Previously,sheworkedatLawrenceBerkeleyNationalLaboratorywhereherresearchfocusedonapplyingdataanalysisandmachinelearningtoenablecross-disciplinescientificdiscovery.

Taghridisco-founderandsteeringcommitteememberoftheArabWomeninComputingorganizationandvolunteersasamentorforvariouswomenincomputingorganizations.

Lori PollockDr.LoriPollockisaProfessorinComputerandInformationSciencesatUniversityofDelaware.Hercurrent researchfocusesonprogramanalysisforbuildingbettersoftwaremaintenancetools,softwaretesting,energy-efficientsoftwareandcomputerscienceeducation.Dr.PollockisanACMDistinguishedScientistandwasawardedtheUniversityofDelaware’sExcellenceinTeachingAwardandtheE.A.TrabantAwardforWomen’sEquity.

About MeBackground• OriginallyfromAlexandria,Egypt• BScandMScComputerScience,AlexandriaUniversity,Egypt• PhDinComputerScience,DePaulUniversity• JDStudent,UniversityofSanFranciscoSchoolofLaw

Career• Currently: Sr.DataAnalyst@Google,CorporateNetworking

• ResearchScientists@LawrenceBerkeleyNationalLab(dataanalysisforBiology,Physics,Systems,…)

• ResearchIntern@BellLabs• ResearchAssistant@DePaulUniversity• TeachingAssistant@AlexandriaUniversity

“A Day in the Life….of a Data Scientist”

Validation, Verification, Exploration, …

Modeling, Learning, Optimizations, …

Business Intelligence, Reporting, …

Coding, Analysis, Communication

Data

ActionKnowledge

Using Machine Learning for Network Capacity Management

Taghrid SamakSenior Data Analyst

Google Corporate Networking

Agenda

• Background– Capacityplanninginenterprisenetworks– Machinelearning

• UsageforecastforGoogle’senterprisenetwork– Data– Model- knowledge– Results- action

Networking Overview

Home Network Office Network

Images from:https://www.lucidchart.com/pages/examples/network-diagram

Network Capacity Management

• Ensuringsufficientresourcesonthenetworktosatisfyperformancerequirements

• Homenetwork– choosingsubscriptionfromtheInternetServiceProvider

• EnterpriseNetwork– interconnectedofficenetworks– officedesignoptimizations– managingbandwidthinsideandoutsideoftheenterprise

Capacity Management Points

Wide Area NetworkWAN

- Multiple Offices- Offices to Internet

Local Area NetworkLAN

- Within Offices

Capacity Management Points

Access Layer

Users’ point of connection

Enterprise Network Data Analysis

• Data– Trafficpassingthroughthenetworkateachlevelfrom

accesslayertoWAN

• Knowledge→Action– Whichusersorapplicationsareusingthenetwork?– Canweoptimizethenetworkdesign?– Whendoweneedcapacityupgradeordowngradefora

specificoffice?– Canwepredictperformanceproblems?

“Fieldofstudythatgivescomputerstheabilitytolearnwithoutbeingexplicitlyprogrammed”

- wikipediadefinition

“Howcanwebuildcomputersystemsthatautomaticallyimprovewithexperience,andwhatarethefundamentallawsthatgovernalllearningprocesses?”

- TomM.Mitchell,CMU

What is Machine Learning?

The Data

• Tableofobservations/samples/objects/…• Eachobservationshasdimensions/attributes/

features/fields/variables/measurements/…

Feature 1 Feature 2 Feature 3 … Feature k

Observation 1

Observation 2

Observation n

Typical Workflow

• DataCollection• Preprocessing(filtering,scaling,sampling,…)• FeatureExtraction• DimensionalityReduction• Learningthemodel(buildknowledge)

– Training– Testing– Validation

• Usethemodel– predictions– actions

The Learning Process

• Canwefindafunctionthataccuratelyfitsthedata?

• Supervisedlearning– thefunctionpredictsafeatureofinterestaccuratelyfor

thetrainingdataandnewsamples• Unsupervisedlearning

– thefunctioncreatesthecorrectpattern/groupsofdata

• What’sneeded– Data– Hypothesisspaceofpotentialfunctions– Optimizationfunctiontominimizetheerror

Machine Learning Methods

Supervised

x1

x2

x1

x2

Unsupervised

Supervised Learning

- Independentvariables(X):dimensions/features/…- Dependentvariables(Y):classifications/labels/…- LearningY’sfromvaluesofX’s- f:X→Y- Usuallyonlyone“y” Labels

x1 x2 x3 y1 y2

Observation 1

Observation 2

Observation n

Unsupervised Learning

- Nolabels- Learning“patterns”fromvaluesofX’s

x1 x2 x3 ... xk

Observation 1

Observation 2

Observation n

The Learning Process - Data Flow

Raw Data

Model 1

Model k

Cross Validation

Model*

Model EvaluationValidation Data

Training Data

Split 1

Split 2

Split k

Training1

Testing 1

Training k

Testing k

Google Offices WAN Capacity Forecast

• Forecastingnetworkusageforeachofficebasedonhistoricaldata– Whendoweneedcircuitcapacityupgradeordowngrade

foraspecificoffice?– Howtomodelusagechangesforchangingheadcounts?

Taghrid Samak, Mark Miklic, “WAN Capacity Forecasting for Large Enterprises.” IEEE/IFIP International Workshop on Analytics for Network and Service Management, AnNet 2016

Google Enterprise Network

Data

• Historicalinbound/outboundbandwidthutilizationforeachoffice- SNMP

• Historicalandforecastheadcount- HR• BW=f(hc,T)

• Approximately400Ksamplesperoffice

Modeling Process Per Office

Forecast Headcounts

Historical Utilization

Historical Headcount

Data Preparation

Model 1… n

Modeling

Model Parameters

Model 1… n

Model Selection Forecast

- Alignment- Interpolation- Smoothing

- Regression- Non-negative least square - Model order

- Cross validation

- Minimize error for the most recent data

Forecast Results

Extracurricular Activities and Time Management

Speaker: Taghrid SamakHost: Lori Pollock

Before we start

• Whatworksforyoumightnotworkforeveryone• Findyourownpaceandbalance

• Hereissomeanecdotaladvice:)

Taghrid’s Extracurriculars

• Asagraduatestudent– UPEHonorSociety,DePaulChapterPresident– ACM-w,DePaulChapterTreasurer– EgyptianStudentAssociationNationalExecutive

Committee• Asaprofessional

– Programcommitteememberfortechnicalconferences– ArabWomeninComputingCo-founder(arabwic.org)– MentorforUSDeptofStateTechwomenProgram

(techwomen.org)– InternaldiversityeffortswithinGoogle– USFLawSchoolDean’sMeritScholarship

Extracurricular Activities

• Activitiesoutsideofthe“normal”realmofstudy/work

• Asastudent– Normal→schoolwork,parttimejob– Extracurricular→studentorganizations,sports,non-

profit,…

• Asaprofessional– Normal→fulltimejob– Extracurricular→sports,non-profit,mentoring,…

Extracurricular Activities

• Whichactivityisrightforme?– Followyourpassion– Commitandfollowthrough

• Personalbenefits– Helpingcausesyoucareabout– Buildingfriendships

• Professionalbenefits– Buildingresumeandnetwork– Learningnewskills– Gettingexperience

Time Management Strategies - prep work

• Clearandfocusedgoals/tasks– Yourto-dolist– Dynamicandflexible– Learntosay“no”

• Prioritize– Byvalue– Bytimeneeded– Bydeadline

Time Management Strategies - steps

• Planning– Bypriority– Day-to-day– Short- andlong-term

• Execution– Avoidprocrastination– Limitinterruptions– Limitmultitasking

• Evaluateandreadjust– Identifyareasofhigh/lowproductivity– Redefinepriorities– Askforhelp