building a machine learning app with aws lambda
TRANSCRIPT
![Page 1: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/1.jpg)
BUILDINGAMACHINELEARNINGAPPLICATIONWITHAWSLAMBDA
Ludi [email protected]
SiliconValleyBigDataScienceMeetupMarch17,2016
(+helpfromTomandPrithvi)
![Page 2: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/2.jpg)
BUILDINGA MACHINE LEARNINGAPPLICATIONWITHAWSLAMBDA
Q: WhatisAWSLambda?A: AWSLambda isacomputeservicethatrunscode–aLambdafunction- on-demand.Itsimplifiestheprocessofrunningcodeinthecloudbymanagingcomputeresourcesautomatically.
OffloadsDevOps tasksrelatedtoVMs:• Serverandoperatingsystemmaintenance• Capacityprovisioning• Scaling• Codemonitoringandlogging• Securitypatches
![Page 3: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/3.jpg)
MAJORSTEPS
Step1:IdentifyproblemtosolveStep2: TrainmodelondataStep3: ExportthemodelasaPOJOStep4:WritecodeforLambdahandlerStep5: Builddeploymentpackage(.zipfile)and
uploadtoLambdaStep6: MapAPIendpointtoLambdafunctionStep7:Embedendpointinapplication
![Page 4: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/4.jpg)
ACONCRETE USECASE: DOMAINNAMECLASS IFICATION
Maliciousdomains• Carryoutmaliciousactivity- botnets,phishing,malwarehosting,etc
• Namesaregeneratedbyalgorithmstodefeatsecuritysystems
Goal:Classifydomainsaslegitimatevs.malicious
Legitimate Malicioush2o zyxgifnjobqhzptuodmzov
zen-cart c3p4j7zdxexg1f2tuzk117wyzn
fedoraforum batdtrbtrikw
![Page 5: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/5.jpg)
FEATURES
• Stringlength• ShannonEntropy
oMeasureofuncertaintyinarandomvariable
• NumberofsubstringsthatareEnglishwords• Proportionofvowels
![Page 6: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/6.jpg)
DATA
• Domainsandwhethertheyaremaliciouso http://datadrivensecurity.info/blog/data/2014/10/legit-dga_domains.csv.zip
o 133,927 rows• Englishwords
o https://raw.githubusercontent.com/dwyl/english-words/master/words.txt
o 354,985rows
![Page 7: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/7.jpg)
MODELINFORMATION
MaliciousDomainModel
Algorithm: GLMModelfamily: BinomialRegularization: RidgeThreshold(maxF1): 0.4935
Class 0 1 Error
0 15889 315 FPR0.0194
1 346 10043 FNR0.0333
Confusion matrix on validation data
Actual
Predicted
![Page 8: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/8.jpg)
WORKFLOWFORTHISAPP
Inputdomainname
GetPredictions
MaliciousDomain?
Visitwebpage
Malicious Legitimate
Yes No
![Page 9: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/9.jpg)
APPARCHITECTUREDIAGRAM
RESTendpoint
JavaScriptApp
Lambda
JythonFeatureMunging
LambdaFunctionHandler
H2OModelPOJO
Prediction
HTTPS POST
domain name
JSONwith
prediction
![Page 10: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/10.jpg)
LAMBDAFUNCTIONHANDLER
publicstaticResponseClass myHandler(RequestClassrequest,Contextcontext)throwsPyException {
PyModule module=newPyModule();
//Predictioncodeisinpymodule.pydouble[]predictions=module.predict(request.domain);returnnewResponseClass(predictions);}
RESTendpoint
JythonFeatureMunging
LambdaFunctionHandler
H2OModelPOJO
Prediction
![Page 11: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/11.jpg)
JYTHONFEATUREMUNGING
def predict(domain):domain=domain.split('.')[0]row=RowData()functions=[len,entropy,p_vowels,num_valid_substrings]eval_features =[f(domain)forfinfunctions]names=NamesHolder_MaliciousDomainModel().VALUESbeta=MaliciousDomainModel().BETA().VALUESfeature_coef_product =[beta[len(beta)- 1]]fori inrange(len(names)):row.put(names[i],float(eval_features[i]))feature_coef_product.append(eval_features[i]*beta[i])
#predictionmodel=EasyPredictModelWrapper(MaliciousDomainModel())p=model.predictBinomial(row)
RESTendpoint
JythonFeatureMunging
LambdaFunctionHandler
H2OModelPOJO
Prediction
![Page 12: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/12.jpg)
H2OMODEL POJO
• staticfinalclassBETA_0implementsjava.io.Serializable {staticfinalvoidfill(double[]sa){sa[0]=1.49207826021648;sa[1]=2.8502716978560194;sa[2]=-8.839804567200542;sa[3]=-0.7977065034624655;sa[4]=-14.94132841574946;}}
RESTendpoint
JythonFeatureMunging
LambdaFunctionHandler
H2OModelPOJO
Prediction
![Page 13: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/13.jpg)
HANDS-ONDEMONSTRATION
STEP1:Build$git clonehttps://github.com/h2oai/app-malicious-domains$cdapp-consumer-loan$gradle wrapper$./gradlew build
STEP2:CreateLambdafunctionandsetAPIendpointSeeinstructionsandscreenshotsinREADME.md
STEP3:Usetheappinawebbrowser$./gradlew jettyRunWar –xgenerateModelhttp://localhost:8080
![Page 14: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/14.jpg)
TROUBLESHOOTING
• CommonPy errorso AnotherH2Oisalreadyrunning
• Py scriptcan’tfindthedatainh2o.import_file()• CommonJavaerrors
o Javanotinstalledatall• Also,mustinstallaJDK(JavaDevelopmentKit)sothattheJavacompileris
available(JREisnotsufficient)o Notconnectedtotheinternet
• Gradle needstofetchsomedependenciesfromtheinternet• CommonLambdaerrors
o Errorinuploading.zipfile• Checkifthefunctionalreadyexistsand,ifnot,tryagain.Forslowerinternet
connections,tryuploading.zipfilewithS3link.o TimeouterrorwhentestingLambdafunction
• GotoadvancedsettingsandincreaseTimeoutvalueo GatewayTimeout(504error)
• ThisisLambda’scoldstartbehavior.Keeptrying,eventuallyLambdakicksin
![Page 15: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/15.jpg)
CAVEATS
• Statelesso Canaccessstateful databycallingotherwebservices,suchasAmazonS3orAmazonDynamoDB.
• Coldstartbehavioro containersareinstantiatedandreusedafterthefirstrequestandstayactiveforawindowoftime(10-20minutes)
o “thelongerIleaveitbetweeninvocations,thelongerthefunctiontakestowarmup”
• APIGatewaytimeoutof10secso Canrequestlongertimeout
![Page 16: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/16.jpg)
CONFIGURINGLAMBDAFUNCTIONS
• Memoryo AllocatesproportionalCPUpower,networkbandwidth,anddiskI/O
o Easysingle-dialsolutiono Logshowshowmuchmemorywasusedfortuningandcostsavings
• Timeout
![Page 17: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/17.jpg)
LAMBDARESOURCEL IMITS
Resource DefaultLimit
Memory 512MB
Numberof filedescriptors 1,024
Numberofprocessesandthreads(combined total)
1,024
Maximumexecutiondurationperrequest 300seconds
Invoke requestbodypayloadsize 6MB
Invoke responsebodypayloadsize 6MB
Concurrentexecutionsperregion 100
Item DefaultLimit
Lambdafunction deploymentpackagesize(.zip/.jarfile)
50MB
Sizeofcode/dependencies thatyoucanzipintoadeploymentpackage(uncompressed zip/jarsize)
250MB
![Page 18: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/18.jpg)
LAMBDAPRICING
• Lambdao Requests
• First1millionpermontharefree• $0.20per1millionrequeststhereafter
o Duration• First400,000GB-secondsofcomputetimepermontharefree• $0.00001667foreveryGB-second thereafter
• APIGatewayo $3.50permillionAPIcallsreceivedplusdatatransfercosts
• EstimateforMaliciousDomainApplication:• Lambda:$0.37/hourwith10threadsafterfree-tier• APIGateway:$0.71/hour• Total:~$1/hr
![Page 19: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/19.jpg)
LAMBDAPERFORMANCE
Memory(MB) Threads Loops Samples Median
(ms)Min(ms)
Max(ms)
%Error
Throughput(calls/sec)
512 1 10000 10000 102 85 2137 0 8.4
512 10 1000 10000 102 85 30330 0.18 44
512 100 100 10000 149 85 30307 0.43 168
![Page 20: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/20.jpg)
LAMBDASCALING
• Automaticallyscalestosupporttherateofincomingrequests
• “Nolimittothenumberofrequestsyourcodecanhandle”
• StartsasmanyinstancesofLambdafunctionasneeded
![Page 21: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/21.jpg)
RELATEDEXAMPLES
• H2OGeneratedModelPOJOinaJavaServletcontainero Github:h2oai/app-consumer-loan
• H2OGeneratedModelPOJOinaStormbolto GitHub:h2oai/h2o-world-2015-trainingo tutorials/streaming/storm
• H2OGeneratedModelPOJOinSparkStreamingo GitHub:h2oai/sparkling-watero examples/src/main/scala/org/apache/spark/examples/h2o/CraigslistJobTitlesStreamingApp.scala
![Page 22: Building a Machine Learning App with AWS Lambda](https://reader034.vdocument.in/reader034/viewer/2022051705/589aa6fb1a28abfc1a8b6611/html5/thumbnails/22.jpg)
RESOURCESONTHEWEB
• Slideso GitHub h2oai/h2o-tutorials/tree/master/tutorials/aws-lambda-app
• Sourcecodeo GitHub h2oai/app-malicious-domains
• LateststableH2OforPythonreleaseo http://h2o.ai/download/h2o/python
• GeneratedPOJOmodelJavadoco http://h2o-release.s3.amazonaws.com/h2o/rel-turan/3/docs-
website/h2o-genmodel/javadoc/index.html
• AWSLambdao http://docs.aws.amazon.com/lambda/latest/dg/welcome.html