best practices for evaluating storage using workload ......storage performance testing and...
TRANSCRIPT
White Paper
Best Practices for Evaluating Storage Using Workload Testing & Validation
Version 1.0 l November 2019
When evaluating new or changed storage platforms to determine the optimal storage technology, product, or configuration for your organization, three factors are critical:
• The workload must accurately reflect the environment; it must be realistic
• The environment must be properly configured for each platform tested
• The tests must follow a consistent methodology
This paper offers a proven methodology, focused primarily on block-based systems, for framing these tests to provide a fair and accurate representation of the new or changing technologies, and how these technologies should be tested and validated using workload profiles representative of the environment into which they will be installed.
2
TABLE OF CONTENTSIntroduction 3
Objectives 3
The Need to Test and Validate 4
Characteristics of a Production Workload 4
Defining the Test Project and Building Your Own Benchmark 6
Document Subjective Factors 6
DefineTestingObjectives 6
DefineEvaluationCriteriaforProof-of-ConceptTesting,akaBake-offs 7
IdentifyTestEnvironment/Protocol 7
DefineGrowthAssumptions 7
Develop Testing Schedule 8
CandidateSelection 8
Notify Suppliers 8
Testing 9
OptimalTestEnvironment 9
StorageTestingWorkflow 10
Importing and Exporting Workloads 10
PrepareandConfigureStorageTarget–Precondition 11
ExecuteTests-CurrentProductionWorkload 11
ExecuteTestswithTDEformodelsthatexceedWorkloadWisdom-Efunctionality 12
ReviewResults 12
Common Testing Pitfalls 13
#1SkippingConditioning 13
#2UsingBadWorkloadModels 13
#3ForcingUnnaturalArrayConfigurations 15
#4NotAgreeingonTerminology 15
Conclusions 16
3
INTRODUCTION
Virtanaistheleadingproviderofstorageperformancetestingandvalidationsolutions,servingITorganizationsaswellasleading storage technology vendors. Storage performance testing and validation matters to IT storage teams because it’s criticalforimprovingtheirjobperformance–enablingthemtomoreintelligentlypurchase,configureanddeploynetworkedstorage.Formanyyears,otherITdisciplineslikenetworkingandapplicationmanagementhaveenjoyedrelativelymodernpre-productiontestingandperformancevalidationtools.Thestorageteamjusthadfreewaretoolswithnoprofessionalsupport.So,theyweremostlycondemnedtoexplainingwhyI/Oisslow…ontheproductionfloor.
Estimatingcapacityneedsisprettyeasy;rarelyhaveweheardofITorganizationsunexpectedlyrunningoutofstorage.Butit’s a good bet that everyone reading this document has personally experienced the effects of a “performance outage” or “brownout”.It’spartiallybecauseit’sreallydifficulttoprojectperformanceneedsunlessyouhavetherightperformancetoolsandtherightmethodology.So,overprovisioninghappensnotjustforcapacity,butjustasoften,tomitigateperformancerisks.
Doing storage performance testing well matters because it helps to answer these common questions:
• CanIimproveapplicationperformancebyimplementingnewstoragetechnologiesorproducts,orchangingconfigurations?Ifso,byhowmuch?
• CanIaffordtheperformanceimprovement?Willnewtechniqueslikecompression/deduplication/snapshotsreducetheeffective$/GBwithoutsubstantiallyimpactingperformance?
• HowdoIselectthebesttechnology,product,orconfigurationtomatchmyapplicationworkloads?
• Whichofmyworkloadswillgainthemostbymovingtonewarchitecturesorproducts?
• Wherearetheperformancelimitsofpotentialnewconfigurations?
• Howwillstoragebehavewhenitreachesitsperformancelimits?
• Whataretheperformancelimits?
Theprimaryeconomicbenefitsofstorageperformancetestingcomefromthreeareas:(1)improvedapplicationperformance,(2)storagecostoptimization,and(3)riskmitigation,whichallowsITarchitectstoinnovatefaster.
Sowhyaren’tmorepeopledoingthistoday?Simple:ithasbeenhardandtootime-consumingusingDIYorsharewaretools,andfrankly,theresultssimplydonotreflectreality.Thesetoolscannotaccuratelysimulatereal-worldproductionworkloads.That’swhyVirtanasellsLoadDynamiXEnterpriseandspecifically,whywewrotethisdocument,toexposestorageengineersandarchitectstoaBestPracticesMethodologytoaccompanytheuseofLoadDynamiXEnterprise.
OBJECTIVES
What are the specific objectives of this document? To go beyond the technology and to:
•Provideguidanceforcustomerstoensureafairevaluationofnewtechnologiesandvendorproducts
•Ensurearchitecturaldifferencesinnewtechnologies,vendorplatformsandvendor-specificconfigurationbestpractices are accurately represented in customer testing
•Ensurecustomers’needsareclearlycommunicatedtovendors,andvendorsprovideaconfigurationtomeetthosespecificneedsforthetestingprocess
•Allowtechnologiestobeaccuratelyevaluatedbasedontestingresultsandevaluationcriteria
•Providecustomerstheopportunitytoincludetheinitialarraysetupandconfigurationprocessintheir evaluation criteria
•Discusscommonpitfallsassociatedwithtestingandwhytheseshouldbeavoided
4
THE NEED TO TEST AND VALIDATE
There are several benefits and use-cases for test and validation:
• Movingtonewplatforms,likefromfibrechanneltoiSCSI,ortoSoftwareDefinedStorage,caninvolveleapsoffaith,andrisk.Testandvalidationbyfarthebestwaytomitigatethisrisk.
• Ifstoragevendorscouldcustom-buildproductsandconfigurationsbasedonindividualcustomerapplicationprofiles,theremightnotbesuchaneedtotest.Buttheydonot,andsostorageengineersandarchitectsmustgobeyondthesimpleperformanceprofilesthatappearinspecsheets,andtheguidelinesprovidedby“industrystandard”benchmarks.Customersmusthavedatathat’srelevanttotheirenvironment.
•Storageworkloadsaredifferentacrossdifferentapplicationsandplatforms.Noonewouldexpectastreamingvideoapplicationtostressastoragearrayinthesamewayasanemailserver.
• Applicationshavewildlydifferentrequirementsforstorageperformance.Latencyforsomeapplicationshavetobemeasuredinmicroseconds,whileforsome,multipleseconds.
• Noonestoragearray,technologyorconfigurationcanuniversallyprovidethebestprice/performanceforallapplicationtypes.Differentstoragesystemsperformdifferentlyfordifferentworkloads:thereisnoonestoragethatfitsall.Infact,weknowthatbothstoragevendorsandcomponentmanufacturersoptimizetheirarraysforparticularworkloadtypes.Buttheirsalesteamsgenerallyarenotawareofthislevelofdetail,andoftensella“onesizefitsall”solution.
• Itisimportanttoknowhowastoragesystemperformsinyourenvironment,includingexpectedgrowth.
• Itcanliterallysavemillionsofdollarsinnewstorageprocurementstounderstandtheconfigurationbestsuitedforyourapplication.Wehaveexampleswherecustomershalvedthenumberofscale-outnodesthatwereoriginallyproposed by their vendors.
•Industrystandardbenchmarksareknowntobe“gamed”byvendorswhorightlywanttopresenttheirproductsinthebestpossiblelight.Often,configurationsaretestedandreportedon,whichbearnoresemblancetorealcustomerconfigurations.
•Vendorsareunderpressurefromtheirmarketingdepartmentssotheyroutinelypublishspecsthatarenotreal-worldinnature.AhugeIOPSnumbermaybeoffered,butitmayhavebeenachievedbyusingblocksizesthatarenotrepresentativeofmostapplications.Ortheymayclaim“randomaccess”numbersbasedonasmalldatasetordatareadbycache.Andthereneverisoneblocksizeforthecompositeworkloadsrunningonmodernarrays.It’sadistributionofsizesfordifferentLUNs,readsandwrites,anditchangesovertime.Runningatestshowingonlyanaverageblocksizeisn’thelpfultothecustomer.
CHARACTERISTICS OF A PRODUCTION WORKLOAD
ProductionworkloadshavevariationsofIOsizes,rates,sequentialityandworkingsetsovertimeonasmall(milliseconds)andlarge(hours)scaleacrosshundredsofLUNs,andaffectstensofterabytesofdataduringa24hworkcycle.Inthefollowinggraphs,youcanseetheworkloadI/Oratesandsizevariationsovertime.Youcanseevariationsinthroughputfrom200toover1400MB/s,andinIOPSfrom15Kto50K.Amoretypicalgraphyoumightseefromsimpleperformancemeasurementtoolswouldaveragethesenumbers,producingaflatter,andmuchlessaccurateportrayalofperformance.
5
Pastsimplesystemlevelmetrics,it’sextremelyusefultomeasureSANcomponents,suchasFront-End-PortsandLUNs.Inthegraphbelow,youcanseeareal-worldexampleofhowproductionworkloadscanuseLUNsunevenly.Inthisexample,averysmallnumberofLUNsarehandlingnearlyalltherealworldI/O.Itisimportantthatthesedifferencesaremaintainedinthetest.Inoneexample,8workloadswerecreatedfordifferentlevelsofLUNactivityandtypesofaccessversus1workloadthattreatedallLUNsequally.TheworkloadwheretheratiosofLUNsthathadworkweremaintainedthearraycouldruntheworkloadwithreasonablelatencyalldaylong.ThesingleworkloadthattreatedallLUNsthesameresultedin130mslatencies.
6
DEFINING THE TEST PROJECT AND BUILDING YOUR OWN BENCHMARK
Document Subjective Factors
Tofairlyandaccuratelyperformtheevaluation,youshoulddocumentanypotentialfactorsnon-essentialtotheevaluation.Considertheinfluence(ifany)thesefactorsmayhaveonyourresults.Theyarevalidconsiderations,butshouldbefactoredoutwhendoingapurelyperformancetest.Examples of subjective factors include:
• Existing professional relationships
•Lengthofrelationshipwithincumbentvendors
• Perceived risks and complexities of changing technologies or migration
•Unsubstantiatedthirdpartyreviews
• Unsubstantiated “rules of thumb”
•Internalconcernsoverchangeandwhatthismightmeantoexistingprocesses
•Benchmarksthatprofileotherapplications
•Currentknowledgeaboutmanagingandmaintainingparticularvendorsorproducts
• Maturity and feature sets of the array and vendor management and monitoring tools
Define Testing Objectives
Commonexamples:
•Determinethebestperformingplatformforyourexistingworkloadat1x
•Determinebestperformingplatformforsameworkloadwhenscaled(1.25X,1.5X,1.75X,2x,etc.)It’sagoodideatoscaleinsmall,regularincrements.Forinstance,ifyouthinkyourworkloadmightgrow4xoverthelifeofthearray,considertestingat2x,and3xaswell.
•Determineplatformsandconfigurationsthatsatisfyapplicationperformancerequirements
•Identifythepointatwhichworkloadexceedsperformancerequirementsofaplatform
Atthispoint,it’sgoodtoaskyourselfifyourtestingobjectivesinfluencedbysubjectivefactors,andfactorthemout.
7
Define Evaluation Criteria for Proof-of-Concept Testing, aka Bake-offs
AswithObjectives,askyourselfifyourevaluationcriteriaareinfluencedbysubjectivefactors.
Business
• Long-termviabilityofthevendor
•Existingrelationships(quidproquo)
• Industry reputation
Financial
•Costs(Acquisition,Maintenance,TCO)
• Price / Performance
Operational
• Reliability
•Supportandsupportability(considerboththelocalserviceandremotesupportteams,andtheirabilitytocoordinateamongstyourlocalandworld-wideteams
Performance
• Identify the highest acceptable average latency
•KPI:Latency,Throughput,IOPS
Scalability
•DefinethehighestIOPSandthroughputatacceptablelatencyforeachplatform
Identify Test Environment / Protocol
Bespecific,andcommitthesedetailsinwriting.Getspecificrecommendationsfromallinterestedparties,includingyourDBandappsteams,serverandnetworkingteams,yourvendors,andofcourse,yourtestingvendor.Askforexamplesofprevioustest environments similar to yours. Then specify:
•Whattestswillberun?
•Howwilltestresultsbecollected?Here’swheretheLoadDynamiXEnterpriserepositorywillbeinvaluable.
•Howwillresultsbeevaluated,andbywhom?
•Whatarethesuccesscriteriaforeachtest?Again,getinputfromotherteams.
•Whatresultsareyouexpecting,basedonwhatthevendorhasstated/published?
Define Growth Assumptions
What is the anticipated growth of the workload over the expected life of the array?
•Canyoucorrelateyourgrowthinworkloadtobusinessgrowth?Barringotherinput,it’sagoodplacetostart.Askforrevenueornumber-of-client-userprojections.
•Areyourworkloadgrowthestimatesreasonable,givenpastgrowthpatternsandbusinessgrowth?
•Arethereanyongoinginitiativesthatcanwillimpacttheseforecasts(i.e.cloudmigrationsorM&Aactivity?)
8
Consider the possibility the array remains in service longer than expected.
•Ifthearrayremainsinservicelongerthanexpected,whatisareasonabletimeline?
•What(ifany)growthwouldbeexpectedduringthisextendedtimeperiod?
•Whatareyourexpectationsforperformanceandstability?
•HowdoesthiscomparetothevendorsexpectedEOLdate?
•Atwhatpointwillyoustopaddingtothearray?(serversorVM’s,storagecapacity,workload,etc.)
Is consolidation required at a future date?
•Arethestoragearrayportssizedcorrectlytoaccommodateconsolidation?
•HowwillLUNconsolidationschangethecharacteristicsoftheworkloads?
Develop Testing Schedule
Ascarefulasyouare,theschedulecanchangeforanumberofreasons.Trytobuildinflexibility;there’snothingmorefrustratingthanfindingsomequestionyoudidnotanticipateandnothavingthetimetofullyexploreyouroptions.
• Whatisthedeadlinefortestingcompletion?Softdeadlineorwillyouloseaccesstothetesttarget?
• Whatisthesequenceoftestingthatwilloccurinthetimeline?Arealltheresourceslinedup?
• Willthevendororcomponentsupplierberequiredtoperformthesetupactivities?Reconfigurations?
• Howlongiseachtest?Includeset-uptime,timetobuildthemodels,timetoexecute,andtocreatecustomreports,ifneeded.Askyourtestingvendorforestimatesifyouarenewatthis.Virtanahasyearsofrelevantexperiencetodrawfrom.
•Whattimebufferisavailableiftestsrunlongerthanexpected?
Candidate Selection
•Whatweretheselectioncriteriafordefiningthecandidatestoincludeinthetest?
•Whywerethesecriteriaimportant?
•Whatproductswillbeincludedinthetest?
Notify Suppliers
Althoughstoragevendorsorcomponentsupplierswilllikelynotbeinvolvedintheactualtesting,vendorswillsizeandconfiguretheirrecommendedoptimalconfiguration,andwillberesponsibleforthesetupandconfiguration.AndyouWANTthemtodothesetupandconfiguration.
Also,considerthattheselectedplatformmayremainaftertestingandbecometheproductionsystem,soimplementationofthearrayshouldfollowallbestpractices.
Whenengagingyoursupplier,besuretoclearlyreviewthesepoints.Itwillenablethemtodoabetterjobofproposingtherightarrayandconfiguringitproperly.So, review:
• Testing objectives
• Duration of the entire testing process
• Evaluation criteria
• Description of the testing environment
• Equipmenttheywillbeexpectedtoprovide
• Services they are expected to provide
• Expected schedule of testing for their components
• Services they are expected to provide
• Expected schedule of testing for their components
9
TESTING
Optimal Test Environment
Tosome,whenanewboxarrivesontheloadingdock,it’saneventnotunlikeabirthdayorChristmasmorning.Withoutdelay,andfrequentlywithoutaproperinventorycontrolprocess,cartonsfullofshinynewtechnologyareunboxed,rackedornot,andconnectedtowhateverserverhappenstobeavailable.Storagedevicesareconfiguredandmanualsconsultedonlyintheunfortunateeventthattheuserinterfacesaren’tproperlyintuitive.Whathappensnextmaybeperformancetesting,butitmayalso be an exercise in building storage performance anecdotes.
Whydowesaythis?Absentofadefinedtestprocessandcontrolledtestingenvironment,youwillbeabletodeclareonlythat“IdidXandYwastheresult”.Thisisasubjectivestatement,anditmaycontainvaluable,experientialdata.Itisnot,however,an objective analysis. The accuracy and interpretation of such
resultsarenotdefinitiveofthesubjectundertest.Thatisnottosaythatthissortofexperimentationhasnovalue;itshoulddemonstrateoutoftheboxfunctionality(orlackthereof)andmayprovidevaluableinsightintoease-of-use.Whatitdoesnotdoisprovideevidenceofarigoroustestingmethodology,includingcontrolledinputsandoutputs,repeatabilityanderroranalysis.
Testingcan,anddoes,implyameansbywhichthequalityandfunctionofasubjectisevaluated.Inthestrictestsensethe“kickthetires”testprocessdescribedabovefitsthatdefinition.However,trulyobjectivetestingrequiresacontrolledtestprocessandacontrolledtestenvironment,ortestbedif,youprefer.
Atestbedprovidesameansofisolatingthesystemundertestfromexternalinfluencesandprotectingnon-testresourcesfromany fallout from the testing process. These are important factors for the generation of the reliable test data and for maintaining businesscontinuity.Atestbedmustbedefinedanddocumentedforrepeatabilityandtosimplifytheanalysisofresults.
Storage performance testing is disruptive and should never be performed in compute environments that support any kind ofproductionorrevenue-generatingcomputing.Norshouldperformancetestingcoincidewithothertesting,suchasuservalidation or ease of use testing. Meaningful storage performance data can only be generated in a controlled testbed and the mostrigoroustestingwilladheretowellrecognizedstandardsandpractices.Theserequirementstypicallycallforadedicatedtest environment to be set aside for the purpose of validating storage performance.
Awell-designedtestbedwillincorporatesimilarconditionsasexpectedintheenvironmentsconsideredforeventualdeploymentofthedevicesundertest.Conditionsdesignedtomaximizesomemetricinawaythatisnotreproducibleoutsidethetestlabcreatesresultsthatmaylookgoodonamarketingbrochure,butwhichholdlittlevaluetoyou,thecustomer.
A comprehensive test environment will include the following elements and attributes:
• Aspaceforthetestingtooccur,separatefromconflictingactivities
• Amechanismorenginetodriveworkloads(LoadDynamiXappliance)
• Infrastructure to support applicable connection protocols
•Amechanismtorecordandanalyzetestdata(LoadDynamiXEnterprise)
• AlibraryortoolboxofworkloadstogeneratedesiredI/Opatterns(LoadDynamiXEnterprise)
• The number of storage ports involved in testing should be comparable to the target production environment
•Switchesandnetworkgeartoemulateproduction
•MultipathaccesstotheLUNs(Active/Active)
•ThenumberofLUNsshouldbeofthesameorderasinproduction.Forinstance,iftheproductionenvironmentisexpectedtohave2,000LUNs,don’ttestwith2LUNs,shootfor~200.
10
•Replicationorotherinternalprocesses(deduplication,compression,garbagecollection,etc.)enabledinproductionshouldbeenabledduring the test runs
And,optionally
• Asourceforthemodelingdata(VirtualWisdomorexistingproductionstoragearray)
•Acontrolledpowersource
• Powermeasuringdevices
•Aclimatecontrolledenvironment
Thetestenvironmentshouldberepresentativeofthewaytheproposedstoragewillbeusedinproductionandshouldbecentralizedandeasytoaccessanduse.Here’s a simple example of a test topology.
Importing and Exporting Workloads
Oneofthefirstissuestounderstandistheworkloadsampleyou’restartingwith.withthusfar.Often,testersstartoutthinkingtheyshouldgrabaweek’sworthofdataandrunthat.Formostusecases,thatisawasteoftime.Itisbesttopickafewkeytimeperiods,maximumIOPs,maximumthroughput,andmaximumlatencyandgrab1to4hoursofthisdatatorunagainstthearray.Thiswillprovidethemostimportantresultsintheleastamountoftime.
Thedaysofspendingdozensofhoursparsinglogfiles,intheoften-futileattempttocreaterealisticworkloadmodelsareover.TheWorkload Data Importer module of Enterprise is designed to import andanalyzeworkloaddatafromstorageinfrastructures,automatically,tocreateextremelyrealisticworkloadmodels.TheworkloaddatacanbeexportedfromVirtualWisdom,storagevendormonitoringtoolsorserverutilitiessuchIOStatasCommaSeparatedVariable(CSV)textfiles.ThedataisanalyzedtodetectworkloadIOprofilestounderstandtheworkloadcharacteristicsintermsofIOPs,Throughput,Read/Writeratioetc.TheWDIGUIishere(right):
Storage Testing Workflow
Here’saworkflowillustration.We’lldiscusseachstepinorder,below.
The ideal data source, to create the most hyper-realistic model, is VirtualWisdom. It can obtain up to 30 different metrics.
11
Prepare and Configure Storage Target – Precondition
Storageplatformsundertestshouldbeconfiguredaccordingtothevendor’sbestpractices.
SAN storage arrays should be “preconditioned” prior to the testing:
•Datamustbewrittentoanarraypriortoreadingit
•AFAsolidstatestorageshouldbebrokenintobringittothestateitisinproduction
•Utilizethepurpose-builtpre-conditioningworkloadinWorkloadWisdom-E5.3andlater.
• Dependingontheconfigurationandthevendor,thestoragearraymightbeleftforaperiodoftimetoprocessthedataoffline.Thevendorwillofferadvice.
Execute Tests – Current Production Workload
Performabaselinetestofworkloadacrossallsystemsusingthecurrentproductionworkloadat100%.Precisionshouldbetothe highest possible extent to produce useful results. This includes:
•ReadandwriteratesandtheirvariationsintimeandacrosstheLUNs
•RequestsizesandtheirvariationsintimeandacrosstheLUNs
•Theamountofcapacity(logicaladdressspace)readfromandwrittentoduringthetest
Workloadscalingteststhestorageperformanceathigherloadsusingthecharacteristicsofthecurrentproductionworkload.
•Consideriterativelyperformingtheworkloadtestsgraduallyincreasingscaletoreflecttypicalgrowth.
•Example:Ifplanningtotestupto3xgrowthinworkload,performtestsusingincrementsofloadscaleof.5x
•The“sweetspot”ofastoragearrayvaries,andyoumayfindthearraythatperformswellat1xor1.5x,maydegradeat2x,whileotherarraysarejustwarmingup.Thisgranularitywillhelpidentifythis,anddefinetheoptimalworkloadrangesforaproduct.
Testmanagementenvironmentshouldbecentralizedandeasytouse.Thescreenshotsbelowdemonstratehowsimplyacomplex,compositeworkloadmaybeillustrated.Thegraphontheleftlistsmultipleworkloads,andthegraphontherightshowstherelationshipsbetweenclientsandtargets,inasingletestenvironment.
12
Execute Tests with TDE for models that exceed WorkloadWisdom-E functionality
Here’saTDEProjectcorrespondingtotheWorkloadWisdom-Ecasepresentedabove,justasanexample.
Review Results
Generateeasilyreadablereports,typicallyinvolvinglatency,throughputandIOPs.Here,inamulti-steptest,youmaydetermine if some products simply shouldn’t move on to the next round of testing because of their shortcomings.
Below,averageresponsetimeandIOPSforabaselineandscaled-uptestina2-vendorbake-off.Inthefirstgraphbelow,it’seasyforthereadertoseetheeffectonlatency,ofincreasingtheworkload(byincreasingIOPS).Inthiscase,vendor2scalesmoreeffectively.Ata3xworkload,vendor1combinedresponsetimeis4ms,whilevendor2isapprox2.3ms(exactfiguresareavailable).
13
COMMON TESTING PITFALLS
#1 Skipping Conditioning
Thepreconditioningofthestorageisamustpre-requisiteofthetestingforthefollowingreasons:
•Storagearraystypically“know”whetheracertainaddresshasbeenpreviouslywrittentoornot.Incasethetestworkloadreadsfromtheaddressthathasnowdata,thearrayreturnszeroswithoutconsuminginternalresources.Theresponsetimeisunrealisticallylow.Thetestresultsareskewedtowardslowresponsetimes
•Fresharrayshaveemptymetadatatables.Accesstothemconsumesresourcesandmightaffectresponsetimes.Havingthememptyorcontainingfewentries(bywritingzeros,forexample)willleadtounrealisticallylowresponse times in the test
•AFAtobetestedasinworkingconditionsneedtohavetheirsolid-statestoragebeinitializedandwrittentoperformasitperformsinyourreal-lifeproductionenvironment.
#2 Using Bad Workload Models
Misconception
•Productionworkloadcanbesimulatedusingscriptsand/ortoolslikeIometerorVdbenchinatestlab.WeoftentalkaboutanearlyuserofLoadDynamiX,GoDaddy.GoDaddytriedtouseIometerandgotartificiallyhighperformanceresultsonthetargetarray.Sadly,whendeployedintoproduction,actualuserexperiencewaspoor.ThiswasmostlyduetoIometer’sinabilitytorecognizethehugeimpactmetadataI/Ohadontheworkload.ThesametestsrunusingLoadDynamiXturnedouttobeextremelyrealisticandgaveGoDaddytheresultsthatresultingingooddeploymentdecisions,andhappyGoDaddycustomers.
Reality
•Thereisnosubstituteforactualproductionworkload
• Building a test environment is resource intensive
• True production conditions may not be accurately captured in the test environment
•Incorrectassumptionsanddeviationsintestingstepsinthesimulationcanleadtothewrongdecision.
High-level steps of building the test environment (the hard way)
• Identifyphysicalinfrastructuretobeused(server,san,storage)
• Considermake,model,age,OSversion/patchlevels,firmwarelevels,driverlevels,etc.
• Build/updatethetestenvironmenttoreflectproduction
• Ensure components are at supported levels for the testing platform
•ConfigureSANconnectivity
• Configurematchingnumberofports,portspeeds,cabletypes,patchpanels,etc.
•Createscriptstosimulateworkload
•Distributescriptstosystemsthatwillrunthescriptsanddrivetheworkload
Setup (the hard way)
•OperatingSystemUpdates
•HBADrivers&Firmware
14
• LoadBalancingConfiguration
•IPAddresses,DNSRecords
• Testing Scripts
• Zoning & Masking
Execute (the hard way)
• Execute scripts on servers/VMs
• Monitor performance
•Collect&carefullyaggregatetestresultsfromthevariousservers
Why this approach fails:
•Workloadtestingispartiallyanartandpartiallyascience.WiththeDIYapproach,youhaveonlyyourownexperiencetodependon.Andperhapsthekindnessofforumparticipants.When,notif,questionscomeup;it’sBestPracticetohaveateamwithdecadesofexperiencetorelyon.
•Conventionaltoolsandscriptsattempttoapproximateandsimulateworkloads;andforallbutthesmallest,simplestworkloads,theyfail.
•Setupeffortincreasesexponentially,themoreaccuratelyyoutrytosimulateworkloads.
•Testingisnotcentralized;resultsneedtobecollectedandcorrelatedtominimizemistakesandmakeyoureffortsrepeatable.Atestthat’snotrepeatableis,bydefinition,notagoodtest.
•Duetothecomplexityofsetup,execution,collection,andanalysis,customersgenerallyperformfewertestingiterations,resultinginfewerusefuldatapoints.
•Testarebasedonassumedcharacteristics(akaguesses),whicharethenscaled;theeffectofmultipleinaccurateassumptions is compounded and multiplied.
• Youwon’tknowifyourassumptionsaretrulycorrectuntilyouareinproductionanddrivingworkload.
Whyspendsignificantlymoreeffort,time,andmoneytocreateanapproximation,whenyoucanuseyour productionworkload?
15
#3 Forcing Unnatural Array Configurations
“Wearegoingtoconfigureallarraysexactlythesame,testthesameworkload,andcomparetheresults”isnotaBestPractice,thoughitseemsfair.It’snot.Imposingalike-for-likeconfigurationacrossdifferentproducts:
•Oftendeviatesfromthebestpracticesforoneormoreproductsinthetestgroup
•Wouldnotmatchthesolution/configurationyoupurchaseorputintoproduction
•MayreflectatechnicallyUNSUPPORTEDconfiguration
•Producesub-optimalperformancemetrics
Thetargettestingenvironmentshouldreflecttheconfigurationyouwouldputinproduction,eveniftheconfigurationisnotadirectcomparisontotheotherconfigurationsinthetest.Bysharingyourobjectiveswithyourvendor,youwillgettheiryearsofexperienceinsystemconfiguration.
#4 Not Agreeing on Terminology
Therearesomecommonandnot-so-commontermsthatgetthrownaround;andwithoutacommondefinition,can causebarriersbetweenvendors,managers,engineers,andotherinvolvedparties.Herearesometermsweshouldallbefamiliarwith:
AllFlashArray(AFA) An all-flash array is a solid-state storage disk system that contains multiple flash memory drives instead of spinning hard disk drives.
Array metadata table A table, that has properties that store metadata such as variable names, row names, descriptions, and variable units.
Data compression Encoding information using fewer bits than the original representation.
Deduplication A specialized data compression technique for eliminating duplicate copies of repeating data
Garbage Collection Solid state storage garbage collection (GC) is the process by which a solid-state drive (SSD) improves write performance. Garbage collection, like TRIM, pro-actively eliminates the need for whole block erasures prior to every write operation. When a file is deleted from a computer, most operating systems (OSs) delete the table of contents entry, but do not delete the actual data blocks from the storage media. Hard disk drives will simply overwrite the unneeded data blocks. Flash SSDs, however must erase the unneeded data blocks before new data can be written. Working in the background, garbage collection systematically identifies which memory cells contain unneeded data and clears the blocks of unneeded data during off-peak times to maintain optimal write speeds during normal operations.
Multi-pathing Multipath I/O is a fault-tolerance and performance-enhancement technique that defines more than one physical path between the CPU in a computer system and its mass-storage devices through the buses, controllers, switches, and bridge devices connecting them.
Pre-conditioning Preconditioning is the writing of data (random pattern) to the entire flash device to “break it in”. Data is written to the flash memory in units called pages (made up of multiple cells). However, the memory can only be erased in larger units called blocks (made up of multiple pages). If the data in some of the pages of the block are no longer needed (also called stale pages), only the pages with good data in that block are read and re-written into another previously erased empty block. Then the free pages left by not moving the stale data are available for new data.
Sequentiality Degree to which data I/O stays in sequence
Working Sets Defines the amount of memory that a process requires in a given time interval
©10/2019 Virtana. All rights reserved. WorkloadWisdom and VirtualWisdom are trademarks or registered trademarks in the United States and/or in other countries. All other trademarks and trade names are the property of their respective holders.
Virtana2331 Zanker RoadSan Jose, CA 95131Phone: +1.408.579.4000
View more Resources
Contact a Virtana Expert
Learn more about VirtualWisdom
CONCLUSIONS
ByfollowingtheoutlinedBestPractices,youcanrealizethefollowingbenefits:
•Performanceassurance:EnsurestoragesolutionswillmeetperformanceSLAsunderspecificworkloadsandconfidentlychoosetheoptimalsolutionforthoseworkloads.
•Reducedstoragecosts:Reduceover-provisioningbychoosingthelowestcostsystemsforspecificworkloads;quantifythebenefitandeffectsofeverysystem.
•Increaseduptime:Identifyproblemsinthedevelopmentlabpriortoproductiondeployment;validateallinfrastructurechangesagainstworkloadrequirementsandtroubleshootmoreeffectivelybyre-creatingfailure-inducingworkloadconditionsinthetestanddevelopmentlab.
•Acceleratenewapplicationdeployments:Acceleratetimetomarketbyvalidatingnewapplicationsonnewsystems,makingdeploymentdecisionsfasterandmoreconfidently.
CONTRIBUTORS:
Michael Bello
John Cryan
Carmine DellAquila
Craig Foster
Yang Gao
Peter Murray
Scott Newkirk
Elizaveta Tavastcherna
Jim Bahn