stream processing engine

of 39/39
Stream Processing Engine 王王 2011-12-8

Post on 06-May-2015




2 download

Embed Size (px)


stream processing engine


  • 1.Stream Processing Engine2011-12-8

2. Agenda Architecture Multi-thread I/O Further work 3. Purpose Users can query provenanceWhy to queryprovenance? When the user get a strange result, hestream1may be interested in the provenancestream2 Stream processing engine resultstream3stream4Provenance : the data in the original coming streamsfrom which the result is generatedRequired : If the provenance of one tuple hasnt been savedin the disk , the tuple would not be sent to the user. 4. Architecture Layered architectureSpe layerStream processingengine Buffer for Buffer layerprovenanceFile layer Disk I/O The layer below will provides service to the layer aboveThe layer above will invoke the interface provided by the layer below 5. Store metadata of streamsSpe layerincluding cql statementsand datatypesParser the cql statementComponent view and generate query plan tree Metadata Cql parserProcess along the queryProvide common serviceplan treeQuery planUtilityprocessor 6. Query Plan TreeEntityroot root operatorjoin join operator join leaf joinleafselectleafselect operatorleafleaf operator 7. Operator class diagram idleaf select joinroot 8. Query Plan Tree EntityrootTransportationjoinqueue join leaf leaf joinCommon queueselectleafStorage queueleaf 9. Queue class diagramData flowattribute1: integerattribute2: integer attribute3: string 10. Queue EntityMemory managementContinuous memoryused as bufferHead : the head of the tuples in the bufferIn a queue, we dont allocate memory for each tuple ,we allocate memory for the queue, buffer tuplesTail : the tail of the tuples in the and thewould be saved in the buffer of the queue. tail head head tail head tail headWhen there is no space forthe new tuple , throwWhen a tuple leaves, theexception: need load When initialed, head the When a tuple arrives, andshedding algorithm tail tail will move a tuple head would move forward forwardare the beginningthe length of tuple the length of a address of the buffer 11. Tuple Entity If the tuple is in a queue, it will use the buffer of the queue. If not, it will create its own buffer The beginning address of the buffer The offset in the bufferThe map saves the provenanceof the tupleThe tuple lengthMap[s1]=list{id1,id2}Map[s2]=list{id4} The timestamp of the tupleThe relation schema with the tuple 12. Buffer layer Faade design patternSingleton design patternSpe layer BufferControlBuffer layerFile layerThe buffer control class provides an interface of the layer.The upper layer neednt to interact with other classes in the buffer layer.If we change the implementation of the buffer layer, we neednt to change thecode of the layer above as long as we maintain the same interface. 13. Provenance life cycleBufferControlBuffer layer File layerThisWhenscantuplemapbeeneach tuple pushedwillwouldat sometime. memory. Thereprovenance insertatstored,functionpushedqueryqueue.the Itqueuethebetuplemayprocessed along in toquery provenanceThe Calla storage queue, system the provenancetheforstoredbe stored.IfCall the sometime the tosee storing it isinqueue be ato thein Atisthe store may atheifthe tuples outputstoredThe tuplethe toBeStoredsystem,functiondeletememory The Call the arrivesa be for call the Insertthemake a copythe what provenance to the client the will thread the provenance tuple shouldstream Another the function functiontuple thebe the been saved.provenance has call see the the Then And anotherandmapquerymemoryto be functiontreewillprovenanceis thread deleted from have file plan isStored into idon will transportation It Whenthe storing function.will call the tuple reaches theroot operator 14. PageContinuous memoryMay be 4kb, 16kb, . 56kb page In this system, pages are used tosave two kinds of objects. tuplePage for tuples pagetuple tupleWhy to usePage for bitmapsbitmap? saved , saved for Because we should save the state each tuple, notbitmbitm bitm Markup the state of We are able to use just use 1 bit for apbitm apbitmap bitm the tuple one tuple if we use bitmap. page apbitm ap apbitm apap bitmapJust thinking about a stream about10kb/s, each tuple is 8 bytes, then wecan savebitm apbitm ap bitmap0:not saved 10*1024/8*0.875=1.12kb/s1:saved 15. Architecture for buffer layer Global listunusedpagepagepage pagepage pagepage bufferlistusedpagepagepage pagepage buffer List page *page *page * Listpage *page * vectorpage*page * page* tuplebitmap page* page*Hash tables1 s2s1s2Eachglobal would save theThe name of stream hash Each vector buffer that all The buffer for this pointers ofhashed to a be save thewould needed pages thatpages be the should tablevector from here.allocated data of one stream. 16. Architecture for buffer layer tuple o(1)Insert Globallistunusedpagepagepage pagepage pagepage buffer listusedpagepagepage pagepage page bufferListpage * page* page*Page *Listpage* page* vector page * page* Page*page *page *page *Hash tables2s1 s1 Thethis case,Just seejust page can onlys2save 10Secondly ,it buffer each move a page fromIn global will suppose that page in thebecause if the lastvector itis Page theupbytes pages, sobitmapThe buffer that 100 the 20 tuple. there thespace pagelist ato save table would Suppose ofissamein the buffer If hashtuples, the two pagesadded toused list, tabletheFirstly,thewillspacejust save the hash and isThen unusedlookto with idthe21. stream1 have : tuple from so it will allocate withthe idfrom: thebytes now page Andthe tuple will be inserted into thisno it for the vector for of the bufferwithglobal andthe page of inserted into thisyes,toTuple oftuple is 2110 stream1find the this pagevector. Theantuplestream1 comes return allocate a page from the bufferpage. Then a page can store 10 tuples. 17. Buffer layer Sequence diagram for inserting a tuple: BufferControl PageHashTablePageVectorProvenanceBuffStreamBuffer erifStreamExist(streamName) truegetPage(int id) getInsertablePage(int id)getMorePage()getOnePageToUse()pagepage page pagepush(data) 18. Architecture for buffer layertuple o(1)Find Globallistunused pagepagepage pagepage pagepage buffer listused pagepagepage pagepage bufferListpage *page *page * Listpage *page * vector page * page* page* page* page*Hash table s1 s2 s1s2As a result it is inWe just calculatevector with index of 1, and the page of 45-31/10=1 offset of 4 in theThe tupleThen page isto found Suppose Suppose that the we havebytes thetuple. It is It is the index of the tuple 100 find now westream1 isin thebytes the tuplepage. of want 10 vectorthe we calculate 45-31-10*1=4 And It is the offsetId identifier31 45with sameofwithpage in the ofThe the of the the bitmapfirstthe vector is vector 19. Release the memoryIf we dont release the rootmemory used for savingprovenance ,the memorywould run out quickly joinWe dont release memoryfor one tuple each time ,join leaf we just release memoryfor one page each time. join leafWe will look into everyidentifier of the provenance inselect leaf the query plan tree. Theseidentifiers are considereduseful. And others are useless.Then the page contains noleafuseful tuples would bedeleted. 20. Architecture for buffer Delete tuples o(nmp)layer Globallistunusedpagepagepage pagepage pagepagepage buffer listusedpagepagepage pagepage bufferList page* page* page* Listpage* page* vector page * page*page *page *page *Hash table s1 s2s1s2 For releasingshould do is,we scan along the query planThen we the memory firstflush thestream1 are:moveWhat wethe page from the buffer andof pagevectorjust page the and And deleteweknowthe useful identify ofvector. Update thetree, it from found the Suppose that one page is 100 bytes and containsof the vector. It 14the same the unused listit. the used page list to will releasefirst id no useful tuples, wewith the bitmap 13, is ,16The tuple of stream1 is 10 bytes And the first id of the vector is 1. 21. User query for provenanceData flow diagram When user query forWhen user query for When user dont queryprovenance , and theprovenance , and the for provenanceprovenance is not in the provenanceis in the pages inpages the memory in the memory 22. Architecture for buffer layer Query provenancelistunusedpagepagepage pagepagepage page pagelistusedpagepagepage pagepagepagepagepage page pagepagepage pageList* * * List** * ** page page pagepagepage*** query tuple* * page page page***s1 s2s1 s2Fornot,buffer query if the provenancetheinthe at for read InIfexample will willdisk here ,LFU: putin withbufferidentify 5 andIf theifwe ,we seefull ,welast pageget is buffer willused. We willwe must read the pageis in the are for query, When see for query,the mustfrom onethe page for not,buffer the provenance set the frequently ofwe if is flush provenancethere page out If the data from thebeinthen least page We most tuples,theyes, we will findwedata at onedisk. to thepage.canread strategy may31 the provenance in this beginningThe pagesstream1. time. one pageof thehereof buffer.query. 23. Buffer layerAbstract factory design pattern The client code neednt to know the implementation details of the tuple and bitmap and query. 24. Multi-threads main tread : do most of things includingreceiving data from streams. Storing tread : save provenance I/O thread : deal with I/O with clientsincluding registering streams, registering cqls,query provenance. 25. Read-write lock Lock Insert(thread 1)datastructur Provenance hashtable vectorbuffer globalbufferpageeMaptype mapmap vectorlist ListUnsigned thread- char []unsafeLock logicInitialedread~readWriteinitialed~write tupleread~readread~read write initailed ~writeWrite~writeWrite~write Write write 26. Read-write lockLockTo be stored(thread 1)datastructur Provenance hashtable vector buffer globalbufferpageeMaptype mapmap vector list ListUnsignedchar []Lock logic Write ~write 27. Read-write lock Lock Is tuple stored(thread 1)datastructu Provenance hashtable vector buffer globalbuffe pagereMaprtypemapmap vector list ListUnsigned char []Lock logic read ~read Read bitmap ~read Read ~read 28. Read-write lock Lockdelete(thread 1)datastructu Provenance hashtable vector buffer globalbuffe pagereMaprtypemapmap vector list ListUnsigned char []Lock logicRead~read Read ~readtuple read ~readRead Write released ~write~readWrite~write 29. Read-write lock Lockstoring(thread 2)datastructu Provenance hashtable vector buffer globalbuffe pagereMaprtypemapmap vector list ListUnsigned char []Lock logicWrite~write Read ~readtuple Read ~read Read ~read Read ~read Read bitmap ~read trywrite ~write 30. Read-write lock Lockquery(thread 3)datastructu Provenance hashtable vector buffer globalbuffepagereMaprtypemapmap vector list List Unsignedchar []Lock logic read ~read Read ~readtupletryread~read Read ~read Read ~read write queryInitialed ~writeWrite~write 31. Lock optimization We should reduce the cost of lockmanagement while increase the concurrency The lock for buffer is useless because allthreads would make no conflicts on it. We canget rid of it. The lock for global buffer can be changed to amutex. Some not important operations can just dotrylock and trywrite. 32. Lock performance analysisFor the read-write lock we used: allowing concurrent access to multiple threads for reading restricting access to a single thread for writes write-preferring The smallest granularity : PagePerformance lost: When we need to do some operations on one page. Page for tuple : readerstoring thread, query thread writermain thread Page for bitmap :readermain thread writerstoring thread Page for query : all done in the I/O threadConclusion: Likely to improve performance while needs experiments 33. Concurrency controlstudying 34. File layerWrite a tuple When write a tuple into the file Get the offset of the tail of the file Append the tuple on the tail of the file Flush the buffer Add the offset and tuple identifier to the index Use partitioned hash to implement the two-dimensional index.file disk 35. Registering I/OstreamsRegisteringstream1cqls stream2System stream3 Query provenancestream4We dont use one thread for one I/O. We just implement it inWe implement them in one thread. the main thread.It can be blocked when there is no need to Must be non-blocking I/Oread or writeWe will use I/O Multiplexing here. 36. What is I/O multiplexing? When an application needs to handlemultiple I/O descriptors at the same time When I/O on any one descriptor canresult in blocking It can be blocked until any of the I/Odescriptors registered becomes able toread, write or throw exception. 37. epoll epoll is a scalable I/O event notificationmechanism It is meant to replace the older POSIX selectand poll system calls.File descriptor: Fd=0 Fd=1Fd=2 fd=3 Fd=4Write:00 011 selectRead: 00 110 38. Further work Implement the multi-threads design, use athread to save the provenance Implement the file layer design. Add an index tothe provenance saved in the file Implement the I/O design