volume logistics application mongodb for a high · 2018. 4. 26. · mongodb for a high volume...
TRANSCRIPT
![Page 1: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/1.jpg)
Santa Clara, California | April 23th – 25th, 2018
MongoDB for a High Volume Logistics
Application
![Page 2: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/2.jpg)
about me ...
Eric Potvin
Software Engineer in the performance team
at Shipwire, an Ingram Micro company, in Sunnyvale, California
![Page 3: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/3.jpg)
… A little background
![Page 4: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/4.jpg)
who are we?We offer a cloud-based fulfillment software platform
This platform enables thousands of brands and online retailers to manage their order fulfillment operations
We support 20+ warehouses in multiple countries like USA, Canada, Australia, Italy, Germany and China
![Page 5: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/5.jpg)
Some warehouses are unable to easily adapt their systems to new technologies
Warehouses are using old infrastructure; aka servers (AS/400) or service providers
Warehouses understands files
… and FTP
warehouses are … old-fashioned
![Page 6: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/6.jpg)
what we have to deal withMillions of files received monthly
Gigabytes of various document file types (XML, TXT/CSV, PDF)
Limitations on file received (raw zip files vs zip files)
Limitations of FTP connections
![Page 7: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/7.jpg)
lots of data to maintain8 processing servers
Ingesting millions of files per month
Thousands of log files
100+ GB of monthly logs / 250+ GB of data files
![Page 8: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/8.jpg)
server resources & limitationsBy manipulating so many files, we are suffering from high server resources
consumptions
- Lots of processes with constant high CPU usage
- Each processes has high RAM usage,
- And high network usage - GBs of data transferred hourly
![Page 9: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/9.jpg)
searching for information can be tediousOften, we need to look for data in case of errors or a common “we didn’t receive these files”
Data and logs are not available for users
Finding information requires an engineer to connect to each server
![Page 10: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/10.jpg)
what about...NFS? This will eliminate the lookup across servers but still have some issues:
- Still large amount of files- Network overhead for large files- And … -bash: /bin/ls: Argument list too long
MySQL- Changing data structure requires maintenance
![Page 11: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/11.jpg)
![Page 12: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/12.jpg)
![Page 13: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/13.jpg)
… so why did we choose MongoDB?
![Page 14: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/14.jpg)
get all data at no cost?Analytics software are great and allow any user to see data
But they can be costly and limited
MongoDB gives the flexibility to save what we need
With no monthly or setup fee
![Page 15: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/15.jpg)
better integrationsAll data can now be visible by all users
Can be integrated with our in-house applications
Self-service tool allow users to take actions immediately in case of issues
Accurate real-time tracking of documents
Real-time monitoring of documents, server resources
![Page 16: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/16.jpg)
no more frequent reads/writesNo more slow CRUD operations on an XML file on disk
Avoid millions of disk and memory operations
It also make our code healthier …
![Page 17: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/17.jpg)
From:Document doc = db.parse(<my_file>);
Element elem = doc.getDocumentElement();
NodeList nl = elem.getElementsByTagName(<child>);
for(int i = 0; i < nl.getLength(); i++) {
NodeList node = (Element)nl.item(i).getElementsByTagName(<tag>);
for(int j = 0; j < node.getLength(); j++) {
// fetch data for what I need
// and update later
To: mongoClient.getDatabase(myDatabase)
.getCollection(myCollection)
.find(search)
.projection(whatINeed)
// and update later
collection.update(search, dataToUpdate);
simplified code
![Page 18: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/18.jpg)
available for everyone and instantlyNow all our apps can access MongoDB
Microservices can access the same data without delay
Data is available instantly, even after multiple manipulations
![Page 19: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/19.jpg)
another ALTER? seriously? ...No more “system under maintenance” because we need to alter a big table
No need to care about schema update due to a warehouses updated file
And no need to store the entire content in a blob and try to search within
![Page 20: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/20.jpg)
where is my data?Can access data using a “single point of access” (all depends which secondary I am reading from)
Faster data access with multiple secondaries
No more “file locked” … and waiting for unlock ...
![Page 21: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/21.jpg)
server goes down, no big dealElection process is fantastic!
No more “down time” due to single points of failure
Easy to expand and/or upgrade
![Page 22: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/22.jpg)
![Page 23: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/23.jpg)
How did we reduce server resource usage?
![Page 24: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/24.jpg)
example of manipulating a single order1 order from Chicago, USA to Québec City, Canada using an international carrier,
1 product ordered.
This requires at least 7 XML files and 3 PDF files to be created
![Page 25: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/25.jpg)
This files contains multiple nodes giving details about shipping details- Tracking numbers- Number of boxes shipped- Carrier including details- etc...
File size can be up to few Megabytes
shipping confirmation example
![Page 26: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/26.jpg)
nested loops of … O(n*r)?Looping through few Megabytes file is slow
- Each loop calls API and update database records
What if the process crashed, where to start from?
- Manual recovery
Constant server monitoring resources
![Page 27: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/27.jpg)
iterations (what we used to have)
Open the entire file in memory
Loop through each record,
For each record loop through each box shipped
For each box shipped,
Loop through each product (quantity shipped, reason if not shipped)
![Page 28: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/28.jpg)
Enough !
let’s keep this simple: O(1)
![Page 29: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/29.jpg)
no more loops ...Save the data we only care about
- Our own standard format using Kilobytes of data
Higher efficiency of searching documents
- One simple document, one single query
![Page 30: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/30.jpg)
“Stateful” resourcekeep track of data changes inside the document
No more intensive memory and disk usage due to multiple file manipulations
Real-time manual change from a UI by any user
![Page 31: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/31.jpg)
Fault tolerantMongoDB gives us persistent data (server reboot, segmentation fault, etc…)
Eliminates memory issues when reading multiple large text file in memory
Free up resources for other applications running on the same server
![Page 32: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/32.jpg)
server resourcesThis result in very low resource usage processes
CPU percentage and load went down drastically
Network usage dropped considerably
![Page 33: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/33.jpg)
disk utilizationNo more -bash: /bin/ls: Argument list too long
Lots of free space reused for something else
No more frequent “cleanup” or disk maintenance
No more file archiving/maintenance to a backup server
No more disk at 95% utilization alerts
![Page 34: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/34.jpg)
Let’s see a simple example
![Page 35: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/35.jpg)
Application logs
![Page 36: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/36.jpg)
application logs (what we used to have)
Each application logs its data to their own specific files
Each log uses different log level based on what is executed
CRIT (0), ERR (1), WARN (2), INFO (3), DEBUG (4)
Logs are saved with following format in /var/log/my_application/my_app.log
2017-11-12T03:50:02-08:00 [ INFO / 3 ] (PID: 12345): My message
![Page 37: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/37.jpg)
application log (search)To search, we simply need to run:
for x in $(seq 1 8); do
ssh "p$x.myserver" grep -r "my search" /logs/app/* ;
Done
… wait … and … wait
![Page 38: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/38.jpg)
no more !
let’s fix this
![Page 39: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/39.jpg)
logging in MongoDBEach application logs its data to their own specific namespace
Database used: <application_name>
Collection used: <application_specific>
Example: warehouse.sending_files
![Page 40: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/40.jpg)
logging in MongoDB (example)
{“datetime”: date: ISODate(),
“level”:”INFO”, “code”:3,
“pid”: 12345,
“message”: “file orders_1234.zip sent to /inbound/” }
![Page 41: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/41.jpg)
MongoDB log (search)use logs;
db.my_app.find();
db.my_app.find({level: “INFO”});
db.my_app.find({message: /some specific data/);
![Page 42: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/42.jpg)
archiving logsArchiving data can be done by using the TTL index
● Warning: ttl index runs every 60 seconds on all namespaces and records to identify which records needs to be
removed. This can slow down data access.
Another way is to create a daemon that generates “yearly or monthly” collections.
Then, use the mongodump to archive the records.
![Page 43: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/43.jpg)
![Page 44: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/44.jpg)
So …
What can MongoDB do for you?
![Page 45: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/45.jpg)
Q+A ?
![Page 46: Volume Logistics Application MongoDB for a High · 2018. 4. 26. · MongoDB for a High Volume Logistics Application. ... Eric Potvin Software Engineer in the performance team at Shipwire,](https://reader034.vdocument.in/reader034/viewer/2022051605/6013a8deaea9d15c607853b4/html5/thumbnails/46.jpg)
Thank You!