Download - Why EDP Chose MongoDB
![Page 1: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/1.jpg)
Why EDP chose
Artyom Diky
William Biesty
Mark Velez
![Page 2: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/2.jpg)
Agenda
• Who are we?
• Evolution of Document Management
• File system to relational DB
• Relational to document-oriented DB
• Paper to electronic
• Advantages and Challenges
• Questions?
![Page 3: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/3.jpg)
Who Are We?
• New York City Department of Health and Mental Hygiene
• Environmental Health Services (EHS)
• Environmental Disease Prevention (EDP) • Lead Poisoning Prevention Program (LPPP)
• MIS Unit we are here
• We support many programs within EDP
• Who are our stakeholders? • Inspectors
• Researchers
• Clinical Staff
• Lawyers (FOIL)
![Page 4: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/4.jpg)
Evolution of Document Management Paper
• A lot of legal documents on paper
• Historic - from the '70s and up
• Current (ongoing)
• Problems with Paper
• Time and Labor Intensive • Locate, Copy, Redact, Copy, Mail (Repeat….)
• Storage Space
• Disaster Recovery
![Page 5: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/5.jpg)
Evolution of Document Management eFiles
• VB6
• Scanning utilities
• File-system based storage
• Millions of files
• Identifiers based on child ID
![Page 6: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/6.jpg)
Evolution of Document Management eFiles Issues
• Technical • VB6 phased out
• Outdated 3rd party tools changed API
• License expired
• Security • Documents have been redacted permanently
• No access control to private information
• Scalability • New document types
• New indexing (tagging) mechanisms for search
![Page 7: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/7.jpg)
Evolution of Document Management
• Need for better document management
• Paperless offices mandate
• Expand searchable attributes and document text
• Update technology
• Improved security
• HIPAA compliance
• Platform for future applications
![Page 8: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/8.jpg)
File System to Relational DB
• Challenges:
• 1M+ historical documents as image files
• Need for document metadata
• Various and evolving schemas
• Security
• Updates and migration
• Fail-safe storage
![Page 9: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/9.jpg)
Technologies
• We use Microsoft technologies
• SQL Server
• .NET
• We are a small team that develop and support dozens of data collection apps (forms)
• Risk assessments
• Inspection Reports
• Research
• Case Management
![Page 10: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/10.jpg)
Example Documents event_date child ID document_type
me_num
![Page 11: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/11.jpg)
File System to Relational DB
FileStream • MSSQL 2008
o Data storage with FileStream
o Metadata with Entity-Attribute-Value
sql_variant
o Data-driven application design
• Rich service-oriented API through WCF
• Search engine
• Added features
o Versioning
Change and revert
![Page 12: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/12.jpg)
DocSpace SQL Architecture
![Page 13: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/13.jpg)
Limitations of Relational Model
• Need faster development cycle
• Double effort for development and maintenance
• On application and database level
• Document definition (metadata) first, content later
• Changing schema
• Rigid document structure • Not amenable to change
• No support for non-primitive values
![Page 14: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/14.jpg)
Effects on Development Cycle
• SQL Waterfall-like approach
• Fully develop requirements before implementation • Gotta get the schema right to avoid hassle
• Change discouraged
• MongoDB Rapid Application Development
• Prototyping
• Change accommodated
![Page 15: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/15.jpg)
Document Management System Done Right
• Faster development cycles
• No translation of complex document structure into relational model
• Application driven schema
• Document content first, metadata later
• Flexible document structure driven by user requirements
• GridFS for large documents
![Page 16: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/16.jpg)
DocSpace MongoDB Architecture
![Page 17: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/17.jpg)
Case Study - Traffic Fatalities
• A study of traffic-related fatalities in NYC
• Injury Surveillance and Prevention
• Offline data collection
• 330+ data points
• Multiple weekly changes to schema
o Add/remove fields
o Value types
• Developed in 500 hrs (3 months)
• 1 intermediate developer, 1 novice
![Page 18: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/18.jpg)
Evolving Use of MongoDB
• Single Node with Database Security
• Nightly Dump for Backup Archiving
• Master – Slave Nodes
• Replica Sets – 3 Nodes
• Distributed across Metropolitan Area Network
• Bare Iron Primary, VMware ESX and Hyper-V VM Secondaries
•Hurricane Sandy – No downtime, one node failed
![Page 19: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/19.jpg)
Thank you!
Questions
![Page 20: Why EDP Chose MongoDB](https://reader034.vdocument.in/reader034/viewer/2022052321/54c1cbf64a7959ad728b4571/html5/thumbnails/20.jpg)
Contact Us
William Biesty, Database Administrator, [email protected]
Art Diky, Software Engineer, [email protected]
Mark Velez, Software Engineer, [email protected]
nyc.gov/health