dbfs - database filesystem
TRANSCRIPT
![Page 1: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/1.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Dbfs - Database filesystem 1
Timo Minartz
Software project WS 2008/09
April 6, 2009
1supervised by Julian Kunkel1 / 25
![Page 2: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/2.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Inhalt
1 Concept and problem case
2 Software design
3 Implementation
4 Benchmarks
5 Conclusion and future work
6 Literature
2 / 25
![Page 3: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/3.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Project goal
Problem case specific
• map filesystem sources and database tables in one namespace
• implement a lightweight filesystem with FUSE [Sou]
• easy to maintain database design
• minimize database overhead
General
• reusable software
• well documented
• usability
3 / 25
![Page 4: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/4.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Project goal
Problem case specific
• map filesystem sources and database tables in one namespace
• implement a lightweight filesystem with FUSE [Sou]
• easy to maintain database design
• minimize database overhead
General
• reusable software
• well documented
• usability
3 / 25
![Page 5: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/5.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Problem case
Initial situation
• a microscope generates lots of data in a specific folderhierarchy
• in particular it creates a tiff-File with a size of a few MByte
• this tiff-File is identicated by a collaboration, project, plate,replicate, well and file name
• there are multiple collaborations, projects, etc. so lots oftiff-Files are created
Further situation
• tiff-Files should be evaluated by different applications
• these applications store their results in simple files
• it should be easy to manage these files (i.e. by a databasesystem)
4 / 25
![Page 6: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/6.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Problem case
Initial situation
• a microscope generates lots of data in a specific folderhierarchy
• in particular it creates a tiff-File with a size of a few MByte
• this tiff-File is identicated by a collaboration, project, plate,replicate, well and file name
• there are multiple collaborations, projects, etc. so lots oftiff-Files are created
Further situation
• tiff-Files should be evaluated by different applications
• these applications store their results in simple files
• it should be easy to manage these files (i.e. by a databasesystem)
4 / 25
![Page 7: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/7.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Problem case (2)
Initial filestructure (base filesystem)
/collaboration/project/plate/replicate/well-file.tiff
Resulting filestructure (fuse filesystem, dbfs)
/collaboration/project/application/plate/replicate/well/file.tiff
5 / 25
![Page 8: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/8.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Problem case (2)
Initial filestructure (base filesystem)
/collaboration/project/plate/replicate/well-file.tiff
Resulting filestructure (fuse filesystem, dbfs)
/collaboration/project/application/plate/replicate/well/file.tiff
5 / 25
![Page 9: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/9.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Example
Base filesystem structure
/collab0/project0/plate0/replicate0/000-file1.tiff/collab0/project0/plate0/replicate0/000-file2.tiff/collab0/project0/plate0/replicate0/001-file3.tiff/collab0/project0/plate0/replicate0/metadata
Dbfs filestructure
/collab0/project0/application0/plate0/replicate0/000/file1.tiff/collab0/project0/application0/plate0/replicate0/000/file2.tiff/collab0/project0/application0/plate0/replicate0/001/file3.tiff/collab0/project0/application0/plate0/replicate0/metadata
6 / 25
![Page 10: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/10.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Example
Base filesystem structure
/collab0/project0/plate0/replicate0/000-file1.tiff/collab0/project0/plate0/replicate0/000-file2.tiff/collab0/project0/plate0/replicate0/001-file3.tiff/collab0/project0/plate0/replicate0/metadata
Dbfs filestructure
/collab0/project0/application0/plate0/replicate0/000/file1.tiff/collab0/project0/application0/plate0/replicate0/000/file2.tiff/collab0/project0/application0/plate0/replicate0/001/file3.tiff/collab0/project0/application0/plate0/replicate0/metadata
6 / 25
![Page 11: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/11.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files examples
Dbfs filesystem
/collaboration0/project0/application0/plate0/replicate0/000/ergs/collaboration0/project0/application0/plate0/replicate0/001/ergs
• virtual files are stored in database
• virtual files are identificated by collaboration, project, plate,replicate, well, file name AND application
7 / 25
![Page 12: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/12.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files examples
Dbfs filesystem
/collaboration0/project0/application0/plate0/replicate0/000/ergs/collaboration0/project0/application0/plate0/replicate0/001/ergs
• virtual files are stored in database
• virtual files are identificated by collaboration, project, plate,replicate, well, file name AND application
7 / 25
![Page 13: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/13.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Further constraints
Virtualization layers
• one for the application and
• one for the well
Permissions
• only read permission to tiff-Files
• permissions for metadata files inherited from base filesystem
• read and write permissions to virtual files on application level
• no structural changes allowed (chmod,mkdir,. . . )
8 / 25
![Page 14: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/14.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Further constraints
Virtualization layers
• one for the application and
• one for the well
Permissions
• only read permission to tiff-Files
• permissions for metadata files inherited from base filesystem
• read and write permissions to virtual files on application level
• no structural changes allowed (chmod,mkdir,. . . )
8 / 25
![Page 15: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/15.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files model
• table for every application
• table has columns for every subfolder and one for every virtualfile
Table: Example database table collaboration0 project0 application0
plate replicate well ergs
plate0 replicate0 000 “ergs for well 000”plate0 replicate0 001 “ergs for well 001”
9 / 25
![Page 16: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/16.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files model
• table for every application
• table has columns for every subfolder and one for every virtualfile
Table: Example database table collaboration0 project0 application0
plate replicate well ergs
plate0 replicate0 000 “ergs for well 000”plate0 replicate0 001 “ergs for well 001”
9 / 25
![Page 17: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/17.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Permissions model
• permissions on project level
• second table for permissions
• containing one column for application and one for the owner(user id from operating system)
Table: Example permission table permissions collaboration0 project0
name owner
application0 1000application1 1001
10 / 25
![Page 18: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/18.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Permissions model
• permissions on project level
• second table for permissions
• containing one column for application and one for the owner(user id from operating system)
Table: Example permission table permissions collaboration0 project0
name owner
application0 1000application1 1001
10 / 25
![Page 19: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/19.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Managing the directory structure
General
• changes in the base filesystem
• and in the database tables (i.e. new virtual files)
Howto
• “by hand”, see documentation and/or README file
• using a simple GUI
11 / 25
![Page 20: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/20.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Managing the directory structure
General
• changes in the base filesystem
• and in the database tables (i.e. new virtual files)
Howto
• “by hand”, see documentation and/or README file
• using a simple GUI
11 / 25
![Page 21: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/21.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Managing the directory structure (2)
Figure: Graphical user interface to manage the directory structure12 / 25
![Page 22: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/22.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Optimizations and restrictions
Database overhead
• multiple users who need own database connections
• lots of queries are generated for a simple command (like ls)
Optimization
• thread-safe database pooling
• simple caching for query results
• both can be enabled in the sourcecode
Restrictions
• cache consistency problem
• if underlying base filesystem changes (creating new(sub-)folders etc.)
13 / 25
![Page 23: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/23.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Optimizations and restrictions
Database overhead
• multiple users who need own database connections
• lots of queries are generated for a simple command (like ls)
Optimization
• thread-safe database pooling
• simple caching for query results
• both can be enabled in the sourcecode
Restrictions
• cache consistency problem
• if underlying base filesystem changes (creating new(sub-)folders etc.)
13 / 25
![Page 24: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/24.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Optimizations and restrictions
Database overhead
• multiple users who need own database connections
• lots of queries are generated for a simple command (like ls)
Optimization
• thread-safe database pooling
• simple caching for query results
• both can be enabled in the sourcecode
Restrictions
• cache consistency problem
• if underlying base filesystem changes (creating new(sub-)folders etc.)
13 / 25
![Page 25: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/25.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Implementation in C++
Implemented classes can be spread into 4 modules
• handling filesystem issues
• database access
• GUI and
• the helper classes and functions
Implemented filesystem operations
• gettattr
• readdir
• read and
• write
14 / 25
![Page 26: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/26.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Implementation in C++
Implemented classes can be spread into 4 modules
• handling filesystem issues
• database access
• GUI and
• the helper classes and functions
Implemented filesystem operations
• gettattr
• readdir
• read and
• write
14 / 25
![Page 27: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/27.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Implementation in C++ (2)
Further implementation details
• documentation (PDF)
• in-line documentation (doxygen)
• type make doc in software project root
15 / 25
![Page 28: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/28.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
FUSE stumbling blocks
Mounting fuse without administrative privileges
• mount: ./dbfs mountpoint [args]
• umount: fusermount -u mountpoint
Logging
• fuse forks a new process, so logging to stdout is not possible
• the parameter -f prevents fuse from forking
• alternative: logging to a file (implemented)
Debugging with valgrind
• problem with older kernel versions: fusermount not traceable
• workaround available: see README in project root
• with kernel 2.6.27-11-generic working out-of-the-box
16 / 25
![Page 29: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/29.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
FUSE stumbling blocks
Mounting fuse without administrative privileges
• mount: ./dbfs mountpoint [args]
• umount: fusermount -u mountpoint
Logging
• fuse forks a new process, so logging to stdout is not possible
• the parameter -f prevents fuse from forking
• alternative: logging to a file (implemented)
Debugging with valgrind
• problem with older kernel versions: fusermount not traceable
• workaround available: see README in project root
• with kernel 2.6.27-11-generic working out-of-the-box
16 / 25
![Page 30: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/30.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
FUSE stumbling blocks
Mounting fuse without administrative privileges
• mount: ./dbfs mountpoint [args]
• umount: fusermount -u mountpoint
Logging
• fuse forks a new process, so logging to stdout is not possible
• the parameter -f prevents fuse from forking
• alternative: logging to a file (implemented)
Debugging with valgrind
• problem with older kernel versions: fusermount not traceable
• workaround available: see README in project root
• with kernel 2.6.27-11-generic working out-of-the-box
16 / 25
![Page 31: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/31.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
The benchmark process
Testsets
• comparision of Dbfs and tmpfs
• evaluation of Dbfs• tmpfs as base filesystem• ext3 / tmpfs filesystem for the mysql database• clean / dirty database
Different use cases
• reading filesystem attributes
• reading metadata files and tiff-Files
• reading virtual files
• writing virtual files
17 / 25
![Page 32: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/32.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
The benchmark process
Testsets
• comparision of Dbfs and tmpfs
• evaluation of Dbfs• tmpfs as base filesystem• ext3 / tmpfs filesystem for the mysql database• clean / dirty database
Different use cases
• reading filesystem attributes
• reading metadata files and tiff-Files
• reading virtual files
• writing virtual files
17 / 25
![Page 33: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/33.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Metadata
Figure: Reading filesystem attributes18 / 25
![Page 34: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/34.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Physical files
Figure: Read test for the physical files depending on blocksize19 / 25
![Page 35: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/35.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Physical files (2)
Figure: Read test for the physical files, time for reading one byte
.
20 / 25
![Page 36: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/36.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Physical files (3)
Figure: Read test for the physical files, bytes per sec
.
21 / 25
![Page 37: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/37.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files
Figure: Read test for virtual files22 / 25
![Page 38: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/38.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files (2)
Figure: Write test for virtual files23 / 25
![Page 39: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/39.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Virtual files (3)
Figure: Read and write for virtual files24 / 25
![Page 40: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/40.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Conclusion
Software project goal
• mapping filesystem and database sources in one namespacecan be solved by a fuse implementation
• good performance for physical files (stored on underlyingfilesystem)
• bottleneck for virtual files is not the database access itself
• concrete use case must take decision about using thisimplementation
Future work
• implementation issues (sql injection, dynamic virtualizationlayers)
25 / 25
![Page 41: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/41.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
Conclusion
Software project goal
• mapping filesystem and database sources in one namespacecan be solved by a fuse implementation
• good performance for physical files (stored on underlyingfilesystem)
• bottleneck for virtual files is not the database access itself
• concrete use case must take decision about using thisimplementation
Future work
• implementation issues (sql injection, dynamic virtualizationlayers)
25 / 25
![Page 42: Dbfs - Database filesystem](https://reader030.vdocument.in/reader030/viewer/2022021307/6207543849d709492c3059ce/html5/thumbnails/42.jpg)
Concept and problem case Software design Implementation Benchmarks Conclusion and future work Literature
ROFS, the Read-Only Filesystem for FUSE.http://mattwork.potsdam.edu/projects/wiki/index.
php/Rofs
IEEE, The ; Group, The O.:The Open Group Base Specifications Issue 6.http://www.opengroup.org/onlinepubs/009695399/
functions/contents.html
Microsystems, Sun:MySQL 6.0 Reference Manual.http://dev.mysql.com/doc/refman/6.0/en/index.html
Sourceforge.net:Main Page - fuse.http://apps.sourceforge.net/mediawiki/fuse/index.
php?title=Main_Page
25 / 25