a columnar architecture for modern risk management systems · voxel-based 3d city model •storing...
TRANSCRIPT
A columnar architecture for modern risk management systems
Romulo GoncalvesSisi Zlatanova, Kostis Kyzirakos, Pirouz Nourian, Foteini Alvanaki,
and Willem van Hage
Risk Management
3D city models
Data sources
Architecture of a service
• Spatial DataBaseManagement System
– Columnar architecture
• Spatial analysis tailored for different use case scenarios
Point Cloud DataVector Data
TOP25RASTER
Semantic data
Storage
Looking to the future
• Each use case might require a different resolution– Level of definition on the
semantic data
– Remote sensing data
New conceptual scheme
• Voxel-based 3D city
model
– Voxels are the
volumetric
representation of pixels.
Voxel-based 3D city model
• Storing volumetric spaces such as air, water and underground is possible.
• Every object is defined by set of voxels– The voxel size depending on the level of detail (LOD)
– voxel’s characteristics e.g. type (wall, glass, roof, door, etc.), color, density
• Simplification to speed up computations– simplifies a range of geometric operations: volumes and areas
– real world objects by a single geometry type (3D cube) instead of collection of polygons/polyhedron
Voxelization
• MethodsX paper:– Voxelization Algorithms for Geospatial Applications
• https://github.com/NLeSC/geospatial-voxels/software/voxelGen
• Storage challenges [1]:– With different semantic level of detail and coverage of in- and out-
side empty spaces
– Grid with homogenous cells or heterogenous cells• An entire city will generate a massive 3D grid of voxels at different resolutions
with a large number of semantic attributes attached
[1] Towards 3d raster GIS: on developing a raster engine for spatial DBMS
New storage scheme
• Nested column-oriented storage for 3D city models.
• To store nested data structures in flat columnar format [2]– the record-wise versus columnar
representation• In the columnar representation all the
values of a nested field are stored contiguously.
– the schema is mapped to a list of columns• A.B.C can be retrieved without reading
A.E, A.B.D,
• [2] Dremel: Interactive analysis of web-scale datasets.
Columnar format for nested structures
• The record structure is defined trough two integers called repetition level and definition level.
• Definition level – The definition level records at which level it
started being null.
• Repetition level. – It is used to define when a new list starts in
a column of values.
• Parquet– https://blog.twitter.com/2013/dremel-made-
simple-with-parquet)
Storage scheme
• The LOD is used for the definition level.
• All the voxels inherit the semantics from the parent– if a object has LOD2 semantics, it will
also has LOD1 semantics
• The repetition level is the number of sub-divisions a parent voxel has. – An object is semantically identified as
a building in LOD1
– While in LOD2 it might be composed by a set of sub-voxels to define walls, floor surface and etc.
Architecture of a service
• Exploit late materialization– Low memory footprint
• Repetition level– Scan aggregation-
operators
• Definition level– Projections
• Un-nest the data
• Tuple reconstruction
– Blocking operator
– Flat-data integration
Voxel Data
TOP25RASTER
Point Cloud Data Semantic dataVector Data
Storage
3D city models
“Minecrafted” city
Routes
• The available space to define escape trajectory routes
Indoor models
Summary & future
• Simplification to speed up computations
• The uniqueness of our solution– A voxel-based 3D city models,
– nested column-oriented format to explore the 3D city model at different levels of detail.
– topological and geometric functionality for 3D raster manipulation is part of the relational kernel and not an add-on
• Spatial analysis tailored to different use case scenarios
• Future– Geo-Spark
Questions & Ideas?
Image from: http://aboutinterviews.com/5-questions-to-not-ask-in-a-job-interview/
http://www.atlantainjurylawblog.com/files/2016/08/bright-idea.jpg