the power of datomic
TRANSCRIPT
The power of Datomic
Konrad Szydlo@ryujinkony
Independent components
● Storage● Reading● Writes● Query
Main features
● No update-in place● Only assertions and retractions● Retraction != removal● Immutable data● Database as a value● Flexible schema● ACID transactions● Declarative and logic data programming
Datomic's Reality model
● Based on Clojure's reality model● Value● Identity● State● Time
Value
● Never changing property● Stable, immutable● 44● “John”● 44 != 61● Same meaning regardless of context
Identity
● Named entity in a system● May have different values● At different points in time● “John”● “Johnny”● “ElfenLied”
State
● Value of an identity● At a point in time
Time
● By-product of ordering states within an entity● Linear progression● Morning => ElfenLied● Afternoon => John● Evening => ElfenLied
Datom
● Basic unit of operation in Datomic● Also called a fact● Made up of five parts
Datom example
Entity attribute value Tx operation
2439 name John at work add
2439 name John playing retract
2439 name ElfenLied playing add
Not only “now”
● Easy access to past and future states● What was the value of x one week ago?● What was the state of DB before app crashed?● Not so easy in RDBMS
Everything is data
● Datomic operates on data● Lists, maps, vectors, keywords, strings● Even functions, schema is data● Possible to run queries on all data
Transactor
● Single instance● Handles all writes● Second instance on stand-by● Ensures ACID properties● Notifies transaction submitter and all peers
(other applications) when transaction is persisted
Storage
● Not directly in Datomic● Leverages storage services:
– DynamoDB
– Riak
– Infinispan
– SQL storages
● Storing segments of datoms not individual ones● Log storage – chronological● Indices storage – various orders of datoms
Indices
● Entity / Attribute / Value / Transaction● Sorted order● EAVT – all datoms (SQL “row-like” view)● AEVT – all datoms (SQL “column-like” view)● AVET – unique datoms● VAET – reference attributes
EAVT
● EAVT – all datoms (SQL “row-like” view)
134 name Tim 4592 add
586 city 32 1975 add
586 gender male 4592 add
586 name John 4592 add
976 name Rob 4938 add
AEVT
● AEVT – all datoms (SQL “column-like” view)
city 986 32 1975 add
name 134 Tim 4592 add
name 576 Rob 4938 add
name 986 John 4592 add
title 367 Lord Jim 4592 add
AVET
● AVET – unique datoms
city 32 986 1975 add
name John 986 4592 add
name Rob 576 4938 add
name Tim 134 4592 add
title Lord Jim 367 4592 add
VAET
● VAET – reference attributes
15 author 841 1975 add
32 city 134 4592 add
269 city 576 4938 add
517 city 986 4592 add
male gender 986 4592 add
Value of DB
● View of db is a value● Immutable● Graph of entities and their attributes● Direct iterable access to indices● DB value is a param to queries and functions
Datalog
● Default query language● Declarative – WHAT not HOW● Logic – patterns matching● Logical variables
Query structure
● :find – projection clause, similar to SELECT● :in – binding arguments
– Implicit for some queries
● :where – restriction clause● Most of the logic would be in :where clause
Simple schema
person
name Joseph Conrad
name George Orwell
genre
type novel
type fiction
Schema
book
:book/author Joseph Conrad
:book/title Lord Jim
:book/genre novel
Simple Query
● Find titles for all books
[:find ?title
:where [?book :book/title ?title]]
↓ ↓ ↓
(Entity Attribute Value)
● Binding on ?title and ?book
● Find all book titles for Joseph Conrad
[:find ?title
:where [?book :book/author ?author]
[?author :person/name “Joseph Conrad”]
[?book :book/title ?title]]
[:find ?title
:where [?book :book/author ?author]
[?author :person/name “Joseph Conrad”]
[?book :book/title ?title]]
● ?book resolves to the same entity
● ?author resolves to the same entity
Order in query
[:find ?title
:where
[?author :person/name “Joseph Conrad”]
[?book :book/author ?author]
[?book :book/title ?title]]
● Queries are semantically similar● Marginal performance difference● Provide specific datoms first
Even more attributes in query
[:find ?title ?author ?year
:where
[?b :book/genre :genre.type/novel]
[?b :book/year ?year]
[?b :book/author ?p]
[?p :person/name ?author]
[?b :book/title ?title]]
Pull API
● New addition to Datomic● Allows easy access to datom's attributes● Provides a number of patterns for querying
Wildcard pattern
● Return all attributes with *
(pull
databasevalue
'[*]
entityid)
Specific attributes
● Use pattern to return specific attributes
(pull
databasevalue
'[:book/title :book/genre]
lordjim)
Schema with reference
{:db/id #db/id[:db.part/db]
:db/ident :book/genre
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/one
:db/doc "Book's genre"
:db.install/_attribute :db.part/db}
Lookup reverse
● Navigate backwards from declared reference● Get fiction books
(pull
databasevalue
'[:book/_genre]
:genre.type/fiction)
German philosophers
● Kepler (1571 - 1630)● Leibniz (1646 - 1716)● Kant (1724 - 1804)● Hegel (1770 - 1831)● Heidegger (1889 – 1976)
● attribute :person/influenced-by
Recursion
● Who influenced Kant?
(pull
databasevalue
'[:person/name {:person/influencedby 5}]
kant)
● Gottfried Leibniz <= Kepler
● Kepler (1571 - 1630)● Leibniz (1646 - 1716)● Kant (1724 - 1804)● Hegel (1770 - 1831)● Heidegger (1889 – 1976)
Recursion + reverse lookup
● Who was influenced by Kant?
(pull
databasevalue
'[:person/name {:person/_influencedby 5}]
kant)
● Hegel => Heidegger
Java methods
[:find ?name
:where
[?p :person/name ?name]
[(.startsWith ?name "M")]]
Functions
● Atomic transformations● Integrity checks● Constraints
DB function components
● Function is a data structure● Declared with :db/fn● Optional docs● Clojure or java● Parameters● Require (Clojure) or import (Java) block● Your code
DB function
{:db/ident :getkvolume
:db/doc "Gets number of book copies in 1000s”
:db/fn #db/fn
{:lang "clojure"
:params [noofcopies]
:requires [[datomicbook.utils :refer [formatk]]]
:code [(formatk (/ noofcopies 1000) )]}}
Using db functions
(invoke
databasevalue
:getkvolume
4000)
=> ["4 K"]
Validation function
{:db/ident :validatebook
:db/fn #db/fn
{:lang "clojure"
:params [book]
:requires [[clojure.string :as str]]
:code (let [required #{:book/title :book/author}
missing (remove book required)]
(iflet [missing (seq missing)]
(throw (RuntimeException.
(str "Missing attributes"
(str/join ", " missing))))
book))}}
Construction function
{:db/ident :validateandconstructbook
:db/fn #db/fn
{:lang "clojure"
:params [db m]
:code (let [book (merge {:db/id
(tempid :db.part/user)}
m)
validate (→ (entity db :validatebook)
:db/fn)]
[(validate book)])}}
(transact
databasevalue
[[:validateandconstructbook
{:book/title "Victory"
:book/author josephconrad}]])
Database filters
● Filter DB value based on some predicate● Keep only relevant datoms● Built-in filters and custom filters● Filters allow for one set of queries operating on
different db values
As-of
● Returns DB value “as of” particular point in time● Ignores any transactions after that point● Point-in-time could be:
– Transaction id
– java.util.Date instance
– Time-basis (t) of database
● What was the DB last week, month etc?
(findpopulation
(asof database berlin1850txdate)
berlin)
Since
● Opposite of as-of● Returns value of database that includes only
datoms added after certain point in time
● What were the transactions after X point in time?
(findpopulation
(since database berlin1850txdate)
berlin)
Latest changes
(def txtime #inst "20141225T10:33:16.43600:00")
(map :e (datoms (since berlin1850 txtime) :eavt))
● Get datoms from filtered db using since
● Get datoms using :eavt index
● Get entity ids from datoms
history
● Present and all of the past unfiltered● Complete history of entity● Or group of entities
(>> (d/q '[:find ?aname ?v ?inst
:in $ ?e
:where [?e ?a ?v ?tx true]
[?tx :db/txInstant ?inst]
[?a :db/ident ?aname]]
(history database) berlin)
(filter
#(some #{:city/population} %))
(sortby #(nth % 2)))
([:city/population 260000 Wed Dec 31 08:57:09 GMT 2014]
[:city/population 383000 Thu Jan 01 06:00:55 GMT 2015])
Filtering errors
● Custom filter for:– incorrect datoms
– Not applicable data at some point in time
– Security reasons
● Add “wrong” population
(def berlinerror
@(transact
connection
[{:db/id berlin
:city/population 999999}]))
(def errortxid
(:tx (first (berlinerror :txdata))))
(def errortxes
"Known bad transactions"
#{errortxid})
(defn correct?
[_ datom]
(not (contains? errortxes (:tx datom))))
(def corrected (filter (berlinhistory) correct?))
(getpopulationhistory corrected)
; id population tx
=> #{[17592186045421 383000 13194139534329] [17592186045421 139700 13194139534327] [17592186045421 163600 13194139534328] [17592186045421 686000 13194139534331] [17592186045421 63400 13194139534324] [17592186045421 39000 13194139534313]}
● Still more from Datomic:– Negation in query
– Retraction
– Excision – true removal
– Partitioning
– Transactions
– with – state with proposed additions● What would happen if we did x?
– More in depth on covered topics
Resources
● http://docs.datomic.com● http://www.datomic.com/training.html● http://www.datomic.com/videos.html● http://www.learndatalogtoday.org
Thank you
● Datomic is awesome● [email protected]● @ryujinkony