"building data streams" Константин Евтеев (avito)
TRANSCRIPT
![Page 2: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/2.jpg)
2Strictly Confidential 2Strictly Confidential 2
CRUD
read
DBcreate
update
delete
read
read
read
read
read
![Page 3: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/3.jpg)
3Strictly Confidential 3Strictly Confidential 3
1 Оптимизируем процедуры выборки
2 Вертикальное и функциональное масштабирование
3 Вводим в бой стендбаи бинарной репликации
4 Денормализуем данные для чтения
5 Шардирование
6 Денормализованные данные выносим на отдельные машины
Optimization steps
![Page 4: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/4.jpg)
4Strictly Confidential 4Strictly Confidential 4
C(R)UD & R
read
DBcreate
update
delete
read
read
read
read
read
DB-SB
![Page 5: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/5.jpg)
5Strictly Confidential 5Strictly Confidential 5
movies
idnamelength_minutesrating_id
movie_showtimes
idmovie_idtheatre_idroomstart_time
theatres
idnamephone_number
purchased_tickets
idconfirmation_codepurchase_price_cents
auditoriums
theatre_idroomseats_available
orders
confirmation_codemovie_showtime_idmovie_idtheatre_idauditorium_idroomstart_time
Денормализуем данные для чтения
mat_view
movie_namemovie_rating_idmovie_length_minutesmovie_id,theatre_id,room,start_time,theatre_name,seats_availble...
http://www.slideshare.net/pavlushko/sphinx-10460333http://www.pgcon.org/2008/schedule/attachments/64_BSDCan2008-MaterializedViews-paper.pdfhttp://www.pgcon.org/2008/schedule/attachments/63_BSDCan2008-MaterializedViews.pdf
![Page 6: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/6.jpg)
6Strictly Confidential 6Strictly Confidential 6
It looks like
1 Command and Query Responsibility Segregation (CQRS) или
Command-query separation (CQS)
2 Event Sourcing
3 Eventual consistency
4 Lambda Architecture
5 Kappa Architecture
http://lambda-architecture.net/http://milinda.pathirage.org/kappa-architecture.com/
![Page 7: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/7.jpg)
7Strictly Confidential 7Strictly Confidential 7
DBCUD
(CRUD)mv1
Field1 - fieldN
DB-SBR DBN-
CUD(CRUD)
DB-SBR
mvN
Field1 - fieldN
DB-SBRmv1
Field1 - fieldN
DB-RE-01*2
mv2
Field1 - fieldN
mv1
Field1 - fieldN
DB-RE-N*2
mv6
Field1 - fieldN
mv3
Field1 - fieldN
DBN-SBR
DBN-SBR
node1 node2
node3 node3
PL/Proxy cluster
dwhData streams
![Page 8: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/8.jpg)
8Strictly Confidential 8Strictly Confidential 8
Выгрузка по времени
+ работает− есть вопросы
![Page 9: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/9.jpg)
9Strictly Confidential 9Strictly Confidential 9
Выгрузка данных batch-ами
Источник Буфер ПриемникТранспорт транспорт
#pg_current_xlog_insert_location() #Get current transaction log insert locationmaster_pos=$(psql_ping_db c "select force_wal(pg_current_xlog_insert_location())")
#pg_xlog_location_diff(location text, location text) numeric #Calculate the difference between two transaction log locationspsql c "select pg_xlog_location_diff(pg_last_xlog_replay_location(), '${master_pos}')")
https://github.com/eshkinkot/pgday2016/tree/master/ping_DB
![Page 10: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/10.jpg)
10
Strictly Confidential 10
Strictly Confidential 10
Ticker & pgq
2008 год! https://www.pgcon.org/2008/schedule/attachments/55_pgq.pdf
https://www.postgresql.org/docs/9.4/static/functions-info.html#FUNCTIONS-TXID-SNAPSHOT
https://github.com/markokr/skytools/tree/skytools_2_1_stable/python
https://github.com/markokr/skytools/blob/skytools_2_1_stable/sql/pgq/functions/pgq.ticker.sql
http://www.pgcon.org/2009/schedule/attachments/91_pgq.pdf
![Page 11: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/11.jpg)
11
Strictly Confidential 11
Strictly Confidential 11
MVCC (https://www.postgresql.org/docs/8.3/static/release-8-3.html)
Add several txid_*() functions to query active transaction IDs (Jan)This is useful for various replication solutions.
![Page 12: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/12.jpg)
12
Strictly Confidential 12
Strictly Confidential 12
How to select txids that are between snapshots
xmin1 xmax1
xmax2
xmax1
xmin2
![Page 13: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/13.jpg)
13
Strictly Confidential 13
Strictly Confidential 13
Все изменения данных под единой блокировкой объекта(row level)
ItemsCategory
params values
moderation
doublets
Fee packages
…
UsersContact
information
shops, company
Afraud
payments
…
User lock
Item lock
Payment data
Items data
Users data
Под блокировкой понимаем цепочку блокировок:select item_id from items for update;
![Page 14: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/14.jpg)
14
Strictly Confidential 14
Strictly Confidential 14
Сессионные переменные “signal”
1. Достаточно дешево по ресурсам
2. Не нужно разбирать цепочку вызовов процедур, и добавлять входные/выходные параметры
Достоинства:
![Page 15: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/15.jpg)
15
Strictly Confidential 15
Strictly Confidential 15
Сессионные переменные, групповые действия, подводные камни …
В 1 транзакции может меняться несколько объектов, а событие относится не ко всем. Решение:Используем массив по unique key array ‘{}’
![Page 16: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/16.jpg)
16
Strictly Confidential 16
Strictly Confidential 16
Init remote mv or counterhttps://www.depesz.com/2016/06/14/incrementing-counters-in-database/
1) Cоздать временную таблицу на стороне подписчика
2) Создать принимающую процедуру
● временную для инициализации
● реальную
3)Начинаем слать события
4)Запускаем инициализацию со стендбая(дождаться прихода
события, с которого начали заполнять временную таблицу)
5)Переключаем на реальную процедуру приема
6)Пересчитываем под блокировкой через очередь
![Page 17: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/17.jpg)
17
Strictly Confidential 17
Strictly Confidential 17
Init remote mv or counter
items
item_iduser_idcategory_idlast_update_txtime
CONSTRAINT TRIGGER items_delta AFTER INSERT OR UPDATE OR DELETE
providerprovider subscriber
pgq
user_items_cnt
user_idcategory_idcntdate
tmp_user_cnt
user_id
1 save data for recount after init
2 copy
3 switch accept to real table
https://github.com/eshkinkot/pgday2016/tree/master/remote_cnt
![Page 18: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/18.jpg)
18
Strictly Confidential 18
Strictly Confidential 18
Table and trigger function on provider's side
![Page 19: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/19.jpg)
19
Strictly Confidential 19
Strictly Confidential 19
Tables on subscriber's sideAccept function
![Page 20: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/20.jpg)
20
Strictly Confidential 20
Strictly Confidential 20
Provider init function
![Page 21: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/21.jpg)
21
Strictly Confidential 21
Strictly Confidential 21
Subscriber's init functions
![Page 22: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/22.jpg)
22
Strictly Confidential 22
Strictly Confidential 22
Maintenance
![Page 23: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/23.jpg)
23
Strictly Confidential 23
Strictly Confidential 23
Real time sphinx index
Main DB
mat_view_index
Idfield1….filedN
DB-RE-INDmat_view_index
Idfield1….filedN sphinx
londiste
Select full reindex
pgq pgq consumer
![Page 24: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/24.jpg)
24
Strictly Confidential 24
Strictly Confidential 24
Persistent queue
table1
Idfield1….filedN
pgq
Table1 1
Idfield1….filedNtick_idcall_id
Table1 n
Idfield1….filedNtick_idcall_id
Lock freerotation
complex object tbl 1
Idfield1….filedNtick_idcall_id
complex object tbl n
Idfield1….filedNtick_idcall_id
Lock freerotation
Complex object part 1
Idfield1….
Complex object part 1
Idfield1….
Complex object part 1
Idfield1….
deferred trg
deferred trg
![Page 25: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/25.jpg)
25
Strictly Confidential 25
Strictly Confidential 25
Особенности согласования данныхEventual consistency
1 Read with ver num
master
Avito_delta
ClientAvito_delta
reserv
3 (if 2) write due to delay possible in pgq
pgq
2 consumers
2 write with check version
![Page 26: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/26.jpg)
26
Strictly Confidential 26
Strictly Confidential 26
Восстановление после аварий
providersb
subscriber
providerundo
provider
subscriber
subscriber
subscriber
1 crash
3 apply undo
2 promote
4 start consumer
3 copy
1 crash
2 stop consumer
4 Add new consumer
5 start consumer
![Page 27: "Building data streams" Константин Евтеев (Avito)](https://reader031.vdocument.in/reader031/viewer/2022022200/58a808fc1a28ab3d6e8b5429/html5/thumbnails/27.jpg)
White Gardens Business Center, 7 Lesnaya street, Moscow, 125047, www.avito.ruWhite Gardens Business Center, 7 Lesnaya street, Moscow, 125047, www.avito.ru
Thank you foryour attention!
White Gardens Business Center, 7 Lesnaya street, Moscow, 125047, www.avito.ru
Спасибо за внимание!
Константин Евтеев[email protected]
https://hh.ru/vacancy/10795267https://hh.ru/vacancy/11463461