walb: block-level wal. concept

23
WalB: Block-level WAL for Efficient Incremental Backup Dec 2, 2010 Takashi HOSHINO Cybozu Labs, Inc.

Upload: takashi-hoshino

Post on 04-Dec-2014

666 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: WalB: Block-level WAL. Concept

WalB: Block-level WALfor Efficient Incremental Backup

Dec 2, 2010

Takashi HOSHINO

Cybozu Labs, Inc.

Page 2: WalB: Block-level WAL. Concept

Contents

• Motivation

• WalB

– Architecture

– Main Algorithm

– Data Format

– Pros and Cons

• Current Progress

• Summary

Page 3: WalB: Block-level WAL. Concept

Motivation

• There is no good backup solution

– Online

– Small performance overhead

– Supports various applications

– Cost-effective

• We need

– Consistent full backup

– Incremental backup

– Block-level backup

– To use commodity hardware and free software only

Page 4: WalB: Block-level WAL. Concept

Requirements inside Cybozu

• Guarantee backup interval

– within 5-10min

• Keep multiple backup archives

– also in remote site

• Keep multiple snapshots

– per 1day for 1week

Page 5: WalB: Block-level WAL. Concept

WalB

• A wrapper block device driver

– Data device to store data

– Log device to store WAL (write-ahead log)

• Related user-land tools

– Device controller

– Log extractor

• Target OS and architecture

– Latest Linux kernel (2.6.36)

– x86_64 host

Page 6: WalB: Block-level WAL. Concept

Applications

System Architecture with WalB

File System

WalB

Software RAID1

Volume0

Database System

Volume1

OS

Storage

App

For Backup

For Mirroring

Page 7: WalB: Block-level WAL. Concept

Backup vs Mirroring

Typical methods

Recoverable failures

Can keep latest data?

Backup Mirroring

Making snapshot, Logging writes

Operation miss,Application bug

No

RAID1, Replication

Failure of facilities

Yes

We need both functionalities to save data from lost.

Page 8: WalB: Block-level WAL. Concept

WalB Architecture

WrapperBlock Device(WalB Dev)

Any Block Devicefor Data (Data device)

Any Block Devicefor Log (Log device)

Read Write Log

Not special format An original format

Any Application(File System, DBMS, etc)

WalB LogExtractor

WalB DevController

Control

Page 9: WalB: Block-level WAL. Concept

WalB Functionalities

• Online incremental backup

• Online consistent full backup

• Snapshot creation/deletion

– not accessible due to no index

• Volume resize

– not resize of log capacity

Page 10: WalB: Block-level WAL. Concept

Incremental Backup with WalB

LV1

LV2

… …

LV1 bkpLV1 bkpLV1 bkp1

LV1 bkpLV1 bkpLV2 bkp1

LV1 bkpLV1 bkpLV1 bkp2

LV1 bkpLV1 bkpLV2 bkp2

Primary Server Backup Server 1 Backup Server 2

Page 11: WalB: Block-level WAL. Concept

Log Transfer

• Each log is compressed with lzop and transferred via ssh/rsh.

• Proxy may be useful for pipelined transfer to multiple sites.

• Delete original log in primary server after all backup servers have a replica.

Primary Server Backup Server 1 Backup Server 2

Log 2

Log 3

Log 2 (lzoped)

Log 1 (lzoped) Log 1 (lzoped)Log 1

Log 4

Page 12: WalB: Block-level WAL. Concept

Consistent Full Backup with WalB

Primary Server Backup Server

WalB Device(online)

Log

Full Archive(inconsistent)

Log

Apply

(A)

(B)

Start full backup

t1

Get consistent Image at t1

(C)

(A) (B) (C)

Timet0 t2

Get log fromt0 to t1

Page 13: WalB: Block-level WAL. Concept

Wait completion

Read/Write Algorithm (1)RequestQueue

•Generate logpack header•Generate request for log device•Issue the request

•Generate request for data device•Issue the request

•Send completion for upper layer•Update written_lsid•Free logpack header buffer

Wait completion

Wait completion

Write

•Generate and issue to the data device

•Send completion for upper layer

Read

Simple Algorithm

Page 14: WalB: Block-level WAL. Concept

Wait completion

Read/Write Algorithm (2)RequestQueue

•Generate logpack header•Allocate data buffer and copy data for data device write•Generate request for log device•Issue the request

•Insert to PendingTree•Send completion for upper layer•Generate request for data device•Issue the request

•Delete from PendingTree•Free data buffer for data device write•Update written_lsid•Free logpack header buffer

Wait completion

Wait completion

Write

•Search PendingTreeIf all data are in the treeThen copy data and send completion

for upper layerElse generate request for lack data and

issue the request to the data device

•If all requests have finishedThen send completion for upper layer

Read

Faster but more complex than (1)

Page 15: WalB: Block-level WAL. Concept

Parallel Task Processing

Generateread request

Generatelogpack

RequestQueue

Selectread/write

Writelogpack

Submit read request,Wait completion,Send completion

ReadQueue

LogpackQueue

LogpackCompletion

Queue

DatapackCompletion

Queue

Wait completion,Generate datapack

Wait completion,Update written_lsid,

Send completion

Writelogpack

Submitlogpack

Writelogpack

Writelogpack

Writedatapack

DatapackQueue

This is of algorithm (1)

• Data write must start after log write completion• Send completion for write must be serialized

Page 16: WalB: Block-level WAL. Concept

WalB Data Format

• Data device

– The same image as wrapping block device

• Log device

– Overview

– Snapshot metadata

– Ring buffer

– Logpack

Page 17: WalB: Block-level WAL. Concept

Log Device Format

• Snapshot metadata– The size is determined at device creation.

• Ring buffer– Stores write-ahead log.

– The size is determined at device creation.

– The size can not be changed.

Ring bufferSnapshot metadata

Address

Superblock (SECTOR_SIZE)

Reserved (PAGE_SIZE)

PAGE_SIZE = 4096 bytesSECTOR_SIZE = 512 or 4096 bytes

Page 18: WalB: Block-level WAL. Concept

Snapshot Metadata

• Snapshot header contains checksum and allocation bitmap

• Lsid: Log sequence id.

• Snapshot record size is 80 bytes. (6 records in 512 byte sector)

u32checksum

u8[64]name

SnapshotMetadata

SnapshotSector

SnapshotRecord

(fixed size)

u64 timestamp

u64lsid

u32bitmap

Snapshot Header

Page 19: WalB: Block-level WAL. Concept

Ring Buffer

start_offset

2nd

written data

Ring buffer

Log pack LogpackHeader

1st

written data …

Log packwith oldest_lsid

SECTOR_SIZE

Page 20: WalB: Block-level WAL. Concept

Logpack Header

1st 2nd 3rd …Logpack Header

u64 lsid; /* Log sequence id */u64 offset; /* IO offset by the sector. */u16 lsid_local; /* local sequence id

as the data offset in the log record. */u16 size; /* IO size by the sector. */u16 is_exist; /* 0 if this record does not exist. */u16 reserved1;

Log Record

Log Record

u32checksum

u16# of IO

u16Total IO size

Page 21: WalB: Block-level WAL. Concept

Write twice (1)

Pros and Cons

Read

Write

WalB(redo log)

Snapshot with redo log

No overhead Index search

Index modification

Snapshot with undo log

Bitmap search

Bitmap search/modification

and old data copyNo overhead (2)

WalB + Index ~= Block device with snapshot management

TypicalSoft.

--- ZFS, BtrFS LVM

Page 22: WalB: Block-level WAL. Concept

Current Progress

• Survey and study linux kernel programming

• Design of rough architecture

• Prototype implementation

• Basic evaluation and redesign if required

• Implementation of full functionalities and test

• Operation inside Cybozu

• Publication as GPLv2

• Merging to device-mapper if required

• Merging to main repository (hopefully)

Page 23: WalB: Block-level WAL. Concept

Summary

• Motivaion

– No good backup solutioncovering various applications

• WalB

– Is a block device driver with WAL

– Provides efficient incremental backup