storage architectures and options
TRANSCRIPT
Storage Architectures and Options
Alan McSweeney
April 12, 2023 2
Objectives
• To provide high-level information on storage options and architectures for storing and managing digital camera data
• To provide indicative sample solutions• To initiate discussions on storage
configurations and options
April 12, 2023 3
Agenda
• Confirmation of Storage Requirements• Data Flows and Processes• Storage Management Architectures and
Options• Storage Management Operation, Management
and Use• Sample Solutions
April 12, 2023 4
Understanding of Requirements
• Storage solution to manage raw and processed map image data
• Store raw and processed data− No requirement to store intermediate pre-processed data
• Keep 6 month’s raw and processed data on primary storage
• Keep online copy of additional data• Keep all raw and processed data indefinitely• Size for at least 5 years• Deliverables
− Draft data management/storage policy− SLA options on data retrieval from non-primary storage− Set of practical options− Storage management policy document
April 12, 2023 5
Objectives of Storage Management
• Data availability to meet service level commitments even during failures, disasters, or other forms of primary data loss
• Data protection against loss and to prevent unauthorised access
• Data retention that is compliant with regulations and standards in an unalterable state, fully audited for long periods of time
• Cost-effective storage management infrastructure
April 12, 2023 6
Backup and Data Archival
• Backup− Ensure efficient recoverability of data− Does not make backup data directly available− Optimised to bring large amounts of data back online quickly
for system recovery− Retention management at the volume level− Not oriented to long-term management beyond life of current
environment and media• Archiving
− Copy from online environment to separately managed (secure) storage to reduce cost of storage and enforce retention
− Provides easy (ideally transparent) access for retrieval− Optimised to write and retrieve data at file granularity− File-level retention management− Designed to manage data over long-term, through media
migration and with access auditing and controls− Designed to manage multiple copies of data on different media
types
April 12, 2023 7
High Level Storage Management Architectures
• Multi-tier data storage architectures− Primary/Secondary− Primary/Secondary/Tertiary− Primary/Secondary and Tertiary in parallel− Secondary disk storage layer is purely for
convenience to allow recall of data
• Advantages and disadvantages in terms of cost and service
April 12, 2023 8
Hierarchical Storage Management (HSM)
• HSM is a key requirement of effective (and cost-effective) storage management
• Data is migrated (moved / copied) from one storage layer to another, usually less expensive, form of storage
• A stub is created for and replaces each migrated file− On the local system, a stub file looks and act like a
regular file
• When user action restores a file but the user does not change the file, that file is ″re-stubbed″ during the next migration process
April 12, 2023 9
Primary/Secondary
Primary Storage
Secondary Storage
High speed fibre-channel disk
Data is directly accessible
Offline/nearline storage
Retain data indefinitely
Tape/optical media
Migrate After
Defined Interval
April 12, 2023 10
Primary/Secondary
Primary Storage
Secondary Storage
Migrate After
Defined Interval
Retrieve from Secondary to Primary
April 12, 2023 11
Primary/Secondary/Tertiary
Primary Storage
Secondary Storage
Tertiary Storage
High speed fibre-channel disk
Data is directly accessible
High capacity ATA (SATA/FATA) disk
Data is directly accessible
Data resides
Offline/nearline storage
Retain data indefinitely
Tape/optical media
Migrate After
Defined Interval
Migrate After
Defined Interval
April 12, 2023 12
Primary/Secondary/Tertiary
Primary Storage
Secondary Storage
Tertiary Storage
Migrate After
Defined Interval
Migrate After
Defined Interval
Retrieve from Secondary/Tertiary to
Primary
April 12, 2023 13
Primary/Secondary and Tertiary in Parallel
Primary Storage
Secondary Storage
Tertiary Storage
Migrate After
Defined Interval
Take Copy Immediatel
y
April 12, 2023 14
Hardware Options
• Disk Storage• Tape Storage – Manual or Automated• Optical Storage – Manual or Automated• Hybrid devices
− VTL (Virtual Tape Library)− EMC Centera− IBM DR550− Storage gateways
April 12, 2023 15
Hardware Options - Disk
Disk – Advantages• Speed - FC and SATA disk technologies allow the data
to be housed on the appropriate disks• SATA Drive technology has mature and can lead to
decreased acquisition costs• FC and SATA can be used within the same storage
system for primary and secondary data• Storage Virtualisation
− Virtualise disk arrays within a storage system− Virtualise storage systems within a fabric− Thin provisioning allows over commitment of disk – reducing
acquisition costs− Single Instance Storage (Deduplication) can be used but its
effectiveness depends in the nature of the data
April 12, 2023 16
Hardware Options - Disk
Disk – Disadvantages• Acquisition cost• Disk systems do not interoperate well• Management - multiple skill sets may be
required even if all storage systems are from the same vendor
• Most hardware vendors focus on ensuring hardware resilience, data resilience is not their concern
• Operating costs – power, air conditioning, maintenance
April 12, 2023 17
Hardware Options – Removable Media
• Advantages− Control of costs− Keep fixed number of media within automated
library unit (could keep none)
• Disadvantages− External media needs media management and
control• Media management is greater for smaller capacity optical
disks
− Manual costs of media management
April 12, 2023 18
Hardware Options – Optical Storage
Optical Storage• UDO (Ultra Density Optical)
− 60 GB media capacity
• UDO media have a 50+ year life • UDO technology roadmap -120GB and 240GB media capacities• Main vendor – Plasmon• Resold by other vendors: HP and IBM• WORM media option
Model Gx24 Gx32 Gx80 Gx174 G238 G438 G638 Maximum Media Slots 24 32 80 174 238 438 638 Maximum Raw Capacity – (TB) – UDO2
1.4 1.9 4.8 10.4 14.3 26.3 38.3
Max/Min Drives 2 / 1 2 / 1 4 / 2 6 / 2 12 / 2 12 / 2 12 / 2 Robotics Access Time (secs) 7 7 7.3 8.3 6.2 6.3 6.4 Library Reliability (Mean Swap Between Failure)
2,000,000 2,000,000 3,800,000
Redundant Power NA NA Optional Import/Export Slot Single Single Single Bulk Load NA NA 10 disk
April 12, 2023 19
Optical Library and Drive Performance
• Poor performance relative to tape• Direct access medium• Use depends on data read (retrieval) and write
volumes
Media Load Time 5 sec Media Unload Time 3 sec Average Seek Time 35 msec Buffer Memory 32MB Max Sustained Transfer Rate - Read 12 MB/s Max Sustained Transfer Rate - Write 6 MB/s (with verification)
MSBF - Mean Swap Between Failure > 750,000 load/unload cycles
MTBF - Mean Time Between Failure > 100,000 hours Interface Wide Ultra 2 LVD SCSI or USB 2.0
April 12, 2023 20
Single Drive/Path Tape and Optical Read and Write Performance
GBTape Read
TimeTape Write
TimeOptical
Read TimeOptical
Write Time
100 0.2 0.2 4.6 2.3200 0.5 0.5 9.3 4.6300 0.7 0.7 13.9 6.9400 0.9 0.9 18.5 9.3500 1.2 1.2 23.1 11.6600 1.4 1.4 27.8 13.9700 1.6 1.6 32.4 16.2800 1.9 1.9 37.0 18.5900 2.1 2.1 41.7 20.8
1,000 2.3 2.3 46.3 23.1
Hours
April 12, 2023 21
Hardware Options – Optical Storage
Optical – Advantages• Reduced cost over disk• Larger capacity media planned for the future• Can have embedded encryption• Long media shelf life before refresh is required• Very reliable medium• True WORM option
April 12, 2023 22
Hardware Options – Optical Storage
Optical – Disadvantages• Low capacity• Media must be managed offline unless multiple
libraries are bought• Low data access speed – not suited to large
data volume restores
April 12, 2023 23
Hardware Options – Optical Storage
Optical Storage Issues• Low medium capacity
− UDO – 60 GB currently, 120 GB and 240 GB planned
• Tape− LTO-4 Ultrium 1840 – 800 GB uncompressed− LTO-3 Ultrium 960 – 400 GB uncompressed
April 12, 2023 24
Tape and Optical Media Capacities
• Optical media capacity cumulative annual increase of c. 31%
• Tape media capacity cumulative annual increase of c. 64%
0
100
200
300
400
500
600
700
800
900
1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Ca
pa
cit
y G
B -
Pa
st
an
d C
urr
en
t
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
Ca
pa
cit
y G
B -
Fu
ture
Optical Media Capacity Tape Media Capacity Future Optical Media Capacity Future Tape Media Capacity
April 12, 2023 25
Hardware Options – Tape
Tape – Advantages• Cost• Very well defined road map for LTO
− LTO4 (Dec 2006) - 1.6TB (2:1 compression) and data transfer rates of up to 240 MB/second (2:1 compression)
− LTO5 (Planned) - 3.2 TB (2:1 compression) and data transfer rates of up to 360 MB/second (assuming a 2:1 compression)
− LTO6 (Planned) - 6.4 TB (2:1 compression) and data transfer rates of up to 540 MB/second (assuming a 2:1 compression)
• High capacity media• Designed for large data volume restore• Multiple media can be streamed to aggregate capacity
and speed• Can have embedded encryption
April 12, 2023 26
Hardware Options – Tape
Tape – Disadvantages• Media shelf life – medium• Media long-term reliability• Cumbersome single file restores• Sequential access medium
April 12, 2023 27
Hardware Options – Tape Library
• Widely available from large number of vendors: Dell, HP, IBM, Quantum− IBM System Storage TS3500 Tape Library − One base frame, and up to 15 expansion frames − Up to 12 drives per frame (up to 192 per library)− Up to 5.5 PB with LTO 4 cartridges − LTO Fibre Channel interface for server
attachment
• Very high capacity automated data management
• Long-term data storage
April 12, 2023 28
VTL (Virtual Tape Library)
• Hybrid units that emulate tape libraries• Use low cost disk (and possibly tape)• Works with existing tape backup software• Improved backup speeds• No removable medium backup• Sample products
− IBM• IBM Virtualization Engine TS7510 • IBM Virtualization Engine TS7520
− HP• StorageWorks Virtual Library System (VLS)• VLS1000i• VLS6000
April 12, 2023 29
IBM Virtualization Engine TS75x0
• TS7510• 96 TB Capacity at 2:1
Compression• Maximum number of
virtual libraries – 128• Maximum number of
virtual drives – 1,024• Maximum number of
virtual cartridges – 8,192 • Maximum number of
concurrent backups – 32
• TS7520• 2.6 PB Capacity at 2:1
Compression• Maximum number of
virtual libraries – 512• Maximum number of
virtual drives – 4,096• Maximum number of
virtual cartridges – 64,000
• Maximum number of concurrent backups – 32
April 12, 2023 30
HP StorageWorks Virtual Library System (VLS)
• VLS1000i• 3 TB Capacity at 2:1
Compression• Maximum number of
virtual libraries – 6• Maximum number of
virtual drives – 12
• VLS6000• 105 TB Capacity at 2:1
Compression• Maximum number of
virtual libraries – 16• Maximum number of
virtual drives – 128
April 12, 2023 31
IBM DR550
• Uses multiple storage tiers (disk, tape, optical) within an archive
• Software - System Storage Archive Manager• Two models
− DR1 - 36.88 TB raw− DR2 - 168 TB raw
• Attached devices – support for PB capacities− Tape systems− Optical systems
• Awards− Data Protection Summit—Information Lifecycle Management
(ILM)—Best of Show, 2007 − AIIM (The Enterprise Content Management Association)—Best in
Show, 2005, 2006
April 12, 2023 32
Software Options
HSM • HSM is a principle most products offer the
same basic functionality− Automatic migration and management of data from
one medium to another− Stubs or pointer are left in place of migrated files− Speed of retrieval depends upon speed of hardware
upon which the files have been migrated to, this gives online, near-line and off-line options
April 12, 2023 33
Software Options
Bridgehead Software• Small company, employee owned
− Can they offer the level of service and support required when really needed
− Are they possible acquisition targets
• Ideal for mid – large customers − Can it handle the levels of data over time
Caminosoft• Major corporation – publicly listed and managed by SEC
rules and regulations• Primary focus is on managing file server type data• Repackaged by vendors such as CA
April 12, 2023 34
Software Options
Symantec• Major corporation• Two products:
− NetBackup− Enterprise Vault
• NetBackup− HSM does not support Windows
• Enterprise Vault− KVS staff still provide support, separate entity within Symantec− Focus is largely on email and compliance− Some integration with NetBackup− Files to be migrated are collected into CAB files− Entire CAB file recalled− Poor support for tape as archival medium
• Recommended that you only use tape for data that is seldom or never accessed
April 12, 2023 35
Software Options
IBM – Tivoli• Major corporation• Vast knowledge within the company• Extensive R&D budgets• Agents and options from most major software
and hardware vendors
April 12, 2023 36
Software Options
HP – File Archiver• Major corporation• Vast knowledge within the company• Extensive R&D budgets• “Simple Lightweight Solution” according to HP
April 12, 2023 37
Software Options
HSM Product
What is Required from chosen vendor / application?• Stable and functionally bullet proof solution• Easy to use• Capable of handling files• Capable of handling data volumes• Must integrate with backup application (so as NetBackup
does not initiate a restore when backing up or restoring stubs)
• Expert support knowledge• Expert integration knowledge
− These products are dependant on hardware vendors solutions
April 12, 2023 38
Data Deduplication
• Store only one copy of data• The deduplication process should be granular
− The smaller the data block examined, the more likely it is duplicate data will be found.
• The deduplication process should be designed with minimal overhead when deduplicating (storing) and un-deduplicating (retrieving) data− Hardware better than software
• The deduplication process should provide resiliency to insure that all data can be reliably stored and retrieved, even in the event of system failure
April 12, 2023 39
Data Deduplication
• Available for range of storage – hardware and software− Symantec Enterprise Vault creates a MD5 fingerprint
for every file that is archived• If multiple files have the same hash code, only one copy of
the file is physically stored
− IBM N Series has Advanced Single Instance Storage (ASIS)• Hardware and block-based deduplication
April 12, 2023 40
Deduplication in Action
Client.ppt
Identical file - 20 blocks
Sales ed.ppt
20 x 4K blocks
White paper.doc
Different file - 10 blocks
Sales ed v2.ppt
Edited file - 24 blocks
= Identical blocks
With ASIS - 38 total blocks
Without ASIS – 74 total blocks
April 12, 2023 41
Potential Deduplication Savings – Dependent in Data Types
Technical Pubs Archive
Engineering Home Directories
Medical Imaging
DataBase Backup
Software Archive
Web & Microsoft Office Data
0% 10% 20% 30% 40% 50% 60% 70% 80%
April 12, 2023 42
Software and Solution Design Constraints and Issues
Bottom Line• Produce a realistic design before implementation and validate
design• Solutions must be fully tested to ensure it works as expected• Decisions can then easily be made on the basis of the tests• NetBackup integration must be thoroughly tested with any
solution• Primary to secondary to tertiary migration and retrievals must be
tested and documented• Misconfiguration or lack of understanding can lead to data loss or
primary production system failure• Need to look at the total cost of ownership – maintenance, power,
manual effort – put a cost on all elements and activities to ensure fair comparison
• Reduced complexity – fewer components, vendors – means long-term ease of operation and use and has a genuine value
April 12, 2023 43
Sample Storage Capacity Planning
• Sizing issues and assumptions− Annual growth rate− Overhead for determination of actual disk storage requirements
(RAID overhead, etc.)− Archival storage medium utilisation overhead (allowance for
unfilled tapes, optical platters, RAID for VTL, etc.)− Storage lifecycle− Number of storage layers – 2 or 3
• Sample storage capacity planning scenarios− Annual growth rates – 0%, 10%, 20%, 30%− Translated into monthly growth rates for calculations - 20%
annual growth = 1.531% monthly − Three tiers− Migrate from Tier 1 to Tier 2 after 6 months− Migrate from Tier 2 to Tier 3 after further 6 months
April 12, 2023 44
Disk Space Calculations
• Storage estimates expressed as raw capacities required to accommodate data
• Includes overhead for effective usability, RAID, snapshots, online spare, less than 100% utilisation, etc.
• Primary storage after 5 years with 10% annual growth = 25,580 GB
• Equates to at least 34,533 GB of raw disk capacity
April 12, 2023 45
Sample Storage Capacity Planning – 0% Annual Growth Rate
Annual Growth Rate 0%Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead 35%Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead 25%Number of Years to Cater For in Initial Storage Solution 5Raw Data per Month GB 700Pre-processed Dara Per Month GB 2,000Processed Dara Per Month GB 2,000Primary Data Storage Retention Months 6Secondary Data Storage Retention Months 6Tertiary Data Copy Months 12Tertiary Data Storage Retention Months 9999PrimaryTotal Primary Data Per Month GB 2,700Total Primary Data Per Month Including Contingency and Growth GB 3,645Primary Storage Including Contingency GB 21,870Primary Storage Including Contingency and Growth GB 21,870SecondaryTotal Secondary Data Per Month GB 2,700Total Secondary Data Per Month Including Contingency and Growth GB 3,645Secondary Storage Including Contingency GB 21,870Secondary Storage Including Contingency and Growth GB 21,870UDO Medium Capacity GB 60LTO4 Medium Capacity Compressed 1600
April 12, 2023 46
Capacities - Annual Growth Rate – 0%
Month Primary GB
Total Primary
GB
Secondary GB
Total Secondary
GB
Tertiary GB
Total Tertiary
GB
UDO Medium
Slots
LTO4 Media
Month 6 3,645 21,870 0 0 0 0 0 0Month 12 3,645 21,870 3,645 21,870 0 0 0 0Month 18 3,645 21,870 3,645 21,870 3,375 20,250 338 13Month 24 3,645 21,870 3,645 21,870 3,375 40,500 675 25Month 30 3,645 21,870 3,645 21,870 3,375 60,750 1,013 38Month 36 3,645 21,870 3,645 21,870 3,375 81,000 1,350 51Month 42 3,645 21,870 3,645 21,870 3,375 101,250 1,688 63Month 48 3,645 21,870 3,645 21,870 3,375 121,500 2,025 76Month 54 3,645 21,870 3,645 21,870 3,375 141,750 2,363 89Month 60 3,645 21,870 3,645 21,870 3,375 162,000 2,700 101
April 12, 2023 47
Storage Capacities - 0% Annual Growth Rate
0
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
Mon
th 1
Mon
th 4
Mon
th 7
Mon
th 1
0
Mon
th 1
3
Mon
th 1
6
Mon
th 1
9
Mon
th 2
2
Mon
th 2
5
Mon
th 2
8
Mon
th 3
1
Mon
th 3
4
Mon
th 3
7
Mon
th 4
0
Mon
th 4
3
Mon
th 4
6
Mon
th 4
9
Mon
th 5
2
Mon
th 5
5
Mon
th 5
8
GB
Total Secondary GB Total Primary GB Total Tertiary GB
April 12, 2023 48
Media Requirements - 0% Annual Growth Rate
0
500
1,000
1,500
2,000
2,500
3,000
Month1
Month5
Month9
Month13
Month17
Month21
Month25
Month29
Month33
Month37
Month41
Month45
Month49
Month53
Month57
Month
Nu
mb
er o
f M
edia
UDO Medium Slots LTO4 Media LTO3 Media
April 12, 2023 49
Sample Storage Capacity Planning – 10% Annual Growth Rate
Annual Growth Rate 10%Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead 35%Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead 25%Number of Years to Cater For in Initial Storage Solution 5Raw Data per Month GB 700Pre-processed Dara Per Month GB 2,000Processed Dara Per Month GB 2,000Primary Data Storage Retention Months 6Secondary Data Storage Retention Months 6Tertiary Data Copy Months 12Tertiary Data Storage Retention Months 9999PrimaryTotal Primary Data Per Month GB 2,700Total Primary Data Per Month Including Contingency and Growth GB 3,645Primary Storage Including Contingency GB 21,870Primary Storage Including Contingency and Growth GB 32,020SecondaryTotal Secondary Data Per Month GB 2,700Total Secondary Data Per Month Including Contingency and Growth GB 3,645Secondary Storage Including Contingency GB 21,870Secondary Storage Including Contingency and Growth GB 32,020UDO Medium Capacity GB 60LTO4 Medium Capacity Compressed 1600
April 12, 2023 50
Capacities - Annual Growth Rate – 10%
Month Primary GB
Total Primary
GB
Secondary GB
Total Secondary
GB
Tertiary GB
Total Tertiary
GB
UDO Medium
Slots
LTO4 Media
Month 6 3,823 22,459 0 0 0 0 0 0Month 12 4,010 23,586 3,823 22,459 0 0 0 0Month 18 4,205 24,737 4,010 23,586 3,713 21,723 362 14Month 24 4,410 25,945 4,205 24,737 3,894 44,447 741 28Month 30 4,626 27,211 4,410 25,945 4,084 68,280 1,138 43Month 36 4,851 28,539 4,626 27,211 4,283 93,276 1,555 58Month 42 5,088 29,932 4,851 28,539 4,492 119,492 1,992 75Month 48 5,337 31,393 5,088 29,932 4,711 146,988 2,450 92Month 54 5,597 32,925 5,337 31,393 4,941 175,826 2,930 110Month 60 5,870 34,533 5,597 32,925 5,183 206,071 3,435 129
April 12, 2023 51
Storage Capacities - 10% Annual Growth Rate
0
50,000
100,000
150,000
200,000
250,000
Mon
th 1
Mon
th 4
Mon
th 7
Mon
th 1
0
Mon
th 1
3
Mon
th 1
6
Mon
th 1
9
Mon
th 2
2
Mon
th 2
5
Mon
th 2
8
Mon
th 3
1
Mon
th 3
4
Mon
th 3
7
Mon
th 4
0
Mon
th 4
3
Mon
th 4
6
Mon
th 4
9
Mon
th 5
2
Mon
th 5
5
Mon
th 5
8
GB
Total Secondary GB Total Primary GB Total Tertiary GB
April 12, 2023 52
Media Requirements - 10% Annual Growth Rate
0
500
1,000
1,500
2,000
2,500
3,000
3,500
Month1
Month5
Month9
Month13
Month17
Month21
Month25
Month29
Month33
Month37
Month41
Month45
Month49
Month53
Month57
Month
Nu
mb
er o
f M
edia
UDO Medium Slots LTO4 Media LTO3 Media
April 12, 2023 53
Sample Storage Capacity Planning – 20% Annual Growth Rate
Annual Growth Rate 20%Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead 35%Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead 25%Number of Years to Cater For in Initial Storage Solution 5Raw Data per Month GB 700Pre-processed Dara Per Month GB 2,000Processed Dara Per Month GB 2,000Primary Data Storage Retention Months 6Secondary Data Storage Retention Months 6Tertiary Data Copy Months 12Tertiary Data Storage Retention Months 9999PrimaryTotal Primary Data Per Month GB 2,700Total Primary Data Per Month Including Contingency and Growth GB 3,645Primary Storage Including Contingency GB 21,870Primary Storage Including Contingency and Growth GB 45,350SecondaryTotal Secondary Data Per Month GB 2,700Total Secondary Data Per Month Including Contingency and Growth GB 3,645Secondary Storage Including Contingency GB 21,870Secondary Storage Including Contingency and Growth GB 45,350UDO Medium Capacity GB 60LTO4 Medium Capacity Compressed 1600
April 12, 2023 54
Capacities - Annual Growth Rate – 20%
Month Primary GB
Total Primary
GB
Secondary GB
Total Secondary
GB
Tertiary GB
Total Tertiary
GB
UDO Medium
Slots
LTO4 Media
Month 6 3,993 23,016 0 0 0 0 0 0Month 12 4,374 25,274 3,993 23,016 0 0 0 0Month 18 4,791 27,687 4,374 25,274 4,050 23,163 386 14Month 24 5,249 30,329 4,791 27,687 4,437 48,413 807 30Month 30 5,750 33,224 5,249 30,329 4,860 76,072 1,268 48Month 36 6,299 36,395 5,750 33,224 5,324 106,371 1,773 66Month 42 6,900 39,869 6,299 36,395 5,832 139,562 2,326 87Month 48 7,558 43,674 6,900 39,869 6,389 175,921 2,932 110Month 54 8,280 47,843 7,558 43,674 6,998 215,750 3,596 135Month 60 9,070 52,409 8,280 47,843 7,666 259,381 4,323 162
April 12, 2023 55
Storage Capacities - 20% Annual Growth Rate
0
50,000
100,000
150,000
200,000
250,000
Mon
th 1
Mon
th 4
Mon
th 7
Mon
th 1
0
Mon
th 1
3
Mon
th 1
6
Mon
th 1
9
Mon
th 2
2
Mon
th 2
5
Mon
th 2
8
Mon
th 3
1
Mon
th 3
4
Mon
th 3
7
Mon
th 4
0
Mon
th 4
3
Mon
th 4
6
Mon
th 4
9
Mon
th 5
2
Mon
th 5
5
Mon
th 5
8
GB
Total Secondary GB Total Primary GB Total Tertiary GB
April 12, 2023 56
Media Requirements - 20% Annual Growth Rate
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
Month1
Month5
Month9
Month13
Month17
Month21
Month25
Month29
Month33
Month37
Month41
Month45
Month49
Month53
Month57
Month
Nu
mb
er o
f M
edia
UDO Medium Slots LTO4 Media LTO3 Media
April 12, 2023 57
Sample Storage Capacity Planning – 30% Annual Growth Rate
Annual Growth Rate 30%Disk Storage Contingency, Allowance for Less Than 100% Utilisation, RAID, Other Overhead 35%Tape Storage Contingency, Allowance for Less Than 100% Utilisation, Other Overhead 25%Number of Years to Cater For in Initial Storage Solution 5Raw Data per Month GB 700Pre-processed Dara Per Month GB 2,000Processed Dara Per Month GB 2,000Primary Data Storage Retention Months 6Secondary Data Storage Retention Months 6Tertiary Data Copy Months 12Tertiary Data Storage Retention Months 9999PrimaryTotal Primary Data Per Month GB 2,700Total Primary Data Per Month Including Contingency and Growth GB 3,645Primary Storage Including Contingency GB 21,870Primary Storage Including Contingency and Growth GB 62,463SecondaryTotal Secondary Data Per Month GB 2,700Total Secondary Data Per Month Including Contingency and Growth GB 3,645Secondary Storage Including Contingency GB 21,870Secondary Storage Including Contingency and Growth GB 62,463UDO Medium Capacity GB 60LTO4 Medium Capacity Compressed 1600
April 12, 2023 58
Capacities - Annual Growth Rate – 30%
Month Primary GB
Total Primary
GB
Secondary GB
Total Secondary
GB
Tertiary GB
Total Tertiary
GB
UDO Medium
Slots
LTO4 Media
Month 6 4,156 23,545 0 0 0 0 0 0Month 12 4,739 26,937 4,156 23,545 0 0 0 0Month 18 5,403 30,713 4,739 26,937 4,388 24,575 410 15Month 24 6,160 35,019 5,403 30,713 5,003 52,398 873 33Month 30 7,024 39,927 6,160 35,019 5,704 84,122 1,402 53Month 36 8,008 45,524 7,024 39,927 6,503 120,292 2,005 75Month 42 9,131 51,906 8,008 45,524 7,415 161,532 2,692 101Month 48 10,410 59,182 9,131 51,906 8,454 208,554 3,476 130Month 54 11,870 67,477 10,410 59,182 9,639 262,167 4,369 164Month 60 13,534 76,936 11,870 67,477 10,991 323,294 5,388 202
April 12, 2023 59
Storage Capacities - 30% Annual Growth Rate
0
50,000
100,000
150,000
200,000
250,000
Mon
th 1
Mon
th 4
Mon
th 7
Mon
th 1
0
Mon
th 1
3
Mon
th 1
6
Mon
th 1
9
Mon
th 2
2
Mon
th 2
5
Mon
th 2
8
Mon
th 3
1
Mon
th 3
4
Mon
th 3
7
Mon
th 4
0
Mon
th 4
3
Mon
th 4
6
Mon
th 4
9
Mon
th 5
2
Mon
th 5
5
Mon
th 5
8
GB
Total Secondary GB Total Primary GB Total Tertiary GB
April 12, 2023 60
Media Requirements - 30% Annual Growth Rate
0
500
1,000
1,500
2,000
2,500
3,000
3,500
4,000
4,500
5,000
Month1
Month5
Month9
Month13
Month17
Month21
Month25
Month29
Month33
Month37
Month41
Month45
Month49
Month53
Month57
Month
Nu
mb
er o
f M
edia
UDO Medium Slots LTO4 Media LTO3 Media
April 12, 2023 61
10 Year Data Storage Capacities – Different Growth Rates
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
Month6
Month12
Month18
Month24
Month30
Month36
Month42
Month48
Month54
Month60
Month66
Month72
Month78
Month84
Month90
Month96
Month102
Month108
Month114
Month120
GB
Total Primary GB - 10% Total Secondary GB - 10% Total Tertiary GB - 10%
Total Primary GB - 20% Total Secondary GB - 20% Total Tertiary GB - 20%
Total Primary GB - 30% Total Secondary GB - 30% Total Tertiary GB - 30%
April 12, 2023 62
Single Drive/Path Tertiary Layer Data Write Times – Tape and Optical
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000
Mon
th 1
Mon
th 5
Mon
th 9
Mon
th 1
3
Mon
th 1
7
Mon
th 2
1
Mon
th 2
5
Mon
th 2
9
Mon
th 3
3
Mon
th 3
7
Mon
th 4
1
Mon
th 4
5
Mon
th 4
9
Mon
th 5
3
Mon
th 5
7
Mon
th 6
1
Mon
th 6
5
Mon
th 6
9
Mon
th 7
3
Mon
th 7
7
Mon
th 8
1
Mon
th 8
5
Mon
th 8
9
Mon
th 9
3
Mon
th 9
7
Mon
th 1
01
Mon
th 1
05
Mon
th 1
09
Mon
th 1
13
Mon
th 1
17
Ho
urs
Tape Write Time Hours 10% Growth Optical Write Time Hours 10% Growth Tape Write Time Hours 20% Growth
Optical Write Time Hours 20% Growth Tape Write Time Hours 30% Growth Optical Write Time Hours 30% Growth
April 12, 2023 63
Implementation Options
• Factors:− 2 or 3 tiers − Optical, tape or VTL as the last tier − Use of existing storage (HP/Dell) or new storage − DR or no DR
• Offsite manual copy or replication
− Software HSM – use existing NetBackup or other: HT FileStore, CaminoSoft, IBM Tivoli
April 12, 2023 64
Spectrum of Options
All disk
DR option with replicated data
Primary disk
Secondary - tape
Mixed disk/tape/optical/VTL/manual/automated
April 12, 2023 65
Data Retrieval Operation
• Secondary disk− Data is retrieved to primary immediately – available within
seconds/minutes
• Secondary/tertiary VTL− Data is retrieved to primary immediately – available within
minutes
• Secondary/tertiary tape library− Data is retrieved to primary immediately – available within
minutes
• Secondary/tertiary optical library− Data is retrieved to primary immediately – available within hours
• Manual media retrieval− Retrieval times depends on media location and staff allocated to
media handling
April 12, 2023 66
Sample Options
• Three tiers – optical or tape library as third tier• All disk• Reuse/expand existing hardware• Low cost ATA disks for secondary storage
• Not all available options – presented for review and feedback
April 12, 2023 67
Physical Option 1 – Three Tiers – Optical or Tape
April 12, 2023 68
Physical Option 1 – Three Tiers – Optical or Tape
April 12, 2023 69
Physical Option 1 - Components
• Primary storage – SAN with fibre disk• Second storage – SAN with ATA disk• Tertiary storage – optical library• Software
− HT Filestore− Caminosoft− NetBackup Storage Migrator− Tivoli Storage Manager
April 12, 2023 70
Resilience
• Primary storage mirrored for resilience
April 12, 2023 71
Operation and Service Level Agreement
April 12, 2023 72
Physical Option 2 – All Disk Configuration
• All disk storage option• Two mirrored sites with realtime replication• Multiple replicated components for resilience• Sample configuration
− Primary Storage• Clustered SAN Controllers with 594 x 300 GB Fibre Channel
Drives = 151 TB Raw Storage
− Secondary Storage• Clustered SAN Controllers with 336 x 750 GB SATA Drives =
252 TB Raw Storage
− Total 403 TB of Raw Storage capacity (doubled for DR)
April 12, 2023 73
All Disk Configuration
April 12, 2023 74
Resilience – Multiple Points of Redundancy
April 12, 2023 75
Resilience
• SAN switches• SAN controllers• Two disks per shelf• Entire site
April 12, 2023 76
All Disk Configuration
• Indicative hardware and software (replication, snapshot) cost− €1.8 million− €4,460 per TB (doubled for DR)
• 5 standard racks in each location• Does not include
− HSM software− Installation and commissioning
• Represents high water mark in terms of costs and functionality
April 12, 2023 77
All Disk Configuration
Advantages• High performance• Low manual intervention• Highly resilient
Disadvantages• High cost of acquisition and operation• Growth in data volumes means additional
expense• No upper limit on cost
April 12, 2023 78
Physical Option 3 – Existing Hardware
• Raw, pre-processed and processed data resides on HP EVA
• Replicated continuously to second EVA• Dell CX disk array used as secondary location• Existing ADIC LTO drives used for tertiary and
long term offsite storage
April 12, 2023 79
April 12, 2023 80
Existing Hardware
Advantages• Cost• Some skill sets already in organisation
Disadvantages• Investment in old technology• Software based HSM product skills required
April 12, 2023 81
Introduction of Tertiary Device
• Existing HP and Dell storage still employed• UDO or LTO device used as final destination
before removal to offsite archive
April 12, 2023 82
April 12, 2023 83
Introduction of Tertiary Device
Advantages• Cost – use of existing hardware• Some skill sets already in organisation• Media life is increased with UDO
Disadvantages• Cost – UDO or new tape library• Management of archived media – especially UDO as
they are low capacity• Investment in old technology• Software based HSM product skills required• UDO retrieval speeds
April 12, 2023 84
Virtual Tape Library
• VTL device will act as a tape library• VTL will be secondary location• HSM product skills may not be required • NetBackup could manage this process• VTL data will ultimately be archived to tape via
ADIC tape library
April 12, 2023 85
April 12, 2023 86
Virtual Tape Library
Advantages• Some skill sets already in organisation• No new third party migration tool absolutely necessary• Extension of NetBackup system using NetBackup
Storage Migrator
Disadvantages• Cost – VTL with required capacity can be expensive• Cannot take VTL backups offsite – tertiary solution still
required• Lack of vendor implementation experience
April 12, 2023 87
Physical Option 4 – Disk Based Secondary Information Store
• Single storage device with multiple PB of data scalability
• Data can be retained on information store for 15+ years and beyond
• 1 TB disk make this possible• Data can be moved to storage attached tape• Internal backup features of information store
can aid NetBackup routine (SnapShots, Vaulting)
April 12, 2023 88
April 12, 2023 89
Disk Based Information Store
Advantages• Speed of retrieval• No new third party migration tool absolutely necessary• Simplicity• Integration with NetBackup – no effect on daily backup
routines• Information store can be split across multiple information
stores to give multiple PB capacity is required
Disadvantages• Cost – may be expensive initially but storage can be
added over time as needed
April 12, 2023 90
Central Management – Storage Virtualisation
• Controller site above storage systems• Handle day to day management of storage
across all platforms
Advantages• Skill set consolidation• Costs
Disadvantages• Vendor based skill are still ultimately required
April 12, 2023 91
April 12, 2023 92
Key Questions
• Number of storage tiers and preferred configuration
• Use of tape/optical/VTL• Software HSM option• Disaster recovery/business continuity
requirements and options• Capacity planning constraints and assumptions• New hardware or reuse of existing hardware• Level of automation required for archival level• Financial constraints and budget available• Implementation schedule