vmworld 2013: architecting the software-defined data center

55
Architecting the Software-Defined Data Center Aidan Dalgleish, VMware David Hill, VMware Kamau Wanguhu, VMware VSVC7371 #VSVC7371

Upload: vmworld

Post on 22-Jan-2015

291 views

Category:

Technology


1 download

DESCRIPTION

VMworld 2013 Aidan Dalgleish, VMware David Hill, VMware Kamau Wanguhu, VMware Learn more about VMworld and register at http://www.vmworld.com/index.jspa?src=socmed-vmworld-slideshare

TRANSCRIPT

  • 1. Architecting the Software-Defined Data Center Aidan Dalgleish, VMware David Hill, VMware Kamau Wanguhu, VMware VSVC7371 #VSVC7371

2. 22 Introduction David Hill Senior Solutions Architect, Global Technology Solutions VCAP Blog: Virtual-Blog.com Twitter @davehill99 Aidan Dalgleish Staff Solutions Architect Global CoE VCDX #010 Blog: vCloudscape.com Twitter: @aidersd Kamau Wanguhu Staff Solutions Architect Global CoE VCDX #003 Blog: borgcube.com Twitter: @borgcube_ 3. 33 Agenda Method Q/AOverview Design Patterns 4. 44 Definition Software-defined data center is an architectural approach to IT infrastructure that extends virtualization concepts such as abstraction, pooling and automation to all of the data centers resources to achieve IT as a service. 5. 55 Definition Software-defined data center is an architectural approach to IT infrastructure that extends virtualization concepts such as abstraction, pooling and automation to all of the data centers resources to achieve IT as a service. 6. 66 Three Core Tiers 7. 77 Top 5 Value Drivers Why Is SDDC Attractive Now? Dramatic Sustainable Cost Reduction Realize 40-75% cost reduction at full SDDC (~75% CapEx / ~56% OpEx reductions.) Organizational Efficiencies New Skillsets and headcount optimizations are required and allow much more to be done by much fewer, more cross-functional staff Datacenter Agility and Efficiency High levels of automation and self-service drive a much more agile datacenter environment and greatly reduce time to market of services and applications Operational Efficiencies Simplified, consolidated, and heterogeneous management and orchestration toolsets allow for a fully-realized service-oriented organization 8. 88 SDDC Quandary Must introduce complexity to provide simplicity A SDDC is an architecture for your infrastructure. This adds a layer of complexity in the form of technology and automation that simplifies the provisioning process for end-users By incurring upfront costs, you reduce long term operating costs Apps Compute Network Storage 9. 99 Software-Defined Data Center Standardized ResourcesAvailabilityManagement Pooled ResourcesAutomation Security Policy Self-Service 10. 1010 Software-Defined Data Center Standardized ResourcesAvailabilityManagement Pooled ResourcesAutomation Security Policy Self-Service vSphere (vCenter/ESX)Site Recovery Manager vCenter Operations Management Suite vCenter Orchestrator AMQP (RabbitMQ) vCloud Director and vCloud API vCloud Connector Nicira vCloud Automation Center vFabric Application DirectorvCloud Networking and Security 11. 1111 Software-Defined Data Center Standardized ResourcesAvailabilityManagement Pooled ResourcesAutomation Security Policy Self-Service Site Recovery Manager vCenter Operations Management Suite vCenter Orchestrator AMQP (RabbitMQ) vCloud Director and vCloud API vCloud Connector NSX vCloud Automation Center vFabric Application DirectorvCloud Networking and Security Virtualized Compute VDS Storage Profiles VMware HA VADP (backup/restore) vCenter Operations Configuration Manager Infrastructure Navigator Hyperic Chargeback Manager Edge, Load Balancing, VPN App Firewall VXLAN Service Insertion Framework Virtual Datacenters vApp and Metadata Catalog IaaS PaaS Workload Mobility Extended Network Catalog Sync Monitoring, Compliance, Remediation, Reporting vSphere (vCenter/ESX) 12. 1313 Agenda Method Q/AOverview Design Patterns 13. 1414 Vision Architecture PlanTransition Manage Governance Requirements Management Change Mgmt Method Articulated vision Business and Technical Goals Requirements, Assumptions, Constraints, Scope, Risks, Use Case Definition Gap Analysis Architecture Definition (Business, Information Systems, Technology) Roadmap definition Implementation planning Iteration planning Implement solution Validation Continuous monitoring Optimization 14. 1515 Phased Approach Virtualize Servers Virtualize Storage Virtualize Network Fully Automated Phase 1 Phase 2 Phase 3 Target State Phases can be completed in any order. Server Virtualization is typically the low hanging fruit 15. 1616 REQUIREMENTS 16. 1717 Identify Design Inputs There are a number of application use cases. First and foremost are production applications which must be ported from a legacy environment. Migration of these applications are critical to the success of the project. The development teams are interested in being able to quickly spin up self- contained application pods to allow for continuous testing without impacting other resources on the network. Currently no provisions have been made for additional hardware for the pilot. The available hardware consists of a number of blade server systems and a few legacy storage array systems. Application I/O profiles are not known and they would like a flexible way to control or ensure storage performance. They would like to minimize the capacity required and in turn the costs of the storage. The existing networking infrastructure is limited to a range of VLANs. The companys current operating model consists of separate IT teams for each major business unit. As part of ongoing research into the cloud computing model, the project sponsor has secured funding to pilot an implementation. The main project initiative is to standardize operations centrally and provide an IaaS offering to supplement existing IT processes. After discussing with the current BU stakeholders, they have identified a need for access to separate, dedicated environments to host their primary production applications. In addition, they would like sandbox execution environments that scale as their computing needs grow. Metering of the environment is desired in order to provide usage reports back to the end consumers. A later stage will involve integration of the metering system with the system that handles invoice processing for a complete chargeback model. The provisioning process in most business units takes anywhere from 3 days to 2 weeks due to the number of approvals needed, application dependencies, and integration with additional systems. Complete automation of the provisioning process is a key driver for the new centralized model. While this is a pilot, architecture of the system should follow recommended proven practices within the industry. 17. 1818 Identify Design Inputs There are a number of application use cases. First and foremost are production applications which must be ported from a legacy environment. Migration of these applications are critical to the success of the project. The development teams are interested in being able to quickly spin up self-contained application pods to allow for continuous testing without impacting other resources on the network. Currently no provisions have been made for additional hardware for the pilot. The available hardware consists of a number of blade server systems and a few legacy storage array systems. Application I/O profiles are not known and they would like a flexible way to control or ensure storage performance. They would like to minimize the capacity required and in turn the costs of the storage. The existing networking infrastructure is limited to a range of VLANs. The companys current operating model consists of separate IT teams for each major business unit. As part of ongoing research into the cloud computing model, the project sponsor has secured funding to pilot an implementation. The MAIN PROJECT INITIATIVE IS TO STANDARDIZE OPERATIONS CENTRALLY and provide an IaaS offering to supplement existing IT processes. After discussing with the current BU stakeholders, they have identified a need for access to separate, dedicated environments to host their primary production applications. In addition, they would like sandbox execution environments that scale as their computing needs grow. Metering of the environment is desired in order to provide usage reports back to the end consumers. A later stage will involve integration of the metering system with the system that handles invoice processing for a complete chargeback model. The provisioning process in most business units takes anywhere from 3 days to 2 weeks due to the number of approvals needed, application dependencies, and integration with additional systems. Complete AUTOMATION OF THE PROVISIONING PROCESS IS A KEY driver for the new centralized model. While this is a pilot, architecture of the system should follow recommended proven practices within the industry. 18. 2121 Business Requirements B101 B102 B103 B104 B105 B106 B107 B108 B109 B110 System provides separate dedicated environments. Complete automation of the provisioning process. System provides metering capabilities for cost reporting. System leverages shared infrastructure and resource pooling. System supports a catalog of standardized templates. System provides differentiated offerings based on cost. 19. 2222 Technical Requirements T101 T102 T103 T104 T105 T106 T107 T108 T109 T110 Must integrate with existing ticketing system Leverage thin provisioning for storage efficiency Centralized LDAP directory to be used System supports a catalog of standardized vApp templates. System provides differentiated offerings based on cost. 20. 2323 Constraints C101 C102 C103 C104 C105 C106 C107 C108 C109 C110 Dell and AMD have been preselected as the platform of choice Eight 1GbE ports will be used per server NetApps NAS storage will be used All Tier 2 NAS volumes are de-duplicated Physical switches will not be configured for QoS Existing Cisco TOR environment to be used Limited VLANs available 21. 2424 Assumptions A101 A102 A103 A104 A105 A106 A107 A108 A109 A110 Virtualization environment configured Shared storage configured VLANs and IP address reserved 22. 2929 Agenda Method Q/AOverview Design Patterns 23. 3030 Design Patterns Server Architecture Storage Architecture Network Architecture Automation Design 24. 3131 Design Patterns Server Architecture Storage Architecture Network Architecture Automation Design 25. 3232 Design Considerations What does the environment look like today? How many sites? How many potential virtualization candidates? Multiple waves? How will this impact your Design / Project? Different Cluster / Datacenter structure Within the limits? Sizing based on X waves / years? What are the use cases? Server consolidation? IaaS? Service Level Agreements (SLA)? 26. 3333 Compute Considerations How many eggs in one basket? Two sockets vs four sockets Optimal Memory configurations 8GB DIMMs are cheaper than 2 x 4GB Triple channel configurations Number of DIMM slots might be different per vendor / model AMD vs Intel AMD supports more cores, while Intel generally is faster VMmark can be used to make perf comparisons! TPS vs no TPS Using 64-bit Guest OSes? Performance gain Sweetspot? Still seems to be dual socket 96GB of memory 27. 3434 Design Considerations Vendor AMD vs Intel Blade vs Rack Density increases Hot spots Costs Management Additional considerations Is embedded ESXi available? How much local SSD (capacity and IOPS) can it handle? Does it have built-in 2x 10 GE ports? Does the built-in NIC card have hardware iSCSI capability? Management integration 28. 3535 Where Do We Start? How many physical Datacenters will there be? Will each physical DC need a vCenter Server? For each vCenter, do we need multiple virtual Datacenters? For each DC, do we need multiple Clusters? For each Cluster, how many hosts? Physical DC vCenter Datacenter Datacenter Cluster Cluster ESXi ESXi ESXi vCenter Datacenter Physical DC vCenter Datacenter Cluster ESXi 29. 3636 Compute Recap Think about Performance Think about Sizing Considerations on how to scale in the future SLAs BC/DR 30. 3737 Design Patterns Server Architecture Network Architecture Automation Design Storage Architecture 31. 3838 Design Considerations Protocol Wars! Multiple Tiers? Or even Auto-Tiering, what is the impact? vSphere Storage APIs Array Integration (VAAI) Does it impact sizing? vSphere Storage APIs Storage Awareness (VASA) Will it impact operations? Thin provisioning? Thin, Thick and Eager Zeroed Thick vSphere vs Storage Array! Virtual SAN 32. 3939 Design Considerations Can we use Storage DRS? Impact on storage array features? Impact on sizing? Impact on other VMware products like vCloud Director? Policy-Driven Storage? How does it utilize VASA? Business Continuity Requirements? Or possibly in the future? No more worrying about block sizes with VMFS-5 When upgrading VMFS-3 to VMFS-5 block size does not change! Did you know VAAI is T-10 compliant? Makes leveraging it easier for lower-end devices 33. 4141 Software Defined Storage Software Defined Storage the one remaining piece of the SDDC story Much like vCNS and NSX have eliminated manual networking processes, vSAN eliminates the manual storage processes Policy driven; defined instead of interpreted Single data store per cluster Manage at the VM layer 41 34. 4242 The Classic Way In 5.1, previously used storage profiles to interpret different storage capabilities No standards Manually created No service guarantees Storage DRS/SIOC could limit some guests to help Multiple platforms outside the VMware interface scope Limited integration Result: Limited portability, significant differences between platforms, and a lot of continuing manual work on the side of the storage admin 35. 4343 The Modern Way Define storage policies Storage platform adapts your workload to match the policy Adjust policy on the fly as needed Guaranteed service levels Portability Once again, we are abstracting storage to be a VM property, much like CPU allocations, RAM capacities, or networking segments 36. 4444 Designing Good Policies vSAN will allow you to set almost any combination of possible attributes for a storage policy What makes a good policy? Redundancy Failure zones Disk striping (RAID) Read Cache What makes a bad policy? No redundancy Avoid waste! Cache reservation on namespace (its text!) Thickness of namespace (still just text!) Stripe width of namespace (getting the hint?) Same policy for everything A small web server with static content probably doesnt need a cache reservation, or much stripe width 37. 4545 Storage Recap Size isnt everything, its the performance counts Understand the nature of your workloads Consider implications of what technology allows you to do Dont overlook business continuity requirements Design for failure assume anything can and will fail 38. 4646 Design Patterns Server Architecture Automation Design Storage Architecture Network Architecture 39. 4747 Design Considerations Physical Consideration Number of Sites Active/Active Sites Latency between sites Transport Fabric (L2 or Leaf and Spine) Server Cluster locations Security Considerations Isolated zones (Internal, DMZ, External) Virtual machine groups Isolation between virtual machines VPN requirements 40. 4848 Design Considerations Connectivity Requirements Site spanning of L2 networks IP addressing preservation Multiple N-S egress locations E-W traffic optimizations Routing requirements Access Requirements Access to physical hosts on same L2 Virtual machine IP addressing IPv4 IPv6 Virtual machine reachability Direct NAT 41. 4949 Overlapping IP Addressing/Existing IP Addressing Design Constraint Preserve existing network IP addressing scheme and application addressing scheme Re-use IP subnets across tenants or Zones of Control What NSX Enables No changes to physical infrastructure No changes in the configuration of the application Reduced deployment time Progressive migration to new infrastructure without disruption to application users Normalized segmentation and network configuration Troubleshooting benefits/Operational benefits Design Solution Network overlays provisioned across hypervisors on separate network domains Options for unicast, multicast of hybrid physical network configurations Requires only L3 connectivity (no need for trunking or VLAN extensions across network domains) Network overlays provisioned across hypervisors on separate network domains NAT implemented as edge service 42. 5050 Workload Mobility Design Constraint Application mobility limited by physical network topology Disjoint L2 domains Datacenter Interconnect (DCI) VLAN extensions What NSX Enables Removes need for trunking across disjoint L2 domains Removes need of presenting two network stacks to the hypervisor Removes need for extending VLANs across L3 boundaries using data plane implementations such as VPLS, OTV, etc. No need for a multicast-capable WAN Allows preservation of current network topology and addressing scheme Single management point/integration point (NSX Controller API), as opposed to having to orchestrate tasks across individual network nodes Design Solution Network overlays provisioned across hypervisors on separate network domains L2-bridge gateway services to provide connectivity for bare-metal applications or VLAN-backed applications 43. 5151 Use Case Providing Security Services to Applications What NSX Enables East-West Firewall services North-South perimeter firewall High performance, highly scalable in- kernel firewall service Security policies that follows the entire VM lifecycle Centralized management of security policies with distributed enforcement of such policies Policy definition can be based on vCenter objects Activity Monitoring allows for policy enforcement based on user identity Design Constraint Application mobility imposes security challenges. Lack of native security capabilities in the application. Design Solution NSX Manager deployment and administration of security services 44. 5252 IP Transport Network NSX Controller Cluster NorthboundRESTAPI 11.1.1.10 Gateway Service Appliance/VM Virtual NetworkVM1 VM2 VM1 VM2 NSX How It Works 10.2.2.10 Data Plane Control Plane VM1VM1 VM2 Cloud Management Platform 1 2 Hypervisor 10.1.1.10 VM3 192.168.1.0/24 Corpnet 20.1.1.2 VM3 Corpnet 20.1.1.2 10.97.110.10 VM2 VLAN 9 VM4 VM5 VLAN 9 VM4 VM5 1 2 Existing DC Network(s) 5252 45. 5353 Network Recap Enhanced virtual machine mobility Security policy decoupled from location Location independent IP addressing Virtual infrastructure design flexibility 46. 5454 Design Patterns Server Architecture Design Storage Architecture Network Architecture Automation 47. 5656 Opportunities for Automation Full automation Partial automation Plan and request Self-service portal Billing and charge back Application configuration management User access provisioning Service catalog Operate and monitor Performance dashboards/ visualization Incident management Automated scaling Monitoring General infrastructure Business continuity Analytics and capacity management Discovery and asset management Virtualization platform management Physical resource management Application lifecycle management Multi-cloud management Provision and deploy Application provisioning Service provisioning Application orchestration Security and compliance Processes in the operating cycle that can be automated 48. 5757 Component Integration and Interoperability Network Services External Systems vSphere (vCenter/ESX) vCenter Orchestrator vCenter Operations Manager vCenter Configuration Manager vCenter Infrastructure Navigator vCenter Chargeback Manager AMQP (RabbitMQ) vFabric Application Director vCloud Automation Center vCloud Connector Site Recovery Manager Backups vCloud Director Public Cloud Add more vCenters for scale VMware API for Data Protection (VADP) Some vendors back up vCD vApp No integration with SRM and vCD vCNS Manager vApp must be off Network Extension uses Layer 2 VPN Topology Adapter v0.9.1 Compliant Service Insertion Framework Collects from vSphere, vCD, and vCNS Edge Edge integrated w/vCD; App not Customization w/ API/Adapter Integrated SQL vCloud API REST API vCloudAPI vCloudAPI vCloud API REST API VIM API VIM API vCloud API vCloud API 49. 5858 Key Takeaways 3 2 1 Understanding of the key design considerations for SDDC Understanding of the key components of an SDDC Understanding of the key integration points of SDDC 50. 5959 Where Do I Go for More Information? SDDC VAPP4679 Software Defined Datacenter Design Panel for Monster VMs VCM5048 Automating the Software Defined Data Center: How do I get started VSVC7498 Deploying the Software Defined Data Center today, Customer perspectives and tips on optimizing VMware environments with SDDC Software Defined Storage STO5638 Best practices for Software Defined Storage STO5027 VMware Virtual SAN Technical Best Practices STO4798 Software Defined Storage: The VCDX way Software Defined Networking NET5716 Advanced NSX Architecture NET5184 Designing your next generation data center for Network Virtualization Cloud PHC4750 How to build a hybrid cloud in less than a day PHC5640 The story behind designing and building a distributed automation framework for vCloud Hybrid Service 51. 6060 VMware Architect Community Launch Join us as we launch the VMware Architect community for the SDDC Enjoy drinks, snacks, and conversation with other champions of SDDC Intercontinental Hotel The Pacific terrace Wednesday 28th August 5PM 7PM 52. 6161 Questions 53. THANK YOU 54. Architecting the Software-Defined Data Center Aidan Dalgleish, VMware David Hill, VMware Kamau Wanguhu, VMware VSVC7371 #VSVC7371