when tools attack

Post on 29-Nov-2014

1.484 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Even with the best intentions in mind, a tool or an automated process can be created and deployed with devastating results to your Perforce server instance. Learn from someone with years of experience dealing with rogue tools, hearts of gold, and hands of destruction on what to do and what to avoid when automating Perforce.

TRANSCRIPT

#

Chris MakinIT Infrastructure at Playground Games

When Tools Attack

#

Chris MakinIT Infrastructure AdministratorPlayground GamesChris Makin is a battle-scarred veteran of Perforce server planning, implementation and administration. With 11 years of IT service to the video games industry he proudly serves at Playground Games as IT Infrastructure Administrator. If not elbow deep in servers he can be found trying to sample every craft beer under the sun.

#

• Sharing in real world examples of custom tools with hearts of gold and hands of destruction!

• The Impacts & issues caused• Fixes, resolutions & preventative actions• Tips for server setup & tools testing

When Tools Attack

#

• Founded in 2009 with 16 staff• 20+ years experience in game development• Turn10 & Microsoft Studios October 2012 September 30th 2014

#

• Over 150 developers at peak• Outsource teams across the globe• 17 Perforce Server Instances

– 3 core & 4 supporting P4Ds– Further 10 replicas for workload balancing, DR & HA– Linux & Windows, VMWare & Bare metal– Nimble, EqualLogic & DAS

#

• Single P4D server– 4TB total size

• 100 developers• 12 build servers• 13 solid hours lock time

per day for automated systems

• 3 Core P4D servers– 10TB total size– 1TB at peak change

• 24 build servers• Builds over 150GB• 485,000 ops completed• 340,000 automated• Lock contention – gone

Horizon Horizon 2

#

• Deadliest Catch– Trawling the depths.

• Tor‘s Hammer– An internal cyber attack!

• Skynet– Ignore it for too long and it’s taken over the world.

The Bad Guys

#

Deadliest CatchTrawling the depths

#

• Tool built to create a depot heat map• “p4 files @=clnumber”• Started at CL 1

– Worked its way upwards.– 9 streams

• High latency connection

Deadliest Catch

#

• Unresponsive server• Commands queue• Human element

– F5, F5, F5, F5…

• Database lock contention

Deadliest Catch Impacts

#

• Lower tool polling frequency• Lower thread count• Forward commands to replica, on or offsite• Trigger or P4broker limiting number of concurrent

operations per user• Upgrade to 2013.3 or higher for lockless reads

Deadliest Solutions

#

Tor’s HammerInternal cyber attack!

#

• Level editor with Perforce integration– Have my files been updated?– Query's workspace #have against #head

• Query was being carried out on each file individually

– 25,000 files– 150 developers– 3,750,000 queries– Every second

Tor’s Hammer

#

• You are now under DDoS attack!• Perforce server stops responding to all requests

– P4Auth stops responding

• TCP flood• OS level/network stack issue

– TCP/IP port exhaustion

Tor’s Hammer Impacts

#

• Hang, draw and quarter tools programmer• Local firewall

– Flood protection rules

• Tune network stack– Max available ports– Min keep alive time

Tor’s Hammer Solutions

#

SkynetIgnore it for too long and it’s taken over the world

#

• World editor• Designed so every file is self descriptive with a

UID– \\game\level1\walls\wall1_walls_level1_123456789.png

• In the backend this turns into– D:\p4depot\game\level1\walls\

wall1_walls_level1_123456789.png,d\1.12345.gz

Skynet

#

• In practice– \\game1\mainline\data\level_data\level1\objects\

textures\walls\wall1_walls_textures_objects_level1_level_data_data_mainline_1234567890.png

– \\game1\mainline\data\level_data\level1\objects\textures\walls\wall1_walls_textures_objects_level1_level_data_data_mainline_mainline_data_level_data_level1_objects_textures_walls_wall1_1234567890.png

• 199 chars compared to original 54

Skynet

#

• Project was heavily branched & continuously integrated with > 100,000 files

• Integrations took longer• Exponential metadata growth – GB per day• Higher RAM, swap & CPU utilization• Windows OS & proxy path length issues

Skynet Impacts

#

• Send a cyborg back in time to stop the tools change submit

• Trigger/P4Broker rule inspecting file & path length for irregularities/max allowed length

Skynet Solutions

#

The Good Guys

#

• Perforce Support• P4D > 2013.3

– Lockless reads!

• Replica & Edge servers– Offload locks, CPU & I/O intensive tools and workloads

• P4Broker & Triggers– Don’t like a command? Block or re-direct it!

• “Side-track” server instance

The Good Guys

#

Server Setup & Tools Testing

#

• Metadata replica– Offline checkpoints, additional replicas, no live

interruption

• Enable process monitoring• Monitor server• Pay attention to your type map

Server Setup

#

• Every tool has an impact – TEST!• Test against real data

– Metadata & full replicas

• Set a high level of logging– Utilize Perforce Server Log Analyzer

• Monitor system utilization– CPU, RAM, disk I/O…

Tools Testing

##

Thank you!Chris Makinchris.makin@playground-games.com

#

Chris MakinIT Infrastructure AdministratorPlayground GamesChris Makin is a battle-scarred veteran of Perforce server planning, implementation and administration. With 11 years of IT service to the video games industry he proudly serves at Playground Games as IT Infrastructure Administrator. If not elbow deep in servers he can be found trying to sample every craft beer under the sun.

#

• http://answers.perforce.com/articles/KB_Article/Setting-Up-a-Side-track-Server

• http://answers.perforce.com/articles/KB_Article/How-to-Monitor-a-Swamped-Perforce-Server

• http://answers.perforce.com/articles/KB_Article/Installing-P4Broker-on-Windows-and-Unix-systems

• http://answers.perforce.com/articles/KB_Article/Using-P4Broker-With-Replica-Servers

• https://kb.perforce.com/psla/

• http://www.perforce.com/blog

Useful Links

top related