adrian ball cv 2015-07 v0.7
TRANSCRIPT
![Page 1: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/1.jpg)
ADRIAN BALLSenior Linux/Solaris System Administ ra tor
[email protected] 623647
http://uk.linkedin.com/pub/adrian-ball/15/214/842/
SUMMARY
Available for contract work from September 2015.
Enterprise systems design, administration, troubleshooting and support
Scripting and automation
Sun-certified
Security-cleared (SC)
Conversant in the ITIL framework.
KEY SKILLS
Solaris (2.x to 10 and SunOS 3/4) Red Hat Linux System Administration Systems Engineering
Unix Shell Scripting (bash / ksh) Sun/Oracle/Dell hardware Systems and Infrastructure Design DNS
Disaster Recovery Backup and Recovery High Availability ZFS
Sun Cluster (HA) NFS Solaris Volume Manager (DiskSuite) CentOS
PERL, TK/TCL SSH Veritas VxVM/VxFS (storage foundation) RHN/Kickstart
Ubuntu/Debian/Mint Linux HPC cluster Oracle Enterprise Linux Remedy/ITIL
Jumpstart/JET SAN / EMC Clariion NetBackup & Legato Networker Team Leadership
Solaris LDOMs (OVM) & zones NTP Technical Documentation Production Support
Sendmail Apache Troubleshooting/problem solving Vendor Relationships
Puppet DRBD/Corosync LDAP integration TCP/IP
ABOUT ME
I have over twenty years of experience designing, testing, building, documenting, automating and supporting Enterprise Unix systems. With a background in programming & scripting, I seek ways to make systems work together better, and
will design and implement improvements, in addition to core activities.
My technical skills are hands on, I'm happy dealing with a cross-section of disciplines and making them work
cohesively. I have direct experience with storage, databases, applications, monitoring/alerting, packaging, data transfer, networking, web servers and other network services, mail, performance tuning and backup & recovery - in most cases
this is quite in-depth. Where dedicated teams look after these areas, I have often assumed a project technical coordination role.
I'm comfortable dealing with difficult and unfamiliar technical issues, often troubleshooting live production problems. In pressured situations, I deliberately take a careful and precise approach, gathering evidence to correctly identify the
root cause of the problem, communicating with stakeholders as necessary, until a resolution is reached.
I am an advocate of sharing knowledge effectively, clearly communicating and documenting ideas and processes. I will
take the time to record useful information for others, and correct out-of-date instructions as they are encountered.
Adrian Ball – Curriculum Vitae – 07/2015 - page 1/8
![Page 2: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/2.jpg)
EXPERIENCE
Ux1 Ltd / UK Met Office (contract, two renewals) 09/2014 – 08/2015
Linux systems support and administration
OVERVIEW
The UK Met Office is a world-renowned weather forecasting centre, with data and science at its core. Its multi-million
pound computing facilities include IBM & Cray HPC Supercomputers, IBM mainframe systems (running zOS linux instances), massive data storage and retrieval facilities, large-scale VMware ESX clusters, a large operational Linux
server estate, a mixture of Windows and Linux desktop clients and the necessary support systems.I provided system administration services focussing on the operational Red Hat Linux estate, which comprised of
several hundred VMware instances, S390 zOS instances and physical servers.
KEY ACHIEVEMENTS
Improved and expanded the Munin performance trend monitoring service:
The existing service was under-used, primarily because the server was installed on a general purpose operational system, which was unable to adequately deal with the volumes of data. I designed and configured
two dedicated servers in different network environments, planned and seamlessly migrated the service. The service now monitors three times the original number of clients and has been an instrumental tool in resolving
performance issues.
Investigated and resolved several long-standing performance issues:
In one example, a satellite data collection/processing system was frequently over-running and missing scheduled targets. Using performance trend monitoring (Munin), more detailed information from sar, /proc/*
etc and researching kernel tuning options, I determined that the problem was that the system, whilst appearing to have plentiful RAM available to allocate from cache, was not allocating it quickly enough on demand. The
root cause was that the VFS (virtual file system) cache was being cleared first, and as this comprised thousandsof small files, it was taking too long. Changing the vfs_cache_pressure tuning parameter resolved this problem
- the systems have consistently performed within SLAs since this change was made.
Designed, documented and deployed a number of project-related intelligent system builds with Puppet:
Using the existing Puppet system, wrote new manifests with embedded logic to deal correctly with development, test, production, physical and virtual systems, enabling new instances to be added quickly on
demand, without requiring bespoke system-specific manifests.
Worked with the backup and storage team to manage the deployment of hundreds of EMC Networker
backup clients to the operational Linux estate:This required an information-gathering exercise, scripting to deploy to RHN-enabled servers, further scripting
to deploy to various unique systems which could not take advantage of any automation options, and the development and testing of a Puppet manifest tailored for the Met Office environment, which is now used to
deploy and configure the agent to all new systems.
TECHNOLOGY AREAS
Red Hat Enterprise Linux (RHEL) versions 4 to 6
Puppet
Remedy/ITIL
VMware ESX/VCenter, Dell servers & disk arrays, IBM S390 Linux instances
DRBD/Corosync HA cluster
Performance monitoring, troubleshooting and tuning.
Bash / ksh scripting
Adrian Ball – Curriculum Vitae – 07/2015 - page 2/8
![Page 3: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/3.jpg)
Ux1 Ltd / University of Bristol - Department of Theoretical Chemistry (contract)
07/2014 – 09/2014
HPC and general Linux systems administration
OVERVIEW
The University of Bristol Department of Theoretical Chemistry provides dedicated HPC cluster facilities to academic
staff, along with Linux workstations, some equipped with high-end GPU units.I was contracted for a specific period to provide support and management on these dedicated specialised systems.
General support included managing and maintaining the two Rocks HPC cluster systems hardware, job schedulers and queue management, software and troubleshooting.
KEY ACHIEVEMENTS
Upgraded the Curie HPC cluster, with minimal downtime:
One of the HPC clusters required upgrading for security and maintainability, but the system was complex and
large, with no equivalent hardware to test the process. I therefore recreated a virtual copy (with a representative, but reduced set of compute nodes) on my own equipment, and worked through documenting
and testing the process of upgrading from Rocks Cluster version 5 to version 6. This was a reasonably complex process, involving the reconfiguration of several external systems (including a virtual server running
on the cluster itself) and the upgrade+rebuild of 40+ compute nodes, along with the head node. The outage was expected to take about a week, but I was able to successfully return the cluster to service within the first
day.
Instigated compute node monitoring and alerting:
The HPC cluster compute nodes, were configured with a trade-off of high performance versus resilience. As they are hidden from the main infrastructure, they were manually monitored for such issues as mirror devices
going offline, filesystems filling, CPU/disk failures etc - all of which could cause compute jobs to fail or hang silently. I designed and developed a simple monitoring system to report on a number of issues via the head
nodes. This immediately picked up on several previously unreported problems, which were then able to be resolved quickly, improving the overall service.
TECHNOLOGY AREAS
Rocks Cluster
Linux: CentOS, Ubuntu, Scientific Linux
FlexLM, MPI compilers
KVM virtual machines
Bash
LDAP integration
Adrian Ball – Curriculum Vitae – 07/2015 - page 3/8
![Page 4: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/4.jpg)
HP Enterprise Services 12/2006 – 05/2014
Solaris/Linux systems engineer
OVERVIEW
Solaris, Red Hat Linux, shell scripting, design/build, support and troubleshooting for several clients, notably: The
Department of Work and Pensions, Aegon, Rolls-Royce Aerospace, Rolls-Royce Marine Power and the UK Ministry of Justice.
KEY ACHIEVEMENTS
Instigated and implemented the redesign of Solaris & Linux build systems for the UK Ministry of
Justice:
Scripted & documented modular & repeatable new builds for all hardware with the ability to retrospectively deploy updates on existing servers (e.g. security fixes) in a controlled manner. This resulted in much improved
(and predictable) upgrade and patching processes amongst other things, saving many hours of valuable administrator time.
Reverse-engineered key elements (NIS/Jumpstart) of the Rolls-Royce Unix/Solaris build system and re-
engineered to enable DHCP-based Jumpstart:
This removed a block in the critical path of the ongoing network infrastructure upgrade programme, enabling itto continue to original timescales, saving time and money.
Identified £140,000 savings in Oracle RDBMS licensing costs by specifying alternative (T-series) hardware
for the Rolls-Royce Strategic Sourcing project.
Designed and implemented a unique ZFS-based backup and disaster-recovery system for a bespoke
Ministry of Justice application:
Using Solaris zone restarts, ZFS snapshots and rsync cloning enabled the application to be securely copied andrestarted within 3 minutes, well within the allowed downtime of one hour per day. This saved the project some
embarrassment, time and money, as without this solution, service level agreements would have needed to have been renegotiated with the customer.
Redesigned and implemented the cross-site MoJ Unix NTP infrastructure to eliminate single points of
failure and adhere to security policy. The previous implementation had resulted in service outages as database,
application and NFS servers became out-of-sync. The new design performed flawlessly, resulting in no furtheroutages since implementation.
TECHNOLOGY AREAS
Enterprise systems design/build/support: Sun/Oracle M8000/3000, E10K, E25K, V-series, T-series, X-series
(Intel).
System design/build: OVM (LDOMs), zones, ZFS, Veritas (Symantec) Volume Manager/VxFS, Solaris
Volume Manager (Disksuite), Jumpstart/JET, LVM, NFS.
SAN (multi-pathing with EMC Powerpath, MPXIO, Veritas DMP & Linux native multi-pathing).
Solaris Cluster.
NetApp, HP EVA and EMC Clariion connectivity.
Red Hat Enterprise Linux, Oracle Enterprise Linux, Kickstart.
Patching/upgrades planning/deployment.
Monitoring client deployment (Tivoli, Sitescope & Xymon).
Scripting/packaging: ksh, bash, Bourne shell, SED/AWK, PERL, TK/TCL, Solaris packaging, RPM,
application/OS integration (e.g. for automated application zone shutdown/snapshot/restart)
Other: Sendmail configuration, DNS, LDAP proxy, SSH, rack layout/cabling, performance troubleshooting,
Adrian Ball – Curriculum Vitae – 07/2015 - page 4/8
![Page 5: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/5.jpg)
Data Protector, data migration, security scanning/remediation, firmware updates, cross-site NTP
design/implementation, Solaris resource management, technical disaster recovery (bare metal restore process planning/testing, offline boot), server consolidation planning/implementation.
Siemens Energy Services 11/2001 – 12/2006
Senior unix systems manager and team leader
OVERVIEW
Responsible for the design, implementation, maintenance and operation of the mainly Sun/Solaris based Unix systems,
storage, and backup infrastructure.
KEY ACHIEVEMENTS
Disaster recovery: Wrote the scripts and processes to recover the entire Unix-based service off-site from bare
metal, onto non-matching hardware (thus significantly reducing contract costs).
Designed an optimal and standardised system disk configuration, and implemented this on all existing (hitherto
individually configured) servers. This provided a building block for a number of subsequent system
improvements, including a better and more predictable performance profile, and improved backup/recovery times.
Designed and implemented a new backup network using existing equipment; improving throughput by eight
times. This enabled backups to be completed within the required window each night, improving the service to
users online during normal working hours – with no capital outlay.
Directly worked with vendors to gain best-value, particularly on older test-servers where the original
manufacturer would charge inflated prices for both initial purchase and subsequent maintenance of equipment nearing end-of-life. By purchasing legitimate second-user equipment and including spares on-site, this saved
Siemens tens of thousands of pounds, probably a six-figure sum, all without compromising service quality.
TECHNOLOGY AREAS
System performance optimisation: Worked with the DBA team to optimise VxVM disk volume layouts for
improved performance and resilience. Maintained the systems at up-to-date patch and firmware revisions.
Backup and recovery: Ensured that the daily backup procedures functioned correctly, amending as required.
Wrote scripts to automate several tasks, including the generation and web-enabling of tape picking-lists (for offsite storage), this enabled the operations staff to generate the lists on-demand, saving valuable administrator
time.
Developed, tested and amended semi-automated disaster recovery procedures, to ensure smooth operation
during the annual disaster recovery testing, and if required, real disaster recovery.
System design, configuration, procurement and installation.
System monitoring: Ensured that the monitoring system captured all important system issues. As previously
manually reported issues arose, created scripts to check for these unusual events: runaway processes, filesystem & volume size discrepancies, filesystems missing from backup schedules, network interfaces
running at incorrect speeds/mode etc.
Sun Microsystems (EDS/Rolls-Royce contract) 11/2000 – 09/2001
Unix consultancy – NFS server consolidation project
OVERVIEW
Adrian Ball – Curriculum Vitae – 07/2015 - page 5/8
![Page 6: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/6.jpg)
Contracted by Sun Microsystems to the EDS/Rolls-Royce Aero account, provided technical consultancy, design and
documentation for a large scale NFS server migration project. The work covered low-level design and implementation up to presentation at board level.
Audited the Sun server estate (400+ systems, 90 of which were NFS servers). This involved gathering information froma variety of disparate databases, spreadsheets, reports and from the servers themselves (systems and performance data).
Using the information gathered, designed and documented the new system enabling the NFS server population to be reduced from 90 to 23, whilst increasing capacity, performance, resilience and manageability.
KEY ACHIEVEMENTS
Developed a database and a suite of PERL scripts to analyse existing server data and made it available by
setting up a project web server on my own equipment. This enabled others on the project to extract relevant information as required, speeding up project progress.
Produced the detailed design & test documents with cost justifications for submission to the technical review
board, allowing the board to make an informed decision to proceed.
Wrote performance monitoring & data-collection scripts (as no suitable performance monitoring system was in
place at the time), and wrote a web-based interface allowing graphs to be displayed of the various data over
specified periods of time. This was used to compare the existing system performance against the pilot and initial implementations, proving that the new system was viable and working correctly.
Wrote a set of quota-management/reporting scripts in order that the pilot could commence prior to the delayed
arrival of the commercially-provided quota-management interface, removing a blockage in the critical path of
the project.
Barron McCann Ltd 07/1997 – 11/2000
Unix/technical disaster recovery consultant
OVERVIEW
Unix consultancy, with specific focus in technical disaster recovery of Sun/Solaris systems.
Major clients included:- IBM Business Continuity and Recovery Services, Powergen, Nortel Networks, M&G, Del Monte International, Toshiba UK, AstraZeneca and numerous local and county authorities.
KEY ACHIEVEMENTS
Devised the techniques and processes required to recover bare-metal Solaris systems on different hardware to
the original. This enabled us to offer lower-priced contracts, which brought in a lot of new business, and also saved us money through smaller capital outlay.
As part of the disaster-recovery service, delivered, installed and configured a replacement for a customer's
database & application server which was stolen one evening. The customer had no technical recovery
procedure in place, worked through the night to create and implement one, and fully recovered the system by lunchtime on the following day. This saved the customer, whose business was in shipping perishable goods,
tens of thousands of pounds in lost revenue. The customer had previously given notice that the existing contract would not be renewed, not seeing the value in a disaster-recovery service – after these events, a new
three-year contract was soon agreed.
Devised and presented training materials to teach our hardware engineers the basics of Unix system
administration, particularly how to identify and troubleshoot attached storage systems. This enabled them to provide a more effective service, with fewer calls back to support staff being required.
Adrian Ball – Curriculum Vitae – 07/2015 - page 6/8
![Page 7: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/7.jpg)
Adept Scientific PLC 01/1996 – 07/1997
Systems administrator
OVERVIEW
Reporting directly to the MD, I was responsible for the IT and communications infrastructure of the company.
Responsible for the technical management and development of the company's internal systems and connectivity.
KEY ACHIEVEMENTS
Redesigned and implemented a new structure for the company website, including a backend database (Postgres
(SQL)), including many automated features pulling in external data sources. Wrote the interfaces where
required), for example:
Automatically updated the company public training calendar from the internal Apple Mac calendar
application, enabling easy access to up-to-date information for customers.
Automatically generated an online telephone database from the SDX phone system. This ensured that
all phone information was consistent throughout the company, reducing communication errors.
Automatically locally mirrored selected areas of US business partners' websites (transatlantic transfer
rates being noticeably poor during this era). This allowed customers to research product information which was previously difficult to access.
Implemented Lotus Notes to replace the existing CC:Mail system.
Designed/implemented the company IT infrastructure database to aid management and reporting.
Managed the implementation of the WAN link to the newly opened US office.
Rothamsted Research - Biotechnology & Biological Sciences Research Council (BBSRC)
09/1987 – 01/1996
Analyst/programmer, systems administration and team leader
OVERVIEW
Part of the IT services team, I was responsible for the design and implementation of systems providing services to
around 1000 desktops over three sites. Prior to the systems manager/team-leader role, I was a scientific analyst/programmer for four years, writing bespoke data visualization programs in Fortran & C on DEC Vax and Sun
hardware.
KEY ACHIEVEMENTS
Designed and implemented the corporate networked desktop infrastructure. Based on Windows 3.11, PC/TCP,
Hummingbird Exceed and various GNU tools (MS-DOS versions of make, sed, grep and awk) on the client
side, along with NFS and Netware servers. Systems used an NFS shared distribution of Exceed (a PC X server)and NFS shared configuration files. This enabled central upgrades and management. Services were provided
by Unix and Vax/VMS servers via X-windows. This configuration enabled the scientific staff to access all available systems from a single desktop, making their work more efficient and reducing support costs.
Provided a bespoke graphics/data visualisation programming service (in FORTRAN and C) for scientific staff,
enabling them to develop insights into their data which were hitherto difficult, if not impossible, to reach.
Designed and implemented the corporate applications access solution. Written in C with Xview libraries, then
an improved and more flexible version in TK/TCL. Once again, this improved access and efficiency, and
reduced support costs.
Set up one of the first 100 web servers in the UK, and collaborated with a plant pathology researcher to make
Adrian Ball – Curriculum Vitae – 07/2015 - page 7/8
![Page 8: Adrian Ball CV 2015-07 v0.7](https://reader036.vdocument.in/reader036/viewer/2022082504/55d71400bb61eb8b548b46c2/html5/thumbnails/8.jpg)
some of the very first scientific reference images available to researchers and students on the Internet.
AFRC (Agriculture and Food Research Council) 1986
Student Placement
Plant pathology & entomology research projects. Interfaced micro-computers (Apple II / BBC) to lab kit and wrote bespoke programmes for data capture and interpretation.
Unilever 1985
Student Placement
Microbiology research project. Computing work included data preparation and modifying proprietary application code used to control transmission electron microscopes.
EDUCATION
University of Hertfordshire 1992
BSc Computer Science (with distinction)
OTHER COURSES
Sun Microsystems, SA-285, Solaris 2 System Administration - Exam 210-004 (Solaris System Administration
II) 85%, leading to Sun accreditation (EL1000)
Sun Microsystems, Solstice DiskSuite
Sun Microsystems, SA-380, Solaris Network Administration - Exam 210-005 (Solaris TCP/IP Network
Administration) 87%, leading to Sun accreditation (EL1500)
Sun Microsystems, SA-340, SunNet Manager;
Veritas Volume Manager;
Veritas NetBackup;
Oracle DBA part 1;
Sun Microsystems, SA-400, Solaris system performance management;
Sun Microsystems, ES-400, Sun E10000 system management;
Sun Microsystems, SA-225-S10, Solaris 10 new features for experienced system administrators;
Sun Microsystems, Sun Cluster 3.2
HP-UX System Administration for Experienced UNIX System Administrators
Others: Java programming, Advanced Vax/VMS Fortran programming, Vax/VMS DCL, OSF/1 administration,
image analysis, Uniras graphics library programming, System 1032 database, Datatrieve, Technical writing.
Adrian Ball – Curriculum Vitae – 07/2015 - page 8/8