This is an incomplete list of projects I’ve completed or am working on – this is by no means an exhaustive list, but can give you an idea of the kind of efforts I can help wth.

Organisational Transitions

Leadership Transition - Working with a large, busy IT group in a diverse M&A-based environment. Guiding a leadership group through a C-level transition, including planning around communications, driving more strategic thought around service delivery, direct work on staffing and skill augment planning. This required some tricky changes of approach based on new leadership and resets of several relationships with internal stakeholders as well as MSP partners.

‘Traditional Ops’ to SRE Transition for Network Surveillance Team - I was part of the leadership group that managed the transition of a ’traditional’ network operations team for Google’s Network to an SRE model. This included everything from relationship management with existing Network Architecture Stakeholders, to cross-training on SRE methodology, coding classes, portfolio assembly for job ladder transition, and active coaching and working with other leads in this area. The team was spread across timezones between Sunnyvale, Dublin and Sydney, comprising about 150 FTE. This took place over a 12-18 month timeframe, and we vastly exceeded the retention expectations, with over 80% of engineers being successful in building their skillset to an SRE standard.

Post-Layoff Remit/Charter Re-rationalisation - Worked with a group that had recently had a reduction in force on rationalising what they were capable of delivering, coaching key leads through the exercise of determining their team’s capabilities, and helping with stakeholder management as we communicated new expectations.

Sharding Teams - I have done this many times, with teams that are growing a presence outside their primary office - my work as head of engineering at Google Ireland meant I was at the forefront of everything from stakeholder management of existing leadership and partners, to hiring and assembling the right people as a vanguard of a new office. Specific advice and common practices on working across timezones and asynchronous work practices were also critical here.

Technical Transformations

SLI/SLO and Service Catalogue - Worked with a large, busy IT group on defining and communicating a service catalogue, including introduction and training on SLIs and SLOs as a concept and implementation. Working directly with Service Owners on definition and communication of SLOs and their outcomes for service delivery.

Observability Strategy and Technology - Working with a diverse set of tooling and methodologies to be able to have a broader view of a large set of services. Brought a large group of service owners together to sponsor a common SLO reporting tooling (nobl9) and taking a pragmatic approach to standardisation.

Short-form Interventions

Outage Assessment and Analysis - Taking a raw dataset of outages for a given set of services, and translating them into a broad set of recommendations for real fixes to be worked on by an MSP partner. The view was that there were only tactical fixes being made, and a ’no-nonsense’ assessment of what was really recurring and breaking in subtly similar ways was needed. Outcome was a direct analysis of the data, as well as several both technical and strategic staffing and relationship management recommendations.

One-Off Trainings and Analysis

I am happy to come and talk to individual teams in a ’tech talk’ style fashion with Q&A, in small groups for a few days of intensive work onsite, or whatever form might make sense.

SLO Primer - Broad outlines of SLOs and their outcomes, specific advice and exercises on SLO definition and commentary on your particular service.

Oncall Theory and Practice - Triage and analysis of oncall schedules and alert data, leading to specific recommendations on fixes to make and approaches and policies to amend so people are happier.