Senior Director - Operations and Reliability Engineering
Company: Boston Consulting Group
Location: Boston
Posted on: June 1, 2025
Job Description:
Locations: Canary Wharf - BostonWho We AreBoston Consulting
Group partners with leaders in business and society to tackle their
most important challenges and capture their greatest opportunities.
BCG was the pioneer in business strategy when it was founded in
1963. Today, we help clients with total transformation-inspiring
complex change, enabling organizations to grow, building
competitive advantage, and driving bottom-line impact.
To succeed, organizations must blend digital and human
capabilities. Our diverse, global teams bring deep industry and
functional expertise and a range of perspectives to spark change.
BCG delivers solutions through leading-edge management consulting
along with technology and design, corporate and digital
ventures-and business purpose. We work in a uniquely collaborative
model across the firm and throughout all levels of the client
organization, generating results that allow our clients to
thrive.
What You'll DoThe Senior Director - Operations and Reliability
Engineering is responsible for blending---Site Reliability
Engineering (SRE), DevOps, and traditional operations models---to
build a next-generation---Reliability Engineering function. This
role ensures---end-to-end automation at scale, 24x7 operational
excellence, and high availability---across---all of BCG,
including---BCG Core, BCG X, and Consulting Team (CT) worldwide.
The leader will drive---strategic planning, execution, and
optimization---of global IT infrastructure, cloud operations, and
service management while ensuring a---secure, scalable, and
efficient---technology environment. This role is accountable for
embedding and assuring---IT Service Management (ITSM)
processes---across all teams, ensuring compliance with standardized
frameworks and operational excellence. -Key Responsibilities:
-Strategic Leadership & Transformation:
- Define and execute a---modern Reliability Engineering strategy,
integrating---SRE, DevOps, and automation-first operational
models.
- Drive---end-to-end automation---to eliminate toil, improve
efficiency, and enhance operational resilience.
- Lead the transition from traditional IT operations to
a---proactive, AI-driven, self-healing infrastructure.
- Establish a global---observability, telemetry, and predictive
analytics framework---for real-time insights.
- Align operational strategies with business goals, ensuring IT
supports digital transformation initiatives across---BCG Core, BCG
X, and CT.Infrastructure & Cloud Operations:
- Oversee---global IT infrastructure, cloud platforms, and hybrid
hosting environments---across---all BCG business units.
- Manage---network reliability, compute platforms, and
cloud-native services---across AWS, Azure, and GCP.
- Scale---Infrastructure as Code (IaC),---automated provisioning,
and---cloud workload optimization.
- Drive---edge computing, containerized workloads, and
high-performance computing strategies.
- Implement---AI-driven monitoring, self-healing automation, and
full-stack observability.IT Service Management & Operational
Excellence:
- Mandate and assure the adoption of IT Service Management (ITSM)
processes across all teams, ensuring standardized, efficient, and
effective service delivery.
- Establish---SRE-based operational metrics, including---SLOs,
SLIs, and error budgets.
- Oversee---incident response, problem resolution, and root cause
analysis with AI-driven remediation.
- Ensure---high availability, performance, and security
compliance---for all enterprise services.
- Develop a---follow-the-sun operational support model,
ensuring---24x7 resilience and uptime across all of BCG.
- Optimize---incident, change, and capacity management, ensuring
alignment with---ITIL best practices---and automated
workflows.
- Lead---Service Asset and Configuration Management (SACM),
ensuring---accurate and real-time management of software and IT
assets within the CMDB.
- Drive continuous---enhancements to the CMDB,
improving---visibility, compliance, and lifecycle management---of
IT assets.Security, Compliance & Risk Management:
- Embed---security and compliance into operational
workflows---with automated security controls.
- Ensure adherence to---ISO 27001, NIST, SOC 2, GDPR, and cloud
security best practices.
- Collaborate with---cybersecurity teams---to
integrate---zero-trust security models.
- Drive---resiliency planning, disaster recovery, and business
continuity initiatives.Financial & Vendor Management:
- Optimize IT operational budgets with a---cost-effective,
cloud-native strategy.
- Negotiate---vendor contracts, ensuring alignment with business
needs and service reliability.
- Drive---cost efficiency in cloud spending, SaaS platforms, and
infrastructure investments.Leadership & Talent Development:
- Build and mentor a high-performing---Reliability Engineering
team, fostering a culture of automation and innovation.
- Lead a team of---SREs, DevOps engineers, and platform
reliability experts---across global squads.
- Promote a---collaborative, data-driven, and proactive mindset,
ensuring agility and operational resilience.
- Establish workforce development programs for---AI-driven
operations, automation, and modern reliability practices.
What You'll BringRequired Qualifications: -
- 15+ years of experience---in IT operations, SRE, DevOps, or
platform engineering.
- 5+ years in a senior leadership role, managing---large-scale IT
environments.
- Deep technical expertise in---cloud computing (AWS, Azure,
GCP), on-prem infrastructure, and hybrid environments.
- Proven track record in---end-to-end automation, Infrastructure
as Code (IaC), and large-scale observability.
- Experience in---AI-driven IT operations, predictive analytics,
and automated remediation.
- Strong understanding of---zero-trust security, regulatory
compliance, and risk management.
- Excellent leadership, communication, and stakeholder management
skills.Preferred Qualifications:
- Certifications:---ITIL, AWS/Azure/GCP Solutions Architect, SRE
Foundation, CISSP, or equivalent.
- Experience with---Kubernetes, Terraform, Ansible, and
AI-powered operations tools.
- Strong problem-solving abilities, with a data-driven approach
to operational excellence.The---Senior Director - Operations
Platform Lead---is a pivotal leadership role responsible
for---shaping the future of IT operations---by integrating---SRE,
DevOps, and automation-first methodologies. If you are a highly
technical, innovation-driven leader passionate about---scaling
operations through automation and AI-driven resilience, we invite
you to apply.
Who You'll Work WithWork Environment & Additional Information:
- Hybrid or on-site work model.
- May require occasional travel for---business meetings, data
center visits, or vendor engagements.
- Ability to work in a---fast-paced, high-availability IT
environment, with a focus on automation and reliability.
Boston Consulting Group is an Equal Opportunity Employer. All
qualified applicants will receive consideration for employment
without regard to race, color, age, religion, sex, sexual
orientation, gender identity / expression, national origin,
disability, protected veteran status, or any other characteristic
protected under national, provincial, or local law, where
applicable, and those with criminal histories will be considered in
a manner consistent with applicable state and local laws.
BCG is an E - Verify Employer. for more information on
E-Verify.
Keywords: Boston Consulting Group, Medford , Senior Director - Operations and Reliability Engineering, Executive , Boston, Massachusetts
Didn't find what you're looking for? Search again!
Loading more jobs...