|
Description
|
% of Time Spent
|
- Operational Oversight and Service Restoration
The Engineer, Intelligent Operations and Observability provides active operational leadership within the McCormick Technology Operations Center during incidents, service degradations, and other events affecting the company’s technology environment. This role maintains real time awareness of the operational state across infrastructure, cloud platforms, applications, networks, and end user services, and leads coordinated response efforts to ensure the appropriate teams and service providers are engaged quickly and working with urgency toward restoration. The role drives incident response by reinforcing priorities, tracking progress, challenging assumptions, escalating when needed, and helping ensure the right level of visibility, accountability, and stakeholder attention is maintained through recovery and stabilization.
|
35%
|
- Observability and Operational Tooling
Implements, administers, and continuously improves the operational tooling environment that supports enterprise monitoring, event management, and service visibility. Serves as a technical expert across application performance monitoring, infrastructure and systems monitoring, event aggregation, alerting, dashboards, and reporting, ensuring the tool suite delivers meaningful and actionable insight. Partners across infrastructure, cloud, application, and service teams to expand monitoring coverage, improve signal quality, reduce alert noise, and strengthen overall observability and operational decision making.
|
35%
|
- Continuous Improvement and Operational Maturity
Contributes to the ongoing maturity of the Technology Operations Center by identifying and implementing improvements in operational processes, tooling, visibility, and service assurance practices. Supports the development of runbooks, standards, and response procedures, and uses operational data and trends to recommend opportunities for automation, standardization, and stronger preventive controls. Works within established service management and SIAM aligned practices to improve coordination across internal teams and service providers, while helping advance a more proactive, insight driven, and resilient operations model.
|
15%
|
- Incident Response Process Contribution
Serves as a key stakeholder in the ongoing development and improvement of incident response practices, standards, and supporting procedures by providing operational insight and practical recommendations based on real world experience. Helps shape how incident prioritization, escalation, communications, and restoration processes are refined over time, while reinforcing adherence to established practices across service providers and internal teams. Supports governance and operational reviews by identifying execution gaps, highlighting response trends, and recommending improvements that strengthen consistency, discipline, and overall response effectiveness.
|
15%
|
|