View All Jobs 28727

AWS Technical Service Management

Own ITIL process improvements for AWS-based cloud services across multiple teams
Mexico City
Senior
8 hours agoBe an early applicant
Santander

Santander

A global retail and commercial bank providing financial services including personal banking, corporate lending, and wealth management across multiple markets.

AWS Technical Service Management

To succeed in this role, you will be responsible for:

  1. Own and continuously improve ITIL practices for Incident Management, Change Management, and Problem Management for AWS-based services.
  2. Ensure service stability and adherence to SLAs/OLAs through operational controls, service reviews, and continuous improvement initiatives.
  3. Establish and track service health KPIs (availability, incident volume, MTTR/MTTA, change success rate, problem recurrence).
  4. Incident Management (incl. Major Incidents)
  5. Lead incident triage and coordination across cloud infrastructure, platform, security, and application teams.
  6. Use Dynatrace / Cloudwatch insights (alerts, traces, service flow, SLOs) to accelerate identification of impact scope and probable root cause domains (app vs infra vs dependencies).
  7. Coordinate communications and status updates during incidents, ensuring timely escalation, stakeholder alignment, and restoration targets.
  8. Change Management & Governance (CAM / CAB / Committees)
  9. Create, validate, and control change requirements in ServiceNow, ensuring quality of change records (scope, impact, risk, test evidence, implementation plan, backout plan, approvals).
  10. Drive the end-to-end change lifecycle: intake, risk/impact analysis, scheduling, approvals, implementation tracking, post-change validation, and closure.
  11. Prepare and present changes to CAM, CAB, and other change forums, ensuring compliance with governance and regulatory expectations.
  12. Monitor change calendars/pipelines to prevent conflicts and reduce change-related incidents.
  13. Problem Management & Continuous Improvement
  14. Lead or coordinate problem investigations for recurring incidents; ensure strong root cause analysis (RCA) and corrective/preventive action plans (CAPA).
  15. Track action items to closure and measure effectiveness (e.g., recurrence reduction, improved SLO attainment).
  16. Monitoring, Metrics & Reporting (ServiceNow + Dynatrace)
  17. Analyze and interpret data from ServiceNow (tickets, categories, backlog, SLA breaches) and Dynatrace (availability/performance indicators) to detect deviations, risks, and trends.
  18. Produce weekly/monthly operational reports and dashboards: SLA compliance, incident trends, change success rate/failure modes, top recurring issues, operational risk indicators.
  19. Propose mitigation plans and service improvements based on evidence and measurable outcomes.
  20. Process, Documentation, and Automation Enablement
  21. Define and maintain operational processes and standards for cloud service operations.
  22. Identify opportunities for systems automation (auto-remediation, workflow automation, alert tuning) and partner with engineering teams to implement.
  23. Stakeholder Management & Cross-Team Coordination
  24. Act as the operational focal point between cloud teams, application owners, security/risk, and governance stakeholders.
  25. Support decision-making by providing clear risk assessments, impact narratives, and recommended actions.
  26. Negotiate priorities, timelines, maintenance windows, and resource needs across teams.
+ Show Original Job Post
























AWS Technical Service Management
Mexico City
Technical Support
About Santander
A global retail and commercial bank providing financial services including personal banking, corporate lending, and wealth management across multiple markets.