Educational requirements: Bachelor
English requirements: Competent English
Requirements for skilled employment experience for years: 1-3 years
Required residence status: Temporary visa, Permanent resident, Citizen
Accept remote work: unacceptable
WHAT WE ARE LOOKING FOR: We are looking for a highly motivated Senior Site Reliability Engineer who will be based in our Sydney office and to work alongside with our international teams,, cloud and site reliability engineers across Asia/Europe/EMEA. The ideal candidate will possess a blend of experience among SaaS architecture, industry-proven ITIL practices and modern agile delivery patterns. You will be helping Tealium process Billions of transactions per day using our modern AWS based technology stack which continues to evolve regularly with new infrastructure, tools and services.
YOUR DAY TO DAY: • Improve on alerting signal to noise by focusing on services and business metrics over individual hosts. • Proactively research potential problems before they arise. • Work closely with managers and team on opportunities for monitoring, alerting and trending improvements as well as runbook and troubleshooting documentation. • Look for opportunities to improve application logging and exception handling. • Maintain and expand strong cross-functional relationships with Customer Success and Product Management teams. • Manage incidents and implement changes following Tealiums Agile/ITIL processes.
WHAT YOU BRING TO TEALIUM: • You have Bachelor’s degree in Software Engineering, Computer Science, Network Engineering, or equivalent experience. • You bring at least 4 years experience with linux based AWS (>10 services). • You bring at least 3 years experience in a technical management capacity within a SaaS 24X7X365 environment. • You have strong time management, organizational, oral and verbal skills required. • You are able to articulate technical challenges and proposed solutions in a succinct, clear manner for all organizational levels. • You are experienced with sophisticated AWS services and systems running at scale (millions to billions of transactions per day). (EC2/ALB, EKS/Kubernetes, AWS CLI). • You are experienced with scripting, CICD and infrastructure as code tools (Python, Jenkins, Git, Terraform, Jira/Confluence). • You are experienced with modern observability, logging, monitoring, alerting, trending and dashboarding methods and tools (DataDog, SumoLogic, PagerDuty, Cloudwatch, Prometheus). • You are able to collaborate with US counterparts on incidents, changes and troubleshooting. • You are a life long learner, strong attention to detail and aptitude for data driven questioning.