Senior Site Reliability Engineer

at Akamai Technologies, Inc.
Published March 2, 2020
Location Cambridge, MA
Category Default  
Job Type Full-time  

Description


Overview
As a Senior Site Reliability Engineer you will be an integral part of a team working to strategically apply blockchain technologies to new and innovative business areas, resulting in high-throughput transaction systems at global scale.

As a Senior SRE you will be responsible for:
* Component and framework designs supporting the virtualization and orchestration of Akamai computing infrastructure, from conception and design through testing ,deployment and operation.
* Working on projects that make our network more efficient while sustaining service and component stability, performance and secure.
* Working to understand, explain and improve/adaptive complex/stable and legacy software component deployment/integration frameworks.
* Working with our development QA and system QA teams to come up with regression tests and operational monitoring that cover new changes to our software.
* Working with our cross business unit engineering teams to support migration designs and critical network rollouts.

About the Team
Akamai Labs is expanding its team with a mission to strategically apply blockchain technology to business requirements for a highly available, high throughput, highly secure transaction system.

Qualifications
Required Education and ExperienceApplicants must meet one of the following education and experience requirements: * 5 years of relevant experience and a Bachelor’s degree in computer science/engineering or
* 3 years of relevant experience and a Master’s degree in computer science/engineering or
* Relevant experience and a PhD in computer science/engineering

Required Skills
* 5+ years experience with 1 or more of the following languages - C/C++, Golang, Java, Python
* 5+ years of experience working on Linux/Unix platforms

Desired Skills
* Linux host system hypervisors including KVM
* Experience with configuration management / infrastructure as code tools like Ansible, Chef and Puppet
* Experience with distributed storage systems
* Linux kernel development and/or performance tuning
* SQL experience
* Experience with Content Distribution Networks
* Data center utilization monitoring and COGs modeling
* Designing, implementing and deploying continuous build/deployment frameworks 
* Site reliability engineering and/or related work including service performance
* Experience building scalable servers or distributed systems
* Proven track record of delivering large amounts of high quality, complex code
* Highly responsible, self-disciplined, self-managed, self-motivated, able to work with little or no supervision
* Extensive experience working on multiple projects at a time in a fast paced, results oriented environment
* Excellent written and verbal communication skills