Cloud Reliability Engineer (AWS) - Remote
Location
United States
Posted
2 days ago
Salary
Not specified
No structured requirement data.
Job Description
What We’re About
At CentralSquare, we don’t just build software - we power public servants and uplift communities with Hero-Grade Technology. Every line of code, every feature we deliver helps heroes across North America protect, serve, and save lives. When you join us, you become part of a mission-driven team creating technology that makes communities safer and stronger.
Your Growth Matters. We believe heroes deserve opportunities to rise. That’s why we invest in your career with mentorship, learning programs, and clear paths for advancement. If you’re motivated, there’s no limit to how far you can go.
Your Commitment Deserves Reward. We offer competitive compensation and a benefits package designed to support your life inside and outside of work—tuition reimbursement, parental leave, paid volunteer hours, and unlimited PTO. Plus, our flexible work environment gives you the freedom to balance your heroic work with personal well-being, whether you’re in the office or remote.
Join us and help build the tools that power real-life heroes. Together, we make a difference.
The Role
Our Site Reliability Engineer leads the architecture, design, and deployment of network solutions across our client portfolio. They are responsible for developing specifications, implementing and maintaining cloud network security architecture for systems applications, performing and managing network design upgrades and hardware reconfigurations in a hybrid environment, diagnosing and resolving routing and interconnectivity issues as well as ensuring the availability, reliability, integrity, and efficient operation of the systems that support the AWS Cloud and hosted applications.
Job Duties:
-
Activities include designing, developing, installing, and maintaining software solutions that provide efficiency in Cloud Operations.
-
Work with engineering teams to refine deployment and release processes.
-
Collaborate with the engineering team on projects as the expert on reliability, performance, and efficiency.
-
Assist product engineers in development and deployment of backend applications.
-
Be prepared to explain your work, decisions, and ideas to your colleagues.
-
Participate in 24x7 operational support and on-call rotation shifts.
-
Ensure that all system design and procedures are documented and up-to-date.
-
Combine existing documentation where available, and create it where needed, to create a centralized body of knowledge for all team members to utilize. Contribute to the upkeep of documentation to maintain relevancy and accuracy.
-
Provide training and education to Cloud Operations on infrastructure and internal tooling.
-
Provide level of audit and control to security personnel.
-
Monitor systems to collect metrics for tuning and capacity planning.
-
Work to automate detection and resolution of recurring issues.
-
Build the whole stack from load balancers to the databases.
-
Ensure safety, predictability, repeatability and auditability of all build and deploy processes.
-
Provide technical leadership to other CentralSquare departments.
-
Develop, coach, mentor individuals and teams and ensure high performance in a fast-paced environment.
-
Build tools and automation that eliminate repetitive tasks and prevent incident occurrence.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Our client is dedicated to serving our nation's military and Veterans. They have the honor to support federal agencies in their efforts to advance the United States health care system and improve the overall health and well-being of all those who serve or have served our country....
DevOps Engineer
Bright Vision Technologies"Retrieve the best out of you" in each process what you do.
We are looking for a skilled DevOps Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth pote...
Associate DevOps Manager - EST
HiBobHiBob helps modern, mid-size businesses transform the way they manage people, giving HR and managers all they need to connect, engage, develop, and retain top talent. Since 2015, we’ve achieved consecutive triple-digit year-over-year growth, all backed by our amazing team of Bobbers from across the globe, making us the choice HRIS of over 4000 midsize and multinational companies. Our HR platform is intuitive, data-driven, and built for the way people work today: globally, remotely, and collaboratively. Fast-growing companies across the globe such as Huel, What3words, Fiverr, and VaynerMedia rely upon Bob to help them create the best work experiences for their people.
This player-coach role involves managing and mentoring one Senior DevOps Engineer while spending 60-80% of time hands-on designing, building, and operating production systems, with a focus on supporting AI-focused projects and cloud infrastructure.
Senior Site Reliability Engineer - Remote EST
HiBobHiBob helps modern, mid-size businesses transform the way they manage people, giving HR and managers all they need to connect, engage, develop, and retain top talent. Since 2015, we’ve achieved consecutive triple-digit year-over-year growth, all backed by our amazing team of Bobbers from across the globe, making us the choice HRIS of over 4000 midsize and multinational companies. Our HR platform is intuitive, data-driven, and built for the way people work today: globally, remotely, and collaboratively. Fast-growing companies across the globe such as Huel, What3words, Fiverr, and VaynerMedia rely upon Bob to help them create the best work experiences for their people.
The role involves designing, building, and operating production-grade Kubernetes infrastructure on AWS while developing AI Agents to automate incident handling and root cause analysis. Responsibilities also include building and maintaining GitOps-based CI/CD pipelines and owning monitoring and operational excellence using Datadog.