Scale the future of Apollo’s cloud infrastructure to support our ever growing team, codebase, and of course major world-wide customers (NYT, Walmart, and Expedia to name a few). Your contributions will focus primarily on our systems, operations, and scaling our support for services out in a developer centric manner. Observability, reliability, and scalability are the name of the game.
Whether your passion is in the nitty gritty of K8s topologies, refactoring yet another Terraform Module, or removing even more permissions from accounts, you will feel right at home. Projects within immediate scope will touch our GCP Footprint/VPC Networking, Kubernetes installation, CI/CD pipelines, and Druid cluster. Your aim will be to scope out projects and problems such that your teammates can hit the ground running, and for you to run alongside! Your abilities will be well suited to help us look ahead toward where our next bottlenecks, and hot spots will be.
The team this role is on builds the platform which cradles all of our backend code, services, and data. We aim to make Apollo more dependable by providing the communal, plug and play building blocks, that make simple services simple to debug, and simpler to support.
The goal of our team is not measured in our overall productivity, but rather in the gains of other teams. We focus on efforts which enable and nurture the growth of other developer teams at Apollo. You’ll be helping your teammates, and shaping how we work in a major way every day.
What you’ll do
- Shape our cloud infrastructure and service architecture and help leverage areas of strength while shoring up weaknesses to allow Apollo to grow as the most trusted GraphQL leader.
- Lead development on things like our GCP Footprint, and Cloud Infrastructure. You’ll consult on our service chassis, and reporting/metrics pipeline. You’ll leverage tools such as: Terraform, K8s, Datadog, GCP, and CircleCI.
- Participate on our on-call rotation, helping to make it more manageable
- Nurture our cloud installation, seeing it scale well beyond its current scope. You’ll have the opportunity to solve some of the more gnarly problems companies see when users, traffic, and employees increase by orders of magnitude
- Work together with a group of reliable, and empathetic people. You’ll be a mentor and be able to grow alongside your teammates.
Who you are
- You enjoy pairing technical solutions with the right level of effort. You have experience at a variety of different scales, both in terms of data throughput and codebase collaborators
- Experience being on call for production services and know how to handle after-hours incidents
- Can harden distributed software systems, maintain K8s installations, improve CI and CD, and work to make code more observable with a tool like Datadog. Bonus: you’ve got PostgreSQL and/or Druid chops.
- You treat code as a craft. You have a bias for deprecation over refactor, and refactor over rebuild, but you’re not against rebuilding when appropriate (in our opinion, the best kind of refactor is the delete key). You aren’t attached to the code you write and appreciate understandable simple code.
- You’re data-driven. You rely on the insights we have at our disposal to make informed system-level decisions, test hypotheses and feed into prioritization. Additionally, when we’re missing a data point, you work to get the necessary tools
- A big plus: you’ve got GCP specific experience, but aren’t biased towards one cloud provider over another
- Another big plus: you're excited about working with GraphQL and influencing its technical future. You're excited to work at a devtools company where the tools you're building will actually improve your own
This position can be done from anywhere in the US and European time zones.
Apollo is proud to be an equal opportunity workplace dedicated to pursuing and hiring a talented and diverse workforce.