Talent.com
Lead Site Reliability (DevOps) Engineer
Lead Site Reliability (DevOps) EngineerRoberts Recruiting, LLC • Boston, MA, United States
Lead Site Reliability (DevOps) Engineer

Lead Site Reliability (DevOps) Engineer

Roberts Recruiting, LLC • Boston, MA, United States
5 hours ago
Job type
  • Full-time
Job description

Lead Site Reliability Engineer

We’re looking for a top‑notch, hands‑on SRE to lead our small and talented infrastructure engineering team and help us elevate our game when it comes to designing, building and operating high‑performance and highly‑available systems.

We’re backed by Insight Venture Partners and Iconiq Capital, we’re on a path to $1B in 2019, and we’ll get there — even more surely if you come help us.

Every engineer is responsible for the software they build, and SREs play a critical part in providing the tools, practices, and expertise to support them succeed.

Our production systems are hosted in AWS datacenters running a large Ruby on Rails web application and a handful of smaller services in Ruby, Node.js, and Java. We currently deploy 3–5 times a day. Our systems are stable and fire drills are rare. Technologies we’re currently using include :

  • Amazon Web Services (EC2, ELB, S3, RDS, ElastiCache) and Ubuntu Linux
  • Postgres, Redis, Memcached, ElasticSearch
  • Chef, ServerSpec, Terraform, NewRelic, DataDog, Sumo Logic and Test Kitchen

In this mission‑critical role, you would :

  • Design, build, and maintain the core infrastructure of our product
  • Actively manage the backlog for our infrastructure team and work closely with other SREs on the team to provide coaching and mentorship
  • Help us increase developer productivity and get to true continuous delivery
  • Develop operational and security standards and champion operational excellence and secure coding practices
  • Partner with engineering teams closely to educate and consult
  • Participate in solution design for new features, products, systems and tooling
  • Debug complex problems across the whole stack
  • Continually monitor application / system performance and costs, generate actionable insights and either implement or advocate for them
  • Participate in on‑call rotations, along with every member of the engineering team
  • Ruthlessly eliminate repetitive manual tasks and recurring errors
  • Ensure we are always employing best‑of‑breed tooling for all our infrastructure and automation needs
  • Collaboratively plot course for the maturing and growth of our infrastructure
  • Participate (and sometimes run point) in handling production incidents
  • Work closely with engineering teams to conduct root cause analysis for production incidents, and evolve infrastructure and tooling
  • This role might be that rare opportunity if you :

  • Thrive in a highly collaborative, no red‑tape, rapid‑growth environment
  • Love building tooling and infrastructure to help developers be more productive
  • Love eliminating repetitive manual tasks through automation
  • Have a healthy appreciation of what it means to work in production
  • Have solid Unix command line and systems chops
  • Have experience with substantial, distributed SaaS or eCommerce systems
  • Can point to a solid track record of success leading small‑to‑medium infrastructure teams
  • Have vision and well‑informed opinions about how to build infrastructure for a high‑growth, technology‑driven company that’s headed towards the $1B mark
  • #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • Boston, MA, United States

    Related jobs
    Site Reliability Engineer

    Site Reliability Engineer

    LogRocket • Boston, MA, United States
    Full-time
    Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - USA).Founded in 2016, LogRocket's goal is to make every experience on the web as perfect as possible.We solve a huge ch...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Red Hat • Boston, MA, United States
    Full-time +1
    Join to apply for the Site Reliability Engineer role at Red Hat.Red Hat is looking for a Platform Engineer to join its Platform Engineering team! In this role, you will help architect, implement, i...Show more
    Last updated: 13 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Cimulate, Inc. • Boston, MA, United States
    Full-time
    In this pivotal role, you’ll own the reliability, availability, and performance of our SaaS production environment—monitoring critical systems, managing deployments, and ensuring seamless operation...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Wabbisoft • Boston, MA, United States
    Full-time
    Boston, MA or Remote / / Full-time Position.Are you interested in helping companies transform the way they think about security as part of their software development pipeline? If “Yes!,” then keep re...Show more
    Last updated: 30+ days ago • Promoted
    OpenShift / Kubernetes Site Reliability Engineer

    OpenShift / Kubernetes Site Reliability Engineer

    Ford Motor Company • Boston, MA, United States
    Full-time
    This Kubernetes Site Reliability Engineer position will design and provision infrastructure supporting cloud native applications alongside a geographically distributed team.Emphasis will be on cont...Show more
    Last updated: 10 days ago • Promoted
    Lead Reliability Engineer

    Lead Reliability Engineer

    Arcadis • Boston, MA, United States
    Full-time +1
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Arcadis is the world's leading company delivering sustainable design, engineering, and consultancy sol...Show more
    Last updated: 13 days ago • Promoted
    Lead Site Reliability Engineer (SRE)

    Lead Site Reliability Engineer (SRE)

    EPAM Systems • Boston, MA, United States
    Full-time
    At EPAM, we’re not just building software — we’re engineering excellence.Lead Site Reliability Engineer (SRE).This role is ideal for someone who thrives in fast-paced financial systems, has a passi...Show more
    Last updated: 13 days ago • Promoted
    Staff Site Reliability Engineer - Observability

    Staff Site Reliability Engineer - Observability

    Hispanic Alliance for Career Enhancement • Boston, MA, United States
    Full-time
    At CVS Health, we’re building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.As the nation’s leading h...Show more
    Last updated: 30+ days ago • Promoted
    DevOps and Site Reliability Engineer

    DevOps and Site Reliability Engineer

    Devopshunt • Boston, MA, United States
    Full-time
    Boston Red Sox and Fenway Sports Management.Members of the Baseball Systems team at the Boston Red Sox are focused on designing, building, and refining the software and data pipelines used within B...Show more
    Last updated: 3 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Cimulate AI • Boston, MA, United States
    Full-time
    In this pivotal role, you will own the reliability, availability, and performance of our SaaS production environment—monitoring critical systems, managing deployments, and ensuring seamless operati...Show more
    Last updated: 13 days ago • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    SS&C Technologies • Boston, MA, United States
    Full-time
    SS&C Technologies is a global investment and financial services software provider, headquartered in Windsor, Connecticut, and supporting more than 28,000 employees across 35 countries.It specialize...Show more
    Last updated: 10 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Canonical • Boston, MA, United States
    Full-time
    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiat...Show more
    Last updated: 30+ days ago • Promoted
    Desktop Engineer Lead

    Desktop Engineer Lead

    TEKsystems • Andover, MA, United States
    Full-time
    Desktop Engineer Lead •Location : • Andover, MA.Category : • Technical Support Manager / Supervisor.About the Role We are seeking an experienced Desktop Engineer Lead to join our IT team in Andover, MA...Show more
    Last updated: 12 days ago • Promoted
    Full Stack Engineer

    Full Stack Engineer

    forREAL • Danvers, MA, United States
    Full-time
    REAL is a modern platform that simplifies the leasing experience for both tenants and landlords.Tenants can browse listings, explore neighborhoods, take 3D tours, and complete the application proce...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    LogRocket, Inc • Boston, MA, United States
    Full-time
    Founded in 2016, LogRocket's goal is to make every experience on the web as perfect as possible.We're solving a huge challenge for product managers and developers - understanding the user experienc...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer (SRE) - Cloud / DevOps Engineer-

    Site Reliability Engineer (SRE) - Cloud / DevOps Engineer-

    Lumen • Boston, MA, United States
    Full-time
    We are igniting business growth by connecting people, data and applications – quickly, securely, and effortlessly.Together, we are building a culture and company from the people up – committed to t...Show more
    Last updated: 5 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Digital Realty • Boston, MA, United States
    Full-time
    Position Title : Site Reliability Engineer, Interconnection Service and Network Delivery.Location : Hybrid : Austin, Dallas, Boston, Ashburn, Atlanta, London, or Amsterdam. In this role, you will be re...Show more
    Last updated: 13 days ago • Promoted
    Site Reliability Engineer III - AWM

    Site Reliability Engineer III - AWM

    JPMorgan Chase & Co. • Boston, MA, United States
    Full-time
    We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.As a Software Engineer III at JPMorganChase within the Asset and Wealth Management A...Show more
    Last updated: 23 days ago • Promoted