Talent.com
Site Reliability Engineer (SRE) - Engineering Productivity

Site Reliability Engineer (SRE) - Engineering Productivity

Arista NetworksNashua, NH, US
2 days ago
Job type
  • Full-time
Job description

Job Description

Job Description

Company Description

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.

At Arista we value the diversity of thought and perspectives that each employee brings to the table. We  believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.

Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.

Job Description

Who You’ll Work With

Arista Networks is looking for world-class Site Reliability Engineers passionate about driving systems reliability and scalability to provide the best possible development experience for our 2000+ person engineering team. You will be part of a fast paced, high caliber team building the internal systems and infrastructure used to build the routing and switching products driving the industry's largest data center networks.

Arista’s Software Engineering team runs at a scale rarely found - TBs of source control, 60GB work trees with 1000s of developer branches in flight at any given time, over 400K daily build / test jobs and over 150 homegrown and cloud native services running on a 60 node Kubernetes cluster.  Operating these systems takes vigilance, responsiveness to alerts, and a steady stream of updates and bug fixes to keep things running smoothly and efficiently as well as to increase our ability to monitor, understand and visualize them. The SRE role will cover all aspects of our software development infrastructure, and may include monitoring, responding to, and enhancing alerts, working to unify and standardize our alerts, fine tuning code for scalability and performance, debugging problems and the addition of new features. You will own your projects from definition to deployment and customer interactions, and you will be responsible for the quality of everything you deliver.

Working in Engineering Productivity (EngProd), you will collaborate and work with other engineers to design, build, scale, and operate the systems that the rest of Arista’s development teams use.  The EngProd team uses industry-standard systems like Ansible, Jenkins, Kubernetes, Grafana, Gerrit, MySQL, ElasticSearch, Google Cloud, and Redis and also internal systems that we’ve built from the ground-up to automate CI / CD, testing, analysis, and visualization.

What You’ll Do

  • Keeping the production status green all the time
  • Proactively monitor, respond to, and enhance alerts
  • Build automated responses to the most common alerts or work with the rest of the EngProd team to build them
  • Create and maintain the incident response runbooks working with the service dev teams
  • Debug and resolve issues impacting developer user experience and infrastructure stability
  • Develop patterns to support system reliability and socialize them within the EngProd team
  • Review and contribute to the specifications and implementations written by other team members.
  • Work with Arista’s software engineers to identify bottlenecks and limitations in our workflows, tooling, and infrastructure and provide fixes for those problems.
  • Provide support for our tools and infrastructure to Arista’s development team.

Qualifications

  • At least BS Computer Science or Engineering + 5 years’ experience, MS Computer Science or Engineering + 3 years’ experience, or equivalent work experience.
  • Knowledge of one or more of Go, Python, Javascript, Shell Scripting.
  • Knowledge of Linux (or UNIX).
  • Experience operating and managing software systems at scale
  • Strong understanding of the fundamentals of storage and networking
  • Comfortable with Ansible and GitOps
  • Applied understanding of software engineering principles.
  • Strong problem solving and software troubleshooting skills.
  • Ability to design a solution and implement features independently. Ability to work in small teams.
  • #LI-SP1

    Additional Information

    Arista Networks is an equal opportunity employer.  Arista makes all hiring and employment-related decisions in a non-discriminatory manner without regard to race, color, religion, sex, sexual orientation, gender identity, national origin or any other factor determined to be unlawful under applicable federal, state, or law law.  All your information will be kept confidential according to EEO guidelines.

    Create a job alert for this search

    Site Reliability Engineer Sre • Nashua, NH, US

    Related jobs
    • Promoted
    Product Development Engineering Lead

    Product Development Engineering Lead

    DEKA Research & DevelopmentManchester, NH, US
    Full-time
    DEKA Research & Development, located in Manchester, NH, is seeking an Product Development Engineering Lead.In this role, you will drive new product development teams to deliver highly innovativ...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Developer

    Site Reliability Developer

    VirtualVocationsManchester, New Hampshire, United States
    Full-time
    A company is looking for a Site Reliability Developer.Key Responsibilities Perform DevOps activities to support customers and engineers during release cycles and production Respond to incidents,...Show moreLast updated: 30+ days ago
    • Promoted
    Senior System Reliability Analysis Engineer

    Senior System Reliability Analysis Engineer

    Draper LabsCambridge, MA, United States
    Full-time
    Draper is an independent, nonprofit research and development company headquartered in Cambridge, MA.The 2,000+ employees of Draper tackle important national challenges with a promise of delivering ...Show moreLast updated: 30+ days ago
    • Promoted
    Utilities / Facilities Site Leader (R&D Site)

    Utilities / Facilities Site Leader (R&D Site)

    Mentor Technical GroupBoston, MA, US
    Full-time
    Mentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life science engineering and technical solutio...Show moreLast updated: 30+ days ago
    • Promoted
    Customer Reliability Engineer

    Customer Reliability Engineer

    VirtualVocationsManchester, New Hampshire, United States
    Full-time
    A company is looking for a Customer Reliability Engineer III.Key Responsibilities Manage and resolve customer technical issues via support tickets and real-time interactions Act as a liaison bet...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    CyberarkNewton, MA, US
    Full-time
    CyberArk (NASDAQ : CYBR), is the global leader in Identity Security.Centered on privileged access management, CyberArk provides the most comprehensive security offering for any identity – huma...Show moreLast updated: 30+ days ago
    • Promoted
    Sr. SRE, Compute Infrastructure

    Sr. SRE, Compute Infrastructure

    NxT LevelBoston, MA, US
    Full-time
    Senior Site Reliability Engineer – Compute Infrastructure.Location : Boston, MA (Hybrid – Tues–Fri Onsite | Mondays Remote). Compensation : $134,250 – $214,800 + Bonus + Equity...Show moreLast updated: 2 days ago
    • Promoted
    Senior Manager, Site Reliability Engineering

    Senior Manager, Site Reliability Engineering

    XometryBoston, MA, US
    Full-time
    Xometry (NASDAQ : XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry's digital marketplace gives ma...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Engineering Co-Op - Spring 2026

    Reliability Engineering Co-Op - Spring 2026

    EntegrisBillerica, MA, United States
    Full-time
    Reliability Engineering Co-Op - Spring 2026.Reliability Engineering Co-Op - Spring 2026 Here at Entegris, we use advanced science to enable technologies that transform the world, and we are seeking...Show moreLast updated: 22 days ago
    • Promoted
    Lead Digital Engineer- SRE

    Lead Digital Engineer- SRE

    BJ's Wholesale ClubMarlborough, MA, US
    Full-time
    Join our team of more than 34,000 team members, supporting our members and communities in our Club Support Center, 235+ clubs and eight distribution centers. BJ’s Wholesale Club offers a colla...Show moreLast updated: 28 days ago
    • Promoted
    Field Engineer

    Field Engineer

    The Middlesex CorporationLittleton, MA, United States
    Full-time +2
    The Middlesex Corporation is a nationally recognized and award-winning leader in the heavy civil construction industry.Since 1972, the family business founded by Robert W.Pereira has developed an e...Show moreLast updated: 10 days ago
    • Promoted
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    VirtualVocationsLowell, Massachusetts, United States
    Full-time
    A company is looking for a Senior Site Reliability Engineer.Key Responsibilities Design and implement infrastructure and automation scripts for AWS deployment and management Optimize and monitor...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer Engineer

    Site Reliability Engineer Engineer

    Coralogix, inc.Boston, MA, United States
    Full-time
    Site Reliability Engineer EngineerBoston, MA • Full-time • Senior#### About The PositionCoralogix is a modern, full-stack observability platform transforming how businesses process and understand t...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    VirtualVocationsLowell, Massachusetts, United States
    Full-time
    A company is looking for a Manager, Software Engineer.Key Responsibilities Define and execute the strategic vision and roadmap for the Site Reliability Engineering function Provide leadership an...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Semiconductor Reliability Engineer

    Lead Semiconductor Reliability Engineer

    RaytheonAndover, Massachusetts, United States of America
    Full-time
    MA112 : Andover MA 358 Lowell St Dukes 358 Lowell Street Dukes, Andover, MA, 01810 USA.Person, or Immigration Status Requirements : . The ability to obtain and maintain a U.At Raytheon, the foundation ...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Coralogix, inc.Boston, MA, United States
    Full-time
    Site Reliability EngineerBoston, MA • Full-time • Senior#### About The PositionCoralogix is a modern, full-stack observability platform transforming how businesses process and understand their data...Show moreLast updated: 15 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocationsDorchester, Massachusetts, United States
    Full-time
    A company is looking for a Site Reliability Engineer.Key Responsibilities Become a subject matter expert in applications supporting customers Collaborate with teams to evaluate, deploy, and debu...Show moreLast updated: 30+ days ago
    • Promoted
    2nd Shift Production Helper

    2nd Shift Production Helper

    Eastern MetalBow, NH, US
    Full-time
    Primary duties and responsibilities include : .Primary helper on at least one line with ability to nest, stack and package material coming off of the line. Inspect finished products for defects to ens...Show moreLast updated: 30+ days ago