Talent.com
Distinguished Software Engineer, Reliability Infra

Distinguished Software Engineer, Reliability Infra

LinkedInMountain View, CA, United States
1 day ago
Job type
  • Full-time
Job description

Distinguished Software Engineer, Reliability Infra

LinkedIn is the world's largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exciting opportunities, build necessary skills, and gain valuable insights every day. We're also committed to providing transformational opportunities for our own employees by investing in their growth. We aspire to create a culture that's built on trust, care, inclusion, and fun where everyone can succeed.

Job Description

At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team.

This role will be based in Sunnyvale, CA or San Francisco, CA.

Responsibilities

  • Serve as a senior technical leader driving the long-term reliability and observability strategy across LinkedIn's infrastructure
  • Re-architect LinkedIn's backend systems to enable granular failure domains and reduce the blast radius of incidents
  • Design and implement next-generation failure mitigation strategies that avoid full-region or full-datacenter failovers
  • Partner closely with across many different types of engineers to raise the bar for operational excellence and incident response
  • Define and build frameworks to improve monitoring, alerting, and observability across hundreds of services and systems
  • Define and own the roadmap of bringing observability to critical user journeys for LinkedIn's products to help capture and improve the experience of LinkedIn's members / customers
  • Spearhead a multi-year initiative to transition LinkedIn's infrastructure to a regionalized model with localized failover, enhancing both scalability and availability
  • Lead technical discussions on the future of Engineering at LinkedIn, what the function should evolve into over the next 3- 5 years
  • Deliver key insights, executive level reporting across the cross-functional engineering teams to enable the right business decisions around improving quality and reliability of our services and products
  • Act as a force multiplier by mentoring engineers, influencing technical direction across orgs, and contributing deeply to culture, hiring, and technical excellence
  • Lead incident response and post-incident reviews to identify root causes and implement preventive measures.
  • Develop and maintain incident management processes and procedures to ensure timely resolution of issues and minimize impact on customers

Qualifications

Basic Qualifications

  • 15+ years of software engineering experience
  • 8+ years focused on infrastructure, reliability focused engineering, or distributed systems
  • Preferred Qualifications

  • Hands-on experience with large-scale incident response, root cause analysis, and resiliency engineering
  • Strong communication and cross-functional collaboration skills, with experience influencing across multiple orgs and leadership levels
  • Proven success designing and leading architectural transformations at internet-scale companies
  • Deep knowledge of systems reliability, observability frameworks, and fault-tolerant architecture design
  • Experience with multi-region architecture, capacity planning, and failover strategies in large-scale cloud or hybrid environments
  • Background in CI / CD, platform reliability, and automation of ops-heavy systems.
  • Familiarity with modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana) and service mesh architecture
  • Track record of setting long-term technical strategy and driving systemic improvements in availability and performance
  • Previous experience in a Distinguished Engineer or equivalent role at a high-growth or web-scale technology company
  • Additional Information

    LinkedIn is committed to fair and equitable compensation practices. The pay range for this role is $238,000 to $390,000. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location.

    Equal Opportunity Statement

    We seek candidates with a wide range of perspectives and backgrounds and we are proud to be an equal opportunity employer. LinkedIn considers qualified applicants without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other legally protected class.

    #J-18808-Ljbffr

    Create a job alert for this search

    Reliability Engineer • Mountain View, CA, United States

    Related jobs
    • Promoted
    Flight Software Infrastructure Engineer

    Flight Software Infrastructure Engineer

    Reliable RoboticsMountain View, CA, United States
    Permanent
    We're building safety-enhancing technology for aviation that will save lives.Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally tra...Show moreLast updated: 30+ days ago
    • Promoted
    Gaming Licensed Senior Software Engineer

    Gaming Licensed Senior Software Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Senior Lead Software Engineer, AI Engineering.Key Responsibilities Design, develop, and operate core AI platform components, including LLM runtime services and vector s...Show moreLast updated: 30+ days ago
    • Promoted
    Distinguished Engineer

    Distinguished Engineer

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Distinguished Engineer (Remote Eligible).Key Responsibilities Articulate and evangelize a bold technical vision for the domain Decompose complex problems into practica...Show moreLast updated: 30+ days ago
    • Promoted
    Systems Engineer II

    Systems Engineer II

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Systems Engineer II to manage and operate production environments while ensuring 24 / 7 availability. Key Responsibilities Monitor and maintain all production system equip...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer - Reliability

    Software Engineer - Reliability

    RubrikPalo Alto, CA, United States
    Full-time
    The Rubrik Engineering team is comprised of people who produce extraordinary results.Our engineers are driven to build efficient, reliable, and cost effective products. We believe in empowering our ...Show moreLast updated: 1 day ago
    • Promoted
    Software Engineer - Reliability

    Software Engineer - Reliability

    xAISan Francisco, CA, United States
    Full-time
    AI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excelle...Show moreLast updated: 1 day ago
    • Promoted
    Distinguished Software Engineer

    Distinguished Software Engineer

    AffirmSan Jose, CA, United States
    Full-time
    Distinguished Software Engineer.Be among the first 25 applicants.Distinguished Software Engineer.Affirm is reinventing credit to make it more honest and friendly, giving consumers the flexibility t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    PsiQuantumPalo Alto, CA, United States
    Full-time
    Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Show moreLast updated: 30+ days ago
    • Promoted
    Software Engineer, IoT Reliability

    Software Engineer, IoT Reliability

    AirGarageSan Francisco, CA, United States
    Full-time
    AirGarage is seeking a Software Engineer to own the reliability, health, and observability of our nationwide IoT device fleet. You will work with embedded systems, backend infrastructure, and site r...Show moreLast updated: 1 day ago
    • Promoted
    Systems Software Engineer

    Systems Software Engineer

    VirtualVocationsHayward, California, United States
    Full-time
    A company is looking for a Staff Systems Software Engineer.Key Responsibilities Design and implement a stable framework for integrating with multiple vendor firewalls Understand customer require...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer - Inference

    Site Reliability Engineer - Inference

    Jobright.aiSan Francisco, CA, United States
    Full-time
    Site Reliability Engineer - Inference.Be among the first 25 applicants.Site Reliability Engineer - Inference.Get AI-powered advice on this job and more exclusive features.Jobright is an AI-powered ...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocationsOakland, California, United States
    Full-time
    A company is looking for a Mid-Sr.Site Reliability Engineer with a focus on on-prem Kubernetes / K8s.Key Responsibilities Manage and maintain on-premise containerized environments Deploy resources...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Software Engineer, Site Reliability

    Software Engineer, Site Reliability

    Fireworks AIRedwood City, CA, United States
    Full-time
    Get AI-powered advice on this job and more exclusive features.Here at Fireworks, we're building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highe...Show moreLast updated: 6 hours ago
    • Promoted
    Distinguished Software Engineer, Reliability Infra

    Distinguished Software Engineer, Reliability Infra

    Next MatterMountain View, CA, United States
    Full-time
    LinkedIn is the world’s largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover exci...Show moreLast updated: 1 day ago
    • Promoted
    Site Reliability Engineer II

    Site Reliability Engineer II

    Hinge HealthSan Francisco, CA, United States
    Full-time
    From scaling Kubernetes clusters to improving observability with Datadog, we build the tooling and automation that empower product teams to ship with confidence. Collaborate with engineering teams t...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    Jobs via DiceRedwood City, CA, United States
    Full-time
    Dice is the leading career destination for tech experts at every stage of their careers.Our client, Kforce Technology Staffing, is seeking a Reliability Engineer in Redwood City, CA.Deliver high-le...Show moreLast updated: 1 day ago
    • Promoted
    Senior Software Engineer - Observability and Reliability

    Senior Software Engineer - Observability and Reliability

    SigmaSan Francisco, CA, United States
    Full-time
    Senior Software Engineer - Observability and Reliability.Senior Software Engineer - Observability and Reliability.We are growing the engineering team and looking for engineers who have the chops to...Show moreLast updated: 1 day ago
    • Promoted
    Software Engineer II

    Software Engineer II

    VirtualVocationsFremont, California, United States
    Full-time
    A company is looking for a Software Engineer II.Key Responsibilities : Develop, enhance, test, deploy, and maintain software and services for applications / APIs using C# / Node.Perform code reviews a...Show moreLast updated: 30+ days ago