Talent.com
Systems Reliability Engineer (SRE) - Edge
Systems Reliability Engineer (SRE) - EdgeCloudflare Inc • San Francisco, CA, United States
Systems Reliability Engineer (SRE) - Edge

Systems Reliability Engineer (SRE) - Edge

Cloudflare Inc • San Francisco, CA, United States
1 day ago
Job type
  • Full-time
Job description

About Us

At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world's largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare all have web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine's Top Company Cultures list and ranked among the World's Most Innovative Companies by Fast Company.

We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills, and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team. We hire the best people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us!

Available Locations : Austin

About the Role

We are looking for talented Systems Reliability Engineers to build and operate our Edge platform running in more than 320 cities in over 120 countries. Our SREs come from diverse technical backgrounds and have built up their knowledge working in different environments, but common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. We support our services in a "follow the sun" model with offices in East Asia, Europe and North America.

This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. We live at the boundary between systems, network, and software, and love improving the glue that holds them together. Working with us, you will build tools to constantly improve service availability, performance, and operational velocity. You will nurture a passion for an "automate everything" approach that makes systems failure resistant and ready to scale.

SREs focus on the immediate state and functionality of the Cloudflare platform around the world, leveraging an array of monitoring, alerting and diagnostics tools while developing and enhancing the Cloudflare platform and its capabilities. We own a wide portfolio of applications and services, running a tight feedback loop of developer and operator patterns. The ideal SRE candidate has a passionate curiosity about how the Internet fundamentally works and has a strong knowledge of networking, Linux and TLS along with coding ability in Go, Rust, or Python.

Requisite Skills

  • Aptitude for identifying problems, owning them and working with others to solve them
  • Linux systems experience
  • 3 years experience in an SRE role or a role with similar functions
  • Software development skills in some programming language such as Go, Rust, or Python
  • Understanding of distributed software systems and large scale system design tradeoffs
  • Intermediate experience of common network protocols like DNS and HTTP

Examples of desirable skills, knowledge and experience

  • Experience with the Linux kernel and Linux software packaging
  • Performance analysis and debugging
  • Configuration management systems such as Saltstack, Chef, Puppet or Ansible
  • Workflow automation systems such as Temporal or Apache Airflow
  • Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Squid or Apache
  • SQL databases
  • Time series databases such as OpenTSDB, Graphite, Prometheus or Grafana
  • Key / Value stores
  • Internetworking and BGP
  • Bonus Points

  • Experience with continuous / rapid release engineering
  • Strong tooling and automation development experience
  • Experience working in a 24 / 7 / 365 service environment
  • Experience working with large scale production distributed systems
  • A history of contributing to Open Source Software
  • Some tools that we use

  • Nginx
  • PostgreSQL
  • Docker
  • Prometheus
  • Grafana
  • Consul
  • Nomad
  • Temporal
  • Salt
  • What Makes Cloudflare Special?

    We're not just a highly ambitious, large-scale technology company. We're a highly ambitious, large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.

    Project Galileo : Since 2014, we've equipped more than 2,400 journalism and civil society organizations in 111 countries with powerful tools to defend themselves against attacks that would otherwise censor their work, technology already used by Cloudflare's enterprise customers at no cost.

    Athenian Project : In 2017, we created the Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free, so that their constituents have access to election information and voter registration. Since the project, we've provided services to more than 425 local government election websites in 33 states.

    1.1.1.1 : We released 1.1.1.1 to help fix the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Here's the deal - we don't store client IP addresses never, ever. We will continue to abide by our privacy commitment and ensure that no user data is sold to advertisers or used to target consumers.

    Sound like something you'd like to be a part of? We'd love to hear from you!

    This position may require access to information protected under U.S. export control laws, including the U.S. Export Administration Regulations. Please note that any offer of employment may be conditioned on your authorization to receive software or technology controlled under these U.S. export laws without sponsorship for an export license.

    Cloudflare is proud to be an equal opportunity employer. We are committed to providing equal employment opportunity for all people and place great value in both diversity and inclusiveness. All qualified applicants will be considered for employment without regard to their, or any other person's, perceived or actual race, color, religion, sex, gender, gender identity, gender expression, sexual orientation, national origin, ancestry, citizenship, age, physical or mental disability, medical condition, family care status, or any other basis protected by law. We are an AA / Veterans / Disabled Employer.

    Cloudflare provides reasonable accommodations to qualified individuals with disabilities. Please tell us if you require a reasonable accommodation to apply for a job. Examples of reasonable accommodations include, but are not limited to, changing the application process, providing documents in an alternate format, using a sign language interpreter, or using specialized equipment. If you require a reasonable accommodation to apply for a job, please contact us via e-mail at hr@cloudflare.com or via mail at 101 Townsend St. San Francisco, CA 94107.

    Create a job alert for this search

    Reliability Engineer • San Francisco, CA, United States

    Related jobs
    Software QA Engineer

    Software QA Engineer

    OSI Engineering • Menlo Park, CA, US
    Full-time
    We’re looking for a Software Quality Assurance Engineer with experience in web and mobile testing to join our team! In this role, you will work alongside Product, Engineering, and other members of ...Show more
    Last updated: 30+ days ago • Promoted
    Dev Ops Engineer

    Dev Ops Engineer

    Lawrence Berkeley National Laboratory • Berkeley, CA, United States
    Full-time +1
    Lawrence Berkeley National Lab's (.NERSC Division has an opening for a Dev Ops Engineer to join the team.In this exciting role, you will serve as a DevOps-oriented System Administrator / Software Eng...Show more
    Last updated: 30+ days ago • Promoted
    Distinguished Software Engineer, Reliability Infra

    Distinguished Software Engineer, Reliability Infra

    LinkedIn • Mountain View, CA, United States
    Full-time
    LinkedIn is the worlds largest professional network, built to create economic opportunity for every member of the global workforce. Our products help people make powerful connections, discover excit...Show more
    Last updated: 1 day ago • Promoted
    Failure Analysis Engineer

    Failure Analysis Engineer

    Diverse Lynx • Cupertino, CA, United States
    Full-time
    Title - Failure Analysis Engineer.Onsite in Cupertino Monday - Friday.Need 2X engineers to be be flexible to work swing shift. No contract to contract resources.Execute Failure Analysis tasks assign...Show more
    Last updated: 1 day ago • Promoted
    DevSecOps Engineer

    DevSecOps Engineer

    Anomali • Redwood City, CA, United States
    Full-time
    Anomali is headquartered in Silicon Valley and is the Leading AI-Powered Security Operations Platform that is modernizing security operations. At the center of it is an omnipresent, intelligent, and...Show more
    Last updated: 18 days ago • Promoted
    Verification and Validation Engineer / Senior V&V Engineer

    Verification and Validation Engineer / Senior V&V Engineer

    Bayside Solutions • San Mateo County, CA, US
    Full-time +1
    Verification and Validation Engineer / Senior V&V Engineer.Conduct design verification and validation testing for product development activities. Support all technical aspects of the product and i...Show more
    Last updated: 30+ days ago • Promoted
    Electrical Reliability Engineer

    Electrical Reliability Engineer

    Marathon Petroleum • Martinez, CA, United States
    Full-time
    At MPC, we're committed to being a great place to work - one that welcomes new ideas, encourages diverse perspectives, develops our people, and fosters a collaborative team environment.Electrical R...Show more
    Last updated: 4 days ago • Promoted
    Quality Engineer II

    Quality Engineer II

    Bio-Rad Laboratories • Hercules, CA, United States
    Full-time
    The Bio-Rad Laboratories Design Quality Assurance team is seeking a detailed oriented Quality Engineer to support software life cycle management of life science and clinical diagnostics product des...Show more
    Last updated: 30+ days ago • Promoted
    Senior Field Engineering TechnicianReliability & Test • Berkeley, CA • Full time • On-site

    Senior Field Engineering TechnicianReliability & Test • Berkeley, CA • Full time • On-site

    Form Energy • Berkeley, CA, United States
    Full-time
    Are you ready to build America's energy future? Form Energy is an American manufacturing and energy technology company.We're revolutionizing energy storage with cost-effective, multi-day technology...Show more
    Last updated: 30+ days ago • Promoted
    QA Engineer (VN155AP2025)

    QA Engineer (VN155AP2025)

    40HRS, Inc. • Fremont, CA, US
    Full-time
    Job summary The Quality Engineer ensures the reliability and quality of • • • • • • • • •'s computer hardware products.This role involves developing, implementing, and monitoring quality control processes ...Show more
    Last updated: 30+ days ago • Promoted
    Software Engineer, Reliability

    Software Engineer, Reliability

    OpenAI • San Francisco, CA, United States
    Full-time
    Join the engineering teams that bring OpenAI’s ideas safely to the world!!.The Applied Engineering team works across research, engineering, product, and design to bring OpenAI’s technology to consu...Show more
    Last updated: 30+ days ago • Promoted
    Failure Analysis Engineer

    Failure Analysis Engineer

    Nokia • Sunnyvale, CA, United States
    Full-time
    FAILURE ANALYSIS / YIELD IMPROVEMENT ENGINEER.Responsible for failure analysis, with an emphasis on driving process and tool characterization, as well as yield improvement, in a wafer fab that manu...Show more
    Last updated: 1 day ago • Promoted
    Failure Analysis Engineer

    Failure Analysis Engineer

    Hyve Solutions • Fremont, CA, United States
    Full-time
    Customer & Cross-Functional Collaboration.Work directly with top-name customers.Drive technical issues to closure by.Failure Analysis & Troubleshooting. PCBA-level troubleshooting and component-leve...Show more
    Last updated: 30+ days ago • Promoted
    Senior Software Development Engineer in Test

    Senior Software Development Engineer in Test

    Informatica LLC • Redwood City, CA, United States
    Full-time
    Build Your Career at Informatica.We seek innovative thinkers who believe in the power of data to drive meaningful change. At Informatica, we welcome adventurous minds eager to solve the world's most...Show more
    Last updated: 30+ days ago • Promoted
    Reliability Engineer (Rotating Equipment)

    Reliability Engineer (Rotating Equipment)

    Advantage Technical • Rodeo, CA, US
    Full-time
    Reliability Engineer (Rotating Equipment).Contract : 1 year, could extend.Bachelor's degree in mechanical engineering or related technical discipline. Minimum 5 years' rotating equipment reliability ...Show more
    Last updated: 23 hours ago • Promoted
    Automation Engineer for Safety Systems

    Automation Engineer for Safety Systems

    Lawrence Berkeley National Laboratory • Berkeley, CA, United States
    Full-time
    Berkeley Lab's Engineering division is hiring an Automation Engineer.PLC- and relay-based safety technologies,.This is a rare opportunity to apply your technical expertise while contributing to gro...Show more
    Last updated: 30+ days ago • Promoted
    Operations Reliability Engineer

    Operations Reliability Engineer

    Apple • Cupertino, CA, United States
    Full-time
    Imagine what you could do here.At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there'...Show more
    Last updated: 1 day ago • Promoted
    Semiconductor Failure Analysis Engineer

    Semiconductor Failure Analysis Engineer

    Orion Placement • San Jose, CA, United States
    Full-time
    This position requires employees to be on-site five days a week at our client's San Jose, CA office.You must have at least 10 years of experience in semiconductor failure analysis and be fluent in ...Show more
    Last updated: 30+ days ago • Promoted