Talent.com
Senior Site Reliability Engineer
Senior Site Reliability EngineerMcAfee • Frisco, TX, US
Senior Site Reliability Engineer

Senior Site Reliability Engineer

McAfee • Frisco, TX, US
2 days ago
Job type
  • Full-time
Job description

Overview

Join to apply for the Senior Site Reliability Engineer role at McAfee

This is a Hybrid position located in Frisco, TX. You will be onsite on an as-needed basis, typically 1 to 6 times a month. We are only considering candidates within a commutable distance to one of the two locations and are not offering relocation assistance at this time.

About The Role

  • As the SRE engineer, you will be accountable and responsible to maintain the appropriate service levels (availability, latency, and reliability) to serve our customers' needs, and reduce the friction for managing change. Your responsibilities include engaging with DevOps, Engineering & other teams to understand and support the business needs and initiatives.
  • Every SRE is responsible for the availability, scalability, security, performance, cost, and compliance requirements of our services.
  • You will ensure applications onboarded to SRE are instrumented for full-stack observability and continuous testing, introduce continuous improvement, integrate into IT Service Operations, and share support responsibilities for critical customer journeys, business flows, and applications.
  • Responsible for proactive monitoring of mission critical production environments and respond quickly to breaches in trends or issues.
  • Troubleshoot, debug, and escalate issues with proper analysis to concerned teams to ensure maximum availability.
  • Troubleshoot problems in real-time, interacting with DevOps / Engineering and internal support representatives to deliver maximum customer satisfaction.
  • Detect and triage all operational incidents and requests.
  • Work to reduce Mean Time to Restore (MTTR) and improve Mean Time To Detect (MTTD).
  • Own availability and performance of mission critical services. Automate to prevent problem recurrence, and respond to non-exceptional service conditions.
  • Help maintain and improve service operations by following established processes and procedures and periodically update SOPs and documents in Confluence.
  • Create and manage day-to-day processes including Change Management, Incident Management, and Problem Management.
  • Support automation initiatives to enhance MTTR and MTTD.
  • Help track KPIs to support operational performance and service reliability.
  • Participate in incident retrospectives and assist in managing the incident lifecycle.
  • Plan and deploy patches and product enhancements to our environments.
  • Engage in readiness reviews before changes or deployments into production environments.
  • Support product engineering teams on SRE-related activities to establish optimal SLAs for predefined activities and provide a high-quality customer experience.
  • Provide detailed summaries of high-priority issues to stakeholders ensuring data quality.
  • Participate early in the SDLC to ensure reliability is built in from the beginning and create plans for successful implementations / launches and smooth transition into the SRE team.
  • Create accurate root cause analyses of production issues and help provide long-term solutions.
  • Continually evaluate and adopt the latest industry technologies to optimize costs and streamline processes.
  • Communicate effectively and present team progress to leadership. Lead by example technically and establish credibility with quality execution. Mentor and coach other SRE team members.

About You

  • 4 to 5+ years of software development and / or technical operations experience, with experience running large-scale applications.
  • Prior experience in SRE / DevOps, Infrastructure Engineering, and Systems Engineering.
  • Experience in defining and monitoring for highly resilient and reliable applications.
  • Experience maintaining and operating production systems (>
  • 99.95% SLA) on Cloud.

  • Able to monitor, debug & RCA for service failures.
  • Exceptional communication skills that cross team and geographical boundaries.
  • Advanced knowledge and skills within a specific technical discipline with understanding of the impact of work on other areas of the organization.
  • Enjoy working with a large variety of services and technologies.
  • Experience with Monitoring, logging, APM & other tools : Grafana, CloudWatch, etc.
  • Experience with CI / CD tools : Git, Jenkins, Harness, etc.
  • Experience with container technologies : Kubernetes, Docker.
  • Experience with both Windows and Linux operating systems.
  • Strong knowledge of AWS cloud services covering serverless and containerized workloads.
  • Good to have ITIL, HDI, AWS, or other cloud certifications.
  • Ability to work in a fast-paced, high-growth environment and continuously learn to improve efficiency with technology and tools.
  • Ability to work some non-standard hours to support a global team and initiatives.
  • Company Overview

    McAfee is a leader in personal security for consumers. Focused on protecting people, not just devices, McAfee consumer solutions adapt to users' needs in an always online world, empowering them to live securely through integrated, intuitive solutions that protects their families and communities with the right security at the right moment.

    Company Benefits And Perks

    We work hard to embrace diversity and inclusion and encourage everyone at McAfee to bring their authentic selves to work every day. We're proud to be Great Place to Work Certified in 10 countries, a reflection of the supportive, empowering environment we've built where people feel seen, valued, and energized to reach their full potential and thrive. We offer a variety of social programs, flexible work hours and family-friendly benefits to all of our employees.

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement
  • We're serious about our commitment to diversity, which is why McAfee prohibits discrimination based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.

    J-18808-Ljbffr

    Create a job alert for this search

    Senior Site Reliability Engineer • Frisco, TX, US

    Related jobs
    Site Reliability Engineer II

    Site Reliability Engineer II

    VirtualVocations • Grand Prairie, Texas, United States
    Full-time
    A company is looking for a Site Reliability Engineer II- Process Automation.Key Responsibilities Optimize and automate incident and change management processes to enhance system efficiency and re...Show more
    Last updated: 30+ days ago • Promoted
    Senior Principal Reliability Engineer

    Senior Principal Reliability Engineer

    Raytheon • Mckinney, TX, US
    Full-time
    At Raytheon, the foundation of everything we do is rooted in our values and a higher calling – to help our nation and allies defend freedoms and deter aggression. We bring more than 100 years of exp...Show more
    Last updated: 2 days ago • Promoted
    Lead SRE Engineer

    Lead SRE Engineer

    Motion Recruitment • Arlington, TX, US
    Temporary
    This range is provided by Motion Recruitment.Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Motion Recruitment has partnered with a financial s...Show more
    Last updated: 2 days ago • Promoted
    Senior Site Reliability Engineer (Overnight)

    Senior Site Reliability Engineer (Overnight)

    OVHcloud • Dallas, TX, US
    Full-time
    Quick Apply
    Job Summary The Senior Site Reliability Engineer (SRE) will ensure the high availability, performance, monitoring, and incident response for multiple OVHcloud products and services.This role involv...Show more
    Last updated: 30+ days ago
    Cyber Reliability Engineer

    Cyber Reliability Engineer

    VirtualVocations • Carrollton, Texas, United States
    Full-time
    A company is looking for a Cyber Reliability Engineer Senior Consultant specializing in Infrastructure Monitoring.Key Responsibilities Collaborate with cross-functional teams to ensure monitoring...Show more
    Last updated: 2 days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    VirtualVocations • Arlington, Texas, United States
    Full-time
    A company is looking for a Site Reliability Engineer to provide engineering and operational support for cloud and application services in Oracle Cloud Infrastructure (OCI).Key Responsibilities De...Show more
    Last updated: 30+ days ago • Promoted
    Lead, Systems Engineer (Cost Engineer - TruePlanning))

    Lead, Systems Engineer (Cost Engineer - TruePlanning))

    L3Harris Technologies • BALCH SPRINGS, Texas, United States
    Full-time
    L3Harris is dedicated to recruiting and developing high-performing talent who are passionate about what they do.Our employees are unified in a shared dedication to our customers’ mission and quest ...Show more
    Last updated: 30+ days ago • Promoted
    Lead, Systems Engineer (Cost Engineer - TruePlanning)

    Lead, Systems Engineer (Cost Engineer - TruePlanning)

    L3Harris Technologies • WESTON, Texas, United States
    Full-time
    L3Harris is dedicated to recruiting and developing high-performing talent who are passionate about what they do.Our employees are unified in a shared dedication to our customers’ mission and quest ...Show more
    Last updated: 3 hours ago • Promoted • New!
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    VirtualVocations • Mesquite, Texas, United States
    Full-time
    A company is looking for a Senior Site Reliability Engineer.Key Responsibilities Design, develop, and implement software to enhance system availability, scalability, latency, and efficiency Lead...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FinThrive • Plano, TX, United States
    Full-time
    Site Reliability Engineer (Azure).This role blend software development and systems administration and cloud engineering to ensure the reliability, scalability, and performance of systems and servic...Show more
    Last updated: 3 days ago • Promoted
    Senior Reliability and Maintainability Engineer

    Senior Reliability and Maintainability Engineer

    Prattwhitney • Mckinney, TX, US
    Full-time
    TX190 : 2501 West University, McKinney 2501 West University , McKinney, TX, 75070 USA.Person, or Immigration Status Requirements : . The ability to obtain and maintain a U.DoD Clearance : SecretAt Rayth...Show more
    Last updated: 2 days ago • Promoted
    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible)

    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible)

    Capital One • PLANO, Texas, United States
    Remote
    Full-time +1
    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible).Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, col...Show more
    Last updated: 17 days ago • Promoted
    Customer Reliability Engineer

    Customer Reliability Engineer

    VirtualVocations • Grand Prairie, Texas, United States
    Permanent
    A company is looking for a Customer Reliability Engineer to ensure the stability and performance of solutions while providing technical escalation support for customers.Key Responsibilities Serve...Show more
    Last updated: 30+ days ago • Promoted
    Director, Distinguished Engineer ( Remote-Eligible)

    Director, Distinguished Engineer ( Remote-Eligible)

    Capital One • DALLAS, Texas, United States
    Remote
    Full-time +1
    Director, Distinguished Engineer ( Remote-Eligible).Engineers are leading experts in their domains, helping devise practical and reusable solutions to complex problems. You will drive innovation at...Show more
    Last updated: 5 hours ago • Promoted • New!
    Site Reliability Engineering Manager

    Site Reliability Engineering Manager

    VirtualVocations • Carrollton, Texas, United States
    Full-time
    A company is looking for a Manager, Site Reliability Engineer.Key Responsibilities Ensure systems and services maintain high availability, reliability, and scalability Develop and maintain autom...Show more
    Last updated: 30+ days ago • Promoted
    Senior Site Reliability Engineer- Full Time Role

    Senior Site Reliability Engineer- Full Time Role

    Maveric Systems Limited • Irving, TX, US
    Full-time
    Senior Site Reliability Engineer - Full Time Role.Required Skills & Experience : .SRE, DevOps, or infrastructure engineering roles. Strong experience with cloud platforms (AWS, GCP, or Azure).Prof...Show more
    Last updated: 2 days ago • Promoted
    Senior Industrial Engineer (Hybrid)

    Senior Industrial Engineer (Hybrid)

    Sally Beauty Holdings • Denton, Texas, US
    Full-time
    Overview Job Title : Senior Industrial Engineer This position is Hybrid working in our North Texas Distribution Center and our Support Center. At Sally Beauty Holdings, we find beauty in diversity.Ou...Show more
    Last updated: 30+ days ago • Promoted
    Site Reliability Engineer Lead

    Site Reliability Engineer Lead

    VirtualVocations • Grand Prairie, Texas, United States
    Full-time
    A company is looking for a Site Reliability Engineer, Team Lead.Key Responsibilities Ensure 24x7 availability of production application systems and drive operational efficiency initiatives Ident...Show more
    Last updated: 2 days ago • Promoted