Talent.com
Site Reliability Engineer

Site Reliability Engineer

The Judge GroupIrving, TX, US
4 days ago
Job type
  • Full-time
Job description

About the Role :

Our client is seeking a Site Reliability Engineer (SRE) with deep expertise in monitoring, debugging, and optimizing Azure App Services. This role is critical in ensuring our platforms remain reliable, performant, and scalable as we continue to grow.

You'll combine hands-on Azure experience with code-level debugging, observability best practices, and automation to prevent issues before they occur, drive down MTTD / MTTR, and deliver an exceptional experience for patients and providers. If you thrive at the intersection of infrastructure, development, and performance, this is the role for you.

What You'll Do :

Monitoring & Debugging

  • Design, implement, and fine-tune monitoring systems for Azure-based applications.
  • Build custom dashboards with Azure Application Insights, Azure Monitor, and related tools.
  • Analyze logs, metrics, and traces to proactively troubleshoot performance and reliability issues.
  • Apply proficiency in C#, .NET, Angular, and SQL for code-level debugging and issue resolution.

Azure App Service Expertise

  • Optimize application performance through a deep understanding of Azure App Service architecture.
  • Configure, manage, and scale App Service environments for multiple applications.
  • Azure Tooling & Automation

  • Leverage Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting to resolve application and infrastructure issues.
  • Automate monitoring, alerting, and remediation workflows to improve reliability and reduce toil.
  • Application Performance Monitoring

  • Use tools like Grafana, Prometheus, or other APM platforms to optimize system health and application performance.
  • Stay adaptable and quickly learn new monitoring tools and frameworks as needed.
  • Collaboration & Communication

  • Partner closely with developers and operations to design effective monitoring solutions.
  • Document and communicate findings, solutions, and RCA reports with clarity and impact.
  • What We're Looking For :

    Bachelor's degree in Computer Science, IT, or related field.

  • Microsoft Azure Fundamentals (AZ-900) certification required
  • Proven SRE experience with a focus on monitoring, debugging, and incident response.

  • Extensive hands-on work with Azure App Services, Application Insights, and Azure Monitor.
  • Skilled with Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting.
  • Strong programming fundamentals with the ability to read and troubleshoot .NET / C# and Angular code.
  • Experience in on-call operations, incident response, and RCA writing.
  • Bonus : Experience with Grafana / Prometheus, DataDog / Dynatrace, Azure Front Door, CDN, Function Apps, WebJobs, Service Bus, or Event Hub.
  • Excellent communication, collaboration, and problem-solving skills.
  • Azure certifications are a strong plus.
  • Create a job alert for this search

    Site Reliability Engineer • Irving, TX, US

    Related jobs
    Principal, Site Reliability Engineering

    Principal, Site Reliability Engineering

    CotalityUSA, Texas, Dallas
    Full-time
    At Cotality, we are driven by a single mission-to make the property industry faster, smarter, and more people-centric.Cotality is the trusted source for property intelligence, with unmatched precis...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    FinThrivePlano, TX, United States
    Full-time
    Site Reliability Engineer (Azure).This role blend software development and systems administration and cloud engineering to ensure the reliability, scalability, and performance of systems and servic...Show moreLast updated: 30+ days ago
    • Promoted
    Associate Principal, Site Reliability Engineering

    Associate Principal, Site Reliability Engineering

    The Options Clearing CorporationDallas, TX, United States
    Full-time
    THIS POSITION IS NOT ELIGIBLE FOR VISA SPONSORSHIP • • • • •.Provide strong support for the availability and performance of OCC's next generation Ovation platform. Enhance system reliability and develope...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineer (SRE)

    Site Reliability Engineer (SRE)

    GSSR IncSouthlake, TX, US
    Full-time
    Client is actively gettingoperational best practices to an even higher level, while adopting andtransforming to a full agile model in the software development practices. TheODX (Operational Data Exc...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Engineer Staff

    Reliability Engineer Staff

    Lockheed Martin CorporationGrand Prairie, TX, United States
    Full-time
    You will be the Reliability Engineer Staff for our MFC Enterprise Facilities team, responsible for developing and implementing reliability-centered maintenance (RCM) programs to ensure the optimal ...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    On-Site Engineer

    On-Site Engineer

    SWK TechnologiesDallas, TX, United States
    Full-time
    SWK Technologies is committed to optimizing the technological landscape for businesses through customized software solutions and superior technical support. We are looking for a dedicated On-Site En...Show moreLast updated: 9 hours ago
    • Promoted
    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible)

    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible)

    Capital OneDALLAS, Texas, United States
    Remote
    Full-time +1
    Sr Lead Software Engineer, Back End / SRE - Shopping (Remote-Eligible).Do you love building and pioneering in the technology space? Do you enjoy solving complex business problems in a fast-paced, col...Show moreLast updated: 30+ days ago
    • Promoted
    Site Reliability Engineering

    Site Reliability Engineering

    ForhyreDallas, TX, US
    Full-time
    Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changin...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Labs Engineer

    Reliability Labs Engineer

    ON SemiconductorRichardson, TX, United States
    Full-time
    Focus on the core content of the job post, remove all extra metadata from the top of the page.Remove any extra mentions of description, job details, job post, etc. At the top of the page, remove obv...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Staff Engineer, Reliability

    Staff Engineer, Reliability

    CelesticaRichardson, TX, United States
    Full-time
    The Staff Reliability Engineer, works in cross functional teams with designers, customers and manufacturing engineering and project leaders to ensure products designed can meet reliability specific...Show moreLast updated: 11 hours ago
    • Promoted
    • New!
    Reliability Engineer

    Reliability Engineer

    Multi-Color Corporation MCCFort Worth, TX, United States
    Full-time
    Build Your Career with an Industry Leader.As the global leader of premium labels, Multi-Color Corporation (MCC) helps brands stand out in competitive markets and inspire positive consumer experienc...Show moreLast updated: 3 hours ago
    • Promoted
    Director, Site Reliability Engineering

    Director, Site Reliability Engineering

    Fidelity InvestmentsRoanoke, TX, US
    Full-time
    Our Site Reliability Engineering (SRE) group within Enterprise Infrastructure blends Operational excellence with developer experience to deliver highly available, scalable, and resilient services t...Show moreLast updated: 30+ days ago
    • Promoted
    Reliability Engineer

    Reliability Engineer

    Trinity IndustriesDallas, TX, United States
    Full-time
    Railcar Fleet Engineering team at our corporate headquarters in .Analyze data from various quality inputs (including, but not limited to nonconformance reports, customer complaints, and internal qu...Show moreLast updated: 4 days ago
    • Promoted
    Site Reliability Engineer

    Site Reliability Engineer

    SogetiFort Worth, TX, US
    Full-time
    The Site Reliability Engineer (SRE) will be responsible for implementing and managing observability across a hybrid application and data landscape. This role ensures system reliability, scalability,...Show moreLast updated: 6 days ago
    • Promoted
    Site Reliability Engineer, Azure

    Site Reliability Engineer, Azure

    Wellfit TechnologiesIrving, TX, US
    Full-time
    Wellfit is the dental industry’s.As a healthcare fintech innovator, we’re transforming the patient journey and redefining what’s possible in dental care.We are seeking a Site Reli...Show moreLast updated: 9 days ago
    Site Reliability Engineer_W2_Irving TX

    Site Reliability Engineer_W2_Irving TX

    Chelsoft Solutions CoIrving, TX, United States
    Full-time
    Quick Apply
    Hybrid Onsite - Irving, TX (3 days / week) Site Reliability Engineer Contract through end of 2025, will extend Responsibilities Design and e...Show moreLast updated: 4 days ago
    • Promoted
    Reliability Engineer Sr

    Reliability Engineer Sr

    Lockheed Martin CorporationGrand Prairie, TX, United States
    Full-time +1
    Lockheed Martin is a global security and aerospace company that employs approximately 114,000 people worldwide and is principally engaged in the research, design, development, manufacture, integrat...Show moreLast updated: 3 days ago
    • Promoted
    Reliability Engineer

    Reliability Engineer

    TrinityRailDallas, TX, United States
    Full-time
    Railcar Fleet Engineering team at our corporate headquarters in.Analyze data from various quality inputs (including, but not limited to nonconformance reports, customer complaints, and internal qua...Show moreLast updated: 7 days ago