Talent.com
Data Center Facility Operations Reliability Engineer

Data Center Facility Operations Reliability Engineer

MetaBoston, MA, United States
6 days ago
Job type
  • Full-time
Job description

Overview

Summary :

Meta was built to help people connect and share, and over the last decade, our tools have played a critical part in changing how people around the world communicate with one another. With over two billion people using the service and hundreds of offices around the globe, a career at Meta offers countless ways to make an impact in a fast-growing organization.Our Data Centers are the foundation upon which our rapidly scaling infrastructure efficiently operates to deliver our innovative services. Meta is seeking an experienced and self-motivated Reliability Lead to join our Asset Management & Reliability team within Facility Operations. This person will work at the leading edge of Facility Operations to identify and manage asset reliability risks that could adversely affect data center operations. Managing stakeholders spread across time zones is a significant challenge and key to the success of our individual projects and overall asset management, quality and reliability program.

Responsibilities

Support the asset care and maintenance strategies for critical assets based on Meta Processes

Support the development of standards, guidelines and processes to execute reliability program function

Lead and facilitate asset criticality assessments, RCM studies, PM Optimization and other reliability studies

Perform reliability analytics include Weibull distribution, Monte Carlo simulation and other reliability analysis

Act as liaison between Reliability and other partner teams (AM, Quality, SSU, Retrofits)

Support the development of standardized PM template to facilitate trending

Works with appropriate technical teams to evaluate reliability and maintainability of data center equipment to significantly influence reliability and maintainability improvements

Works with Asset Management and Quality teams to evaluate the failure data and other information and build that into a global reliability database

Provides input for key documents such as reliability process playbooks, executive, briefs, presentations and program metrics

Support the spares development and sustainment program

Support the development and stewardship of maintenance strategies

Support Master Data and asset onboarding process

Develop or recommend engineering solutions to repetitive failures and all other problems that adversely affect plant operations

Define, design, develop, monitor, and refine an asset maintenance plan that includes both (a) value-added preventive maintenance tasks and (b) predictive and other non-destructive testing methods designed to identify and isolate inherent reliability issues

Develop Reliability Improvement Process (RIP) reports on critical asset failures

Work with Maintenance to analyze asset characteristics, including : asset availability, overall equipment effectiveness, remaining useful life

Provide technical support to Operations, Maintenance management, and technical personnel

Apply value analysis to repair / replace, repair / redesign, and make / buy decisions

Minimum Qualifications

Bachelor’s degree in Mechanical, Electrical Reliability Engineering or similar technical discipline

10+ years of experience in reliability engineering (related to electrical or mechanical cooling equipment)

Experienced in Reliability Centered Maintenance (RCM)and Failure Maintenance Effect Analysis activities for maintenance / process / equipment design optimization to meet reliability requirements

Proficient in usage of EAM solutions to extract data and develop meaningful insights

Certifications in Maintenance & Reliability such as CMRP, CRL, CRE

Knowledgeable of relevant ISO standards (ISO 14224, ISO 17359, ISO 55000)

Preferred Qualifications

Experience with data center equipment such as critical cooling systems, generators, main switchboards, network gear

Proficient in data analysis techniques that can include Process Control, Reliability modeling and prediction, Fault Tree Analysis, Weibull Tree Analysis, Six Sigma (6σ) Methodology

Proficient in developing and executing test plans for assets

Certifications in Maintenance & Reliability such as CMRP, CRL, CRE

Knowledgeable of relevant ISO standards (ISO 14224, ISO 17359, ISO 55000)

Public Compensation

$133,000 / year to $190,000 / year + bonus + equity + benefits

Industry : Internet

Equal Opportunity

Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.

Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.

#J-18808-Ljbffr

Create a job alert for this search

Data Center Engineer • Boston, MA, United States

Related jobs
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

LogRocketBoston, MA, United States
Full-time
Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - USA).Founded in 2016, LogRocket's goal is to make every experience on the web as perfect as possible.We solve a huge ch...Show moreLast updated: 30+ days ago
  • Promoted
  • New!
Site Reliability Engineer

Site Reliability Engineer

Iron MountainBoston, MA, United States
Full-time
Get AI-powered advice on this job and more exclusive features.This range is provided by Iron Mountain.Your actual pay will be based on your skills and experience — talk with your recruiter to learn...Show moreLast updated: 4 hours ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Red HatBoston, MA, United States
Full-time +1
Join to apply for the Site Reliability Engineer role at Red Hat.Red Hat is looking for a Platform Engineer to join its Platform Engineering team! In this role, you will help architect, implement, i...Show moreLast updated: 6 days ago
  • Promoted
Lead Reliability Engineer

Lead Reliability Engineer

ArcadisBoston, MA, United States
Full-time +1
Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Arcadis is the world's leading company delivering sustainable design, engineering, and consultancy sol...Show moreLast updated: 6 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

WabbisoftBoston, MA, United States
Full-time
Boston, MA or Remote / / Full-time Position.Are you interested in helping companies transform the way they think about security as part of their software development pipeline? If “Yes!,” then keep re...Show moreLast updated: 30+ days ago
  • Promoted
Staff Site Reliability Engineer - Observability

Staff Site Reliability Engineer - Observability

Hispanic Alliance for Career EnhancementBoston, MA, United States
Full-time
At CVS Health, we’re building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.As the nation’s leading h...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Black Duck SoftwareBurlington, Massachusetts, United States
Full-time
Black Duck, a recognized pioneer in application security, provides SAST, SCA, and DAST solutions that enable teams to quickly find and fix vulnerabilities and defects in proprietary code, open sour...Show moreLast updated: 2 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

LogRocket, IncBoston, MA, United States
Full-time
LogRocket is an equal opportunity employer.We celebrate diversity and are committed to creating an inclusive environment for all employees. LogRocket will consider sponsoring visas for applicants in...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer I

Site Reliability Engineer I

AxonBoston, Massachusetts, United States
Full-time
Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof...Show moreLast updated: 30+ days ago
  • Promoted
Area Schedule Lead - Data Center Design, Engineering and Construction

Area Schedule Lead - Data Center Design, Engineering and Construction

Boston StaffingBoston, MA, US
Full-time
Contingent Workforce Project Scheduler Role.We are seeking a candidate for a key role overseeing our Contingent Workforce of Project Schedulers. Leadership, communication, and organization skills ar...Show moreLast updated: 16 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

DevOps projectsBoston, MA, United States
Full-time
Cimulate is an AI-native eCommerce search and discovery platform built on cutting‑edge LLM technology.We help commerce brands deliver radically better shopping experiences—faster, more relevant, an...Show moreLast updated: 2 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

Cimulate AIBoston, MA, United States
Full-time
In this pivotal role, you will own the reliability, availability, and performance of our SaaS production environment—monitoring critical systems, managing deployments, and ensuring seamless operati...Show moreLast updated: 6 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

AxonBoston, Massachusetts, United States
Full-time
Join Axon and be a Force for Good.At Axon, we’re on a mission to Protect Life.We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and cloud sof...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CanonicalBoston, MA, United States
Full-time
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiat...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

LogrocketBoston, Massachusetts, United States
Full-time
Founded in 2016, LogRocket's goal is to make every experience on the web as perfect as possible.We're solving a huge challenge for product managers and developers - understanding the user experienc...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer III - AWM

Site Reliability Engineer III - AWM

JPMorgan Chase & Co.Boston, MA, United States
Full-time
We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.As a Software Engineer III at JPMorganChase within the Asset and Wealth Management A...Show moreLast updated: 16 days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

startusBoston, MA, United States
Full-time
As a member of a small cross functional squad, you’ll own a particular infrastructure challenge at Spotify.Design and document systems, including writing and reviewing code, to automate away proble...Show moreLast updated: 30+ days ago
  • Promoted
Service Desk Engineer L1

Service Desk Engineer L1

Zensar TechnologiesPlymouth, MA, US
Full-time
Zensar is hiring a Service Desk Engineer - L1 with 0–1+ years of phone support and solid Windows technical experience, if you're passionate about IT support and eager to grow in a dynamic environme...Show moreLast updated: 18 days ago