Talent.com
Reliability Engineer

Reliability Engineer

EtchedSan Jose, CA, US
Hace 29 días
Tipo de contrato
  • A tiempo completo
Descripción del trabajo

Job Description

Job Description

About Etched :

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning.

Job Summary :

We are seeking a skilled and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer at Etched, you will play a critical role in ensuring that all components and systems meet our rigorous reliability standards, essential for our datacenter applications. This position requires a deep understanding of reliability engineering principles, as well as experience working with suppliers, ODMs, and JDMs.

Key Responsibilities :

Lead the development, implementation, and management of reliability standards for all suppliers working with Etched. Ensure that all components and systems meet or exceed the required reliability benchmarks.

Review and verify reliability reports from suppliers, ensuring accuracy and adherence to Etched’s standards. Provide guidance and feedback to suppliers to ensure continuous improvement in reliability performance.

Collaborate with cross-functional teams to review and recommend component selection criteria based on reliability performance. Ensure that all selected components are capable of meeting the long-term reliability requirements of our datacenter applications.

Evaluate and approve reliability test plans proposed by external vendors. Ensure that the test methodologies and conditions are sufficient to validate long-term reliability under expected operating conditions.

Conduct in-depth analysis of reliability data provided by suppliers and vendors. Identify trends, potential issues, and areas for improvement to enhance overall reliability.

Work closely with ODMs (Original Design Manufacturers) and JDMs (Joint Design Manufacturers) to ensure that all products meet Etched quality and reliability standards. Provide technical guidance and support to maintain maximum operational uptime and long-term reliability.

Review and establish reliability metrics and standards for silicon components, ensuring they meet the stringent requirements for long-term reliability in data center environments.

You maybe a good fit if you have

Bachelor’s or Master’s degree in Reliability Engineering, Electrical Engineering, or a related field.

5+ years of experience in reliability engineering, with a focus on datacenter applications preferred.

Strong understanding of reliability standards, testing methodologies, and data analysis techniques. DFMEA / PFMEA / SPC Engineering analysis experience desired.

Experience working with suppliers, ODMs, and JDMs in a high-tech environment.

Excellent communication skills, with the ability to convey complex technical concepts to diverse stakeholders.

Proven ability to manage multiple projects and deliver results in a fast-paced environment.

How we’re different :

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in West San Jose, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Base Salary Compensation :

$175,000 - $225,000

Benefits :

Full medical, dental, and vision packages, with generous premium coverage

Housing subsidy of $2,000 / month for those living within walking distance of the office

Daily lunch and dinner in our office

Relocation support for those moving to West San Jose

Crear una alerta de empleo para esta búsqueda

Reliability Engineer • San Jose, CA, US

Ofertas relacionadas
  • Oferta promocionada
  • Nueva oferta
Reliability Engineer (Regulated Industry) (San Francisco)

Reliability Engineer (Regulated Industry) (San Francisco)

Mentor Technical GroupSan Francisco, CA, US
A tiempo parcial
Mentor Technical Group (MTG) provides a comprehensive portfolio of technical support and solutions for the FDA-regulated industry. As a world leader in life science engineering and technical solutio...Mostrar másÚltima actualización: hace 6 horas
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

Redwood Materials, Inc.San Francisco, CA, United States
A tiempo completo
Redwood is localizing a global battery supply chain that seamlessly integrates recovery, reuse, and recycling—keeping critical minerals in circulation and driving the energy transition.Founded in 2...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Founding Site Reliability Engineer

Founding Site Reliability Engineer

ReductoSan Francisco, CA, United States
A tiempo completo
Nearly 80% of enterprise data is in unstructured formats like PDFs.PDFs are the status quo for enterprise knowledge in nearly every industry. Reducto helps extract data from complex documents, enabl...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

criteoPalo Alto, CA, United States
A tiempo completo
At Criteo we face challenging problems in the IT industry at scale.Our data is large and our systems require speed and complexity handling. We have about 40 petabytes in Hadoop storage and respond t...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

PsiQuantumPalo Alto, CA, United States
A tiempo completo
Quantum computing holds the promise of humanity's mastery over the natural world, but only if we can build a.PsiQuantum is on a mission to build the first real, useful quantum computers, capable of...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

VirtualVocationsHayward, California, United States
A tiempo completo
A company is looking for a Site Reliability Engineer 1.Key Responsibilities Manage deployments of services to the GovCloud Monitor KPIs of services running in the GovCloud Author and maintain d...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Staff Site Reliability Engineer

Staff Site Reliability Engineer

VirtualVocationsFremont, California, United States
A tiempo completo
A company is looking for a Staff Site Reliability Engineer.Key Responsibilities Define and drive the strategic direction for SRE practices and reliability engineering Architect and implement com...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
  • Nueva oferta
Senior Engineer, Site Reliability

Senior Engineer, Site Reliability

VirtualVocationsFremont, California, United States
A tiempo completo
A company is looking for a Senior Engineer in Site Reliability Engineering for Digital Banking.Key Responsibilities Ensure the reliability, availability, and performance of applications in produc...Mostrar másÚltima actualización: hace 16 horas
  • Oferta promocionada
Reliability Engineer

Reliability Engineer

Periodic LabsMenlo Park, CA, United States
A tiempo completo
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries.We are well funded and growing rapidly. Team members are owners who identify and solve prob...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Reliability Engineer

Reliability Engineer

DoorDashSan Francisco, CA, United States
A tiempo completo
DoorDash Labs, established in 2018, serves as the innovation hub for DoorDash, focusing on developing automation and robotics solutions to enhance last-mile logistics. The team's mission is to creat...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Reliability Engineer

Reliability Engineer

DoorDash USASan Francisco, CA, United States
A tiempo completo
DoorDash Labs, established in 2018, serves as the innovation hub for DoorDash, focusing on developing automation and robotics solutions to enhance last-mile logistics. The team's mission is to creat...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
  • Nueva oferta
AWS Remediation Engineer

AWS Remediation Engineer

VirtualVocationsHayward, California, United States
A tiempo completo
A company is looking for a Senior AWS Remediation Engineer.Key Responsibilities : Manage security issues and ensure timely remediation Design and implement automated security solutions for cloud ...Mostrar másÚltima actualización: hace 20 horas
  • Oferta promocionada
Site Reliability Engineer - Inference

Site Reliability Engineer - Inference

Jobright.aiSan Francisco, CA, United States
A tiempo completo
Site Reliability Engineer - Inference.Be among the first 25 applicants.Site Reliability Engineer - Inference.Get AI-powered advice on this job and more exclusive features.Jobright is an AI-powered ...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Senior Site Reliability Engineer

Senior Site Reliability Engineer

VirtualVocationsHayward, California, United States
A tiempo completo
A company is looking for a Senior Site Reliability Engineer (contractor).Key Responsibilities Design and manage infrastructure using Terraform and CloudFormation Define and maintain SLIs, SLOs, ...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Site Reliability Engineering Manager-Ecommerce

Site Reliability Engineering Manager-Ecommerce

Synstack TechnologiesSan Ramon, CA, US
A tiempo completo
This is Hemanth from Synstack, please share your resume for below opportunity.Location – San Ramon, CA (onsite).A Site Reliability Engineer is a professional who acts as a warrior to monitor,...Mostrar másÚltima actualización: hace 5 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

PacerProSan Francisco, CA, United States
A tiempo completo
You’ll be joining the engineering team responsible for delivering PacerPro’s SaaS and on-premise solutions that orchestrate case data workflows and provide data driven legal insights for our client...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

Jobs via DiceRedwood City, CA, United States
A tiempo completo
Dice is the leading career destination for tech experts at every stage of their careers.Our client, Kforce Technology Staffing, is seeking a Reliability Engineer in Redwood City, CA.Deliver high-le...Mostrar másÚltima actualización: hace 2 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

PrimerSan Francisco, CA, United States
A tiempo completo
Primer helps B2B products break out of the B2C-centric marketing box.Our platform turns consumer ad channels, data streams, and emerging AI workflows into measurable growth engines for go-to-market...Mostrar másÚltima actualización: hace más de 30 días
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

ConductorOneSan Francisco, CA, United States
A tiempo completo
Shape the future of identity with the highest-caliber team.If you’re amazing at what you do and want to solve big challenges in identity and security, come on board. Identity is how companies are be...Mostrar másÚltima actualización: hace 1 día
  • Oferta promocionada
Site Reliability Engineer

Site Reliability Engineer

ZapierSan Francisco, CA, United States
A tiempo completo
We're humans who simply think computers should do more work.At Zapier, we’re not just making software—we’re building a platform to help millions of businesses globally scale with automation and AI....Mostrar másÚltima actualización: hace 2 días