Site Reliability Engineer - USDS

TikTok

Seattle

Full-time

About TikTok . Data SecurityTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy.

Data Security ( USDS ) is a subsidiary of TikTok in the . This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep .

users safe. Our focus is on providing oversight and protection of the TikTok platform and . user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained.

The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.

Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.

Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity;

to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together.

That's how we drive impact - for ourselves, our company, and the communities we serve. Join us. Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and fault-tolerant systems.

In our team, you’ll have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design.

We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We encourage close collaboration while promoting self-direction.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager / department.

We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities- Develop and maintain automation procedures to maximize system efficiency and minimize human intervention.

Work closely with software engineering teams to design, deploy and operate elements to ensure that systems are functionally robust.
Ensure system scalability to handle growth in web traffic and data. - Implement monitoring tools and set up metrics to keep track of system health and performance.
Participate in on-call rotations, assist with incident management, and diagnose, resolve, and prevent production issues.
Conduct performance tests to find and address system bottlenecks. - Collaborate with teams across the organization to define Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).
Practice sustainable user support, incident response, and blameless postmortems.
Bachelor's degree in Computer Science, Information Technology, or a related field with 3+ years of experience- Proven work experience as a Site Reliability Engineer, Systems Engineer, or similar software engineering role.
Proficient knowledge of high-level programming languages (. Python, Go, Java, and Shell script). - Experience in network architecture, database modeling, cloud systems and large-scale distributed systems.
Strong understanding of Linux operating systems and open-source technologies. - Preferred Experience in MySQL, Redis, Ngnix, Kubernetes, Docker, OpenStack, Hadoop, Spark, etc- Preferred Knowledge of monitoring tools and methodologies (such as Prometheus, Grafana).
Excellent problem-solving skills, strategic thinking, and a strong ability to debug complex systems.- Exceptional communication skills and the ability to effectively collaborate with cross-functional teams.

28 days ago

Related jobs

Staff Site Reliability Engineer - Cloud Infrastructure

CIRCLE

Seattle, Washington

As a Senior Site Reliability Engineer at Circle, you will design, build, and maintain Circle’s infrastructure estate to meet the growing worldwide customer base on public cloud providers across multiple regions. Staff Site Reliability Engineer (IV). Senior Site Reliability Engineer (III). Senior Sit...

Site Reliability Engineer (SRE) - Security, Apple Services Engineering

Apple

Seattle, Washington

We are looking for passionate and talented Site Reliability Engineer to continue our focus in providing our customers the highest quality Apple Services experience. We are seeking a highly skilled and motivated Security Site Reliability Engineer (SRE) to join our dynamic and growing team. Understand...

Site Reliability Engineer II - CTJ - Poly

Microsoft

Redmond, Washington

Site Reliability Engineering IC3 - The typical base pay range for this role across the U. OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration. We are looking for...

Senior Site Reliability Engineer

Apple

Seattle, Washington

Join Apple Services Engineering Cloud Service Infrastructure team, as a Site Reliability Engineer, to help support and scale cloud services for millions of Apple users. We build automations, instrument reliability tools, and respond to alerts and incidents which may pose a risk to the reliability of...

SRE / Site Reliability Engineers _ Redmond/Seattle, WA

OQ Point LLC

WA, United States

Title: Site Reliability Engineers (SRE) Location: Redmond/Seattle, WA No of positions: 5 Skills: Dev Ops with Bill Release Management, Az...

AI Ops Site Reliability Engineer - Data Infrastructure (Seattle)

ByteDance

Seattle, Washington

Join our innovative Site Reliability Engineering (SRE) team that merges software development with infrastructure operations to manage large-scale, highly distributed systems. Key Responsibilities:- Develop and implement AI-based software for efficient and intelligent management of service-oriented a...

Senior Site Reliability Engineer

Sentry

Seattle, Washington

The Site Reliability Engineering team is responsible for the deployment, configuration, maintenance and monitoring of Sentry's hosted platform. As Senior Site Reliability Engineer, you will work with a multitude of technologies and have a direct impact on how Sentry evolves to handle 100x our curren...

Senior Active Directory Site Reliability Engineer

Microsoft

Redmond, Washington

Our team is looking for a Senior Active Directory Site Reliability Engineer. As a Senior Active Directory Site Reliability Engineer, you will provide leadership, direction and accountability for strategic application architecture plans, system design, and implementation. Site Reliability Engineering...