Director Of Data
The Wikimedia Foundation, the non-profit that hosts Wikipedia and its sister projects, is seeking a deeply technical, outcome-oriented Director of Data to lead our work across data engineering, search, experimentation, and data-related site reliability engineering (SRE). Wikipedia is a top-ten website that serves audiences in over 300 languages and reaches over a billion people each month. Our data systems power everything from product analytics and machine learning to product features, community tools, and public datasets used by researchers worldwide. In this role, you'll be accountable for shipping reliable, privacy-respecting data products and platforms and for maintaining a high technical bar of how we design, build, operate, and evolve them in the open.
You'll manage managers and principal ICs across multiple teams (Data Engineering, Search, Experimentation, and Data SRE). You'll set clear, realistic roadmaps; focus on production-readiness; and provide strategic technical oversight while being comfortable unpacking and researching deep technical problems and solutions. You have enough experience to know what "good" looks like for petabyte-scale data lakes, event pipelines, and search and experimentation stacks, and scale the best practices for how we plan, build, and operate production systems.
We'd like you to :
- Ship safely and incrementally. Drive the work across data engineering, search, experimentation, and data SRE in close partnership with product management to align technical execution with user needs and priorities. Balance velocity with reliability, involving stakeholders both inside Wikimedia and the larger volunteer community.
- Develop people and teams. Manage managers, coach senior ICs, scale hiring for a diverse, international, remote-first organization, and foster a collaborative, mission-aligned culture.
- Provide technical strategy and oversight. Set architectural direction grounded in experience with event-based architectures, data ingestion, modeling, freshness / accuracy SLOs, data governance, privacy by design principles, and cost efficiency.
- Partner effectively with Product Management. Collaborate with our Group Product Manager for Data Platform and senior PMs to balance technical implementation with diverse user needs. Navigate priority trade-offs through healthy debate and shared accountability.
- Be an operational multiplier. Identify and share patterns across teams that reduce toil and let ICs focus on what they do best.
You are :
Deeply technical with strong judgment. You can read design docs and PRs, spot missing edge cases or constraints, and make calls grounded in industry best practice.Biased towards shipping regularly and with confidence. You break work into safe, incremental releases with a crisp definition of done, unblock teams quickly, and measure outcomes.Operationally focused. You set and hold SLOs / error budgets, collaborate with stakeholders, manage vendor relationships, and treat incidents as opportunities to automate and harden systems.Collaborative across functions. You build strong partnerships with product management, research, analytics, and product teams to achieve shared outcomes.Mission driven. You balance product impact with privacy and transparency, partnering with volunteers to build in the open for a global, multilingual community.Skills and experience :
8+ years of engineering leadership with 3+ years managing managers across data-heavy backend teams (or an equivalent track record of shipping production data systems at internet scale).Track record of shipping production data systems at massive scale.Hands-on experience with relevant open source tech stacks (e.g. Kubernetes, Kafka, Spark, Flink, Hadoop, Ceph, Airflow).Ability to hire, coach, and lead globally distributed teams.Additionally, we'd love it if you have :
A track record of open source participation.Fluency or familiarity with languages in addition to English.Experience as a member of a volunteer community.