Salesforce’s Office of Ethical and Humane Use is seeking an experienced responsible AI data scientist with an adversarial approach and experience conducting ethical red teaming to contribute to our ethical red teaming practice. In this role, you will help us gain a deep understanding of how our models and products may be leveraged by malign actors or through unanticipated use to cause harm. In addition to adversarial testing, you will analyze current safety trends, and develop solutions to detect and mitigate risk, while working cross-functionally with security, engineering, data science, and AI Research teams. You will bring technical depth to the assessment of AI products, models, and applications, in order to identify the best technical mitigations to identified risks.
The ideal candidate will have technical experience in generative as well as predictive artificial intelligence and in responsible / ethical AI
Responsibilities :
Provide technical leadership in designing, prototyping, and implementing comprehensive adversarial testing strategies, including both automated and manual adversarial testing approaches
Mentor and guide collaborator teams on adversarial testing standard processes, helping them develop the skills to conduct their own testing effectivelyCollaborate with cross-functional teams to integrate OEHU adversarial testing frameworks into the AI development lifecycleSafety and RobustnessContribute to the development of detection models, safety guardrails, and other proactive measures to prevent and mitigate risks posed by bad actors
Research and implement innovative techniques for enhancing AI safety and robustness, drawing from both open-source and internal toolsCollaborate with Salesforce’s AI Research team on novel approaches to model safetyTechnical Research and ImplementationWrite clean, efficient, and well-documented code (primarily in Python) to support research efforts and facilitate the evaluation of AI systems
Develop and maintain a repository of reusable code modules and libraries to streamline adversarial testing processesTesting Execution and CollaborationParticipate in scoping, documenting, and implementing tests with partner teams, including the implementation of mitigations identified during testing
Test for technical vulnerabilities, model vulnerabilities, and harm / abuse including but not limited to bias, toxicity, and inaccuracyParticipate in labeling test data in partnership with OEHU and partner teamsReporting, Documentation, and Continuous LearningWrite reports covering the goals and outcomes of testing operations, including significant observations and recommendations
Continuously monitor and analyze emerging threats and vulnerabilities to inform the development of adaptive safety measuresContinue to grow expertise in model safety by keeping up with research in socio-technical systems, privacy, interpretability / explainability, robustness, alignment, and responsible AIQualifications :
3-5 years of industry experience in Software Engineering, AI ethics, AI research, Applied research, ML, DS, or similar rolesDemonstrated ability to think adversarially, ability to anticipate how malicious actors might misuse AI systems and develop corresponding test scenariosExperience creating heuristic-based detection logic and rules for identifying anomalous or suspicious activity in production systems and networks (e.g. log analysis, user behavior analytics)Experience with problem-solving and troubleshooting sophisticated issues with an emphasis on root-cause analysisExperience in analyzing sophisticated, large-scale data sets and communicating findings to technical and non-technical audiencesProven organizational and execution skills within a fast-paced, multi-stakeholder environmentExperience working in a technical environment with a broad, cross-functional team to get results, define requirements, coordinate assets from other groups (design, legal, etc.), and deliver key achievementsExcellent written and oral communication skills, as well as social skills, including the ability to articulate technical concepts to both technical and non-technical audiences.Works well under pressure, and is comfortable working in a fast-paced, ever-changing environmentExperience using SQL and relational databases. Ability to use Python, R, or other scripting languages to perform data analysis at scaleA related technical degree requiredPreferred Qualifications :
Industry experience in AI red-teaming, implemented manual, or semi-automated, or automated adversarial testing approach in production systemExperienced with specific AI attack methodologies (prompt injection, model extraction, adversarial examples, etc.)In office expectations are 36 days / a quarter to support customers and / or collaborate with their teams.
#J-18808-Ljbffr