Senior Data Scientist –Generative AI- Type: Direct Hire
- Location: Houston, TX - Onsite
Position OverviewAn established technology-driven organization is seeking a highly experienced Senior Data Scientist to join a collaborative and fast-paced data and analytics team. This role is ideal for a hands-on technical leader with deep expertise in data science, machine learning, and data engineering, with a strong emphasis on designing, building, and deploying Generative AI solutions.
The Senior Data Scientist will partner closely with engineering, product, and business stakeholders to deliver scalable AI/ML solutions that support data-driven decision-making and innovation across the enterprise. The environment emphasizes big data, advanced analytics, and modern ML platforms.
Key Responsibilities- Design, develop, and deploy advanced machine learning models, with a focus on Generative AI techniques (e.G., GANs, transformers, diffusion models, RAG-based systems).
- Build and train deep learning models using frameworks such as TensorFlow, PyTorch, or JAX for applications in NLP, computer vision, and other generative use cases.
- Collaborate with engineering teams to productionize AI/ML solutions, ensuring performance, scalability, and reliability.
- Partner with technical and business leaders to help shape data strategy, analytics vision, and supporting documentation.
- Apply advanced statistical methods and modeling techniques to analyze large, complex datasets and deliver actionable insights.
- Design and implement robust data pipelines for ingesting, cleaning, and transforming data for machine learning workflows, including lakehouse architectures and medallion-style data governance.
- Architect and maintain scalable data platforms and storage solutions supporting both batch and real-time processing.
- Own end-to-end ML workflows, including feature engineering, model training, evaluation, deployment, and monitoring.
- Continuously evaluate and optimize model performance and contribute to improvements in the broader AI/ML ecosystem.
- Mentor and guide junior data scientists and engineers, promoting best practices and technical growth.
- Stay current with emerging AI/ML research and apply innovative approaches to real-world business challenges.
- Clearly document and communicate technical findings and recommendations to both technical and non-technical audiences.
- Participate in, and at times lead, data and AI technology selection initiatives.
Required Qualifications- Master’s or PhD in Computer Science, Data Science, Engineering, or a related discipline with a concentration in machine learning, AI, or data engineering.
- Demonstrated hands-on experience with Generative AI, including GANs, VAEs, transformers, RAG pipelines, and other modern deep learning architectures.
- Strong programming skills in Python, R, or Julia, with extensive use of AI/ML libraries and frameworks (e.G., TensorFlow, PyTorch, Scikit-learn).
- Significant experience in data engineering, including ETL pipelines, data wrangling, real-time processing, and data storage solutions.
- Deep understanding of machine learning algorithms, deep learning methods, and statistical modeling.
- Experience working in cloud environments (AWS, Azure, or GCP) and with distributed computing frameworks such as Spark or Hadoop.
- Proven ability to design and optimize data architectures for both batch and streaming data use cases.
- Proficiency with SQL and NoSQL databases and data query languages.
- Strong analytical, problem-solving, and critical-thinking skills, with the ability to work autonomously or collaboratively.
- Excellent written and verbal communication skills, with the ability to translate complex technical concepts for diverse audiences.
Preferred Experience- Hands-on experience with NLP and computer vision use cases, including large-scale pre-trained models (e.G., GPT, BERT, DALL·E).
- Experience deploying and monitoring ML models in production using tools such as Docker, Kubernetes, MLflow, or similar platforms.
- Familiarity with software engineering best practices, including version control (Git) and collaborative development workflows.
- Exposure to DevOps and CI/CD pipelines supporting machine learning systems.
- Published research or contributions to recognized AI/ML conferences, journals, or open-source projects.