We're looking for a Senior Machine Learning Engineer to join our small, growing AI/ML team. You'll develop and deploy advanced ML solutions (classification, clustering, and regression) to optimize how we match respondents to surveys and ensure data quality. Beyond model development, you'll play a key role in shaping the MLOps practices that keep those models reliable in production: pipelines, deployment, monitoring, and scaling on AWS.
This role demands autonomy, deep technical expertise, and a strong ability to mentor others. You'll collaborate with data engineers, software engineers, and product teams while helping design, develop, and scale our ML infrastructure.
Responsibilities
- Develop, train, and optimize machine learning models, primarily classification and regression using gradient-boosted frameworks (LightGBM), with the opportunity to expand into advanced ML architectures, including reinforcement learning and LLMs, as our capabilities grow.
- Design, build, and maintain ML pipelines on AWS (SageMaker, S3, Fargate).
- Own the full model lifecycle, from experimentation and training through production deployment and ongoing monitoring.
- Implement monitoring, logging, and alerting for deployed models to ensure continued effectiveness.
- Build and integrate real-time and batch inference systems, APIs, and services for model serving.
- Work with Snowflake and cloud data infrastructure to access, prepare, and query large datasets using SQL.
- Collaborate with engineering, product, and business stakeholders to deliver data-driven solutions.
- Mentor team members, providing guidance on best practices in ML development and MLOps.
- Drive technical discussions on model architecture, tooling choices, and team engineering standards.
Required Skills
- 5+ years of experience building, deploying, and scaling ML models in production.
- Proficient in Python with strong experience in ML libraries (LightGBM, Scikit-learn, NumPy/Pandas).
- Deep understanding of supervised learning; familiarity with reinforcement learning and LLMs is a plus.
- Demonstrated ability to diagnose production model issues, design experiments to validate improvements, and make data-informed tradeoff decisions.
- Strong MLOps expertise: pipelines, CI/CD for ML, model deployment, and monitoring.
- Hands-on experience with AWS services for ML (SageMaker, S3, Fargate, Lambda, ECR).
- SQL proficiency with experience querying large datasets (Snowflake or similar).
- Ability to work autonomously and make key technical decisions related to architecture, standards, and tooling.
- Experience mentoring others and teaching best practices in ML development and MLOps.
Nice to Have
- Java experience for integration with our application layer.
- Understanding of containerization and orchestration (Docker, ECS).
- Experience with infrastructure as code (Terraform, CloudFormation, CDK).
- Experience building A/B testing or experimentation frameworks.
- Experience with experiment tracking tools (MLflow, Weights & Biases).
- Background in data engineering or building data pipelines.
- Experience with model monitoring or observability tools (CloudWatch, Evidently AI, Prometheus).
- Familiarity with transformer architectures, fine-tuning, or RAG patterns.
Why Join Us?
- Ownership & Autonomy: Small team where you'll have the freedom to shape our ML stack, and your contributions have direct, visible impact.
- Variety of Work: No siloed roles. You'll work across the full ML lifecycle: data engineering, modeling, infrastructure, and applications.
- Cutting-Edge Tech: Build scalable MLOps infrastructure, with a roadmap toward RL and LLM capabilities.
- Collaborative Culture: A small, tight-knit ML team where you'll pair on architecture decisions, share code reviews, and have a direct line to stakeholders.
- Impact-Driven: Your models will serve millions of survey respondents, directly shaping how data-driven decisions are made at scale.