The pay range for this role is $160,000 - $190,000/yr USD.
WHO WE ARE:
Headquartered in Southern California, Skechers—the Comfort Technology Company®—has spent over 30 years helping men, women, and kids everywhere look and feel good. Comfort innovation is at the core of everything we do, driving the development of stylish, high-quality products at a great value. From our diverse footwear collections to our expanding range of apparel and accessories, Skechers is a complete lifestyle brand.
ABOUT THE ROLE:
The Sr. Reliability Engineer, Digital Commerce is responsible for ensuring the stability, performance, and operational readiness of the global digital commerce ecosystem. This role owns end-to-end reliability of the customer shopping journey – from storefront experience and product discovery through checkout, order lifecycle, and commerce integrations – with a specific focus on the Salesforce Commerce Cloud (SFCC) ecosystem including B2C Commerce storefronts, integrations, and commerce services.
Working at the intersection of engineering, product, and operations, this engineer drives proactive reliability practices, observability standards, incident management discipline, and automation initiatives that reduce operational risk and strengthen digital commerce resilience at global scale.
WHAT YOU’LL DO:
Commerce Platform Reliability
- Own end-to-end operational reliability across the digital commerce stack, including storefront availability, product catalog and pricing services, search and discovery, checkout and payment processing, order lifecycle, and fulfillment integrations (OMS, WMS, payment gateways, tax, fraud, and shipping).
- Ensure stability and performance of the Salesforce Commerce Cloud (SFCC) ecosystem, including Business Manager configurations, WebDAV operations, replication processes, cartridge-based customization layers, and headless/microservice components integrated with SFCC.
- Establish operational standards and reliability guardrails for commerce services and all dependent systems across varying traffic conditions, including peak demand periods.
- Partner with order management teams to ensure reliability across Manhattan Active Order Management (MAO) order routing, fulfillment execution integrations, and downstream fulfillment event integrity, including BOPIS flows.
Observability & Monitoring
- Design and implement monitoring frameworks across digital commerce services, with proactive detection of conversion-impacting issues before they affect customers.
- Define and manage SLIs, SLOs, and alerting strategies tied to business impact including conversion degradation, checkout failure rates, order placement success, and site performance and latency.
- Build operational dashboards that translate technical signals into revenue and customer experience insights.
- Implement monitoring across SFCC-specific signals including pipeline performance, OCAPI health, SCAPI latency, cache effectiveness, replication health, third-party integration response times, and MAO order orchestration signals such as routing latency, fulfillment status synchronization, and exception queue health.
Incident Management & Operational Readiness
- Lead coordination of high-severity commerce incidents, including triage, root cause analysis, systemic remediation planning, and improved MTTR through automation, tooling, and process optimization.
- Establish and maintain incident runbooks, operational playbooks, and continuous operational readiness standards across commerce platforms.
- Own operational readiness and release planning for major commerce launches, campaigns, and seasonal peak events, including SFCC traffic scaling strategy validation.
- Partner with Salesforce Commerce Cloud support during platform incidents, managing severity escalation processes and coordinating internal response during platform-level disruptions.
Performance & Scalability Engineering
- Identify and remediate performance bottlenecks impacting site speed, checkout latency, and service responsiveness, including SFCC-specific optimization across page caching, CDN configuration, search indexing, and cartridge execution efficiency.
- Partner with engineering teams to drive performance optimization initiatives, support load testing, and own capacity planning and peak readiness validation.
- Ensure commerce systems scale reliably to support business growth and global expansion.
Automation & Reliability Engineering
- Develop automation to reduce manual operational effort and recurring incident classes, including SFCC deployment validation, replication monitoring, integration failure detection, and release risk scoring.
- Implement reliability engineering patterns such as automated recovery workflows, self-healing service orchestration, reliability validation pipelines, and operational health scoring.
- Drive adoption of reliability engineering best practices across delivery teams.
Cross-Functional Collaboration
- Partner with product, engineering, merchandising, marketing, and operations teams to align reliability priorities with business objectives, serving as a reliability advocate during architecture design and solution reviews.
- Act as the reliability liaison between internal commerce engineering teams and Salesforce Commerce Cloud platform teams, coordinating with external vendors and SaaS providers during incident resolution and performance optimization.
- Translate technical reliability risks into clear business impact narratives for both technical and non-technical stakeholders.
WHAT YOU’LL BRING:
- Hands-on experience supporting Salesforce Commerce Cloud (SFCC) production environments, including composable commerce ecosystems integrating SFCC with CMS, search, personalization, and middleware platforms.
- Experience supporting high-traffic global eCommerce environments with modern commerce architectures including headless, composable, and microservices-based platforms.
- Strong background in incident management, observability, and operational excellence practices, with hands-on experience with observability platforms such as Datadog.
- Familiarity with order management systems, payment platforms (such as Cybersource or Adyen), or commerce SaaS ecosystems; exposure to Manhattan Active Order Management (MAO) is a strong plus.
- Experience with CI/CD pipelines, deployment strategies, release governance, APIs, event-driven systems, and commerce integrations.
- Strong understanding of distributed systems, cloud-native infrastructure, and performance optimization for web applications and backend services.
- Experience leveraging AI-assisted engineering tools to improve operational efficiency and automation.
- Strong analytical mindset with the ability to connect technical reliability to business outcomes and communicate effectively with both technical and non-technical stakeholders.
REQUIREMENTS:
- Bachelor's degree in Computer Science, Engineering, or related field, or equivalent experience.
- 7+ years in Site Reliability Engineering, Production Engineering, or Digital Commerce Platform Operations.
- This is a hybrid role based in Manhattan Beach, CA, requiring a minimum of 3 days onsite per week.
#LI-LA1
About Skechers
Skechers, a global Fortune 500® company, develops and markets a diverse range of lifestyle and performance footwear, apparel, and accessories. Serving over 180 countries and territories, Skechers connects customers to products through department and specialty stores, e-commerce and digital stores, and through our more than 5,300 Skechers retail locations.
Equal Employment Opportunity
Skechers is committed to providing a safe, inclusive, and respectful work environment. Skechers provides equal employment opportunities for all employees and applicants for employment without regard race, color, religion, gender, gender identification and expression, national origin, marital status, age, disability, genetic information, military status, sexual orientation, or any other protected characteristic established by local, state or federal law.
Reasonable Accommodation
Applicants for employment who require a reasonable accommodation to apply for a job should request appropriate accommodation by emailing benefits@skechers.com.
To perform this job successfully, an individual must be able to perform each job responsibility satisfactorily. The skills, abilities and physical demands described are representative of those duties that must be met by an employee to successfully perform the essential functions of this job. Reasonable accommodation may be made to enable individuals with disabilities, who are otherwise qualified for the job position, to perform the essential functions.