Job Description
Arc XP (https://www.arcxp.com/) is a cloud-based digital experience platform that empowers enterprise companies, retail brands, and media and entertainment organizations to create and distribute content, drive digital commerce, and deliver compelling multichannel experiences. As a division of The Washington Post, Arc XP has facilitated the digital transformation of customers worldwide, currently serving over 1,900 sites in more than 25 countries, reaching nearly 2 billion unique visitors monthly.
Fully leveraging AWS, the Arc platform is built upon a microservices architecture. Our software teams embrace DevOps principles to deliver and maintain products efficiently. With lightweight processes, our teams innovate rapidly, bringing new ideas to market with agility. New features and products are deployed daily to our customer base, operating as Software as a Service by Arc.
We’re looking for a Senior DevOps Engineer to join our Cloud & Delivery Automation team. In this role, you will design, implement, and manage critical infrastructure and deployment pipelines that power Arc XP’s cloud platform. You’ll work at the intersection of development and operations, with a strong focus on enabling development velocity, operational excellence, and infrastructure reliability.
You’ll collaborate closely with software engineers, architects, and product teams to ensure systems are scalable, secure, and resilient. From infrastructure as code to observability, CI/CD to incident response—you’ll help define how our platform runs at scale.
This is a high-impact role where your decisions will directly shape how services are deployed, maintained, and monitored across our global customer base.
Motivation
You are passionate about automation, continuous improvement, and reducing manual toil through infrastructure as code and intelligent observability
You thrive on solving complex problems related to scale, resilience, and platform performance
You proactively identify bottlenecks in deployment pipelines or infrastructure and take initiative to address them
You value knowledge sharing and mentoring others, helping to grow a DevOps mindset across your team and the wider org
Responsibilities
Architect and maintain highly available, secure, and scalable infrastructure using AWS services like ECS, Lambda, Step Functions, IAM, DynamoDB, and CloudFormation
Design and improve CI/CD pipelines for application and infrastructure deployments using tools like GitHub Actions, or AWS CodePipeline
Enhance monitoring, alerting, and observability using systems such as Datadog, AWS CloudWatch, or Grafana to proactively detect and resolve issues
Implement and manage infrastructure as code (IaC) to enable reproducible, auditable, and scalable cloud environments
Optimize and support containerized services using Docker and Amazon ECS, including environment configuration and orchestration
Drive zero-downtime deployments, rollback strategies, and automation of health checks and release validations
Collaborate with development teams on build tooling, service deployment flows, and performance tuning
Integrate with identity management solutions including SSO, MFA, OAuth, and SAML, particularly with tools like Okta
Contribute to our incident response processes, including participation in on-call rotation and root cause analysis for production issues
Participate in agile ceremonies (planning, refinement, retrospectives) and foster cross-functional collaboration in a distributed team
Qualifications
Minimum Qualifications
BA/BS in Computer Science, Engineering, or equivalent practical experience
3+ years of experience in a DevOps, Site Reliability, or Cloud Engineering role with a strong understanding of production infrastructure
Proficiency with AWS services including Lambda, ECS, IAM, Step Functions, CloudFormation, and DynamoDB
Experience with CI/CD pipelines and release management, particularly around automated testing, builds, and deployments
Proficient in at least one server-side language (e.g., Python, Node.js, Go, or Java) for scripting and tool development
Working knowledge of Docker and container lifecycle management in production
Familiarity with web proxies (e.g., NGINX) and API gateways (e.g., AWS API Gateway)
Experience with observability tools (e.g., Datadog, CloudWatch, Grafana) to ensure system health and performance
Understanding of identity protocols and security practices, including SAML, OAuth, MFA, and Okta integration
Familiarity with agile development workflows and working in distributed, remote-first teams
Preferred Qualifications
Experience building and managing serverless or event-driven applications using AWS Lambda, Step Functions, SNS, or SQS
Proficiency with Infrastructure as Code (IaC) tools such as CloudFormation, or CDK
Familiarity with advanced deployment strategies (e.g., blue/green, canary, rolling updates) for high-availability systems
Hands-on experience with end-to-end or integration testing frameworks and integrating them into CI/CD workflows
Exposure to frontend build and deployment pipelines, especially for React-based applications
Experience working with third-party APIs, including authentication flows and monitoring external service health
Knowledge of CDNs like CloudFront or Akamai and strategies for caching and content delivery
Participation in on-call rotations, with a strong focus on incident response and root cause analysis to continuously improve our systems
Familiarity with DevSecOps practices, such as IAM permissions hygiene, secret management, and automated security scanning
Arc XP’s mission is best served by a diverse, multi-generational workforce with varied life experiences and perspectives. All cultures and backgrounds are welcomed.