Required Experience
- 2+ years with Ceph in production (deployment, configuration, cluster management)
- 2+ years in site reliability engineering or similar role
- Linux/Unix systems with automation focus at scale
- Proficiency in Python or Bash
- Ansible, Terraform, or SaltStack
- Nagios-based monitoring (e.g., Icinga2)
- Observability tooling: Prometheus, Grafana, Mimir, Loki
- Core networking concepts on Linux/Unix systems
Nice to Have
- Docker & Kubernetes (containerization & orchestration)
- Compute platforms: OpenStack, AWS
- CI/CD pipelines and automation workflows
Day-to-Day Responsibilities
- Automate and maintain storage systems for application demands
- Develop tools and scripts to streamline storage operations
- Monitor performance, identify issues, implement solutions
- Participate in agile: standups, code reviews, CI/CD, automated testing
- Improve reliability, performance, and capacity proactively
Your Skills Match
STRONG MATCH
Python
Bash
Ansible
Terraform
Linux (RHEL/SUSE)
Kubernetes
Docker
AWS
OpenStack
CI/CD (Jenkins/GitLab)
Automation at Scale
Storage Architecture
REQUIRED — HIGHLIGHT IN INTERVIEW
Ceph (Kubernetes/Longhorn backends)
Prometheus / Grafana
Icinga2 / Nagios
Mimir / Loki
SaltStack
You deployed Ceph and Longhorn storage backends at Microland for Kubernetes workloads — lead with that. Tie your Dell EMC/Pure Storage production ops experience to the reliability and monitoring requirements.
Compensation by Location
| Location | Range |
| Bay Area (Santa Clara, San Francisco) & Los Angeles | $128,000 – $192,000 |
| Austin, D.C. Metro, CA (non-Bay Area), HI, IL, MA, NH, OR, VA, WA | $110,500 – $165,500 |
| New York City Metro, Kirkland/Seattle | $117,200 – $175,800 |
| All other US locations | $98,500 – $147,500 |
May also include corporate bonus and/or equity awards. Remote position.
Benefits
Medical / Dental / Vision
401(k)
Paid Time Off
Paid Sick Time
Paid Holidays
Paid Parental Leave
Paid Wellness Days
Life Insurance
AD&D Insurance
Short & Long-Term Disability
Employee Stock Purchase Plan
Tuition Assistance
Adoption / Surrogacy / Fertility
Dependent Daycare & Backup Care
Employee Assistance Program
Financial Education & Advice
Interview Prep Tips
- Lead with Ceph experience from Microland — Kubernetes storage backends (Ceph, Longhorn) in production
- Tie Dell EMC PowerScale/PowerMAX ops to storage reliability and performance monitoring requirements
- Prepare a Prometheus/Grafana story — even if indirect, connect CloudWatch + monitoring dashboards experience
- Demonstrate Python/Bash automation at scale — storage provisioning scripts, AWS resource management
- Highlight Ansible/Terraform IaC work — reusable modules, environment consistency, drift reduction
- Be ready to discuss Ceph cluster architecture: OSDs, MONs, MGRs, CRUSH maps, replication
- Show agile/SRE mindset: incident response, post-mortems, SLOs, toil reduction