Position
Position: SRE - Site Reliability Engineer
Location
Location: Remote/PST
Duration
Duration: 1 Year
Qualifications
- Advanced Kubernetes: Must have strong skills in Kubernetes at scale using GKE, AKS, EKS or RKE. Experience with kubectl and Helm. Worked on EKS with kubectl.
- Containers: Experience deploying Java (Spring Boot) microservices in dockerized environments.
- Observability: Experience setting up tools like Prom/Grafana, Datadog, AppDynamics, Splunk to give actionable Intel on a microservices environment including synthetics, application performance monitoring, logging and alerting (PagerDuty/OpsGenie Integrations). Worked on Elasticsearch and OpsGenie Integration.
- CI/CD: Good expertise with Jenkins, Azure DevOps, GitHub Actions, ArgoCD, Artifactory, Azure container registry, Google container registry and other similar tooling. Worked on Jenkins, ArgoCD.
- SCM: Working with tools like GitHub/GitLab for source code management and branching strategies like GitFlow and trunk-based. GitLab using GitFlow.
- Strong troubleshooting skills: Ability to debug code-level issues, contribute to root cause analysis post problem resolution.
- Good communication skills: Respect, active listening, verbal and non-verbal communication, clarity and concision, confidence, open-mindedness.
- Documentation skills: Effectively document automation and technical efforts for easy adoptability.
- Collaboration skills: Work effectively with Scrum/Dev teams with a push/pull philosophy to manage expectations and contribute to platform stability and improvement.
- Desirable Skills/knowledge/experience
- IAC: Terraform, Pulumi. Preferably developed modules in the past rather than just using them. Terraform
- Security: Worked with encryption at rest, in-transit patterns. Experience with tools like Azure Key Vault, HashiCorp Vault, Google KMS.
- Security: Experience with tools like Veracode, Blackduck for AppSec testing, Qualys scanners for infra testing and Twistlock/Aqua for container scanning. QuaLys for OS vulnerabilities scanning and fixing.
- Automation: Identify toil and opportunities to reduce that within the team.
- Authentication/Authorization: Familiarity with AuthN/AuthZ schemes like OpenID, OAuth 2.0, SAML.
- Scripting and Programming: Python, PowerShell, Go, Java, Node.
- Event Driven/Event Sourcing Patterns: Familiarity with distributed event streaming platforms like Kafka, EventHub, RabbitMQ and patterns like CORS.
- Advanced Microservice Patterns: Familiarity with Saga, Choreography and Orchestration patterns.

Washington DC, United States of America
Click apply
JS26489_25303_305784A40126E2E88E9360171B5EEC74
1/27/2026 6:13:20 AM
We strongly recommend that you should never provide your bank account details to an advertiser during the job application process. Should you receive a request of this nature
please contact support giving the advertiser's name and job reference.