Site Reliability Engineer
Location: Barcelona/Spain with a hybrid work 8 days a month in the office
Permanent Contract within an End Client – Large retail Company
Experience: minimum of 5 years of experience
Your Tasks
- Design, build, and maintain our internal Kubernetes-based developer platform.
- Partner with engineering teams to drive the adoption of Kubernetes and cloud-native technologies.
- Support deployment and operation of services in Azure and Google Cloud Platform.
- Improve observability, alerting, and system reliability through best practices and tooling.
- Automate infrastructure provisioning and CI/CD pipelines using Infrastructure-as-Code (IaC) tools.
- Champion DevOps practices and ensure alignment across teams.
- Continuously enhance the scalability, security, and resilience of our platform.
- Lead and own the design, implementation, and evolution of solutions.
Your Profile
- 4+ years of experience in Site Reliability Engineering or related roles.
- Proven hands-on experience operating and managing Kubernetes clusters.
- Strong working knowledge of both GCP and Azure cloud platforms.
- Solid Linux system and networking fundamentals.
- Experience with scripting or programming languages such as Bash, Python, or Go.
- Observability and monitoring expertise using tools like Grafana Stack and OpenTelemetry.
- Proficiency with Infrastructure-as-Code tools, especially Terraform.
- Solid understanding of CI/CD workflows and modern DevOps methodologies.
- Familiarity with cloud security and distributed system reliability.
- Understanding of and experience with the GitOps approach.
- Knowledge of non-relational databases, particularly MongoDB.
- Experience working with message brokers (e.g., Kafka, RabbitMQ)
- A hands-on mindset-you take ownership of your solutions from design to deployment.
- Open-minded, proactive, and capable of leading cross-functional initiatives.
- Comfortable working in agile, cross-disciplinary teams.