BackBack
Senior Site Reliability Engineer
SalaryUp to $2,500LevelExperiencedDepartmentInfrastructure & SupportLocationHa Noi
Copy link

We are seeking a Senior Site Reliability Engineer (SRE) with deep expertise in bare-metal Linux systems, performance optimization, and large-scale data platforms. In this role, you will be responsible for ensuring the reliability, scalability, and efficiency of our production environment, which underpins mission-critical data services.

You will work at the intersection of systems engineering, performance troubleshooting, and data infrastructure reliability, while partnering closely with engineering teams to embed SRE best practices across the software lifecycle.

RESPONSIBILITIES:

System Reliability & Performance  

  • Own the reliability, scalability, and performance of core production systems.  
  • Perform advanced performance troubleshooting and tuning across OS, network, and application layers.  
  • Optimize resource usage on bare-metal Linux servers to maximize efficiency and reliability 

Data Infrastructure Reliability  

  • Operate and scale our enterprise messaging and event streaming system with Kafka.  
  • Ensure high availability and performance of our data warehouse with ClickHouse.  

Automation & Observability  

  • Enhance system observability through metrics, tracing, and logging (Prometheus, Grafana, CheckMK, OpenTelemetry).  
  • Design and maintain alerting systems that balance coverage with actionable signals.  

Incident Response & Coordination

  • Lead high-severity incident response and cross-team coordination as the arbiter when failures have multi-team impact.  
  • Drive blameless postmortems and systemic improvements.  

Reliability Culture & Mentorship  

  • Mentor engineers on performance tuning, deployment safety, and reliability-first design.
  • Promote a culture of automation, ownership, and operational excellence.

REQUIREMENTS

Experience  

  • 5+ years in SRE, systems engineering, or infrastructure-focused roles (with at least 2+ years in a senior or lead position).  
  • Strong track record managing large-scale production systems on bare-metal Linux.  

Technical Skills  

  • Expert-level skills in Linux internals, system performance troubleshooting, and tuning.  
  • Hands-on experience operating and scaling Kafka or equivalent messaging systems.  
  • Hands-on experience operating and scaling ClickHouse or similar OLAP database.  
  • Solid coding/scripting ability in Python, Go, or Bash.  
  • Proficiency with Infrastructure-as-Code tools (Terraform, Ansible, etc.).  
  • Experience building and operating highly available distributed systems.  

Soft Skills 

  • Analytical problem-solver with a strong performance-first mindset.  
  • Advocates for automation and reducing toil.  
  • Communicates clearly across both technical and non-technical teams.  
  • Thrives in high-accountability, reliability-driven environments.  

Nice-to-Have  

  • Hands-on experience operating Kubernetes clusters on a scale. 
  • Familiarity with modern Data Lakehouse architecture. 
  • Prior experience with capacity planning and benchmarking at scale.

HIRING PROCESS 

Phone Screening > Onsite Interviews > Offering.

WHY YOU'LL LOVE WORKING AT CỐC CỐC

Few countries have local challengers in the search and browser space. Vietnam is one of those countries thanks to Cốc Cốc. There are a lot of challenges in competing against dominant global players, but also lots of rewards when we succeed.

Competitive benefits:

  • Competitive salary and bonus scheme with a 13th month salary.
  • Performance review twice/year with opportunity to grow or rotate internally.    
  • Special annual leave policy with minimum 19 days/year, plus 1 day off on your birthday.
  • Annual WFH policy.  
  • Advanced 24/7 Health Insurance for all employees.
  • Great Trade Union benefits such as birthdays, marriage, new born child...

Professional growth:

  • Opportunities to learn and grow through regular training programs, coaching and internal sharing.
  • Work in a diverse environment with talented colleagues and partners/customers, local and expats. 

Positive workplace: 

  • Different exciting internal events to make you part of the Cốc Cốc family.
  • Cozy pantry with plenty of snacks, juice and coffee/tea every day.
  • Many interesting hobby clubs to share your passions like English Club, Yoga, Billard or Football.