Site Reliability Engineer (SRE)

office owns all
the products
highly skilled
team mates

Your responsibilities will include:
• System design, configuration, integration, deployment, and operations of Observability systems and tools. These systems include collection of metrics/logs/events from gaming services, applications (client, middleware, backend) and infrastructure (GCloud, on-premise).
• Design, deploy our Observability infrastructure and systems to the next level of availability and scale
• Develop metrics and log ingestion pipelines for high volumes of telemetry
• Creating build and deployment pipelines for monitoring tools
• Deployment of monitoring solutions
• Developing a set of alerts and metrics to keep your own services alive and performing well
• Collaborating with other SRE team members, working on improving efficiency and reliability of monitoring solutions
• Collaborate with our Application Development teams to define the standards/APIs that ensure our Applications are emitting the right telemetry (metrics, logs, traces, events)
• Collect, aggregate and visualize the collected metrics to provide visibility and standards for key indicators to understand the health of our most critical systems
• Evaluating, choosing, and implementing the next generation of Observability tools

Who are we looking for:
• As a Senior SRE Observability Engineer, you have extensive working experience building/ integrating/ administering systems that leverage open-source monitoring tools at scale (e.g., Prometheus, VictoriaMetrics), Elastic Stack (Elasticsearch, Logstash, Kibana, Beats) and Grafana. We are working with Atlassian products (Jira, etc.) so it'll be good if you have used them too.
• We try to follow the best methodologies and IT operations in an always-up, always-available service but you will be able to suggest any improvements. Our environment is Agile so it`ll be good if you have worked in such teams.
• You are a quick learner who can adopt and devour a lot of information about our in-house framework and systems fast. In this position you will have to show your good soft skills. You can work under pressure whilst maintaining accuracy and attention to detail. As a team we are results oriented and rely on good communication to achieve success.

You have experience in the following technologies:
• 2 years+ experience with Open-Source Monitoring & Observability tooling/integration
• Time Series Databases (TSDB) - InfluxDB, Prometheus, VictoriaMetrics
• Elastic Stack (Elasticsearch, Logstash, Kibana, Beats)
• Grafana
• Full proficiency with Linux command line environment
• Scripting experience in Bash, Nodejs is a big plus
• Monitoring protocols/frameworks – Prometheus/Influx line format, SNMP
• Building software using Jenkins, TeamCity, Gitlab CI
• Git and versioning software
• PfSense
• Virtualization tools (Proxmox, VMware)
• Database experience is big plus(MariaDB, Mysql)
• Cloud services (Google Cloud, AWS, Azure, etc.)
• Containerisation experience (Docker, Kubernetes)
• Middleware (Kafka)

Our Site Reliability Engineering and DevOps team ensures 24/7 coverage of our systems and as part of the team you`ll have to take part in the on-call shift schedule.


Lacking any of these skills?
We make games
Company trips to Las Vegas, London and more every year
Environment tailored to allow you to realize your full potential
Awesome mix between all benefits of a large company and all advantages of creative startup culture
Tailor-made career programe and many opportunities to grow and prove yourself
90% of people will receive personal company funded training in 2021
Additional medical insurance
Modern office in the city center with an amazing view
Optional indoor parking space
Sounds good?
Apply Below!
Please attach your CV....
Upload failed. Max size for files is 10 MB.
Awesome, it's sent.
Thank you for your application. We will get in touch soon.
GO back
Oops! Something went wrong while submitting the form.