Senior Site Reliability Engineer

Վերջնաժամկետ՝ 26 Մայիս 2024

Աշխատանքի պայմաններ՝ Մշտական

Կատեգորիա՝ Ծրագրավորում

Աշխատանքի տեսակը՝ Ամբողջ դրույք

Գտնվելու վայրը՝ Երևան

Աշխատանքի նկարագրություն՝

OneMarketData is continuously searching for bright talent with the skills to make an impact. From developers to data scientists, at OneTick you will have the opportunity to develop and enhance your problem-solving skills using a combination of analytics, imagination, and talent.

Overview

Our DevOps team develops the infrastructure behind the hosted solutions and our software and data delivery lifecycle.

Prior to advancing with your application, we kindly request that you review the CONSENT NOTICE FOR HR AND RECRUITING provided by OneMarketData. Your attention to this matter is greatly appreciated.

Our stack:

  • AWS (some of the services we use are: EKS, EC2, S3, SGW, ASG, ELB, Lambda, etc.);
  • Terraform and Ansible as an IaC approach;
  • Gitlab and Gitlab CI/CD;
  • Python is the main programming language for automation;
  • Kubernetes (mostly EKS, but GKE and other Kubernetes engines are also being used) for Orchestration and Helm for its management.
  • Prometheus/Victoria Metrics, Grafana, Loki, AWS CloudWatch, and CloudTrail for monitoring, logging, and some statistics collection;
  • OneTick (our platform for market data);

Some other tools for different purposes - i.e., Packer, HashiCorp Vault, OpenVPN, Slack, Confluence, and other popular and well-known tools:)


More information about the projects

In the Cloud Project, we have a multi-account AWS infrastructure managed by the AWS organization. Separate AWS accounts are necessary to host customer-facing environments. We have been providing our customers with different setups for our application. In general, we use most of all common AWS resources like EC2, EKS, S3, VPC, ELB, etc, but also the stack of AWS resources is pretty comprehensive. Most of our AWS infrastructure is covered by IaC. CI/CD is running on GitLab.

We have more than 4 petabytes of data in S3 and EFS. We expose part of the data in S3 to the file system using Storage Gateways. Currently, we are migrating from setup on EC2 instances to Kubernetes, integrating centralized logging and monitoring solutions, migrating data loading processes to Airflow, and optimizing infrastructure costs planning to improve performance at the same time.

We are looking for an experienced Site Reliability Engineer (SRE) to join our team. Your primary responsibility will be to guarantee the reliability, scalability, and performance of our applications and systems. Working closely with both our software engineers and product teams, you will dive deep into troubleshooting production issues, ensuring seamless operation. Additionally, you will collaborate on designing and implementing solutions to enhance our monitoring and alerting systems, aiming to optimize our overall efficiency and reliability. Your expertise in automation will play a crucial role in reducing manual toil and streamlining processes, ultimately contributing to the success of our operations.

Աշխատանքային պարտականություններ

  • Monitor and maintain the health and reliability of our production systems
  • Investigate and resolve production issues and outages
  • Develop and maintain monitoring, alerting, and incident response systems
  • Design and implement automation to reduce manual toil and improve system reliability
  • Collaborate with software engineers to design and implement highly scalable and resilient systems
  • Participate in on-call rotation and respond to incidents promptly
  • Continuously improve our systems and processes to ensure the highest level of reliability and availability
  • Document processes and procedures for maintaining and troubleshooting production systems

Անհրաժեշտ հմտություններ

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field
  • 3+ years of experience as a Site Reliability Engineer or related role
  • Strong knowledge of Linux/Unix systems and administration
  • Proficiency in at least one programming language (e.g., Python, Java, C++)
  • Experience with automation and configuration management tools (e.g., Ansible, Terraform)
  • Experience with AWS and Kubernetes

General requirements:

  • English - Upper-Intermediate or higher.
  • Good communicative skills, being able to explain complicated things in simple words.
  • Being eager to learn new technologies (including area-specific).
  • Strong analytical and problem-solving skills
  • Attentiveness, hard-working and goal-oriented mindset (to have the tasks done), and opportunity to work both in the team and independently.
  • Be prepared to explore further and gain a comprehensive understanding of the product, ready to delve deeply into its functionality, because it is closely connected to how things work.

Պահանջվող թեկնածուի մակարդակը: Ավագ

Լրացուցիչ տեղեկություն

Դիմեք օնլայն staff.am-ի միջոցով և հետևեք ձեր դիմումի ողջ ընթացքին։

Մասնագիտական հմտություններ

Python

DevOps

AWS

Անձնական հմտություններ

Խնդիրների լուծում

Վերլուծական հմտություններ

Տարածեք այս հայտարարությունը սոց․ մեդիայի միջոցով։

Արտոնությունների փաթեթ

Աշխատավարձի տարեկան վերանայում
Օտար լեզվի դասընթացներ
Բժշկական ապահովագրություն
Կորպորատիվ միջոցառումներ
Անվճար թեյ, սուրճ եւ հյուրասիրւթյունը
Ճկուն աշխատաժամեր
Բժշկական ապահովագրություն ընտանիքի համար
Անվճար կայանում
Աշխատակիցների ուղղորդման ծրագիր

Կոնտակտներ

Վեբ կայք https://www.onetick.com/

Հեռախոսահամար` +37460460479

Հասցե՝ Yeraz Business Center, bldg 2 (Adontsi 2)., Երևան, Հայաստանի Հանրապետություն