Description:
Responsibilities:
- Lead and develop a team of talented data/software engineers to design, plan, develop, and deploy improvements to back-end platform services related to data ingestion, data processing, and analytics.
- Create a culture of working with big and sensitive data.
- Design the architecture and then lead the implementation of scalable data processing systems.
- Plan the development of a data platform as a SaaS product.
- Collaborate broadly across the organization and with senior leadership to drive team and individual performance focused on clear outcomes and team OKRs.
- Evaluate resource costs, determine the composition of the required team, top-level roadmap, and perform project risk assessment.
- Foster the adoption of best engineering practices across all aspects of software development to build, deploy, test, and release large scale services with quality and agility, while maintaining our current platform to continue to meet customer commitments.
- Facilitate overall technology strategy, quarterly, and yearly goals, drive engineering best practices, and take ownership of delivering on core outcomes.
Required Skills & Experience:
- 7+ years of extensive experience in Data technologies across streaming and batch-oriented realms, cutting across data acquisition, storage, processing, and consumption patterns in operational and analytical domains, as well as expertise in cloud-related data services (AWS / Azure / GCP).
- 5+ years leading highly technical and high performance engineering teams, with experience in people management (hiring and layoff) and performance management (coaching & mentoring). Have led technical Architecture, Design, and Delivery of Big Data and Cloud Data solutions (AWS, Azure, GCP) for multiple projects. Proven track record of architecting, designing, and delivering complex Big Data and Cloud Data projects (AWS, Azure, GCP) to solve problems at scale, especially distributed data platforms (Hadoop/Kafka).
- Expert in distributed data processing frameworks like Spark, Storm, Flink, and Parquet across batch and streaming realms; expert in programming languages, preferably Scala, with Python secondary, and expert at distributed messaging/streaming frameworks like Kafka, Pulsar, Google Pub/Sub, Azure EventHub, and AWS Kinesis.
- Experience with NoSQL databases (Cassandra/HBase/MongoDB/ElasticSearch/Neo4j) and scalable, analytical data stores like Snowflake, BigQuery, Redshift, and Teradata.
- Professional experience with workflow management (Nextflow, Snakemake, Airflow, etc.).
- Deep knowledge of scalable data models, queries, and operations that address various consumption patterns, including random-access and sequential-access, and necessary optimisations like bucketing, aggregating, and sharding.
- Experience in performance-tuning, optimization, and scaling solutions from a storage/processing standpoint.
- Experience with setting up data engineering practices across architecture, design, coding, quality assurance, and deployment of such, using industry-standard DevOps practices for CI/CD, and leveraging tools like Jenkins/Bamboo, Maven, Junit, SonarQube, Terraform (one-click infrastructure setup), Kubernetes, and containerisation.
- Solid understanding of Data Governance, Data Security, Data Cataloguing, and Data Lineage concepts (experience with tools like Collibra in these areas is preferred).
- Passion for recruiting, developing, mentoring, and retaining a world-class engineering team.
- Lean-thinking mindset, comfortable with Agile planning and estimation rituals, flexible, and able to thrive in a fast-paced, innovative young company.
- Excellent written and verbal English-language communication skills, with the ability to adapt the level of detail to various audiences, and able to concisely explain technical concepts to business stakeholders.