Thales

Data Engineer

Thales
Aerospace & DefenseSingaporeOnsitePosted 4 weeks ago

About the role

AI summarised

Thales is seeking a Data Engineer to design, build, and optimize data pipelines and processing frameworks powering next-generation Data Warehouse and Data Lakehouse solutions. This role involves tackling complex data challenges including streaming & batch processing, data quality assurance, and real-time data integration within a collaborative, high-tech environment.

Aerospace & DefenseOnsite

Key Responsibilities

  • Architect, implement, and maintain robust, scalable, and efficient ETL/ELT pipelines to collect, ingest, transform, and load data from diverse sources.
  • Integrate data from various structured and unstructured sources, including APIs, streaming platforms, databases, and external feeds.
  • Develop and optimize data models (relational and non-relational) to support advanced analytics and operational needs.
  • Implement data validation, sanitization, and monitoring processes to ensure high data integrity, accuracy, and consistency.
  • Develop data generation/tracing capabilities via data models, audit change models, and visualization into dashboards.
  • Participate in the design and evolution of Data Warehouse architecture.
  • Automate recurring data engineering tasks to enhance overall efficiency and reliability.

Requirements

  • Bachelor's in Computer Science or Information Technology (Master's degree in CS/Data Science is applicable).
  • Proficiency in implementing ETL & ELT pipelines using Apache Kafka, Apache Spark 3.0, and/or Apache Flink 2.0.
  • Proficiency in programming languages such as Java 8+ or Kotlin 2.x.
  • Proficiency in implementing ETL/ELT pipelines into Kubernetes clusters within Azure cloud environments.
  • Proficiency in using monitoring tools including Grafana, Prometheus, ElasticSearch, and Kibana.
  • Proficiency in implementing ETL/ELT that interacts with object-based stores (e.g., MinIO) and relational databases (e.g., PostgreSQL).
  • Proficiency with distributed source code management using Git-based protocols (e.g., GitLab, Gitea).
  • Proficiency with Linux command line commands and understanding of Linux filesystem/processes.