A*STAR

Research Officer (Large Language Models & Data Engineering), A*STAR BII

A*STAR
ResearchSingaporeFull-time3 weeks ago

About the role

AI summarised

Research Officer role at A*STAR BII, part of a project team developing an AI-powered LLM platform to reduce unnecessary antibiotic prescriptions for lower respiratory tract infections. The role involves integrating and fine-tuning large language models on unstructured clinical text, building pipelines, and collaborating with researchers.

ResearchFull-timeBioinformatics Institute

Key Responsibilities

  • Contribute to the LLM development, focusing on enhancing unstructured data processing for clinical and biomedical applications.
  • Fine-tune and train LLMs (e.g., LLaMA, Mistral, Phi, and the GPT family) using supervised and instruction-based datasets.
  • Design and implement pipelines for data cleaning, preprocessing, and tokenisation of large-scale text corpora.
  • Integrate retrieval-augmented generation (RAG) and knowledge graph components for domain adaptation.
  • Evaluate model performance using BLEU, ROUGE, BERTScore, and factual consistency metrics.
  • Develop optimised PEFT/LoRA/QLoRA fine-tuning frameworks for efficiency on GPU clusters.
  • Collaborate with researchers to design experiments, interpret results, and publish findings.
  • Maintain reproducible codebases, documentation, and experiment logs.

Requirements

  • Degree in Computer Science, Data Science, Artificial Intelligence, or related field.
  • Experience with large language models and fine-tuning techniques.
  • Proficiency in Python and machine learning frameworks (e.g., PyTorch, TensorFlow).
  • Knowledge of natural language processing and text preprocessing.
  • Familiarity with retrieval-augmented generation and knowledge graphs.
  • Experience with GPU clusters and distributed training.
  • Strong analytical and problem-solving skills.
  • Ability to work collaboratively in a research team.
  • Excellent written and verbal communication skills.