Data Scientist for Big-Data Analytics Platforms, Genomics and Healthcare (RE2-R2)

Context And Mission

The Data-Centric Computing group from the Computer Science department at the Barcelona Supercomputing Center is searching for Data Scientist to work on High-Performance Data Analytic (HPDA) platforms and Medical Synthetic Data Generation. Mainly, time series analytics and generative models.

The main objective is to participate in two European research projects. The first is focused on developing methods to predict and estimate the resources requires by HPDA applications to better understand their behaviour and provide information to an orchestrator to manage them. The applications are ran in Big Data and Serverless frameworks such as Apache Spark or Lithops. The methods should use time series information to create an estimation of the resources and also a prediction of them over time. This work is performed in collaboration with Genomic researchers as the applications are focused on genomics analytics. The second is focused on developing models to generate Medical Synthetic Data (Generative AI), data that resembles to the original data provided but it cannot be traced back to the original data. In particular we focus on the development of models to generate synthetic images, but the position might not be limited to that. This work is performed in collaboration of hospitals from the European Union.

Both projects involve collaborating with researchers both from BSC and external and actual use cases. This includes communicating research results in project meetings with the rest of the partners online and presential (might require to travel outside of Spain twice a year).

Key Duties

Development of methods to estimate and predict application resource usage
Development and integration of Synthetic Data Generation models for healthcare
Usage and metric extraction of Cloud/Serverles HPDA/ML platforms and
applications (e.g., Apache Spark, Kubernetes,...)
Contribution to the development of predictive models for HPDA/ML platforms
for orchestration of applications
Participation in internal and external meetings
Give mutual support to other workers in the projects


Master in Computer Science, Data Science, Artificial Intelligence or similar
Essential Knowledge and Professional Experience
A PhD in these fields is valued
Skills on Software Engineering
Skills on Software Engineering
Basic knowledge on infrastructure (cloud/edge computing, serverless computing)
Knowledge of Python
Skills in Deep Learning frameworks such as PyTorch
Additional Knowledge and Professional Experience
Knowledge in time series modeling
Knowledge in Generative AI, specially image generation
Basic skills on system administration
Basic skills in using distributed environments and databases


The position will be located at BSC within the Computer Sciences Department
We offer a full-time contract (37.5h/week), a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures
Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration
Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement
Salary: we offer a competitive salary commensurate with the qualifications and experience of the candidate and according to the cost of living in Barcelona
Starting date: 01/09/2024