Research Engineer in Data Management and Processing (RE3) – AI4S
Reference: 537_24_DIR_CSS_RE3
Job title: Research Engineer in Data Management and Processing (RE3) - AI4S
About BSC
The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, was a founding and hosting member of the former European HPC infrastructure PRACE (Partnership for Advanced Computing in Europe), and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.
Look at the BSC experience:
BSC-CNS YouTube Channel
Let's stay connected with BSC Folks!
We are particularly interested for this role in the strengths and lived experiences of women and underrepresented groups to help us avoid perpetuating biases and oversights in science and IT research. In instances of equal merit, the incorporation of the under-represented sex will be favoured.
We promote Equity, Diversity and Inclusion, fostering an environment where each and every one of us is appreciated for who we are, regardless of our differences.
If you consider that you do not meet all the requirements, we encourage you to continue applying for the job offer. We value diversity of experiences and skills, and you could bring unique perspectives to our team.
Context And Mission
The Computational Social Science and Humanities Program envisions preparing the social sciences and humanities for the era of data, artificial intelligence (AI), and Exascale supercomputing. Our mission is to foster collaboration between social scientists and computer scientists, making high-performance computing (HPC) accessible to all researchers in the field. Through innovative approaches, we aim to apply social science research to contribute valuable insights for informed policymaking.
In our pursuit, we focus on a wide range of key societal research areas such as Population and Household changes, Democratic Quality, Social Media, Public Opinion and Political Communication, Equity and Welfare in Education and Labor Market, tlegal systems and Legislation, Social-Ecology, Science of Science, and History, Archeology and Cultural Heritage. Employing a mixed of advanced statistical models, AI/Machine Learning, Large Language Models (LLMs), Agent-Based Modeling (ABM), Social Network Analysis (SNA), high-performance computational methods are applied to a wide range of large datasets from official statistics, surveys, social media, news, laws, historical archives, current data archives, archaeology data, citizens' volunteered and web-scraping data, and public administration and industry data.
As a research engineer and data scientist, you will play a pivotal role in facilitating the researchers’ work, working closely with the Research units at CSSH, particularly the research unit on Science of Science, and reporting to both the head of the Data Scientists group, and the projects’ IPs. The Science of Science research studies the dynamics of knowledge production on a global scale, examining issues such as the geographical distribution of knowledge production and circulation, the prevalence of various topics, the connectivity (or lack thereof) of scientific communities, the use of artificial intelligence techniques across different fields of study, the development of concepts, and the potential impact of science on policy, among other topics.
The funding for these actions/fellowships and contracts comes from the European Union Recovery and Resilience Facility - Next Generation, within the framework of the General Invitation by the public business entity Red.es to participate in the talent attraction and retention programs within Investment 4 of Component 19 of the Recovery, Transformation, and Resilience Plan.
For more information, please check: https://www.bsc.es/join-us/excellence-career-opportunities/ai4s.
"La financiación de estas actuaciones/becas y contratos, procede del Mecanismo de Recuperación y Resiliencia de la Unión Europea-Next Generation, en el marco de la Invitación General de la entidad pública empresarial Red.es para participar en los programas de atracción y retención del talento dentro de la Inversión 4 del Componente 19 del Plan de Recuperación, Transformación y Resiliencia.
Para más información: https://www.bsc.es/join-us/excellence-career-opportunities/ai4s "
Key Duties
Developing workflows for collecting, processing, and curating bibliometric data and textual data related to scientific production and science policies from sources such as the OpenAlex collection and Semantic Scholar.
Implementing Artificial Intelligence, Machine Learning, Network Analysis, and text classification models to pre-processed corpus including, but not limited to bibliometric data.
Developing workflows for assuring group analyses replicability and data sharing under Findable, Accessible, Interoperable, and Reusable (FAIR) conditions.
Developing research tools to help advance research on computational social sciences and humanities in collaboration with the Computer Science and Operations departments of the BSC-CNS.
Providing support adopting the researchers‘ code to High Performance Computing
Requirements
Education
Engineering Degree or equivalent
MSc in Statistics, Computer Science or similar is highly desirable
Essential Knowledge and Professional Experience
4 to 10 years of experience curating, managing and/or analyzing large datasets for research purposes. Experience with textual data will be highly desirable.
Extensive hands-on experience with the R-statistical package and/or Python, particularly for data curation and management. Ability to work in both programming languages is desirable.
Experience in the implementation of Machine Learning models, and Natural Language Processing technologies.
Experience implementing computational and statistical methods to large-scale datasets.
Additional Knowledge and Professional Experience
Fluency in English is essential. Proficiency in Spanish and other European languages would be advantageous.
Experience working with data for social science purposes.
Experience applying, fine-tuning and comparing Large Language Models’ performance in social science applications will be highly desirable
Knowledge of parallel computing is an asset.
Competences
Interest and ability for independent learning, adapting and developing computational methods to answer research questions in the social sciences
Good communication skills for effective collaboration with researchers and other research engineers and data scientists.
Ability to work in a team and in a multi-cultural environment.
Interest and curiosity for artificial intelligence, machine learning, network analysis, and natural language processing applications in the social sciences.
Conditions
The position will be located at BSC within the Computational Social Science Program.
We offer a full-time contract (37.5h/week), a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, restaurant tickets, private health insurance
Duration: 4 years
Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement
Salary: 50.000,00€
Additional Expenses Grant: Each fellowship will be associated with a grant for additional expenses, such as IT equipment, travel, training, stays, etc.
Starting date: asap - the incorporation for this vacancy must be before the 16th of December 2024
Applications procedure and process
All applications must be submitted via the BSC website and contain:
A full CV in English, including contact details.
A cover/motivation letter with a statement of interest in English, clearly specifying for which specific area and topics the applicant wishes to be considered. Additionally, two references for further contacts must be included. Applications without this document will not be considered.
Development of the recruitment process
The selection will be carried out through a competitive examination system ("Concurso-Oposición"). The recruitment process consists of two phases:
Curriculum Analysis: Evaluation of previous experience and/or scientific history, degree, training, and other professional information relevant to the position. - 40 points
Interview phase: The highest-rated candidates at the curriculum level will be invited to the interview phase, conducted by the corresponding department and Human Resources. In this phase, technical competencies, knowledge, skills, and professional experience related to the position, as well as the required personal competencies, will be evaluated. - 60 points. A minimum of 30 points out of 60 must be obtained to be eligible for the position.
The recruitment panel will be composed of at least three people, ensuring at least 25% representation of women.
In accordance with OTM-R principles, a gender-balanced recruitment panel is formed for each vacancy at the beginning of the process. After reviewing the content of the applications, the panel will begin the interviews, with at least one technical and one administrative interview. At a minimum, a personality questionnaire as well as a technical exercise will be conducted during the process.
The panel will make a final decision, and all individuals who participated in the interview phase will receive feedback with details on the acceptance or rejection of their profile.
At BSC, we seek continuous improvement in our recruitment processes. For any suggestions or comments/complaints about our recruitment processes, please contact recruitment@bsc.es.