ML - SOMMa

ML adminacc Fri, 11/12/2021 – 10:38

In the machine learning research line we deal with data problems coming from different scenarios: industry, biosciences, health, economy, etc. We pursue the developments of new machine learning algorithms that can efficiently tackle these problems. Particularly, we consider problems that account for a variety of data types: from time series, to steaming data or images and speech, and a wide range of modelization techniques and mathematical formalisms such as: probabilistic graphical models, Bayesian approaches, deep learning, etc.

Our goal is to develop novel and efficient machine learning algorithms able to deal with new data-related practical problems. We also pursue the mathematical modeling of these algorithms in order to provide theoretical guarantees of their performance.

Machine Learning DS 003

The research carried out in the machine learning line is inspired in problems that appear in other scientific, technological or economical disciplines. We develop new machine learning methods and algorithms related with the main data analysis activities such as clustering, supervised classification, feature subset selection, etc. to solve this kind of problems. Based on the specific characteristics of the problem at hand, we design tailored but general algorithms that extract as much information as possible from the available data providing efficient machine learning models that solve the problem.

In addition to that, we also develop mathematical tools able to model the behavior and performance of the algorithms: studying their convergence, the estimation of the performance, the behavior of the algorithms in terms of computational time and memory requirements, etc.

During the last years the machine learning line has worked with different machine learning problems and algorithms. Particularly, we can emphasize the work done in the area of time series mining and data streaming, the adaptation of classical clustering algorithms such as k-means or k-medoids to massive data environments, the probabilistic modeling of permutations and ranked data or the developments in anomaly detection, and the analysis of crowd learning environments.

In terms of formalisms, we strongly rely on probabilistic modeling, using different tools and techniques such as probabilistic graphical models and Gaussian process to name, which in most cases are learned under a Bayesian perspective. We also pursue the use of deep learning when we consider it the most appropriate technique for the problem at hand.

Go to Source