Aim and Scope

Unsupervised model-based learning approaches aim at automatically acquiring “knowledge” from data for representation, analysis, interpretation, etc by learning a probabilistic model. They are very suitable for many applications when the supervision (e.g., expert information) is missing, hidden or difficult to obtain, etc. Namely, mixture model-based approaches are one of the most popular and successful unsupervised learning approaches. They are very used in particular in cluster analysis for automatically finding clusters, in discriminant analysis when the classes are dispersed, as well as in regression when the predictor is governed by a hidden process, or in generative topographic learning approaches, etc.
In these unsupervised model-based learning approaches, the model parameters are very often learned in a maximum likelihood (ML) estimation framework using the well-known EM algorithm. Selecting the best model can be performed afterward using some information criteria generally formulated as a penalized maximum likelihood, such as the BIC. A Bayesian regularization is also possible to overcome some possible problems in the ML approach such as singularities, or to include some prior knowledge. The Bayesian formulation yields in a maximum a posteriori (MAP) estimation for the model parameters which can also be performed by using the EM algorithm. The model selection can still performed by relying on slightly modified information criteria implying the maximum a posteriori rather than the maximum likelihood function.
Furthermore, the unsupervised learning approaches can also be used to represent a dataset before running a supervised learning task. The aim is therefore to learn dictionaries for sparse data representations. Standard sparse coding algorithms, namely pursuit algorithms, LASSO, etc, sparse decompositions, which generally optimize a penalized deterministic cost function, can be shown as being a specific constrained case of a general probabilistic Bayesian model optimizing a MAP criterion.
The aims of this session are as follows. First, we aim at seeing, how can probabilistic (non-Bayesian) model-based approaches be regularized at best from a Bayesian probabilistic prospective, namely the choice of adapted prior hyperparameters appropriate for the performed task, namely cluster analysis, hidden process regression analysis, functional data analysis. Then, the second objective is to show, how Bayesian approaches do provide a more general framework to provide sparse representations, compared to standard deterministic sparse coding algorithms. Finally, as the two Bayesian approaches stated before are both parametric, in the sens that they aim at controlling a model, the final objective is to see how can we extend this by furthermore exploring non-parametric Bayesian approaches. In particular the infinite mixture model and its use in both model-based clustering and in Bayesian sparse representation.

The topics of the special session include but not limited to:

  • Unsupervised Generative Learning
  • Model-based cluster and discriminant analysis
  • Latent data Models
  • (Online) EM algorithms
  • Functional Data Analysis
  • Hidden process regression
  • Bayesian regularization
  • Bayesian sparse representation
  • Non-parametric Bayesian models
  • Sparse coding
  • Generative Topographic Learning
  • Temporal Generative Topographic Mapping
  • Applications in signal, speech and image processing; handwritten text recognition; human activity recognition; Diagnosis of complex systems; Content based information retrieval; Gene expression data; etc