Our new paper Quantile–Quantile Embedding for Distribution Transformation and Manifold Embedding (Ghojogh et al., 2021) on a novel method for shaping distributions using a new twist on classic quantile-quantile plotting methods has been accepted for publication in the Elsevier journal Machine Learning with Applications (MLWA).
This paper comes out of the Manifold Learning research topic in out lab led by postdoc Benyamin Ghojogh and was one of the components of his recent completed PhD thesis (Ghojogh, 2021) here in the lab.
References:
Quantile–Quantile Embedding for distribution transformation and manifold embedding with ability to choose the embedding distribution
Machine Learning with Applications (MLWA).
6,
2021.
We propose a new embedding method, named Quantile-Quantile Embedding (QQE), for distribution transformation and manifold embedding with the ability to choose the embedding distribution. QQE, which uses the concept of quantile-quantile plot from visual statistical tests, can transform the distribution of data to any theoretical desired distribution or empirical reference sample. Moreover, QQE gives the user a choice of embedding distribution in embedding the manifold of data into the low dimensional embedding space. It can also be used for modifying the embedding distribution of other dimensionality reduction methods, such as PCA, t-SNE, and deep metric learning, for better representation or visualization of data. We propose QQE in both unsupervised and supervised forms. QQE can also transform a distribution to either an exact reference distribution or its shape. We show that QQE allows for better discrimination of classes in some cases. Our experiments on different synthetic and image datasets show the effectiveness of the proposed embedding method.
Data Reduction Algorithms in Machine Learning and Data Science
benyaminghojogh.
Raw data are usually required to be pre-processed for better representation or discrimination of classes. This pre-processing can be done by data reduction, i.e., either reduction in dimensionality or numerosity (cardinality). Dimensionality reduction can be used for feature extraction or data visualization. Numerosity reduction is useful for ranking data points or finding the most and least important data points. This thesis proposes several algorithms for data reduction, known as dimensionality and numerosity reduction, in machine learning and data science. Dimensionality reduction tackles feature extraction and feature selection methods while numerosity reduction includes prototype selection and prototype generation approaches. This thesis focuses on feature extraction and prototype selection for data reduction. Dimensionality reduction methods can be divided into three categories, i.e., spectral, probabilistic, and neural network-based methods.