NMF methods aim to factorize a non-negative observation matrix X as the product X = G F between two non-negative matrices G and F. Although these approaches have been studied with great interest in the scientific community, they often suffer from a lack of robustness to data and to initial conditions, and provide multiple solutions. To this end and in order to reduce the space of admissible solutions, we propose to inform NMF, thus placing our work in between regression and classic blind factorization. In addition, some cost functions called αβ-divergences are used, so that the resulting NMF methods are robust to outliers in the data.
Three types of constraints are introduced on the matrix F, i.e., (i) the exact or (ii) the bounded knowledge of some components, and (iii) the sum to 1 of each line of F. Update rules are proposed so that all these constraints are taken into account by mixing multiplicative methods with projection.
Moreover, we propose to constrain the structure of the matrix G by the use of a physical model, in order to discern the sources which do contribute to the observed data.
The considered application---consisting of source identification of particulate matter in the air around an industrial area on the French northern coast---showed the interest of the proposed methods. Through a series of experiments on both synthetic and real data, we show the contribution of different informations to make the factorization results more consistent in terms of physical interpretation and less dependent of the initialization.