Fréchet Regression

As data types are becoming more complex, attention has turned to regression in more abstract settings.^[1]^[2] A setting that is increasingly encountered is that of a response variable taking values in a metric space. Examples of such data include Covariance matrix, Graph (discrete mathematics) and Probability distribution function. The presence of a metric provides a natural connection to the work of Fréchet,^[3] where the Fréchet mean is defined for Random element of a metric space as a direct generalization of the standard mean. As Regression analysis can be viewed as the modeling of Conditional expectation, a key feature of Fréchet regression is the generalization of the classical Fréchet mean to the conditional Fréchet mean. Recently, regression is not technical at all due to its versatility and importance. Fréchet regression is just the extension of the classical regression to the complex data.

Definition

Let <math>(\Omega,d)</math> be a metric space. Denote a stochastic process <math>(X,Y)\sim F</math>, where <math>X</math> and <math>Y</math> take values in <math>\mathcal{R}^{p}</math> and <math>\Omega</math>, respectively, and <math>F</math> is the joint distribution of <math>(X,Y)</math> on <math>\mathcal{R}^{p}\times \Omega</math>. Let <math>F_{X}</math> and <math>F_{Y}</math> be the marginal distribution of <math>X</math> and <math>Y.</math> It is assumed that <math>\mu=\mathrm{E}(X), \Sigma=\text{Var}(X),</math> Conditional probability distribution <math>F_{X|Y}</math> and <math>F_{Y|X}</math> exist, with <math>\Sigma</math> Definite matrix.

Fréchet mean, Fréchet variance

The notions of mean and variance were generalized to random objects of a metric space by Fréchet.^[3]

The Fréchet mean is defined as <math>w_{\oplus} = \text{argmin}_{w\in\Omega}E(d^{2}(Y,w)),</math> and the Fréchet variance is <math> V_{\oplus} = E(d^{2}(Y,w_{\oplus}))</math>.

Global Fréchet regression

The global Fréchet regression <math>m_{\oplus}(x)</math> on an arbitrary metric space <math>(\Omega,d)</math> is defined as

<math>m_{\oplus}(x) := \text{argmin}_{w \in \Omega}M(w,x),</math> <math> M(\cdot,x) = E[s(X,x)d^{2}(Y,\cdot)],</math> (1)

where the weight function <math> s</math> is

<math> s(z,x) = 1+(z-\mu)^{T}\Sigma^{-1}(x-\mu).</math> (2)

Local Fréchet regression

Consider the case of a scalar predictor <math> X \in \mathcal{R}^{p},</math> where <math> p=1</math> for ease of representation. This method can be generalized for any <math> p</math> with <math> p=1</math>. Denote a Kernel (linear algebra) function <math> K</math> and Bandwidth (computing) as <math> h</math> with the scaled kernel <math> K_{h}(\cdot)=h^{-1}K(\cdot/h).</math> Let <math> \mu_{j} = E[K_{h}(X-x)(X-x)^{j}], r_{j} = E[K_{h}(X-x)(X-x)^{j}Y]</math>and <math> \sigma_{0}^{2}=\mu_{0}\mu_{2}-\mu_{1}^{2}.</math>

The local Fréchet regression <math>l_{\oplus}(x)</math> is defined as

<math>l_{\oplus}(x) := \text{argmin}_{w \in \Omega}L_{n}(w,x),</math><math> L_{n}(w) = E[s(X,x,h)d^{2}(Y,w)],</math> (3)

where the weight function <math> s</math> is

<math> s(z,x,h) = \sigma_{0}^{-2}\{K_{h}(z-x)[\mu_{2}-\mu_{1}(z-x)]\}</math>. (4)

Examples

Regression for probability distributions with the Wasserstein metric

Let the space <math> \Omega</math> be the set of distribution functions equipped with the Wasserstein metric. Consider a sample<math> (X_{i},Y_{i}),i=1,\ldots,n</math> on <math> \Omega</math>. Set <math> \hat{g}_{x}=n^{-1}\Sigma_{i=1}^{n}s_{in}(x)Q(Y_{i}),</math> where the weights <math> s_{in}(x)</math> in (5) and <math> Q(w)</math> to be the quantile function corresponding to <math> w,</math> for any <math> w \in \Omega.</math> Since <math> \hat{g}_{x} \in L^{2}[0,1],</math> the global Fréchet regression estimator is

<math> \hat{m}_{\oplus}(x)=\text{argmin}_{w\in\Omega}d^{2}_{L^{2}}(\hat{g}_{x},Q(w))=Q^{-1}(\text{argmin}_{q\in Q(\Omega)}d^{2}_{L^{2}}(\hat{g}_{x},q)),</math>

where <math> d_{L^{2}}</math> is the standard Lp space|<math> L^{2}</math> metric.

Regression for correlation matrices with the Frobenius metric

Consider a space of random objects <math> \Omega</math> which consists of Correlation matrices with Frobenius inner product <math> d_{F}.</math> From a sample<math> (X_{i},Y_{i}),i=1,\ldots,n</math> the minimization in (6) is reformulated by setting <math> \hat{B}(x)=n^{-1}\Sigma_{i=1}^{n}s_{in}(x)Y_{i},</math> where the weights <math> s_{in}(x)</math> in (5), and computing

<math> \hat{m}_{\oplus}(x)=\text{argmin}_{w\in\Omega}d_{F}(\hat{B}(x),w).</math>

Therefore, this problem is reduced to finding the correlation matrix which is nearest to the matrix <math> \hat{B}(x),</math> which has been well studied.^[4]^[5]^[6]

References

↑ Marron, J. Steve; Alonso, Andrés M. (2014-01-13). "Overview of object oriented data analysis". Biometrical Journal. 56 (5): 732–753. doi:10.1002/bimj.201300072. ISSN 0323-3847. PMID 24421177. S2CID 26667554.
↑ Wang, Haonan; Marron, J. S. (2007-10-01). "Object oriented data analysis: Sets of trees". The Annals of Statistics. 35 (5). arXiv:0711.3147. doi:10.1214/009053607000000217. ISSN 0090-5364. S2CID 15386459.
↑ ^3.0 ^3.1 Fréchet, M. (1948). "Les éléments aléatoires de nature quelconque dans un espace distancié". Annales de l'institut Henri Poincaré (in French). 10: 215–310. S2CID 124022420.{{cite journal}}: CS1 maint: unrecognized language (link)
↑ Higham, Nicholas J. (2002-07-01). "Computing the nearest correlation matrix—a problem from finance". IMA Journal of Numerical Analysis. 22 (3): 329–343. doi:10.1093/imanum/22.3.329. ISSN 0272-4979.
↑ Qi, Houduo; Sun, Defeng (2006-01-01). "A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix". SIAM Journal on Matrix Analysis and Applications. 28 (2): 360–385. doi:10.1137/050624509. ISSN 0895-4798.
↑ Borsdorf, Rüdiger; Higham, Nicholas J. (2010-01-01). "A preconditioned Newton algorithm for the nearest correlation matrix". IMA Journal of Numerical Analysis. 30 (1): 94–107. doi:10.1093/imanum/drn085. ISSN 0272-4979.

External links

Add External links

This article "Fréchet Regression" is from Wikipedia. The list of its authors can be seen in its historical. Articles taken from Draft Namespace on Wikipedia could be accessed on Wikipedia's Draft Namespace.

[1] Marron, J. Steve; Alonso, Andrés M. (2014-01-13). "Overview of object oriented data analysis". Biometrical Journal. 56 (5): 732–753. doi:10.1002/bimj.201300072. ISSN 0323-3847. PMID 24421177. S2CID 26667554.

[2] Wang, Haonan; Marron, J. S. (2007-10-01). "Object oriented data analysis: Sets of trees". The Annals of Statistics. 35 (5). arXiv:0711.3147. doi:10.1214/009053607000000217. ISSN 0090-5364. S2CID 15386459.

[:0-3] 3.0 ^3.1 Fréchet, M. (1948). "Les éléments aléatoires de nature quelconque dans un espace distancié". Annales de l'institut Henri Poincaré (in French). 10: 215–310. S2CID 124022420.{{cite journal}}: CS1 maint: unrecognized language (link)

[4] Higham, Nicholas J. (2002-07-01). "Computing the nearest correlation matrix—a problem from finance". IMA Journal of Numerical Analysis. 22 (3): 329–343. doi:10.1093/imanum/22.3.329. ISSN 0272-4979.

[5] Qi, Houduo; Sun, Defeng (2006-01-01). "A Quadratically Convergent Newton Method for Computing the Nearest Correlation Matrix". SIAM Journal on Matrix Analysis and Applications. 28 (2): 360–385. doi:10.1137/050624509. ISSN 0895-4798.

[6] Borsdorf, Rüdiger; Higham, Nicholas J. (2010-01-01). "A preconditioned Newton algorithm for the nearest correlation matrix". IMA Journal of Numerical Analysis. 30 (1): 94–107. doi:10.1093/imanum/drn085. ISSN 0272-4979.

[1]

[2]

[3]

[4]

[5]

[6]