I am trying to get some measure of dissimilarity (reconstruction error) between a single estimated factor of a KPCA and the five variables that are used as inputs into the KPCA procedure. I used kernlab in r to perform the KPCA
Below is a reproducible example and my humble attempt at writing out a function for the RMSE
library(dplyr)library(kernlab)set.seed(86491) N = 200 latent = rnorm(N)item1 = latent + rnorm(N, mean=0, sd=0.2) item2 = latent + rnorm(N, mean=0, sd=0.3)item3 = latent + rnorm(N, mean=0, sd=0.25)item4 = latent + rnorm(N, mean=0, sd=0.19) item5 = latent + rnorm(N, mean=0, sd=0.9)y = latent + rnorm(N, mean=500, sd=100)# blunt method, but it worksdf <- as.data.frame(as.matrix(cbind(item1, item2, item3, item4, item5)))df.matrix <- as.matrix(df)df.kpca <- kpca(df.matrix, kernel = "rbfdot", kpar=list(sigma=0.2), features = 1)df.kpca_scores <- df.kpca@pcvdf# I am trying to get some measure of reconstruction error. # a measure of dissimlarity between the single KPCA factor output and the five input variables. # Basic functional form. I am struggling with the implementation of it RMSE = function(kpca, df){ sqrt(mean((df - kpca)^2))}