| Title: | Distribution Comparison Through Density Ratio Estimation |
|---|---|
| Description: | Fast, flexible and user-friendly tools for distribution comparison through direct density ratio estimation. The estimated density ratio can be used for covariate shift adjustment, outlier-detection, change-point detection, classification and evaluation of synthetic data quality. The package implements multiple non-parametric estimation techniques (unconstrained least-squares importance fitting, ulsif(), Kullback-Leibler importance estimation procedure, kliep(), spectral density ratio estimation, spectral(), kernel mean matching, kmm(), and least-squares hetero-distributional subspace search, lhss()). with automatic tuning of hyperparameters. Helper functions are available for two-sample testing and visualizing the density ratios. For an overview on density ratio estimation, see Sugiyama et al. (2012) <doi:10.1017/CBO9781139035613> for a general overview, and the help files for references on the specific estimation techniques. |
| Authors: | Thom Volker [aut, cre] (ORCID: <https://orcid.org/0000-0002-2408-7820>), Carlos Gonzalez Poses [ctb], Erik-Jan van Kesteren [ctb] |
| Maintainer: | Thom Volker <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.2.2.9000 |
| Built: | 2026-05-31 08:15:31 UTC |
| Source: | https://github.com/thomvolker/densityratio |
Colon cancer data set from princeton, containing 2000 gene expressions from 22 colon tumor tissues and 40 non-tumor tissues. The data is collected by Alon et al. (1999) and can be obtained from here.
A data.frame with 62 rows and 2001 columns (class variable and 2000 gene expressions).
Bivariate plot
create_bivariate_plot(data, ext, vars, logscale, show.sample)create_bivariate_plot(data, ext, vars, logscale, show.sample)
data |
Data frame with the individual values and density ratio estimates |
ext |
Data frame with the density ratio estimates and sample indicator |
vars |
Character vector of variable names to be plotted. |
logscale |
Logical indicating whether the density ratio should be plotted in log scale. Defaults to TRUE. |
show.sample |
Logical indicating whether to give different shapes to observations, depending on the sample they come from (numerator or denominator). Defaults to FALSE. |
Bivariate plot
Scatterplot of individual values and density ratio estimates. Used internally in create_univariate_plot()
create_univariate_plot(data, ext, var, y_lab, sample.facet = TRUE)create_univariate_plot(data, ext, var, y_lab, sample.facet = TRUE)
data |
Data frame with the individual values and density ratio estimates |
ext |
Data frame with the density ratio estimates and sample indicator |
var |
Name of the variable to be plotted on the x-axis |
y_lab |
Name of the y-axis label, typically ("Density Ratio" or "Log Density Ratio") |
sample.facet |
Logical indicating whether to facet the plot by sample. Default is TRUE. |
A scatterplot of variable values and density ratio estimates.
Simulated data set (see data-raw/generate-data-densityratio.R) with five variables that are used in the examples.
A data frame with 1000 rows and 5 columns:
Categorical variable with three categories, 'A', 'B' and 'C'
Categorical variable with two categories, 'G1' and 'G2'
Continuous variable (normally distributed given x1 and x2)
Continuous variable (normally distributed)
Continuous variable (normally distributed)
Subset of the denominator_data with three variables and 50 observations
A data frame with 100 rows and 3 columns:
Continuous variable (normally distributed given x1 and x2)
Continuous variable (normally distributed)
Continuous variable (normally distributed)
X and the input matrix Y
Create a Gram matrix with squared Euclidean distances between
observations in the input matrix X and the input matrix Y
X |
A numeric input matrix |
Y |
A numeric input matrix with the same variables as |
intercept |
Logical indicating whether an intercept should be added to the estimation procedure. In this case, the first column is an all-zero column (which will be transformed into an all-ones column in the kernel). |
Creates a histogram of the density ratio estimates. Useful to understand the distribution of estimated density ratios in each sample, or compare it among samples. It is the default plotting method for density ratio objects.
dr.histogram( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'ulsif' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'kliep' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'kmm' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'spectral' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'lhss' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'naivedensityratio' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... )dr.histogram( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'ulsif' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'kliep' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'kmm' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'spectral' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'lhss' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... ) ## S3 method for class 'naivedensityratio' plot( x, samples = "both", logscale = TRUE, binwidth = NULL, bins = NULL, tol = 0.01, ... )
x |
Density ratio object created with e.g., |
samples |
Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'. |
logscale |
Logical indicating whether to plot the density ratio estimates on a log scale. Default is TRUE. |
binwidth |
Numeric indicating the width of the bins, passed on to
|
bins |
Numeric indicating the number of bins. Overriden by binwidth, and
passed on to |
tol |
Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots |
... |
Additional arguments passed on to |
A histogram of density ratio estimates.
A histogram of density ratio estimates.
A histogram of density ratio estimates.
A histogram of density ratio estimates.
A histogram of density ratio estimates.
A histogram of density ratio estimates.
A histogram of density ratio estimates.
ulsif for example usage
kliep for example usage
kmm for example usage
spectral for example usage
lhss for example usage
naive for example usage
Insurance data that is openly available (e.g., on Kaggle).
A data.frame with 1338 rows and 7 columns:
Age of the insured (continuous)
Sex of the insured (binary)
Body mass index of the insured (continuous)
Number of children/dependents covered by the insurance (integer)
Whether the insured is a smoker (binary)
The region in which the insured lives (categorical)
The medical costs billed by the insurance (continuous)
Create gaussian kernel gram matrix from distance matrix
dist |
A numeric distance matrix |
sigma |
A scalar with the length-scale parameter |
The kidiq data stems from the National Longitudinal Survey of Youth and is used in Gelman and Hill (2007). The data set contains 434 observations measured on five variables, and is obtained from https://github.com/jknowles/BDAexampleR.
A data.frame with 434 rows and 5 columns
Child's IQ score (continuous)
Whether the mother obtained a high school degree (binary)
Mother's IQ score (continuous)
Whether the mother worked in the first three years of the child's life (1: not in the first three years; 2: in the second or third year; 3: parttime in the first year; 4: fulltime in the first year)
Mother's age (continuous)
Kullback-Leibler importance estimation procedure
kliep( df_numerator, df_denominator, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = 200, centers = NULL, cv = TRUE, nfold = 5, epsilon = NULL, maxit = 5000, progressbar = TRUE )kliep( df_numerator, df_denominator, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = 200, centers = NULL, cv = TRUE, nfold = 5, epsilon = NULL, maxit = 5000, progressbar = TRUE )
df_numerator |
|
df_denominator |
|
scale |
|
nsigma |
Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation. |
sigma_quantile |
|
sigma |
|
ncenters |
Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples. |
centers |
Option to specify the Gaussian samples manually. |
cv |
Logical indicating whether or not to do cross-validation |
nfold |
Number of cross-validation folds used in order to calculate the
optimal |
epsilon |
Numeric scalar or vector with the learning rate for the
gradient-ascent procedure. If a vector, all values are used as the learning
rate. By default, |
maxit |
Maximum number of iterations for the optimization scheme. |
progressbar |
Logical indicating whether or not to display a progressbar. |
kliep-object, containing all information to calculate the
density ratio using optimal sigma and optimal weights.
Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., Von Bünau, P., & Kawanabe, M. (2008). Direct importance estimation for covariate shift adaptation. Annals of the Institute of Statistical Mathematics 60, 699-746. Doi: https://doi.org/10.1007/s10463-008-0197-x.
set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)
Kernel mean matching approach to density ratio estimation
kmm( df_numerator, df_denominator, scale = "numerator", constrained = FALSE, nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = 200, centers = NULL, cv = TRUE, nfold = 5, parallel = FALSE, nthreads = NULL, progressbar = TRUE, osqp_settings = NULL, cluster = NULL )kmm( df_numerator, df_denominator, scale = "numerator", constrained = FALSE, nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = 200, centers = NULL, cv = TRUE, nfold = 5, parallel = FALSE, nthreads = NULL, progressbar = TRUE, osqp_settings = NULL, cluster = NULL )
df_numerator |
|
df_denominator |
|
scale |
|
constrained |
|
nsigma |
Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation. |
sigma_quantile |
|
sigma |
|
ncenters |
Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples. |
centers |
Option to specify the Gaussian samples manually. |
cv |
Logical indicating whether or not to do cross-validation |
nfold |
Number of cross-validation folds used in order to calculate the
optimal |
parallel |
logical indicating whether to use parallel processing in the cross-validation scheme. |
nthreads |
|
progressbar |
Logical indicating whether or not to display a progressbar. |
osqp_settings |
Optional: settings to pass to the |
cluster |
Optional: a cluster object to use for parallel processing,
see |
kmm-object, containing all information to calculate the
density ratio using optimal sigma and optimal weights.
Huang, J., Smola, A. J., Gretton, A., Borgwardt, K. M., & Schölkopf, B. (2006). Correcting sample selection bias by unlabeled data. In Advances in Neural Information Processing Systems, edited by B. Schölkopf, J. Platt and T. Hoffman. Available from https://proceedings.neurips.cc/paper/2006/hash/a2186aa7c086b46ad4e8bf81e2a3a19b-Abstract.html.
set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)
Least-squares heterodistributional subspace search
lhss( df_numerator, df_denominator, m = NULL, intercept = TRUE, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, nlambda = 10, lambda = NULL, ncenters = 200, centers = NULL, maxit = 200, progressbar = TRUE )lhss( df_numerator, df_denominator, m = NULL, intercept = TRUE, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, nlambda = 10, lambda = NULL, ncenters = 200, centers = NULL, maxit = 200, progressbar = TRUE )
df_numerator |
|
df_denominator |
|
m |
Scalar indicating the dimensionality of the reduced subspace |
intercept |
|
scale |
|
nsigma |
Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation. |
sigma_quantile |
|
sigma |
|
nlambda |
Integer indicating the number of |
lambda |
|
ncenters |
Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples. |
centers |
Numeric matrix with the same variables as |
maxit |
Maximum number of iterations in the updating scheme. |
progressbar |
Logical indicating whether or not to display a progressbar. |
lhss-object, containing all information to calculate the
density ratio using optimal sigma, optimal lambda and optimal weights.
Sugiyama, M., Yamada, M., Von Bünau, P., Suzuki, T., Kanamori, T. & Kawanabe, M. (2011). Direct density-ratio estimation with dimensionality reduction via least-squares hetero-distributional subspace search. Neural Networks, 24, 183-198. doi:10.1016/j.neunet.2010.10.005.
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
The naive approach creates separate kernel density estimates for
the numerator and the denominator samples, and then evaluates their
ratio for the denominator samples. For multivariate data, the density ratio
is computed after a orthogonal linear transformation, such that the new
variables can be treated as independent. To reduce the dimensionality of
the PCA solution, one can set the number of components by setting the
m parameter to an integer value smaller than the number of variables.
naive( df_numerator, df_denominator, m = NULL, bw = "SJ", kernel = "gaussian", n = 2L^11, ... )naive( df_numerator, df_denominator, m = NULL, bw = "SJ", kernel = "gaussian", n = 2L^11, ... )
df_numerator |
|
df_denominator |
|
m |
|
bw |
the smoothing bandwidth to be used. See stats::density for more information. |
kernel |
the kernel to be used. See stats::density for more information. |
n |
|
... |
further arguments passed to stats::density |
naivedensityratio object
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
Simulated data set (see data-raw/generate-data-densityratio.R) with five variables that are used in the examples.
A data frame with 1000 rows and 5 columns:
Categorical variable with three categories, 'A', 'B' and 'C'
Categorical variable with two categories, 'G1' and 'G2'
Continuous variable (normally distributed given x1 and x2)
Continuous variable (normally distributed given x3)
Continuous variable (mixture of two normally distributed variables)
Subset of the numerator_data with three variables and 50 observations
A data frame with 50 rows and 3 columns:
Continuous variable (normally distributed given x1 and x2)
Continuous variable (normally distributed given x3)
Continuous variable (mixture of two normally distributed variables)
Single permutation
Single permutation statistic of ulsif object
Single permutation statistic of kliep object
Single permutation statistic of kmm object
Single permutation statistic of lhss object
Single permutation statistic of spectral object
Single permutation statistic of naivedensityratio object
permute(object, ...) ## S3 method for class 'ulsif' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'kliep' permute(object, stacked, nnu, nde, min_pred = sqrt(.Machine$double.eps), ...) ## S3 method for class 'kmm' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'lhss' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'spectral' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'naivedensityratio' permute(object, stacked, nnu, nde, min_pred, max_pred, ...)permute(object, ...) ## S3 method for class 'ulsif' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'kliep' permute(object, stacked, nnu, nde, min_pred = sqrt(.Machine$double.eps), ...) ## S3 method for class 'kmm' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'lhss' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'spectral' permute(object, stacked, nnu, nde, ...) ## S3 method for class 'naivedensityratio' permute(object, stacked, nnu, nde, min_pred, max_pred, ...)
object |
|
... |
Additional arguments to pass through to specific permute functions. |
stacked |
|
nnu |
Scalar with numerator sample size |
nde |
Scalar with denominator sample size |
min_pred |
Minimum value of the predicted density ratio |
max_pred |
Maximum value of the predicted density ratio |
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
permutation statistic for a single permutation of the data
Plots a scatterplot of two variables, with densityratio mapped to the colour scale.
plot_bivariate( x, vars = NULL, samples = "both", grid = FALSE, logscale = TRUE, show.sample = FALSE, tol = 0.01, ... )plot_bivariate( x, vars = NULL, samples = "both", grid = FALSE, logscale = TRUE, show.sample = FALSE, tol = 0.01, ... )
x |
Density ratio object created with e.g., |
vars |
Character vector of variable names for which all pairwise bivariate plots are created |
samples |
Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'. |
grid |
Logical indicating whether output should be a list of individual plots ("individual"), or one facetted plot with all variables ("assembled"). Defaults to "individual". |
logscale |
Logical indicating whether to plot the density ratio
estimates on a log scale. Default is |
show.sample |
Logical indicating whether to give different shapes to
observations, depending on the sample they come from (numerator or
denominator). Defaults to |
tol |
Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots |
... |
Additional arguments passed to the predict() function. |
Bivariate scatter plots of all combinations of variables in vars.
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
A scatter plot showing the relationship between estimated density ratios and individual variables.
plot_univariate( x, vars = NULL, samples = "both", logscale = TRUE, grid = FALSE, sample.facet = FALSE, nrow.panel = NULL, tol = 0.01, ... )plot_univariate( x, vars = NULL, samples = "both", logscale = TRUE, grid = FALSE, sample.facet = FALSE, nrow.panel = NULL, tol = 0.01, ... )
x |
Density ratio object created with e.g., |
vars |
Character vector of variable names to be plotted. |
samples |
Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'. |
logscale |
Logical indicating whether to plot the density ratio estimates on a log scale. Default is TRUE. |
grid |
Logical indicating whether output should be a list of individual plots ("individual"), or one facetted plot with all variables ("assembled"). Defaults to "individual". |
sample.facet |
Logical indicating whether to facet the plot by sample, i.e, showing plots separate for each sample, and side to side. Defaults to FALSE. |
nrow.panel |
Integer indicating the number of rows in the assembled plot. If NULL, the number of rows is automatically calculated. |
tol |
Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots |
... |
Additional arguments passed to the predict() function. |
Scatter plot of density ratios and individual variables.
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
kliep objectObtain predicted density ratio values from a kliep object
## S3 method for class 'kliep' predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)## S3 method for class 'kliep' predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)
object |
A |
newdata |
Optional |
sigma |
A scalar with the Gaussian kernel width |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)
kmm objectObtain predicted density ratio values from a kmm object
## S3 method for class 'kmm' predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)## S3 method for class 'kmm' predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)
object |
A |
newdata |
Optional |
sigma |
A scalar with the Gaussian kernel width |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)
lhss objectObtain predicted density ratio values from a lhss object
## S3 method for class 'lhss' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), lambda = c("lambdaopt", "all"), ... )## S3 method for class 'lhss' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), lambda = c("lambdaopt", "all"), ... )
object |
A |
newdata |
Optional |
sigma |
A scalar with the Gaussian kernel width |
lambda |
A scalar with the regularization parameter |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))
naivedensityratio objectObtain predicted density ratio values from a naivedensityratio object
## S3 method for class 'naivedensityratio' predict(object, newdata = NULL, log = FALSE, tol = 1e-06, ...)## S3 method for class 'naivedensityratio' predict(object, newdata = NULL, log = FALSE, tol = 1e-06, ...)
object |
A |
newdata |
Optional |
log |
A logical indicating whether to return the log of the density ratio |
tol |
Minimal density value to avoid numerical issues |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
spectral objectObtain predicted density ratio values from a spectral object
## S3 method for class 'spectral' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), m = c("opt", "all"), ... )## S3 method for class 'spectral' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), m = c("opt", "all"), ... )
object |
A |
newdata |
Optional |
sigma |
A scalar with the Gaussian kernel width |
m |
integer indicating the dimension of the eigenvector expansion |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
ulsif objectObtain predicted density ratio values from a ulsif object
## S3 method for class 'ulsif' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), lambda = c("lambdaopt", "all"), ... )## S3 method for class 'ulsif' predict( object, newdata = NULL, sigma = c("sigmaopt", "all"), lambda = c("lambdaopt", "all"), ... )
object |
A |
newdata |
Optional |
sigma |
A scalar with the Gaussian kernel width |
lambda |
A scalar with the regularization parameter |
... |
Additional arguments to be passed to the function |
An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
kliep objectPrint a kliep object
## S3 method for class 'kliep' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'kliep' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted kliep object.
set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)
kmm objectPrint a kmm object
## S3 method for class 'kmm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'kmm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted kmm object.
set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)
lhss objectPrint a lhss object
## S3 method for class 'lhss' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'lhss' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted lhss object.
set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))
naivedensityratio objectPrint a naivedensityratio object
## S3 method for class 'naivedensityratio' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'naivedensityratio' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted naivedensityratio object.
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
spectral objectPrint a spectral object
## S3 method for class 'spectral' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'spectral' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted spectral object.
set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)
summary.kliep objectPrint a summary.kliep object
## S3 method for class 'summary.kliep' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.kliep' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.kliep object.
set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)
summary.kmm objectPrint a summary.kmm object
## S3 method for class 'summary.kmm' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.kmm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.kmm object.
set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)
summary.lhss objectPrint a summary.lhss object
## S3 method for class 'summary.lhss' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.lhss' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.lhss object.
set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))
summary.naivedensityratio objectPrint a summary.naivedensityratio object
## S3 method for class 'summary.naivedensityratio' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.naivedensityratio' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.naivedensityratio object.
print, summary.naivedensityratio,
naive
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
summary.spectral objectPrint a summary.spectral object
## S3 method for class 'summary.spectral' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.spectral' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.spectral object.
print, summary.spectral, spectral
set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)
summary.ulsif objectPrint a summary.ulsif object
## S3 method for class 'summary.ulsif' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'summary.ulsif' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted summary.ulsif object.
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
ulsif objectPrint a ulsif object
## S3 method for class 'ulsif' print(x, digits = max(3L, getOption("digits") - 3L), ...)## S3 method for class 'ulsif' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
Object of class |
digits |
Number of digits to use when printing the output. |
... |
further arguments on how to format the number of digits. |
invisble The inputted ulsif object.
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
Spectral series based density ratio estimation
spectral( df_numerator, df_denominator, m = NULL, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = NULL, cv = TRUE, nfold = 10, parallel = FALSE, nthreads = NULL, progressbar = TRUE )spectral( df_numerator, df_denominator, m = NULL, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, ncenters = NULL, cv = TRUE, nfold = 10, parallel = FALSE, nthreads = NULL, progressbar = TRUE )
df_numerator |
|
df_denominator |
|
m |
Integer vector indicating the number of eigenvectors to use in the spectral series expansion. Defaults to 50 evenly spaced values between 1 and the number of denominator samples (or the largest number of samples that can be used as centers in the cross-validation scheme). |
scale |
|
nsigma |
Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation. |
sigma_quantile |
|
sigma |
|
ncenters |
integer If smaller than the number of denominator observations,
an approximation to the eigenvector expansion based on only ncenters samples
is performed, instead of the full expansion. This can be useful for large
datasets. Defaults to |
cv |
logical indicating whether to use cross-validation to determine the optimal sigma value and the optimal number of eigenvectors. |
nfold |
Integer indicating the number of folds to use in the
cross-validation scheme. If |
parallel |
logical indicating whether to use parallel processing in the cross-validation scheme. |
nthreads |
|
progressbar |
Logical indicating whether or not to display a progressbar. |
spectral-object, containing all information to calculate the
density ratio using optimal sigma and optimal spectral series expansion.
Izbicki, R., Lee, A. & Schafer, C. (2014). High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation. Proceedings of Machine Learning Research 33, 420-429. Available from https://proceedings.mlr.press/v33/izbicki14.html.
set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)
kliep object, including two-sample significance
test for homogeneity of the numerator and denominator samplesExtract summary from kliep object, including two-sample significance
test for homogeneity of the numerator and denominator samples
## S3 method for class 'kliep' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, min_pred = 1e-06, ... )## S3 method for class 'kliep' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, min_pred = 1e-06, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
min_pred |
Scalar indicating the minimum value for the predicted density ratio values (used in the divergence statistic) to avoid negative density ratio values. |
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)set.seed(123) # Fit model dr <- kliep(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kliep(numerator_small, denominator_small, nsigma = 1, ncenters = 100, nfold = 10, epsilon = 10^{2:-5}, maxit = 500)
kmm object, including two-sample significance
test for homogeneity of the numerator and denominator samplesExtract summary from kmm object, including two-sample significance
test for homogeneity of the numerator and denominator samples
## S3 method for class 'kmm' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, min_pred = 1e-06, ... )## S3 method for class 'kmm' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, min_pred = 1e-06, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
min_pred |
Scalar indicating the minimum value for the predicted density ratio values (used in the divergence statistic) to avoid negative density ratio values. |
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)set.seed(123) # Fit model dr <- kmm(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters kmm(numerator_small, denominator_small, nsigma = 5, ncenters = 100, nfold = 10, constrained = TRUE)
lhss object, including two-sample significance
test for homogeneity of the numerator and denominator samplesExtract summary from lhss object, including two-sample significance
test for homogeneity of the numerator and denominator samples
## S3 method for class 'lhss' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )## S3 method for class 'lhss' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))set.seed(123) # Fit model (minimal example to limit computation time) dr <- lhss(numerator_small, denominator_small, nsigma = 5, nlambda = 3, ncenters = 50, maxit = 100) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr))
naivedensityraito object, including two-sample
significance test for homogeneity of the numerator and denominator samplesExtract summary from naivedensityraito object, including two-sample
significance test for homogeneity of the numerator and denominator samples
## S3 method for class 'naivedensityratio' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )## S3 method for class 'naivedensityratio' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")set.seed(123) # Fit model dr <- naive(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
spectral object, including two-sample significance
test for homogeneity of the numerator and denominator samplesExtract summary from spectral object, including two-sample significance
test for homogeneity of the numerator and denominator samples
## S3 method for class 'spectral' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )## S3 method for class 'spectral' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)set.seed(123) # Fit model dr <- spectral(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters spectral(numerator_small, denominator_small, sigma = 2)
ulsif object, including two-sample significance
test for homogeneity of the numerator and denominator samplesExtract summary from ulsif object, including two-sample significance
test for homogeneity of the numerator and denominator samples
## S3 method for class 'ulsif' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )## S3 method for class 'ulsif' summary( object, test = FALSE, n_perm = 100, parallel = FALSE, cluster = NULL, ... )
object |
Object of class |
test |
logical indicating whether to statistically test for homogeneity of the numerator and denominator samples. |
n_perm |
Scalar indicating number of permutation samples |
parallel |
|
cluster |
|
... |
further arguments passed to or from other methods. |
Summary of the fitted density ratio model
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
Unconstrained least-squares importance fitting
ulsif( df_numerator, df_denominator, intercept = TRUE, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, nlambda = 20, lambda = NULL, ncenters = 200, centers = NULL, parallel = FALSE, nthreads = NULL, progressbar = TRUE )ulsif( df_numerator, df_denominator, intercept = TRUE, scale = "numerator", nsigma = 10, sigma_quantile = NULL, sigma = NULL, nlambda = 20, lambda = NULL, ncenters = 200, centers = NULL, parallel = FALSE, nthreads = NULL, progressbar = TRUE )
df_numerator |
|
df_denominator |
|
intercept |
|
scale |
|
nsigma |
Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation. |
sigma_quantile |
|
sigma |
|
nlambda |
Integer indicating the number of |
lambda |
|
ncenters |
Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples. |
centers |
|
parallel |
logical indicating whether to use parallel processing in the cross-validation scheme. |
nthreads |
|
progressbar |
Logical indicating whether or not to display a progressbar. |
ulsif-object, containing all information to calculate the
density ratio using optimal sigma and optimal weights.
Kanamori, T., Hido, S., & Sugiyama, M. (2009). A least-squares approach to direct importance estimation. Journal of Machine Learning Research, 10, 1391-1445. Available from https://jmlr.org/papers/v10/kanamori09a.html
set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)set.seed(123) # Fit model dr <- ulsif(numerator_small, denominator_small) # Inspect model object dr # Obtain summary of model object summary(dr) # Plot model object plot(dr) # Plot density ratio for each variable individually plot_univariate(dr) # Plot density ratio for each pair of variables plot_bivariate(dr) # Predict density ratio and inspect first 6 predictions head(predict(dr)) # Fit model with custom parameters ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)