Package 'densityratio' reference manual

Title:	Distribution Comparison Through Density Ratio Estimation
Description:	Fast, flexible and user-friendly functionality to directly estimate the ratio of two probability distributions from samples from these distributions without estimating the densities separately. Estimated density ratios can, among other things, be used for prediction, outlier detection, change-point detection in time-series, importance weighting under domain adaptation (i.e., sample selection bias) and evaluation of synthetic data utility. The rationale behind these use-cases is that differences between two data distributions can be captured in the ratio their density ratio, which is estimated over the entire multivariate space of the data. Computationally intensive code is executed in `C++` using `Rcpp` and `RcppArmadillo`. The package provides good default hyperparameters that can be optimized in cross-validation (we do recommend understanding those parameters before using `densityratio` in practice). Multiple density ratio estimation methods are implemented, such as unconstrained least-squares importance fitting (`ulsif()`), Kullback-Leibler importance estimation procedure (`kliep()`), spectral density ratio estimation (`spectral()`), and least-squares heterodistributional subspace search (`lhss()`).
Authors:	Thom Volker [aut, cre, cph] , Carlos Gonzalez Poses [ctb], Erik-Jan van Kesteren [ctb]
Maintainer:	Thom Volker <[email protected]>
License:	GPL (>= 3)
Version:	0.1.1
Built:	2025-03-30 06:57:15 UTC
Source:	https://github.com/thomvolker/densityratio

Indivual univariate plot

Description

Scatterplot of individual values and density ratio estimates. Used internally in create_univariate_plot()

Usage

create_univariate_plot(data, ext, var, y_lab, sample.facet = TRUE)
create_univariate_plot(data, ext, var, y_lab, sample.facet = TRUE)

Arguments

`data`	Data frame with the individual values and density ratio estimates
`ext`	Data frame with the density ratio estimates and sample indicator
`var`	Name of the variable to be plotted on the x-axis
`y_lab`	Name of the y-axis label, typically ("Density Ratio" or "Log Density Ratio")
`sample.facet`	Logical indicating whether to facet the plot by sample. Default is TRUE.

Value

A scatterplot of variable values and density ratio estimates.

denominator_data

Description

Simulated data set (see data-raw/generate-data-densityratio.R) with five variables that are used in the examples.

Format

A data frame with 1000 rows and 5 columns:

x1: Categorical variable with three categories, 'A', 'B' and 'C'
x2: Categorical variable with two categories, 'G1' and 'G2'
x3: Continuous variable (normally distributed given x1 and x2)
x4: Continuous variable (normally distributed)
x5: Continuous variable (normally distributed)

denominator_small

Description

Subset of the denominator_data with three variables and 50 observations

Format

A data frame with 100 rows and 3 columns:

x1: Continuous variable (normally distributed given x1 and x2)
x2: Continuous variable (normally distributed)
x3: Continuous variable (normally distributed)

Create a Gram matrix with squared Euclidean distances between observations in the input matrix `X` and the input matrix `Y`

Description

Create a Gram matrix with squared Euclidean distances between observations in the input matrix X and the input matrix Y

Arguments

`X`	A numeric input matrix
`Y`	A numeric input matrix with the same variables as `X`
`intercept`	Logical indicating whether an intercept should be added to the estimation procedure. In this case, the first column is an all-zero column (which will be transformed into an all-ones column in the kernel).

A histogram of density ratio estimates

Description

Creates a histogram of the density ratio estimates. Useful to understand the distribution of estimated density ratios in each sample, or compare it among samples. It is the default plotting method for density ratio objects.

Usage

dr.histogram(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'ulsif'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'kliep'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'kmm'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'spectral'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'lhss'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'naivedensityratio'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)
dr.histogram(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'ulsif'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'kliep'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'kmm'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'spectral'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'lhss'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

## S3 method for class 'naivedensityratio'
plot(
  x,
  samples = "both",
  logscale = TRUE,
  binwidth = NULL,
  bins = NULL,
  tol = 0.01,
  ...
)

Arguments

`x`	Density ratio object created with e.g., `kliep()`, `ulsif()`, or `naive()`
`samples`	Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'.
`logscale`	Logical indicating whether to plot the density ratio estimates on a log scale. Default is TRUE.
`binwidth`	Numeric indicating the width of the bins, passed on to `ggplot2`.
`bins`	Numeric indicating the number of bins. Overriden by binwidth, and passed on to `ggplot2`.
`tol`	Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots
`...`	Additional arguments passed on to `predict()`.

Value

A histogram of density ratio estimates.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Create gaussian kernel gram matrix from distance matrix

Description

Create gaussian kernel gram matrix from distance matrix

Arguments

`dist`	A numeric distance matrix
`sigma`	A scalar with the length-scale parameter

Kullback-Leibler importance estimation procedure

Description

Kullback-Leibler importance estimation procedure

Usage

kliep(
  df_numerator,
  df_denominator,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = 200,
  centers = NULL,
  cv = TRUE,
  nfold = 5,
  epsilon = NULL,
  maxit = 5000,
  progressbar = TRUE
)
kliep(
  df_numerator,
  df_denominator,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = 200,
  centers = NULL,
  cv = TRUE,
  nfold = 5,
  epsilon = NULL,
  maxit = 5000,
  progressbar = TRUE
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`scale`	`"numerator"`, `"denominator"`, or `NULL`, indicating whether to standardize each numeric variable according to the numerator means and standard deviations, the denominator means and standard deviations, or apply no standardization at all.
`nsigma`	Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation.
`sigma_quantile`	`NULL` or numeric vector with probabilities to calculate the quantiles of the distance matrix to obtain sigma values. If `NULL`, `nsigma` values between `0.25` and `0.75` are used.
`sigma`	`NULL` or a scalar value to determine the bandwidth of the Gaussian kernel gram matrix. If `NULL`, `nsigma` values between `0.25` and `0.75` are used.
`ncenters`	Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples.
`centers`	Option to specify the Gaussian samples manually.
`cv`	Logical indicating whether or not to do cross-validation
`nfold`	Number of cross-validation folds used in order to calculate the optimal `sigma` value (default is 5-fold cv).
`epsilon`	Numeric scalar or vector with the learning rate for the gradient-ascent procedure. If a vector, all values are used as the learning rate. By default, `10^{1:-5}` is used.
`maxit`	Maximum number of iterations for the optimization scheme.
`progressbar`	Logical indicating whether or not to display a progressbar.

Value

kliep-object, containing all information to calculate the density ratio using optimal sigma and optimal weights.

Examples

set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)

Kernel mean matching approach to density ratio estimation

Description

Kernel mean matching approach to density ratio estimation

Usage

kmm(
  df_numerator,
  df_denominator,
  scale = "numerator",
  constrained = FALSE,
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = 200,
  centers = NULL,
  cv = TRUE,
  nfold = 5,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE,
  osqp_settings = NULL,
  cluster = NULL
)
kmm(
  df_numerator,
  df_denominator,
  scale = "numerator",
  constrained = FALSE,
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = 200,
  centers = NULL,
  cv = TRUE,
  nfold = 5,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE,
  osqp_settings = NULL,
  cluster = NULL
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`scale`	`"numerator"`, `"denominator"`, or `NULL`, indicating whether to standardize each numeric variable according to the numerator means and standard deviations, the denominator means and standard deviations, or apply no standardization at all.
`constrained`	`logical` equals `FALSE` to use unconstrained optimization, `TRUE` to use constrained optimization. Defaults to `FALSE`.
`nsigma`	Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation.
`sigma_quantile`	`NULL` or numeric vector with probabilities to calculate the quantiles of the distance matrix to obtain sigma values. If `NULL`, `nsigma` values between `0.25` and `0.75` are used.
`sigma`	`NULL` or a scalar value to determine the bandwidth of the Gaussian kernel gram matrix. If `NULL`, `nsigma` values between `0.25` and `0.75` are used.
`ncenters`	Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples.
`centers`	Option to specify the Gaussian samples manually.
`cv`	Logical indicating whether or not to do cross-validation
`nfold`	Number of cross-validation folds used in order to calculate the optimal `sigma` value (default is 5-fold cv).
`parallel`	logical indicating whether to use parallel processing in the cross-validation scheme.
`nthreads`	`NULL` or integer indicating the number of threads to use for parallel processing. If parallel processing is enabled, it defaults to the number of available threads minus one.
`progressbar`	Logical indicating whether or not to display a progressbar.
`osqp_settings`	Optional: settings to pass to the `osqp` solver for constrained optimization.
`cluster`	Optional: a cluster object to use for parallel processing, see `parallel::makeCluster`.

Value

kmm-object, containing all information to calculate the density ratio using optimal sigma and optimal weights.

Examples

set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)

Least-squares heterodistributional subspace search

Description

Least-squares heterodistributional subspace search

Usage

lhss(
  df_numerator,
  df_denominator,
  m = NULL,
  intercept = TRUE,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  nlambda = 10,
  lambda = NULL,
  ncenters = 200,
  centers = NULL,
  maxit = 200,
  progressbar = TRUE
)
lhss(
  df_numerator,
  df_denominator,
  m = NULL,
  intercept = TRUE,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  nlambda = 10,
  lambda = NULL,
  ncenters = 200,
  centers = NULL,
  maxit = 200,
  progressbar = TRUE
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`m`	Scalar indicating the dimensionality of the reduced subspace
`intercept`	`logical` Indicating whether to include an intercept term in the model. Defaults to `TRUE`.
`scale`	`"numerator"`, `"denominator"`, or `NULL`, indicating whether to standardize each numeric variable according to the numerator means and standard deviations, the denominator means and standard deviations, or apply no standardization at all.
`nsigma`	Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation.
`sigma_quantile`	`NULL` or numeric vector with probabilities to calculate the quantiles of the distance matrix to obtain sigma values. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`sigma`	`NULL` or a scalar value to determine the bandwidth of the Gaussian kernel gram matrix. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`nlambda`	Integer indicating the number of `lambda` values (regularization parameter), by default, `lambda` is set to `10^seq(3, -3, length.out = nlambda)`.
`lambda`	`NULL` or numeric vector indicating the lambda values to use in cross-validation
`ncenters`	Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples.
`centers`	Numeric matrix with the same variables as `nu` and `de` that are used as Gaussian centers in the kernel Gram matrix. By default, the matrix `nu` is used as the matrix with Gaussian centers.
`maxit`	Maximum number of iterations in the updating scheme.
`progressbar`	Logical indicating whether or not to display a progressbar.

Value

lhss-object, containing all information to calculate the density ratio using optimal sigma, optimal lambda and optimal weights.

lhss returns rhat, the estimated density ratio.

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Naive density ratio estimation

Description

The naive approach creates separate kernel density estimates for the numerator and the denominator samples, and then evaluates their ratio for the denominator samples. For multivariate data, the density ratio is computed after a orthogonal linear transformation, such that the new variables can be treated as independent. To reduce the dimensionality of the PCA solution, one can set the number of components by setting the m parameter to an integer value smaller than the number of variables.

Usage

naive(
  df_numerator,
  df_denominator,
  m,
  bw = "SJ",
  kernel = "gaussian",
  n = 2L^11,
  ...
)
naive(
  df_numerator,
  df_denominator,
  m,
  bw = "SJ",
  kernel = "gaussian",
  n = 2L^11,
  ...
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`m`	`integer` Optional parameter to reduce the dimensionality of the data in multivariate density ratio estimation problems. If missing, the number of variables in the data is used. If set to an integer value smaller than the number of variables, the first `m` principal components are used to estimate the density ratio. If set to `NULL`, the square root of the number of variables is used (for consistency with other methods).
`bw`	the smoothing bandwidth to be used. See stats::density for more information.
`kernel`	the kernel to be used. See stats::density for more information.
`n`	`integer` the number of equally spaced points at which the density is to be estimated. When n > 512, it is rounded up to a power of 2 during the calculations (as fast Fourier transform is used) and the final result is interpolated by stats::approx. So it makes sense to specify n as a power' of two.
`...`	further arguments passed to stats::density

Value

naivedensityratio object

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

numerator_data

Description

Simulated data set (see data-raw/generate-data-densityratio.R) with five variables that are used in the examples.

Format

A data frame with 1000 rows and 5 columns:

x1: Categorical variable with three categories, 'A', 'B' and 'C'
x2: Categorical variable with two categories, 'G1' and 'G2'
x3: Continuous variable (normally distributed given x1 and x2)
x4: Continuous variable (normally distributed given x3)
x5: Continuous variable (mixture of two normally distributed variables)

numerator_small

Description

Subset of the numerator_data with three variables and 50 observations

Format

A data frame with 50 rows and 3 columns:

x1: Continuous variable (normally distributed given x1 and x2)
x2: Continuous variable (normally distributed given x3)
x3: Continuous variable (mixture of two normally distributed variables)

Single permutation

Description

Single permutation

Single permutation statistic of ulsif object

Single permutation statistic of kliep object

Single permutation statistic of kmm object

Single permutation statistic of lhss object

Single permutation statistic of spectral object

Single permutation statistic of naivedensityratio object

Usage

permute(object, ...)

## S3 method for class 'ulsif'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'kliep'
permute(object, stacked, nnu, nde, min_pred = sqrt(.Machine$double.eps), ...)

## S3 method for class 'kmm'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'lhss'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'spectral'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'naivedensityratio'
permute(object, stacked, nnu, nde, min_pred, max_pred)
permute(object, ...)

## S3 method for class 'ulsif'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'kliep'
permute(object, stacked, nnu, nde, min_pred = sqrt(.Machine$double.eps), ...)

## S3 method for class 'kmm'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'lhss'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'spectral'
permute(object, stacked, nnu, nde, ...)

## S3 method for class 'naivedensityratio'
permute(object, stacked, nnu, nde, min_pred, max_pred)

Arguments

`object`	`naivedensityratio` object
`...`	Additional arguments to pass through to specific permute functions.
`stacked`	`matrix` with stacked numerator and denominator samples
`nnu`	Scalar with numerator sample size
`nde`	Scalar with denominator sample size
`min_pred`	Minimum value of the predicted density ratio
`max_pred`	Maximum value of the predicted density ratio

Value

permutation statistic for a single permutation of the data

Densityratio in bidimensional plot

Description

Plots a scatterplot of two variables, with densityratio mapped to the colour scale.

Usage

plot_bivariate(
  x,
  vars = NULL,
  samples = "both",
  grid = FALSE,
  logscale = TRUE,
  show.sample = FALSE,
  tol = 0.01,
  ...
)
plot_bivariate(
  x,
  vars = NULL,
  samples = "both",
  grid = FALSE,
  logscale = TRUE,
  show.sample = FALSE,
  tol = 0.01,
  ...
)

Arguments

`x`	Density ratio object created with e.g., `kliep()`, `ulsif()`, or `naive()`
`vars`	Character vector of variable names for which all pairwise bivariate plots are created
`samples`	Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'.
`grid`	Logical indicating whether output should be a list of individual plots ("individual"), or one facetted plot with all variables ("assembled"). Defaults to "individual".
`logscale`	Logical indicating whether to plot the density ratio estimates on a log scale. Default is `TRUE`.
`show.sample`	Logical indicating whether to give different shapes to observations, depending on the sample they come from (numerator or denominator). Defaults to `FALSE`.
`tol`	Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots
`...`	Additional arguments passed to the predict() function.

Value

Bivariate scatter plots of all combinations of variables in vars.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Scatter plot of density ratios and individual variables

Description

A scatter plot showing the relationship between estimated density ratios and individual variables.

Usage

plot_univariate(
  x,
  vars = NULL,
  samples = "both",
  logscale = TRUE,
  grid = FALSE,
  sample.facet = FALSE,
  nrow.panel = NULL,
  tol = 0.01,
  ...
)
plot_univariate(
  x,
  vars = NULL,
  samples = "both",
  logscale = TRUE,
  grid = FALSE,
  sample.facet = FALSE,
  nrow.panel = NULL,
  tol = 0.01,
  ...
)

Arguments

`x`	Density ratio object created with e.g., `kliep()`, `ulsif()`, or `naive()`
`vars`	Character vector of variable names to be plotted.
`samples`	Character string indicating whether to plot the 'numerator', 'denominator', or 'both' samples. Default is 'both'.
`logscale`	Logical indicating whether to plot the density ratio estimates on a log scale. Default is TRUE.
`grid`	Logical indicating whether output should be a list of individual plots ("individual"), or one facetted plot with all variables ("assembled"). Defaults to "individual".
`sample.facet`	Logical indicating whether to facet the plot by sample, i.e, showing plots separate for each sample, and side to side. Defaults to FALSE.
`nrow.panel`	Integer indicating the number of rows in the assembled plot. If NULL, the number of rows is automatically calculated.
`tol`	Numeric indicating the tolerance: values below this value will be set to the tolerance value, for legibility of the plots
`...`	Additional arguments passed to the predict() function.

Value

Scatter plot of density ratios and individual variables.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Obtain predicted density ratio values from a `kliep` object

Description

Obtain predicted density ratio values from a kliep object

Usage

## S3 method for class 'kliep'
predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)
## S3 method for class 'kliep'
predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)

Arguments

`object`	A `kliep` object
`newdata`	Optional `matrix` new data set to compute the density
`sigma`	A scalar with the Gaussian kernel width
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Examples

set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)

Obtain predicted density ratio values from a `kmm` object

Description

Obtain predicted density ratio values from a kmm object

Usage

## S3 method for class 'kmm'
predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)
## S3 method for class 'kmm'
predict(object, newdata = NULL, sigma = c("sigmaopt", "all"), ...)

Arguments

`object`	A `kmm` object
`newdata`	Optional `matrix` new data set to compute the density
`sigma`	A scalar with the Gaussian kernel width
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Examples

set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)

Obtain predicted density ratio values from a `lhss` object

Description

Obtain predicted density ratio values from a lhss object

Usage

## S3 method for class 'lhss'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  lambda = c("lambdaopt", "all"),
  ...
)
## S3 method for class 'lhss'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  lambda = c("lambdaopt", "all"),
  ...
)

Arguments

`object`	A `lhss` object
`newdata`	Optional `matrix` new data set to compute the density
`sigma`	A scalar with the Gaussian kernel width
`lambda`	A scalar with the regularization parameter
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Examples

set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))

Obtain predicted density ratio values from a `naivedensityratio` object

Description

Obtain predicted density ratio values from a naivedensityratio object

Usage

## S3 method for class 'naivedensityratio'
predict(object, newdata = NULL, log = FALSE, tol = 1e-06, ...)
## S3 method for class 'naivedensityratio'
predict(object, newdata = NULL, log = FALSE, tol = 1e-06, ...)

Arguments

`object`	A `naive` object
`newdata`	Optional `matrix` new data set to compute the density
`log`	A logical indicating whether to return the log of the density ratio
`tol`	Minimal density value to avoid numerical issues
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Obtain predicted density ratio values from a `spectral` object

Description

Obtain predicted density ratio values from a spectral object

Usage

## S3 method for class 'spectral'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  J = c("Jopt", "all"),
  ...
)
## S3 method for class 'spectral'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  J = c("Jopt", "all"),
  ...
)

Arguments

`object`	A `spectral` object
`newdata`	Optional `matrix` new data set to compute the density
`sigma`	A scalar with the Gaussian kernel width
`J`	integer indicating the dimension of the eigenvector expansion
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Obtain predicted density ratio values from a `ulsif` object

Description

Obtain predicted density ratio values from a ulsif object

Usage

## S3 method for class 'ulsif'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  lambda = c("lambdaopt", "all"),
  ...
)
## S3 method for class 'ulsif'
predict(
  object,
  newdata = NULL,
  sigma = c("sigmaopt", "all"),
  lambda = c("lambdaopt", "all"),
  ...
)

Arguments

`object`	A `ulsif` object
`newdata`	Optional `matrix` new data set to compute the density
`sigma`	A scalar with the Gaussian kernel width
`lambda`	A scalar with the regularization parameter
`...`	Additional arguments to be passed to the function

Value

An array with predicted density ratio values from possibly new data, but otherwise the numerator samples.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Print a `kliep` object

Description

Print a kliep object

Usage

## S3 method for class 'kliep'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'kliep'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `kliep`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted kliep object.

Examples

set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)

Print a `kmm` object

Description

Print a kmm object

Usage

## S3 method for class 'kmm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'kmm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `kmm`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted kmm object.

Examples

set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)

Print a `lhss` object

Description

Print a lhss object

Usage

## S3 method for class 'lhss'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'lhss'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `lhss`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted lhss object.

Examples

set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))

Print a `naivedensityratio` object

Description

Print a naivedensityratio object

Usage

## S3 method for class 'naivedensityratio'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'naivedensityratio'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `naivesubspacedensityratio`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted naivedensityratio object.

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Print a `spectral` object

Description

Print a spectral object

Usage

## S3 method for class 'spectral'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'spectral'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `spectral`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted spectral object.

Examples

set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)

Print a `summary.kliep` object

Description

Print a summary.kliep object

Usage

## S3 method for class 'summary.kliep'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.kliep'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.kliep`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.kliep object.

Examples

set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)

Print a `summary.kmm` object

Description

Print a summary.kmm object

Usage

## S3 method for class 'summary.kmm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.kmm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.kmm`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.kmm object.

Examples

set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)

Print a `summary.lhss` object

Description

Print a summary.lhss object

Usage

## S3 method for class 'summary.lhss'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.lhss'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.lhss`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.lhss object.

Examples

set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))

Print a `summary.naivedensityratio` object

Description

Print a summary.naivedensityratio object

Usage

## S3 method for class 'summary.naivedensityratio'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.naivedensityratio'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.naivedensityratio`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.naivedensityratio object.

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Print a `summary.spectral` object

Description

Print a summary.spectral object

Usage

## S3 method for class 'summary.spectral'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.spectral'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.spectral`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.spectral object.

Examples

set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)

Print a `summary.ulsif` object

Description

Print a summary.ulsif object

Usage

## S3 method for class 'summary.ulsif'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.ulsif'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `summary.ulsif`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted summary.ulsif object.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Print a `ulsif` object

Description

Print a ulsif object

Usage

## S3 method for class 'ulsif'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'ulsif'
print(x, digits = max(3L, getOption("digits") - 3L), ...)

Arguments

`x`	Object of class `ulsif`.
`digits`	Number of digits to use when printing the output.
`...`	further arguments on how to format the number of digits.

Value

invisble The inputted ulsif object.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Spectral series based density ratio estimation

Description

Spectral series based density ratio estimation

Usage

spectral(
  df_numerator,
  df_denominator,
  J = NULL,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = NULL,
  cv = TRUE,
  nfold = 10,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE
)
spectral(
  df_numerator,
  df_denominator,
  J = NULL,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  ncenters = NULL,
  cv = TRUE,
  nfold = 10,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`J`	Integer vector indicating the number of eigenvectors to use in the spectral series expansion. Defaults to 50 evenly spaced values between 1 and the number of denominator samples (or the largest number of samples that can be used as centers in the cross-validation scheme).
`scale`	`"numerator"`, `"denominator"`, or `NULL`, indicating whether to standardize each numeric variable according to the numerator means and standard deviations, the denominator means and standard deviations, or apply no standardization at all.
`nsigma`	Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation.
`sigma_quantile`	`NULL` or numeric vector with probabilities to calculate the quantiles of the distance matrix to obtain sigma values. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`sigma`	`NULL` or a scalar value to determine the bandwidth of the Gaussian kernel gram matrix. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`ncenters`	integer If smaller than the number of denominator observations, an approximation to the eigenvector expansion based on only ncenters samples is performed, instead of the full expansion. This can be useful for large datasets. Defaults to `NULL`, such that all denominator samples are used.
`cv`	logical indicating whether to use cross-validation to determine the optimal sigma value and the optimal number of eigenvectors.
`nfold`	Integer indicating the number of folds to use in the cross-validation scheme. If `cv` is `FALSE`, this parameter is ignored.
`parallel`	logical indicating whether to use parallel processing in the cross-validation scheme.
`nthreads`	`NULL` or integer indicating the number of threads to use for parallel processing. If parallel processing is enabled, it defaults to the number of available threads minus one.
`progressbar`	Logical indicating whether or not to display a progressbar.

Value

spectral-object, containing all information to calculate the density ratio using optimal sigma and optimal spectral series expansion.

References

Izbicki, R., Lee, A. & Schafer, C. (2014). High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation. Proceedings of Machine Learning Research 33:420-429. Available from https://proceedings.mlr.press/v33/izbicki14.html.

Examples

set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)

Extract summary from `kliep` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from kliep object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'kliep'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  min_pred = 1e-06,
  ...
)
## S3 method for class 'kliep'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  min_pred = 1e-06,
  ...
)

Arguments

`object`	Object of class `kliep`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`min_pred`	Scalar indicating the minimum value for the predicted density ratio values (used in the divergence statistic) to avoid negative density ratio values.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)
set.seed(123)
# Fit model
dr <- kliep(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kliep(numerator_small, denominator_small,
      nsigma = 20, ncenters = 100, nfold = 20,
      epsilon = 10^{3:-5}, maxit = 1000)

Extract summary from `kmm` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from kmm object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'kmm'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  min_pred = 1e-06,
  ...
)
## S3 method for class 'kmm'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  min_pred = 1e-06,
  ...
)

Arguments

`object`	Object of class `kmm`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`min_pred`	Scalar indicating the minimum value for the predicted density ratio values (used in the divergence statistic) to avoid negative density ratio values.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)
set.seed(123)
# Fit model
dr <- kmm(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
kmm(numerator_small, denominator_small,
    nsigma = 20, ncenters = 100, nfold = 20,
    constrained = TRUE, parallel = TRUE, nthreads = 2)

Extract summary from `lhss` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from lhss object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'lhss'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)
## S3 method for class 'lhss'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)

Arguments

`object`	Object of class `lhss`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
set.seed(123)
# Fit model
dr <- lhss(numerator_small, denominator_small,
           nsigma = 5, nlambda = 5, ncenters = 50)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))

Extract summary from `naivedensityraito` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from naivedensityraito object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'naivedensityratio'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)
## S3 method for class 'naivedensityratio'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)

Arguments

`object`	Object of class `naivedensityratio`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")
set.seed(123)
# Fit model
dr <- naive(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
naive(numerator_small, denominator_small, m=2, kernel="epanechnikov")

Extract summary from `spectral` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from spectral object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'spectral'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)
## S3 method for class 'spectral'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)

Arguments

`object`	Object of class `spectral`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)
set.seed(123)
# Fit model
dr <- spectral(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
spectral(numerator_small, denominator_small, sigma = 2)

Extract summary from `ulsif` object, including two-sample significance test for homogeneity of the numerator and denominator samples

Description

Extract summary from ulsif object, including two-sample significance test for homogeneity of the numerator and denominator samples

Usage

## S3 method for class 'ulsif'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)
## S3 method for class 'ulsif'
summary(
  object,
  test = FALSE,
  n_perm = 100,
  parallel = FALSE,
  cluster = NULL,
  ...
)

Arguments

`object`	Object of class `ulsif`
`test`	logical indicating whether to statistically test for homogeneity of the numerator and denominator samples.
`n_perm`	Scalar indicating number of permutation samples
`parallel`	`logical` indicating to run the permutation test in parallel
`cluster`	`NULL` or a cluster object created by `makeCluster`. If `NULL` and `parallel = TRUE`, it uses the number of available cores minus 1.
`...`	further arguments passed to or from other methods.

Value

Summary of the fitted density ratio model

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Unconstrained least-squares importance fitting

Description

Unconstrained least-squares importance fitting

Usage

ulsif(
  df_numerator,
  df_denominator,
  intercept = TRUE,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  nlambda = 20,
  lambda = NULL,
  ncenters = 200,
  centers = NULL,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE
)
ulsif(
  df_numerator,
  df_denominator,
  intercept = TRUE,
  scale = "numerator",
  nsigma = 10,
  sigma_quantile = NULL,
  sigma = NULL,
  nlambda = 20,
  lambda = NULL,
  ncenters = 200,
  centers = NULL,
  parallel = FALSE,
  nthreads = NULL,
  progressbar = TRUE
)

Arguments

`df_numerator`	`data.frame` with exclusively numeric variables with the numerator samples
`df_denominator`	`data.frame` with exclusively numeric variables with the denominator samples (must have the same variables as `df_denominator`)
`intercept`	`logical` Indicating whether to include an intercept term in the model. Defaults to `TRUE`.
`scale`	`"numerator"`, `"denominator"`, or `NULL`, indicating whether to standardize each numeric variable according to the numerator means and standard deviations, the denominator means and standard deviations, or apply no standardization at all.
`nsigma`	Integer indicating the number of sigma values (bandwidth parameter of the Gaussian kernel gram matrix) to use in cross-validation.
`sigma_quantile`	`NULL` or numeric vector with probabilities to calculate the quantiles of the distance matrix to obtain sigma values. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`sigma`	`NULL` or a scalar value to determine the bandwidth of the Gaussian kernel gram matrix. If `NULL`, `nsigma` values between `0.05` and `0.95` are used.
`nlambda`	Integer indicating the number of `lambda` values (regularization parameter), by default, `lambda` is set to `10^seq(3, -3, length.out = nlambda)`.
`lambda`	`NULL` or numeric vector indicating the lambda values to use in cross-validation
`ncenters`	Maximum number of Gaussian centers in the kernel gram matrix. Defaults to all numerator samples.
`centers`	`NULL` or numeric matrix with the same dimensions as the data, indicating the centers for the Gaussian kernel gram matrix.
`parallel`	logical indicating whether to use parallel processing in the cross-validation scheme.
`nthreads`	`NULL` or integer indicating the number of threads to use for parallel processing. If parallel processing is enabled, it defaults to the number of available threads minus one.
`progressbar`	Logical indicating whether or not to display a progressbar.

Value

ulsif-object, containing all information to calculate the density ratio using optimal sigma and optimal weights.

Examples

set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)
set.seed(123)
# Fit model
dr <- ulsif(numerator_small, denominator_small)
# Inspect model object
dr
# Obtain summary of model object
summary(dr)
# Plot model object
plot(dr)
# Plot density ratio for each variable individually
plot_univariate(dr)
# Plot density ratio for each pair of variables
plot_bivariate(dr)
# Predict density ratio and inspect first 6 predictions
head(predict(dr))
# Fit model with custom parameters
ulsif(numerator_small, denominator_small, sigma = 2, lambda = 2)

Package 'densityratio'

Help Index

Bivariate plot

Description

Usage

Arguments

Value

Indivual univariate plot

Description

Usage

Arguments

Value

denominator_data

Description

Format

denominator_small

Description

Format

Create a Gram matrix with squared Euclidean distances between observations in the input matrix X and the input matrix Y

Description

Arguments

A histogram of density ratio estimates

Description

Usage

Arguments

Value

Examples

Create gaussian kernel gram matrix from distance matrix

Description

Arguments

Kullback-Leibler importance estimation procedure

Description

Usage

Arguments

Value

Examples

Kernel mean matching approach to density ratio estimation

Description

Usage

Arguments

Value

Examples

Least-squares heterodistributional subspace search

Description

Usage

Arguments

Value

Examples

Naive density ratio estimation

Description

Usage

Arguments

Value

See Also

Examples

numerator_data

Description

Format

numerator_small

Description

Format

Single permutation

Description

Usage

Arguments

Value

Densityratio in bidimensional plot

Description

Usage

Arguments

Value

Examples

Scatter plot of density ratios and individual variables

Description

Usage

Arguments

Value

Examples

Obtain predicted density ratio values from a kliep object

Description

Create a Gram matrix with squared Euclidean distances between observations in the input matrix `X` and the input matrix `Y`

Obtain predicted density ratio values from a `kliep` object

Obtain predicted density ratio values from a `kmm` object

Obtain predicted density ratio values from a `lhss` object

Obtain predicted density ratio values from a `naivedensityratio` object

Obtain predicted density ratio values from a `spectral` object

Obtain predicted density ratio values from a `ulsif` object

Print a `kliep` object

Print a `kmm` object

Print a `lhss` object

Print a `naivedensityratio` object

Print a `spectral` object