Package 'bosfr'

Title: Computes Exact Bounds of Spearman's Footrule with Missing Data
Description: Computes exact bounds of Spearman's footrule in the presence of missing data, and performs independence test based on the bounds with controlled Type I error regardless of the values of missing data. Suitable only for distinct, univariate data where no ties is allowed.
Authors: Yijin Zeng [aut, cre, cph]
Maintainer: Yijin Zeng <[email protected]>
License: GPL-3
Version: 0.1.0
Built: 2025-03-01 05:52:24 UTC
Source: https://github.com/cran/bosfr

Help Index


Bounds of Kendall's tau in the Presence of Missing Data

Description

Computes bounds of Kendall's tau in the presence of missing data. Suitable only for univariate distinct data where no ties is allowed.

Usage

boundsKendall(X, Y)

Arguments

X, Y

Numeric vectors of data values with potential missing data. No ties in the data is allowed. Inf and -Inf values will be omitted.

Details

boundsKendall() computes bounds of Kendall's tau for partially observed univariate, distinct data. The bounds are computed by first calculating the bounds of Spearman's footrule (Zeng et al., 2025), and then applying the combinatorial inequality between Kendall's tau and Spearman's footrule (Kendall, 1948). See Zeng et al., 2025 for more details.

Let X=(x1,,xn)X = (x_1, \ldots, x_n) and Y=(y1,,yn)Y = (y_1, \ldots, y_n) be two vectors of univariate, distinct data. Kendall's tau is defined as the number of discordant pairs between XX and YY:

τ(X,Y)=i<j{I(xi<xj)I(yi>yj)+I(xi>xj)I(yi<yj)}.\tau(X,Y) = \sum\limits_{i < j} \{I(x_i < x_j)I(y_i > y_j) + I(x_i > x_j)I(y_i < y_j)\}.

Scaled Kendall's tau τScale(X,Y)[0,1]\tau_{Scale}(X,Y) \in [0,1] is defined as (Kendall, 1948):

τScale(X,Y)=14τ(X,Y)/(n(n1)).\tau_{Scale}(X,Y) = 1 - 4\tau(X,Y)/(n(n-1)).

Value

bounds

bounds of Kendall's tau.

bounds.scaled

bounds of scaled Kendall's tau.

References

  • Zeng Y., Adams N.M., Bodenham D.A. Exact Bounds of Spearman's footrule in the Presence of Missing Data with Applications to Independence Testing. arXiv preprint arXiv:2501.11696. 2025 Jan 20.

  • Kendall, M.G. (1948) Rank Correlation Methods. Charles Griffin, London.

  • Diaconis, P. and Graham, R.L., 1977. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society Series B: Statistical Methodology, 39(2), pp.262-268.

Examples

### compute bounds of Kendall's tau between incomplete ranked lists
X <- c(1, 2, NA, 4, 3)
Y <- c(3, NA, 4, 2, 1)
boundsKendall(X, Y)

### compute bounds of Kendall's tau between incomplete vectors of distinct data
X <- c(1.3, 2.6, NA, 4.2, 3.5)
Y <- c(5.5, NA, 6.5, 2.6, 1.1)
boundsKendall(X, Y)

Exact bounds of Spearman's footrule in the Presence of Missing Data

Description

Computes exact bounds of Spearman's footrule in the presence of missing data, and performs independence test based on the bounds with controlled Type I error regardless of the values of missing data. Suitable only for univariate distinct data where no ties is allowed.

Usage

boundsSFR(X, Y, pval = TRUE)

Arguments

X

Numeric vector of data values with potential missing data. No ties in the data is allowed. Inf and -Inf values will be omitted.

Y

Numeric vector of data values with potential missing data. No ties in the data is allowed. Inf and -Inf values will be omitted.

pval

Boolean for whether to compute the bounds of p-value or not.

Details

boundsSFR() computes exact bounds of Spearman's footrule for partially observed univariate, distinct data using the results and algorithms following Zeng et al., 2025.

Let X=(x1,,xn)X = (x_1, \ldots, x_n) and Y=(y1,,yn)Y = (y_1, \ldots, y_n) be two vectors of univariate, distinct data, and denote the rank of xix_i in XX as R(xi,X)R(x_i, X), the rank of yiy_i in YY as R(yi,Y)R(y_i, Y). Spearman's footrule is defined as the absolute distance between the ranked values of XX and YY:

D(X,Y)=i=1nR(xi,X)R(yi,Y).D(X,Y) = \sum_{i=1}^{n} |R(x_i, X) - R(y_i, Y)|.

Scaled Spearman's footrule is defined as:

DScale(X,Y)=13D(X,Y)/(n21).D_{Scale}(X,Y) = 1 - 3D(X,Y)/(n^2-1).

When nn is odd, DScale(X,Y)[0.5,1]D_{Scale}(X,Y) \in [-0.5,1], but when nn is even, DScale(X,Y)[0.5{1+3/(n21)},1]D_{Scale}(X,Y) \in [-0.5\{1+3/(n^2-1)\},1] (Kendall, 1948).

The p-value of the independence test using Spearman's footrule, denoted as pp, is computed using the normality approximation result in Diaconis, P., & Graham, R. L. (1977). If pval = TRUE, bounds of the p-value, pl,pup_{l}, p_{u} will be computed in the presence of missing data, such that p[pl,pu]p \in [p_{l}, p_{u}]. The independence test method proposed in Zeng et al., 2025 returns pup_{u} as its p-value. This method controls the Type I error regardless of the values of missing data. See Zeng et al., 2025 for details.

Value

bounds

exact bounds of Spearman's footrule.

bounds.scaled

exact bounds of scaled Spearman's footrule.

pvalue

the p-value for the test. (Only present if argument pval = TRUE.)

bounds.pvalue

bounds of the p-value of independence test using Spearman's footrule. (Only present if argument pval = TRUE.)

References

  • Zeng Y., Adams N.M., Bodenham D.A. Exact Bounds of Spearman's footrule in the Presence of Missing Data with Applications to Independence Testing. arXiv preprint arXiv:2501.11696. 2025 Jan 20.

  • Kendall, M.G. (1948) Rank Correlation Methods. Charles Griffin, London.

  • Diaconis, P. and Graham, R.L., 1977. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society Series B: Statistical Methodology, 39(2), pp.262-268.

Examples

### compute exact bounds of Spearman's footrule between incomplete ranked lists
X <- c(1, 2, NA, 4, 3)
Y <- c(3, NA, 4, 2, 1)
boundsSFR(X, Y, pval=FALSE)

### compute exact bounds of Spearman's footrule between incomplete vectors of distinct data,
### and perform independence test
X <- c(1.3, 2.6, NA, 4.2, 3.5)
Y <- c(5.5, NA, 6.5, 2.6, 1.1)
boundsSFR(X, Y, pval=TRUE)