Informative Range
We define the informative range as a way of understanding the amount of information conveyed about one variable through a non-linear relationships with another.
Definition
For any relationship between two variables \(x\) and \(y\) where \[y = f(x) + \epsilon,\] where \(\epsilon\) represents noise. Let \(y\) be the variable we want to predict using information about \(x.\) The informative range for prediction is a subset of the full range, \(y \in (y_l, y_u)\) with corresponding values in the informative domain for \(x \in (x_l, x_u).\) We define a the informative range and domain by defining a cutoff on the slope, \(f'(x) > \phi,\) so that we get enough information about \(y\) around values of around \(x\) to get information about \(y\) that we can trust.
Figure: An illustration of the informative range and domain for a sigmoidal relationship with no noise.
Lines
If the relationship between two variables is linear, then information about one variable is equally informative about another across its entire range. For example, if we have the fitted relationship for a set of points
\[y = f(x) + \epsilon = mx + b + \epsilon\]
then \(f'(x)=m,\) So the relative amount of information about \(y\) is constant across the entire range of \(x.\) We can thus \(x\) to predict \(y\) and vice versa.
In the following, if we accept that the fitted relationship is the true relationship (a reasonable but risky assumption; in this case, the true value was \(m=2\)), we know that a one unit change in \(x\) will result in an expected 7/4 unit change in \(y\). We can, moreover, use \(y\) to predict \(x\):
\[x \approx \frac{y-b + \epsilon}{m}\]
so a one unit change in \(y\) will result in an expected \(4/7\) change in \(x\). Notably, the error scales with \(1/m\).
Figure 2: For linear relationships, the amount of information about \(y\) does not change across values of \(x.\)
Sigmoidal Relationships
If the relationship between two variables is sigmoidal, then the amount of information about the variable \(y\) depends on the value of \(x\):
\[y = \frac{1}{1 + e^{-S(x-d)}} + \epsilon\] In the following example, we let \(d=10\) and \(S=1.2.\)
We note that the greatest change in \(y\) is found when \(x\) is around \(d=10\). Since the range of \(y \in c(0,1)\), about 90% of the total variance in y occurs for \(x\in c(4.11, 15.889).\)
c(sigmoidF(10+ll, .5, 10), sigmoidF(10-ll, .5, 10))
[1] 0.9500029 0.0499971
In this case, we can write down a closed form expression for the change in \(y\) by looking at the derivative of the sigmoidal function:
\[ \frac{S e^{-S(x-d)}}{(1-e^{-S(x-d)})^2}\]
Around \(x=10\), a one-unit change in \(x\) will result in a maximum 12% change in \(y\), but as we move away from the peak, the amount of information about \(y\) from \(x\) declines.
The right way to do prediction is probably to try and fit a sigmoidal curve to the data.
We note that the fitted line provides a poor fit by visual inspection of the residuals. Since the residual error should be uniformly distributed with no pattern, we can also see how bad we’ve done by plotting the residuals:
If, however, we include only values of \(x\) that contain 90% of the range of \(y\), we get a reasonably good linear fit, to the true relationship. The 9% slope from the fitted line is about 3/4 of the maximum 12% slope predicted above.
Call:
lm(formula = ysub ~ xsub)
Residuals:
Min 1Q Median 3Q Max
-0.141548 -0.034737 0.003852 0.033085 0.137366
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.396674 0.037822 -10.49 1.71e-12 ***
xsub 0.090297 0.003258 27.71 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.0634 on 36 degrees of freedom
Multiple R-squared: 0.9552, Adjusted R-squared: 0.954
F-statistic: 768.1 on 1 and 36 DF, p-value: < 2.2e-16
If we plot the residuals, it’s hard to tell the relationship within the subsetted data are not linear.