http://www.zorba-xquery.com/modules/excel/statistical
Description
Before using any of the functions below please remember to import the module namespace:
import module namespace excel = "http://www.zorba-xquery.com/modules/excel/statistical";
This is a library module offering a part of the set of statistical functions defined by Microsoft Excel 2003.
See also
Author
Daniel Turcanu
XQuery version and encoding
xquery version "3.0" encoding "utf-8";
Namespaces
| excel | http://www.zorba-xquery.com/modules/excel/statistical |
| excel-err | http://www.zorba-xquery.com/modules/excel/errors |
| excel-math | http://www.zorba-xquery.com/modules/excel/math |
| ver | http://www.zorba-xquery.com/options/versioning |
Function Summary
avedev($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeReturns the average of the absolute deviations of data points from their mean. | |
average($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the average (arithmetic mean) of the arguments. | |
averagea($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeCalculates the average (arithmetic mean) of the values in the sequence of arguments. | |
count($numbers as xs:anyAtomicType*) as xs:integerCounts the number of cells that contain numbers or values castable to numeric. | |
counta($numbers as xs:anyAtomicType*) as xs:integerCounts the number of values that are not empty. | |
countblank($cells as xs:anyAtomicType*) as xs:integerCounts the empty values in a sequence. | |
large($numbers as xs:anyAtomicType*, $k as xs:integer) as xs:anyAtomicTypeReturns the k-th largest value in a data set. | |
max($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the largest number in a sequence. | |
maxa($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the largest value in a list of arguments. | |
median($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the median of the given numbers. | |
min($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the smallest number in a sequence. | |
mina($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the smallest value in a list of arguments. | |
mode($numbers as xs:anyAtomicType*) as xs:anyAtomicTypeReturns the most frequently occurring, or repetitive, value in a sequence. | |
percentile($numbers as xs:anyAtomicType*, $k_at as xs:anyAtomicType) as xs:anyAtomicTypeReturns the k-th percentile of values in a sequence. | |
percentrank($numbers as xs:anyAtomicType*, $x as xs:anyAtomicType) as xs:decimalReturns the rank of a value in a data set as a percentage of the data set. | |
prob($x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType) as xs:anyAtomicTypeThis is the same as above, only that upper_limit is not specified. | |
prob($x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType, $upper_limit as xs:anyAtomicType) as xs:anyAtomicTypeReturns the probability that values in a range are between two limits. | |
quartile($numbers as xs:anyAtomicType*, $quart as xs:integer) as xs:anyAtomicTypeReturns the quartile of a data set. | |
rank($x as xs:anyAtomicType, $numbers as xs:anyAtomicType*) as xs:decimalThis RANK function is same as the above, only that $order_ascending is set by default to false. | |
rank($x as xs:anyAtomicType, $numbers as xs:anyAtomicType*, $order_ascending as xs:boolean) as xs:decimalReturns the rank of a number in a list of numbers. | |
slope($known_y as xs:anyAtomicType+, $known_x as xs:anyAtomicType+) as xs:anyAtomicTypeReturns the slope of the linear regression line through data points in known_y's and known_x's. | |
small($numbers as xs:anyAtomicType*, $k as xs:integer) as xs:anyAtomicTypeThis function computes the k-th smallest value in a data set. | |
standardize($x as xs:anyAtomicType, $mean as xs:anyAtomicType, $standard_dev as xs:anyAtomicType) as xs:doubleReturns a normalized value from a distribution characterized by mean and standard_dev. | |
var($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeEstimates variance based on a sample. | |
vara($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeEstimates variance based on a sample. | |
varp($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeCalculates variance based on the entire population. | |
varpa($numbers as xs:anyAtomicType+) as xs:anyAtomicTypeCalculates variance based on the entire population. |
Functions
avedev#1
declare function excel:avedev(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicTypeReturns the average of the absolute deviations of data points from their mean. The formula is sum(abs(x - average_x))/n, where n is the count of x in the sequence.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. Sequence can be of any length from 1 up.
Returns
xs:anyAtomicTypeThe formula result
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
average#1
declare function excel:average(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the average (arithmetic mean) of the arguments. Arguments can be empty values, otherwise must be castable to numeric. If sequence is empty then zero is returned. The sequence can be of any length.
Parameters
$numbers as xs:anyAtomicTypeThe sequence of numbers or empty values.
Returns
xs:anyAtomicTypeThe sum of all numbers divided by the number of non-empty values.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type.
Examples
See also
averagea#1
declare function excel:averagea(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicTypeCalculates the average (arithmetic mean) of the values in the sequence of arguments. Arguments can be of any type. The numbers are added, and the sum is divided by the size of entire sequence.
Parameters
$numbers as xs:anyAtomicTypethe sequence of values of any type. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe result
Examples
See also
count#1
declare function excel:count(
$numbers as xs:anyAtomicType*
) as xs:integerCounts the number of cells that contain numbers or values castable to numeric.
Parameters
$numbers as xs:anyAtomicTypeThe sequence of values, of any length.
Returns
xs:integerThe count of numbers.
Examples
See also
counta#1
declare function excel:counta(
$numbers as xs:anyAtomicType*
) as xs:integerCounts the number of values that are not empty. Empty values are the one with string value "".
Parameters
$numbers as xs:anyAtomicTypethe sequence of values of any type, any length
Returns
xs:integerThe count of non-empty values
Examples
See also
countblank#1
declare function excel:countblank(
$cells as xs:anyAtomicType*
) as xs:integerCounts the empty values in a sequence. The empty values are the ones with string value "". The value 0 is not counted.
Parameters
$cells as xs:anyAtomicTypethe sequence of values, of any length
Returns
xs:integerThe count
Examples
See also
large#2
declare function excel:large(
$numbers as xs:anyAtomicType*,
$k as xs:integer
) as xs:anyAtomicTypeReturns the k-th largest value in a data set. If n is the number of data points in a range, then LARGE(array,1) returns the largest value, and LARGE(array,n) returns the smallest value.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.$k as xs:integerthe position of largest value, with value from 1 to count of values
Returns
xs:anyAtomicTypeThe k-th largest value as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
- excel-err:Num if the sequence is empty
- excel-err:Num if k is not a value between 1 and the sequence size
Examples
See also
max#1
declare function excel:max(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the largest number in a sequence.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length.
Returns
xs:anyAtomicTypeThe max
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
maxa#1
declare function excel:maxa(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the largest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric The sequence can be of any length.
Returns
xs:anyAtomicTypeThe max
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
median#1
declare function excel:median(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the median of the given numbers. The median is the number in the middle of a set of numbers. Half the numbers have values that are greater than the median, and half the numbers have values that are less than the median.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers, of any length
Returns
xs:anyAtomicTypefor odd count of numbers return the number in the middle of the sorted sequence. For even count of numbers return the average of the two numbers in the middle.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
min#1
declare function excel:min(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the smallest number in a sequence.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length.
Returns
xs:anyAtomicTypeThe min
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
mina#1
declare function excel:mina(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the smallest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric The sequence can be of any length.
Returns
xs:anyAtomicTypeThe min
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
mode#1
declare function excel:mode(
$numbers as xs:anyAtomicType*
) as xs:anyAtomicTypeReturns the most frequently occurring, or repetitive, value in a sequence. Arguments must be castable to numeric.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers, of any length
Returns
xs:anyAtomicTypeThe most occuring number
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
- fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are no duplicate numbers
Examples
See also
percentile#2
declare function excel:percentile(
$numbers as xs:anyAtomicType*,
$k_at as xs:anyAtomicType
) as xs:anyAtomicTypeReturns the k-th percentile of values in a sequence. If k is not a multiple of 1/(n - 1), PERCENTILE interpolates to determine the value at the k-th percentile. The function is computed by (max-min)*k + min
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers, of any length$k_at as xs:anyAtomicTypethe percentile, with value between 0 .. 1 inclusive
Returns
xs:anyAtomicTypeThe computed percentile
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
- excel-err:Num if percentile is not between 0 .. 1
Examples
See also
percentrank#2
declare function excel:percentrank(
$numbers as xs:anyAtomicType*,
$x as xs:anyAtomicType
) as xs:decimal Returns the rank of a value in a data set as a percentage of the data set.
If x does not match one of the values in array,
PERCENTRANK interpolates to return the correct percentage rank.
The formula is uses: (RANK - 1) / (size - 1) .
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.$x as xs:anyAtomicTypeis the value for which you want to know the rank
Returns
xs:decimalThe percentage of rank.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
- excel-err:Num if the sequence is zero length
Examples
See also
prob#3
declare function excel:prob(
$x_range as xs:anyAtomicType+,
$prob_range as xs:anyAtomicType+,
$range_lower_limit as xs:anyAtomicType
) as xs:anyAtomicTypeThis is the same as above, only that upper_limit is not specified. The probability is computed only for range_lower_limit.
Parameters
$x_range as xs:anyAtomicTypeis the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.$prob_range as xs:anyAtomicTypeis a set of probabilities associated with values in x_range.$range_lower_limit as xs:anyAtomicTypeis the value for which you want a probability.
Returns
xs:anyAtomicTypeThe probability of the range_lower_limit value
Errors
- excel-err:Num if any probability is not between 0 and 1
- excel-err:Num if the sum of probabilities is not equal to 1
- excel-err:Value if any parameter is not castable to numeric
- excel-err:Num if x_range and prob_range do not have the same number of values
Examples
See also
prob#4
declare function excel:prob(
$x_range as xs:anyAtomicType+,
$prob_range as xs:anyAtomicType+,
$range_lower_limit as xs:anyAtomicType,
$upper_limit as xs:anyAtomicType
) as xs:anyAtomicTypeReturns the probability that values in a range are between two limits.
Parameters
$x_range as xs:anyAtomicTypeis the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.$prob_range as xs:anyAtomicTypeis a set of probabilities associated with values in x_range.$range_lower_limit as xs:anyAtomicTypeis the lower bound on the value for which you want a probability.$upper_limit as xs:anyAtomicTypeis the upper bound on the value for which you want a probability.
Returns
xs:anyAtomicTypeThe probability of the entire range
Errors
- excel-err:Num if any probability is not between 0 and 1
- excel-err:Num if the sum of probabilities is not equal to 1
- excel-err:Value if any parameter is not castable to numeric
- excel-err:Num if x_range and prob_range do not have the same number of values
Examples
See also
quartile#2
declare function excel:quartile(
$numbers as xs:anyAtomicType*,
$quart as xs:integer
) as xs:anyAtomicTypeReturns the quartile of a data set.
Parameters
$numbers as xs:anyAtomicTypesequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.$quart as xs:integer- one of the values 0, 1, 2, 3, 4 with meaning:
- 0
- compute minimum value
- 1
- compute first quartile (25th percentile)
- 2
- compute median value (50th percentile)
- 3
- compute third quartile (75th percentile)
- 4
- compute maximum value
Returns
xs:anyAtomicTypethe computed quartile, as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
- excel-err:Num if the sequence is zero length
- excel-err:Num if $quart is not one of the values 0, 1, 2, 3, 4
Examples
See also
rank#2
declare function excel:rank(
$x as xs:anyAtomicType,
$numbers as xs:anyAtomicType*
) as xs:decimalThis RANK function is same as the above, only that $order_ascending is set by default to false.
Parameters
$x as xs:anyAtomicTypeThe number whose rank you want to find.$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numbers. The sequence can be of any length.
Returns
xs:decimalThe rank of $x.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type.
Examples
See also
rank#3
declare function excel:rank(
$x as xs:anyAtomicType,
$numbers as xs:anyAtomicType*,
$order_ascending as xs:boolean
) as xs:decimalReturns the rank of a number in a list of numbers. The rank of a number is its size relative to other values in a list. (If you were to sort the list, the rank of the number would be its position.) RANK gives duplicate numbers the same rank.
Parameters
$x as xs:anyAtomicTypeThe number whose rank you want to find.$numbers as xs:anyAtomicTypeThe sequence of numbers or values castable to numbers. The sequence can be of any length.$order_ascending as xs:boolean- A boolean having the meaning:
- false
- then rank the number as if the sequence was sorted in descending order.
- true
- then rank the number as if the sequence was sorted in ascending order.
Returns
xs:decimalThe rank of $x.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type.
Examples
See also
slope#2
declare function excel:slope(
$known_y as xs:anyAtomicType+,
$known_x as xs:anyAtomicType+
) as xs:anyAtomicType Returns the slope of the linear regression line through data points in known_y's and known_x's.
The slope is the vertical distance divided by the horizontal distance between
any two points on the line, which is the rate of change along the regression line.
It computes the formula:
sum((x - average_x)(y - average_y)) / sum((x - average_x)^2)
where average_x and average_y are computed with AVERAGE function.
Parameters
$known_y as xs:anyAtomicTypethe sequence of y numbers. The sequence can be of any length, from 1 up.$known_x as xs:anyAtomicTypethe sequence of x numbers. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe slope value, as numeric type
Errors
- excel-err:Value if any parameter cannot be casted to numeric
- fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are different numbers of x's and y's
- fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if any sequence is empty
- excel-err:Div0 if all x's are equal
Examples
See also
small#2
declare function excel:small(
$numbers as xs:anyAtomicType*,
$k as xs:integer
) as xs:anyAtomicTypeThis function computes the k-th smallest value in a data set. Use this function to return values with a particular relative standing in a data set. If n is the number of data points in array, SMALL(array,1) equals the smallest value, and SMALL(array,n) equals the largest value.
Parameters
$numbers as xs:anyAtomicTypeA sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.$k as xs:integerThe position (from the smallest) in the sequence of data to return. Must have value between 1 and size of sequence.
Returns
xs:anyAtomicTypeThe k-th smallest value of $numbers.
Errors
- excel-err:Value if the parameters cannot be casted to numeric type.
- excel-err:Num if the sequence is zero length.
- excel-err:Num if $k is not a value between 1 and the size of sequence.
Examples
See also
standardize#3
declare function excel:standardize(
$x as xs:anyAtomicType,
$mean as xs:anyAtomicType,
$standard_dev as xs:anyAtomicType
) as xs:double Returns a normalized value from a distribution characterized by mean and standard_dev.
The formula is (x - mean) / standard_dev .
Parameters
$x as xs:anyAtomicTypeis the value you want to normalize$mean as xs:anyAtomicTypeis the arithmetic mean of the distribution.$standard_dev as xs:anyAtomicTypeis the standard deviation of the distribution.
Returns
xs:doubleThe normalized x, as numeric type
Errors
- excel-err:Value if any parameter cannot be casted to numeric
- excel-err:Num if standard_dev is a value smaller than zero or equal
Examples
See also
var#1
declare function excel:var(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicType Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe variance, as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
vara#1
declare function excel:vara(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicType Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe variance, as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
varp#1
declare function excel:varp(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicType Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe variance, as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type
Examples
See also
varpa#1
declare function excel:varpa(
$numbers as xs:anyAtomicType+
) as xs:anyAtomicType Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.
Parameters
$numbers as xs:anyAtomicTypethe sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns
xs:anyAtomicTypeThe variance, as numeric type
Errors
- excel-err:Value if the parameters cannot be casted to numeric type