http://www.zorba-xquery.com/modules/excel/statistical ZC

Module Description
Before using any of the functions below please remember to import the module namespace:
import module namespace excel = "http://www.zorba-xquery.com/modules/excel/statistical";

This is a library module offering a part of the set of statistical functions defined by Microsoft Excel 2003.

Author:

Daniel Turcanu

XQuery version and encoding for this module:

xquery version "3.0" encoding "utf-8";

Module Resources
  • the XQuery module can be found here.
Module Dependencies

Imported modules:

Related Documentation

For more details please also see:

Namespaces
excel http://www.zorba-xquery.com/modules/excel/statistical
excel-err http://www.zorba-xquery.com/modules/excel/errors
excel-math http://www.zorba-xquery.com/modules/excel/math
ver http://www.zorba-xquery.com/options/versioning
Function Summary
avedev ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Returns the average of the absolute deviations of data points from their mean.
average ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the average (arithmetic mean) of the arguments.
averagea ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Calculates the average (arithmetic mean) of the values in the sequence of arguments.
count ( $numbers as xs:anyAtomicType* ) as xs:integer
Counts the number of cells that contain numbers or values castable to numeric.
counta ( $numbers as xs:anyAtomicType* ) as xs:integer
Counts the number of values that are not empty.
countblank ( $cells as xs:anyAtomicType* ) as xs:integer
Counts the empty values in a sequence.
large ( $numbers as xs:anyAtomicType*, $k as xs:integer ) as xs:anyAtomicType
Returns the k-th largest value in a data set.
max ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the largest number in a sequence.
maxa ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the largest value in a list of arguments.
median ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the median of the given numbers.
min ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the smallest number in a sequence.
mina ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the smallest value in a list of arguments.
mode ( $numbers as xs:anyAtomicType* ) as xs:anyAtomicType
Returns the most frequently occurring, or repetitive, value in a sequence.
percentile ( $numbers as xs:anyAtomicType*, $k_at as xs:anyAtomicType ) as xs:anyAtomicType
Returns the k-th percentile of values in a sequence.
percentrank ( $numbers as xs:anyAtomicType*, $x as xs:anyAtomicType ) as xs:decimal
Returns the rank of a value in a data set as a percentage of the data set.
prob ( $x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType ) as xs:anyAtomicType
This is the same as above, only that upper_limit is not specified.
prob ( $x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType, $upper_limit as xs:anyAtomicType ) as xs:anyAtomicType
Returns the probability that values in a range are between two limits.
quartile ( $numbers as xs:anyAtomicType*, $quart as xs:integer ) as xs:anyAtomicType
Returns the quartile of a data set.
rank ( $x as xs:anyAtomicType, $numbers as xs:anyAtomicType* ) as xs:decimal
This RANK function is same as the above, only that $order_ascending is set by default to false.
rank ( $x as xs:anyAtomicType, $numbers as xs:anyAtomicType*, $order_ascending as xs:boolean ) as xs:decimal
Returns the rank of a number in a list of numbers.
slope ( $known_y as xs:anyAtomicType+, $known_x as xs:anyAtomicType+ ) as xs:anyAtomicType
Returns the slope of the linear regression line through data points in known_y's and known_x's.
small ( $numbers as xs:anyAtomicType*, $k as xs:integer ) as xs:anyAtomicType
This function computes the k-th smallest value in a data set.
standardize ( $x as xs:anyAtomicType, $mean as xs:anyAtomicType, $standard_dev as xs:anyAtomicType ) as xs:double
Returns a normalized value from a distribution characterized by mean and standard_dev.
var ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Estimates variance based on a sample.
vara ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Estimates variance based on a sample.
varp ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Calculates variance based on the entire population.
varpa ( $numbers as xs:anyAtomicType+ ) as xs:anyAtomicType
Calculates variance based on the entire population.
Functions
avedev back to 'Function Summary'
declare function excel:avedev (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Returns the average of the absolute deviations of data points from their mean. The formula is sum(abs(x - average_x))/n, where n is the count of x in the sequence.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. Sequence can be of any length from 1 up.
Returns:
  • The formula result
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

average back to 'Function Summary'
declare function excel:average (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the average (arithmetic mean) of the arguments. Arguments can be empty values, otherwise must be castable to numeric. If sequence is empty then zero is returned. The sequence can be of any length.

Parameters:
  • $numbers The sequence of numbers or empty values.
Returns:
  • The sum of all numbers divided by the number of non-empty values.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type.
See:
Examples:

averagea back to 'Function Summary'
declare function excel:averagea (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Calculates the average (arithmetic mean) of the values in the sequence of arguments. Arguments can be of any type. The numbers are added, and the sum is divided by the size of entire sequence.

Parameters:
  • $numbers the sequence of values of any type. The sequence can be of any length, from 1 up.
Returns:
  • The result
See:
Examples:

count back to 'Function Summary'
declare function excel:count (
            $numbers as xs:anyAtomicType* 
 ) as xs:integer

Counts the number of cells that contain numbers or values castable to numeric.

Parameters:
  • $numbers The sequence of values, of any length.
Returns:
  • The count of numbers.
See:
Examples:

counta back to 'Function Summary'
declare function excel:counta (
            $numbers as xs:anyAtomicType* 
 ) as xs:integer

Counts the number of values that are not empty. Empty values are the one with string value "".

Parameters:
  • $numbers the sequence of values of any type, any length
Returns:
  • The count of non-empty values
See:
Examples:

countblank back to 'Function Summary'
declare function excel:countblank (
            $cells as xs:anyAtomicType* 
 ) as xs:integer

Counts the empty values in a sequence. The empty values are the ones with string value "". The value 0 is not counted.

Parameters:
  • $cells the sequence of values, of any length
Returns:
  • The count
See:
Examples:

large back to 'Function Summary'
declare function excel:large (
            $numbers as xs:anyAtomicType*, 
            $k as xs:integer 
 ) as xs:anyAtomicType

Returns the k-th largest value in a data set. If n is the number of data points in a range, then LARGE(array,1) returns the largest value, and LARGE(array,n) returns the smallest value.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
  • $k the position of largest value, with value from 1 to count of values
Returns:
  • The k-th largest value as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is empty
  • excel-err:Num if k is not a value between 1 and the sequence size
See:
Examples:

max back to 'Function Summary'
declare function excel:max (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the largest number in a sequence.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length.
Returns:
  • The max
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

maxa back to 'Function Summary'
declare function excel:maxa (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the largest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric The sequence can be of any length.
Returns:
  • The max
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

median back to 'Function Summary'
declare function excel:median (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the median of the given numbers. The median is the number in the middle of a set of numbers. Half the numbers have values that are greater than the median, and half the numbers have values that are less than the median.

Parameters:
  • $numbers the sequence of numbers, of any length
Returns:
  • for odd count of numbers return the number in the middle of the sorted sequence. For even count of numbers return the average of the two numbers in the middle.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

min back to 'Function Summary'
declare function excel:min (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the smallest number in a sequence.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length.
Returns:
  • The min
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

mina back to 'Function Summary'
declare function excel:mina (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the smallest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric The sequence can be of any length.
Returns:
  • The min
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

mode back to 'Function Summary'
declare function excel:mode (
            $numbers as xs:anyAtomicType* 
 ) as xs:anyAtomicType

Returns the most frequently occurring, or repetitive, value in a sequence. Arguments must be castable to numeric.

Parameters:
  • $numbers the sequence of numbers, of any length
Returns:
  • The most occuring number
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are no duplicate numbers
See:
Examples:

percentile back to 'Function Summary'
declare function excel:percentile (
            $numbers as xs:anyAtomicType*, 
            $k_at as xs:anyAtomicType 
 ) as xs:anyAtomicType

Returns the k-th percentile of values in a sequence. If k is not a multiple of 1/(n - 1), PERCENTILE interpolates to determine the value at the k-th percentile. The function is computed by (max-min)*k + min

Parameters:
  • $numbers the sequence of numbers, of any length
  • $k_at the percentile, with value between 0 .. 1 inclusive
Returns:
  • The computed percentile
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if percentile is not between 0 .. 1
See:
Examples:

percentrank back to 'Function Summary'
declare function excel:percentrank (
            $numbers as xs:anyAtomicType*, 
            $x as xs:anyAtomicType 
 ) as xs:decimal

Returns the rank of a value in a data set as a percentage of the data set. If x does not match one of the values in array, PERCENTRANK interpolates to return the correct percentage rank.
The formula is uses: (RANK - 1) / (size - 1) .

Parameters:
  • $numbers the sequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.
  • $x is the value for which you want to know the rank
Returns:
  • The percentage of rank.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is zero length
See:
Examples:

prob back to 'Function Summary'
declare function excel:prob (
            $x_range as xs:anyAtomicType+, 
            $prob_range as xs:anyAtomicType+, 
            $range_lower_limit as xs:anyAtomicType 
 ) as xs:anyAtomicType

This is the same as above, only that upper_limit is not specified. The probability is computed only for range_lower_limit.

Parameters:
  • $x_range is the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.
  • $prob_range is a set of probabilities associated with values in x_range.
  • $range_lower_limit is the value for which you want a probability.
Returns:
  • The probability of the range_lower_limit value
Errors:
  • excel-err:Num if any probability is not between 0 and 1
  • excel-err:Num if the sum of probabilities is not equal to 1
  • excel-err:Value if any parameter is not castable to numeric
  • excel-err:Num if x_range and prob_range do not have the same number of values
See:
Examples:

prob back to 'Function Summary'
declare function excel:prob (
            $x_range as xs:anyAtomicType+, 
            $prob_range as xs:anyAtomicType+, 
            $range_lower_limit as xs:anyAtomicType, 
            $upper_limit as xs:anyAtomicType 
 ) as xs:anyAtomicType

Returns the probability that values in a range are between two limits.

Parameters:
  • $x_range is the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.
  • $prob_range is a set of probabilities associated with values in x_range.
  • $range_lower_limit is the lower bound on the value for which you want a probability.
  • $upper_limit is the upper bound on the value for which you want a probability.
Returns:
  • The probability of the entire range
Errors:
  • excel-err:Num if any probability is not between 0 and 1
  • excel-err:Num if the sum of probabilities is not equal to 1
  • excel-err:Value if any parameter is not castable to numeric
  • excel-err:Num if x_range and prob_range do not have the same number of values
See:
Examples:

quartile back to 'Function Summary'
declare function excel:quartile (
            $numbers as xs:anyAtomicType*, 
            $quart as xs:integer 
 ) as xs:anyAtomicType

Returns the quartile of a data set.

Parameters:
  • $numbers sequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.
  • $quart one of the values 0, 1, 2, 3, 4 with meaning: 0 compute minimum value 1 compute first quartile (25th percentile) 2 compute median value (50th percentile) 3 compute third quartile (75th percentile) 4 compute maximum value
Returns:
  • the computed quartile, as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is zero length
  • excel-err:Num if $quart is not one of the values 0, 1, 2, 3, 4
See:
Examples:

rank back to 'Function Summary'
declare function excel:rank (
            $x as xs:anyAtomicType, 
            $numbers as xs:anyAtomicType* 
 ) as xs:decimal

This RANK function is same as the above, only that $order_ascending is set by default to false.

Parameters:
  • $x The number whose rank you want to find.
  • $numbers the sequence of numbers or values castable to numbers. The sequence can be of any length.
Returns:
  • The rank of $x.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type.
See:
Examples:

rank back to 'Function Summary'
declare function excel:rank (
            $x as xs:anyAtomicType, 
            $numbers as xs:anyAtomicType*, 
            $order_ascending as xs:boolean 
 ) as xs:decimal

Returns the rank of a number in a list of numbers. The rank of a number is its size relative to other values in a list. (If you were to sort the list, the rank of the number would be its position.) RANK gives duplicate numbers the same rank.

Parameters:
  • $x The number whose rank you want to find.
  • $numbers The sequence of numbers or values castable to numbers. The sequence can be of any length.
  • $order_ascending A boolean having the meaning: false then rank the number as if the sequence was sorted in descending order. true then rank the number as if the sequence was sorted in ascending order.
Returns:
  • The rank of $x.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type.
See:
Examples:

slope back to 'Function Summary'
declare function excel:slope (
            $known_y as xs:anyAtomicType+, 
            $known_x as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Returns the slope of the linear regression line through data points in known_y's and known_x's. The slope is the vertical distance divided by the horizontal distance between any two points on the line, which is the rate of change along the regression line. It computes the formula:
sum((x - average_x)(y - average_y)) / sum((x - average_x)^2)
where average_x and average_y are computed with AVERAGE function.

Parameters:
  • $known_y the sequence of y numbers. The sequence can be of any length, from 1 up.
  • $known_x the sequence of x numbers. The sequence can be of any length, from 1 up.
Returns:
  • The slope value, as numeric type
Errors:
  • excel-err:Value if any parameter cannot be casted to numeric
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are different numbers of x's and y's
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if any sequence is empty
  • excel-err:Div0 if all x's are equal
See:
Examples:

small back to 'Function Summary'
declare function excel:small (
            $numbers as xs:anyAtomicType*, 
            $k as xs:integer 
 ) as xs:anyAtomicType

This function computes the k-th smallest value in a data set. Use this function to return values with a particular relative standing in a data set. If n is the number of data points in array, SMALL(array,1) equals the smallest value, and SMALL(array,n) equals the largest value.

Parameters:
  • $numbers A sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
  • $k The position (from the smallest) in the sequence of data to return. Must have value between 1 and size of sequence.
Returns:
  • The k-th smallest value of $numbers.
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type.
  • excel-err:Num if the sequence is zero length.
  • excel-err:Num if $k is not a value between 1 and the size of sequence.
See:
Examples:

standardize back to 'Function Summary'
declare function excel:standardize (
            $x as xs:anyAtomicType, 
            $mean as xs:anyAtomicType, 
            $standard_dev as xs:anyAtomicType 
 ) as xs:double

Returns a normalized value from a distribution characterized by mean and standard_dev.
The formula is (x - mean) / standard_dev .

Parameters:
  • $x is the value you want to normalize
  • $mean is the arithmetic mean of the distribution.
  • $standard_dev is the standard deviation of the distribution.
Returns:
  • The normalized x, as numeric type
Errors:
  • excel-err:Value if any parameter cannot be casted to numeric
  • excel-err:Num if standard_dev is a value smaller than zero or equal
See:
Examples:

var back to 'Function Summary'
declare function excel:var (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns:
  • The variance, as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

vara back to 'Function Summary'
declare function excel:vara (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns:
  • The variance, as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

varp back to 'Function Summary'
declare function excel:varp (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns:
  • The variance, as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

varpa back to 'Function Summary'
declare function excel:varpa (
            $numbers as xs:anyAtomicType+ 
 ) as xs:anyAtomicType

Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.

Parameters:
  • $numbers the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
Returns:
  • The variance, as numeric type
Errors:
  • excel-err:Value if the parameters cannot be casted to numeric type
See:
Examples:

blog comments powered by Disqus