http://www.zorba-xquery.com/modules/excel/statistical

Description

Before using any of the functions below please remember to import the module namespace:

import module namespace excel = "http://www.zorba-xquery.com/modules/excel/statistical";

This is a library module offering a part of the set of statistical functions defined by Microsoft Excel 2003.

See also

Excel 2003 Documentation: Statistical Functions

Author

Daniel Turcanu

XQuery version and encoding

xquery version "3.0" encoding "utf-8";

Namespaces

excelhttp://www.zorba-xquery.com/modules/excel/statistical
excel-errhttp://www.zorba-xquery.com/modules/excel/errors
excel-mathhttp://www.zorba-xquery.com/modules/excel/math
verhttp://www.zorba-xquery.com/options/versioning

Function Summary

avedev($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Returns the average of the absolute deviations of data points from their mean.

average($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the average (arithmetic mean) of the arguments.

averagea($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Calculates the average (arithmetic mean) of the values in the sequence of arguments.

count($numbers as xs:anyAtomicType*) as xs:integer

Counts the number of cells that contain numbers or values castable to numeric.

counta($numbers as xs:anyAtomicType*) as xs:integer

Counts the number of values that are not empty.

countblank($cells as xs:anyAtomicType*) as xs:integer

Counts the empty values in a sequence.

large($numbers as xs:anyAtomicType*, $k as xs:integer) as xs:anyAtomicType

Returns the k-th largest value in a data set.

max($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the largest number in a sequence.

maxa($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the largest value in a list of arguments.

median($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the median of the given numbers.

min($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the smallest number in a sequence.

mina($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the smallest value in a list of arguments.

mode($numbers as xs:anyAtomicType*) as xs:anyAtomicType

Returns the most frequently occurring, or repetitive, value in a sequence.

percentile($numbers as xs:anyAtomicType*, $k_at as xs:anyAtomicType) as xs:anyAtomicType

Returns the k-th percentile of values in a sequence.

percentrank($numbers as xs:anyAtomicType*, $x as xs:anyAtomicType) as xs:decimal

Returns the rank of a value in a data set as a percentage of the data set.

prob($x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType) as xs:anyAtomicType

This is the same as above, only that upper_limit is not specified.

prob($x_range as xs:anyAtomicType+, $prob_range as xs:anyAtomicType+, $range_lower_limit as xs:anyAtomicType, $upper_limit as xs:anyAtomicType) as xs:anyAtomicType

Returns the probability that values in a range are between two limits.

quartile($numbers as xs:anyAtomicType*, $quart as xs:integer) as xs:anyAtomicType

Returns the quartile of a data set.

rank($x as xs:anyAtomicType, $numbers as xs:anyAtomicType*) as xs:decimal

This RANK function is same as the above, only that $order_ascending is set by default to false.

rank($x as xs:anyAtomicType, $numbers as xs:anyAtomicType*, $order_ascending as xs:boolean) as xs:decimal

Returns the rank of a number in a list of numbers.

slope($known_y as xs:anyAtomicType+, $known_x as xs:anyAtomicType+) as xs:anyAtomicType

Returns the slope of the linear regression line through data points in known_y's and known_x's.

small($numbers as xs:anyAtomicType*, $k as xs:integer) as xs:anyAtomicType

This function computes the k-th smallest value in a data set.

standardize($x as xs:anyAtomicType, $mean as xs:anyAtomicType, $standard_dev as xs:anyAtomicType) as xs:double

Returns a normalized value from a distribution characterized by mean and standard_dev.

var($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Estimates variance based on a sample.

vara($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Estimates variance based on a sample.

varp($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Calculates variance based on the entire population.

varpa($numbers as xs:anyAtomicType+) as xs:anyAtomicType

Calculates variance based on the entire population.

Functions

avedev#1

declare function excel:avedev(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Returns the average of the absolute deviations of data points from their mean. The formula is sum(abs(x - average_x))/n, where n is the count of x in the sequence.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. Sequence can be of any length from 1 up.

Returns

  • xs:anyAtomicType

    The formula result

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052089931033.aspx

average#1

declare function excel:average(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the average (arithmetic mean) of the arguments. Arguments can be empty values, otherwise must be castable to numeric. If sequence is empty then zero is returned. The sequence can be of any length.

Parameters

  • $numbers as xs:anyAtomicType
    The sequence of numbers or empty values.

Returns

  • xs:anyAtomicType

    The sum of all numbers divided by the number of non-empty values.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type.

Examples

See also

http://office.microsoft.com/en-us/excel/HP052089941033.aspx

averagea#1

declare function excel:averagea(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Calculates the average (arithmetic mean) of the values in the sequence of arguments. Arguments can be of any type. The numbers are added, and the sum is divided by the size of entire sequence.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of values of any type. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The result

Examples

See also

http://office.microsoft.com/en-us/excel/HP052089951033.aspx

count#1

declare function excel:count(
    $numbers as xs:anyAtomicType*
) as xs:integer

Counts the number of cells that contain numbers or values castable to numeric.

Parameters

  • $numbers as xs:anyAtomicType
    The sequence of values, of any length.

Returns

  • xs:integer

    The count of numbers.

Examples

See also

http://office.microsoft.com/en-us/excel/HP052090261033.aspx

counta#1

declare function excel:counta(
    $numbers as xs:anyAtomicType*
) as xs:integer

Counts the number of values that are not empty. Empty values are the one with string value "".

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of values of any type, any length

Returns

  • xs:integer

    The count of non-empty values

Examples

See also

http://office.microsoft.com/en-us/excel/HP052090271033.aspx

countblank#1

declare function excel:countblank(
    $cells as xs:anyAtomicType*
) as xs:integer

Counts the empty values in a sequence. The empty values are the ones with string value "". The value 0 is not counted.

Parameters

  • $cells as xs:anyAtomicType
    the sequence of values, of any length

Returns

  • xs:integer

    The count

Examples

See also

http://office.microsoft.com/en-us/excel/HP052090281033.aspx

large#2

declare function excel:large(
    $numbers as xs:anyAtomicType*,
    $k as xs:integer
) as xs:anyAtomicType

Returns the k-th largest value in a data set. If n is the number of data points in a range, then LARGE(array,1) returns the largest value, and LARGE(array,n) returns the smallest value.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
  • $k as xs:integer
    the position of largest value, with value from 1 to count of values

Returns

  • xs:anyAtomicType

    The k-th largest value as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is empty
  • excel-err:Num if k is not a value between 1 and the sequence size

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091511033.aspx

max#1

declare function excel:max(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the largest number in a sequence.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length.

Returns

  • xs:anyAtomicType

    The max

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091701033.aspx

maxa#1

declare function excel:maxa(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the largest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric The sequence can be of any length.

Returns

  • xs:anyAtomicType

    The max

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091711033.aspx

median#1

declare function excel:median(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the median of the given numbers. The median is the number in the middle of a set of numbers. Half the numbers have values that are greater than the median, and half the numbers have values that are less than the median.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers, of any length

Returns

  • xs:anyAtomicType

    for odd count of numbers return the number in the middle of the sorted sequence. For even count of numbers return the average of the two numbers in the middle.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091741033.aspx

min#1

declare function excel:min(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the smallest number in a sequence.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length.

Returns

  • xs:anyAtomicType

    The min

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091761033.aspx

mina#1

declare function excel:mina(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the smallest value in a list of arguments. In this implementation there is no difference between MAX and MAXA.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric The sequence can be of any length.

Returns

  • xs:anyAtomicType

    The min

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091771033.aspx

mode#1

declare function excel:mode(
    $numbers as xs:anyAtomicType*
) as xs:anyAtomicType

Returns the most frequently occurring, or repetitive, value in a sequence. Arguments must be castable to numeric.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers, of any length

Returns

  • xs:anyAtomicType

    The most occuring number

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are no duplicate numbers

Examples

See also

http://office.microsoft.com/en-us/excel/HP052091831033.aspx

percentile#2

declare function excel:percentile(
    $numbers as xs:anyAtomicType*,
    $k_at as xs:anyAtomicType
) as xs:anyAtomicType

Returns the k-th percentile of values in a sequence. If k is not a multiple of 1/(n - 1), PERCENTILE interpolates to determine the value at the k-th percentile. The function is computed by (max-min)*k + min

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers, of any length
  • $k_at as xs:anyAtomicType
    the percentile, with value between 0 .. 1 inclusive

Returns

  • xs:anyAtomicType

    The computed percentile

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if percentile is not between 0 .. 1

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092111033.aspx

percentrank#2

declare function excel:percentrank(
    $numbers as xs:anyAtomicType*,
    $x as xs:anyAtomicType
) as xs:decimal

Returns the rank of a value in a data set as a percentage of the data set. If x does not match one of the values in array, PERCENTRANK interpolates to return the correct percentage rank.
The formula is uses: (RANK - 1) / (size - 1) .

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.
  • $x as xs:anyAtomicType
    is the value for which you want to know the rank

Returns

  • xs:decimal

    The percentage of rank.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is zero length

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092121033.aspx

prob#3

declare function excel:prob(
    $x_range as xs:anyAtomicType+,
    $prob_range as xs:anyAtomicType+,
    $range_lower_limit as xs:anyAtomicType
) as xs:anyAtomicType

This is the same as above, only that upper_limit is not specified. The probability is computed only for range_lower_limit.

Parameters

  • $x_range as xs:anyAtomicType
    is the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.
  • $prob_range as xs:anyAtomicType
    is a set of probabilities associated with values in x_range.
  • $range_lower_limit as xs:anyAtomicType
    is the value for which you want a probability.

Returns

  • xs:anyAtomicType

    The probability of the range_lower_limit value

Errors

  • excel-err:Num if any probability is not between 0 and 1
  • excel-err:Num if the sum of probabilities is not equal to 1
  • excel-err:Value if any parameter is not castable to numeric
  • excel-err:Num if x_range and prob_range do not have the same number of values

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092221033.aspx

prob#4

declare function excel:prob(
    $x_range as xs:anyAtomicType+,
    $prob_range as xs:anyAtomicType+,
    $range_lower_limit as xs:anyAtomicType,
    $upper_limit as xs:anyAtomicType
) as xs:anyAtomicType

Returns the probability that values in a range are between two limits.

Parameters

  • $x_range as xs:anyAtomicType
    is the range of numeric values of x with which there are associated probabilities. This does not need to be ordered.
  • $prob_range as xs:anyAtomicType
    is a set of probabilities associated with values in x_range.
  • $range_lower_limit as xs:anyAtomicType
    is the lower bound on the value for which you want a probability.
  • $upper_limit as xs:anyAtomicType
    is the upper bound on the value for which you want a probability.

Returns

  • xs:anyAtomicType

    The probability of the entire range

Errors

  • excel-err:Num if any probability is not between 0 and 1
  • excel-err:Num if the sum of probabilities is not equal to 1
  • excel-err:Value if any parameter is not castable to numeric
  • excel-err:Num if x_range and prob_range do not have the same number of values

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092221033.aspx

quartile#2

declare function excel:quartile(
    $numbers as xs:anyAtomicType*,
    $quart as xs:integer
) as xs:anyAtomicType

Returns the quartile of a data set.

Parameters

  • $numbers as xs:anyAtomicType
    sequence of numbers or values castable to numbers. The sequence can be of any length, from 1 up.
  • $quart as xs:integer
    one of the values 0, 1, 2, 3, 4 with meaning:
    0
    compute minimum value
    1
    compute first quartile (25th percentile)
    2
    compute median value (50th percentile)
    3
    compute third quartile (75th percentile)
    4
    compute maximum value

Returns

  • xs:anyAtomicType

    the computed quartile, as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type
  • excel-err:Num if the sequence is zero length
  • excel-err:Num if $quart is not one of the values 0, 1, 2, 3, 4

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092261033.aspx

rank#2

declare function excel:rank(
    $x as xs:anyAtomicType,
    $numbers as xs:anyAtomicType*
) as xs:decimal

This RANK function is same as the above, only that $order_ascending is set by default to false.

Parameters

  • $x as xs:anyAtomicType
    The number whose rank you want to find.
  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numbers. The sequence can be of any length.

Returns

  • xs:decimal

    The rank of $x.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type.

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092311033.aspx

rank#3

declare function excel:rank(
    $x as xs:anyAtomicType,
    $numbers as xs:anyAtomicType*,
    $order_ascending as xs:boolean
) as xs:decimal

Returns the rank of a number in a list of numbers. The rank of a number is its size relative to other values in a list. (If you were to sort the list, the rank of the number would be its position.) RANK gives duplicate numbers the same rank.

Parameters

  • $x as xs:anyAtomicType
    The number whose rank you want to find.
  • $numbers as xs:anyAtomicType
    The sequence of numbers or values castable to numbers. The sequence can be of any length.
  • $order_ascending as xs:boolean
    A boolean having the meaning:
    false
    then rank the number as if the sequence was sorted in descending order.
    true
    then rank the number as if the sequence was sorted in ascending order.

Returns

  • xs:decimal

    The rank of $x.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type.

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092311033.aspx

slope#2

declare function excel:slope(
    $known_y as xs:anyAtomicType+,
    $known_x as xs:anyAtomicType+
) as xs:anyAtomicType

Returns the slope of the linear regression line through data points in known_y's and known_x's. The slope is the vertical distance divided by the horizontal distance between any two points on the line, which is the rate of change along the regression line. It computes the formula:
sum((x - average_x)(y - average_y)) / sum((x - average_x)^2)
where average_x and average_y are computed with AVERAGE function.

Parameters

  • $known_y as xs:anyAtomicType
    the sequence of y numbers. The sequence can be of any length, from 1 up.
  • $known_x as xs:anyAtomicType
    the sequence of x numbers. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The slope value, as numeric type

Errors

  • excel-err:Value if any parameter cannot be casted to numeric
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if there are different numbers of x's and y's
  • fn:QName("http://www.zorba-xquery.com/modules/excel/errors", "excel-err:NA") if any sequence is empty
  • excel-err:Div0 if all x's are equal

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092641033.aspx

small#2

declare function excel:small(
    $numbers as xs:anyAtomicType*,
    $k as xs:integer
) as xs:anyAtomicType

This function computes the k-th smallest value in a data set. Use this function to return values with a particular relative standing in a data set. If n is the number of data points in array, SMALL(array,1) equals the smallest value, and SMALL(array,n) equals the largest value.

Parameters

  • $numbers as xs:anyAtomicType
    A sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.
  • $k as xs:integer
    The position (from the smallest) in the sequence of data to return. Must have value between 1 and size of sequence.

Returns

  • xs:anyAtomicType

    The k-th smallest value of $numbers.

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type.
  • excel-err:Num if the sequence is zero length.
  • excel-err:Num if $k is not a value between 1 and the size of sequence.

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092661033.aspx

standardize#3

declare function excel:standardize(
    $x as xs:anyAtomicType,
    $mean as xs:anyAtomicType,
    $standard_dev as xs:anyAtomicType
) as xs:double

Returns a normalized value from a distribution characterized by mean and standard_dev.
The formula is (x - mean) / standard_dev .

Parameters

  • $x as xs:anyAtomicType
    is the value you want to normalize
  • $mean as xs:anyAtomicType
    is the arithmetic mean of the distribution.
  • $standard_dev as xs:anyAtomicType
    is the standard deviation of the distribution.

Returns

  • xs:double

    The normalized x, as numeric type

Errors

  • excel-err:Value if any parameter cannot be casted to numeric
  • excel-err:Num if standard_dev is a value smaller than zero or equal

Examples

See also

http://office.microsoft.com/en-us/excel/HP052092731033.aspx

var#1

declare function excel:var(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The variance, as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052093301033.aspx

vara#1

declare function excel:vara(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Estimates variance based on a sample.
The formula is sum(x - average_x)^2 / (n - 1).
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The variance, as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052093311033.aspx

varp#1

declare function excel:varp(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the count of numbers from the sequence, excluding empty values.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The variance, as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052093321033.aspx

varpa#1

declare function excel:varpa(
    $numbers as xs:anyAtomicType+
) as xs:anyAtomicType

Calculates variance based on the entire population.
The formula is sum(x - average_x)^2 / n.
average_x is computed with AVERAGE function.
n is the size of sequence, including empty values.

Parameters

  • $numbers as xs:anyAtomicType
    the sequence of numbers or values castable to numeric. The sequence can be of any length, from 1 up.

Returns

  • xs:anyAtomicType

    The variance, as numeric type

Errors

  • excel-err:Value if the parameters cannot be casted to numeric type

Examples

See also

http://office.microsoft.com/en-us/excel/HP052093321033.aspx

blog comments powered by Disqus