Difference between revisions of "Functions for descriptive statistics"

From Free Pascal wiki
Jump to navigationJump to search
(→‎Custom functions: Adding CoV)
Line 31: Line 31:
 
Some functions aren't defined in the RTL. The subsequent section lists source code of some commonly used measures for centrality and dispersion. Where not otherwise specified the code is provided with a BSD license.
 
Some functions aren't defined in the RTL. The subsequent section lists source code of some commonly used measures for centrality and dispersion. Where not otherwise specified the code is provided with a BSD license.
  
Together with other useful statistical functions expanded and thoroughly tested versions of these functions are also available in the [http://quantum-salis.sf.net/ QUANTUM SALIS] project.
+
Together with other useful statistical code expanded and thoroughly tested versions of these functions are also available in the [http://quantum-salis.sf.net/ QUANTUM SALIS] project.
  
 
=== Standard error of the mean ===
 
=== Standard error of the mean ===

Revision as of 19:22, 13 March 2017

fpc source logo.png

English (en) français (fr)

Descriptive statistics aim at characterising empirical data by summative parameters (and also by tables and plots}.

Standard functions defined in math unit

The unit math of the RTL provides a plethora of routines for descriptive statistics.

  • mean: Returns the mean value of an array.
  • stddev: Returns the (sample) standard deviation of an array.
  • popnstddev: Returns the (population) standard deviation of an array.
  • meanandstddev: Returns mean and standard deviation of an array.
  • momentskewkurtosis: Returns the first four moments of an array.
  • variance: Returns the (sample) variance of an array.
  • popnvariance: Returns the (population) variance of an array.
  • totalvariance: Returns the total variance of an array.
  • sumofsquares: Returns the sum of squares of an array.
  • sum: Returns the sum of values of an array.
  • sumsandsquares: Returns sum and sum of squares of the values in an array.

These functions expect an array of predefined length (e.g. array[1..100] of float) or a 0-based open array (e.g. array of extended) with subsequent call of the SetLength procedure.

Standard functions defined in other units

  • length: Delivers the length (n) of an array.

Custom functions

Some functions aren't defined in the RTL. The subsequent section lists source code of some commonly used measures for centrality and dispersion. Where not otherwise specified the code is provided with a BSD license.

Together with other useful statistical code expanded and thoroughly tested versions of these functions are also available in the QUANTUM SALIS project.

Standard error of the mean

The standard error of the mean (SEM) is a measure that estimates how precisely the true mean of the population can be known.

Calculation of SEM is simple:

  function sem(const data: array of Extended): real;
  begin
    sem := stddev(data) / sqrt(length(data));
  end;

Coefficient of variation

The coefficient of variation (CoV or CV), also known as relative standard deviation (RSD), is a measured of dispersion that is standardised with respect of the data's mean.

it can be calculated with:

function cv(const data: TExtArray): real;
{ calculates the coefficient of variation (CV or CoV) of a vector of extended }
begin
  result := stddev(data) / mean(data);
end;