UAHDataScienceSF

The UAHDataScienceSF package provides statistical functions that can be used in three different ways:

As calculation functions that simply return the result.
As explanatory functions that show the calculation process step by step.
As interactive functions that allow users to practice calculations with feedback.

These three modes are integrated into each function through the learn and interactive parameters. When both are FALSE (by default), the function performs a simple calculation. When learn = TRUE, the function shows a detailed explanation. When interactive = TRUE, the function enters interactive mode.

Usage Examples:

To demonstrate the use of the functions, we will work with the following datasets:

data <- c(1,1,2,3,4,7,8,8,8,10,10,11,12,15,20,22,25)
plot(data);

data2 <- c(1,1,4,5,5,5,7,8,10,10,10,11,20,22,22,24,25)
plot(data2);


#Binomial variables
n = 3
x = 2
p = 0.7
    
#Poisson variables
lam = 2
k = 3

#Normal variables
nor = 0.1

#T-Student variables
xt = 290 
ut = 310
st = 50
nt = 16

The arithmetic mean calculus function:

# Simple calculation
mean_(data)
#> [1] 9.823529

# Learning mode with step-by-step explanation
mean_(data, learn = TRUE)
#> 
#> __MEAN CALCULUS__
#> 
#> The mean of a dataset is calculated by the sum of the values divided by the number of values.
#> 
#> Formula -> (x1 + x2 +..+xn) / num_elements
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> Now we need to add each element of the vector/dataset
#> The sum of the elements is: 167
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> 167/17 = 9.82352941176471
#> [1] 9.823529

# Interactive mode would be called like this (cannot be ran in vignette):
# mean_(interactive = TRUE)

The geometric mean calculus function:

# Simple calculation
geometric_mean(data)
#> [1] 6.911414

# Learning mode with step-by-step explanation
geometric_mean(data, learn = TRUE)
#> 
#> __GEOMETRIC MEAN CALCULUS__
#> 
#> The geometric mean of a dataset is calculated by multiplying each element of the dataset and raising the result to 1 divided by the number of elements in the dataset (the nth root).
#>            We'll give the user an example for better comprension.
#> 
#> Formula -> (x1 * x2 *..* xn)^( 1 / num_elements)
#> xn: valor of elements to dataset
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> Now we need to multiply each element of the vector/dataset
#> The product of the elements is: 1.87342848e+14
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> (1.87342848e+14) ^ ( 1 /17) = 6.91141369632174
#> 
#> Now try by your own! :D
#> 
#> Use geometric_mean(interactive = TRUE) function to practice.
#> [1] 6.911414

# Interactive mode would be called like this (cannot be ran in vignette):
# geometric_mean(interactive = TRUE)

The mode calculus function:

# Simple calculation
mode_(data)
#> Factor appears 3 times in the vector.
#> Unique mode
#> [1] 8

# Learning mode with step-by-step explanation
mode_(data, learn = TRUE)
#> 
#> __MODE CALCULUS__
#> 
#> The mode of a dataset is calculated by looking for the most repeated value in the dataset. If in a group there are two or several scores with the same frequency and that frequency is the maximum, the distribution is bimodal or multimodal, that is, it has several modes.
#> 
#> Formula -> Most repeated value of [Data]
#> 
#> __Use Example__
#> 
#> First step : search the most repeated value
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> Factor 8 appears 3 times in the vector.
#> 
#> Second step : check the dataset looking for a value with the same maximum frequency
#> 
#> If there are only 1 unique most repeated value, it is the mode.
#> If there are 2 values repeated with the same maximum frequency each value represents the mode. Bimodal dataset
#> If there are more than 2 values repeated with the same maximum frequency, it is a Multimodal dataset
#> 
#> Now try by your own! :D
#> 
#> Use mode_(interactive = TRUE) function to practice.
#> [1] 8

# Interactive mode would be called like this (cannot be ran in vignette):
# mode_(interactive = TRUE)

The median calculus function:

# Simple calculation
median_(data)
#> 
#> Sorted vector:
#> 1123478881010111215202225
#> 
#> [1] 8

# Learning mode with step-by-step explanation
median_(data, learn = TRUE)
#> 
#> __MEDIAN CALCULUS__
#> 
#> The median of a dataset is the value in the middle of the sorted data. It's important to know that the data must be sorted. If the dataset has a pair number of elements, we should select both in the middle to add each other and get divided by two. If the dataset has a no pair number of elements, we should select the one in the middle.
#> 
#> Formula -> 1/2(n+1) where n -> vector size
#> 
#> __Use Example__
#> 
#> First step : identify if the vector has a pair number of elements
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Second step: depending of the number of elements
#> 
#> It has a ODD number of elements (17)
#> 
#> We take the 'n/2' approaching up element
#> 
#> The result is : 8
#> 
#> Now try by your own! :D
#> 
#> Use median_(interactive = TRUE) function to practice.
#> [1] 8

# Interactive mode would be called like this (cannot be ran in vignette):
# median_(interactive = TRUE)

The standard deviation calculus function:

# Simple calculation
standard_deviation(data)
#> [1] 6.989364

# Learning mode with step-by-step explanation
standard_deviation(data, learn = TRUE)
#> 
#> __STANDARD DEVIATION CALCULUS__
#> 
#> The standard deviation of a dataset is calculated by adding the square of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset and finally making the square root on the result. We'll give the user an example for better comprension.
#> 
#> Formula ->  square_root ((Summation(each_element - Mean)^2) / num_elements)
#> 
#> Mean -> (x1 + x2 +..+xn) / n
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> The mean of dataset is...9.82352941176471
#> 
#> The square of the diference between each number and the mean of dataset is:
#> 77.8546712802768,
#> 77.8546712802768,
#> 61.2076124567474,
#> 46.560553633218,
#> 33.9134948096886,
#> 7.97231833910035,
#> 3.32525951557094,
#> 3.32525951557094,
#> 3.32525951557094,
#> 0.0311418685121105,
#> 0.0311418685121105,
#> 1.3840830449827,
#> 4.73702422145328,
#> 26.795847750865,
#> 103.560553633218,
#> 148.266435986159,
#> 230.325259515571
#> 
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is: 830.470588235294
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> (830.470588235294/17) ^ (1/2) = 6.98936413936664
#> 
#> Now try by your own! :D
#> 
#> Use standard_deviation(interactive = TRUE) function to practice.
#> [1] 6.989364

# Interactive mode would be called like this (cannot be ran in vignette):
# standard_deviation(interactive = TRUE)

The average absolute deviation calculus function:

# Simple calculation
average_deviation(data)
#> [1] 5.460208

# Learning mode with step-by-step explanation
average_deviation(data, learn = TRUE)
#> 
#> __AVERAGE DEVIATION CALCULUS__
#> 
#> The average deviation of a dataset is calculated by adding the absolute value of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> Formula ->  (Summation(abs(each_element - mean))) / num_elements
#> 
#> Mean -> (x1 + x2 +..+xn) / num_elements
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> The mean of dataset is...9.82352941176471
#> 
#> The absolute value of the diference between each number and the mean of dataset is:
#> 8.82352941176471,
#> 8.82352941176471,
#> 7.82352941176471,
#> 6.82352941176471,
#> 5.82352941176471,
#> 2.82352941176471,
#> 1.82352941176471,
#> 1.82352941176471,
#> 1.82352941176471,
#> 0.176470588235293,
#> 0.176470588235293,
#> 1.17647058823529,
#> 2.17647058823529,
#> 5.17647058823529,
#> 10.1764705882353,
#> 12.1764705882353,
#> 15.1764705882353
#> 
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is: 92.8235294117647
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> 92.8235294117647/17 = 5.46020761245675
#> 
#> Now try by your own! :D
#> 
#> Use average_deviation(interactive = TRUE) function to practice.
#> [1] 5.460208

# Interactive mode would be called like this (cannot be ran in vignette):
# average_deviation(interactive = TRUE)

The variance calculus function:

# Simple calculation
variance(data)
#> [1] 51.90441

# Learning mode with step-by-step explanation
variance(data, learn = TRUE)
#> 
#> __VARIANCE CALCULUS__
#> 
#> The variance of a dataset is calculated by adding the square of the diference between each element and the mean of the dataset. This sum will be dividing by the number of elements in the dataset. We'll give the user an example for better comprension.
#> 
#> Formula ->  (Summation(each_element - Mean)^2) / num_elements
#> 
#> Mean -> (x1 + x2 +..+xn) / n
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> The mean of dataset is...9.82352941176471
#> 
#> The square of the diference between each number and the mean of dataset is:
#> 77.8546712802768,
#> 77.8546712802768,
#> 61.2076124567474,
#> 46.560553633218,
#> 33.9134948096886,
#> 7.97231833910035,
#> 3.32525951557094,
#> 3.32525951557094,
#> 3.32525951557094,
#> 0.0311418685121105,
#> 0.0311418685121105,
#> 1.3840830449827,
#> 4.73702422145328,
#> 26.795847750865,
#> 103.560553633218,
#> 148.266435986159,
#> 230.325259515571
#> 
#> Now we need to add each element of the vector/dataset
#> The sum of the squares is: 830.470588235294
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> 830.470588235294/16 = 51.9044117647059
#> 
#> Now try by your own! :D
#> 
#> Use variance(interactive = TRUE) function to practice.
#> [1] 51.90441

# Interactive mode would be called like this (cannot be ran in vignette):
# variance(interactive = TRUE)

The quartiles calculus function:

# Simple calculation
quartile(data)
#> 
#> Sorted vector:
#> 112347888
#> 
#> 
#> Sorted vector:
#> 1123478881010111215202225
#> 
#> 
#> Sorted vector:
#> 81010111215202225
#> 
#> Q0 Q1 Q2 Q3 Q4 
#>  1  4  8 12 25

# Learning mode with step-by-step explanation
quartile(data, learn = TRUE)
#> 
#> __QUARTILES CALCULUS__
#> 
#> The quartile divides the dataset in 4 parts as equal as possible.
#> 
#> Formula -> First quartile (Q1) as the median of the first half of values.
#>              Second quartile (Q2) as the median of the series itself.
#>              Third quartile (Q3) as the median of the second half of values.
#> 
#> __Use Example__
#> 
#> Step 1: The vector must be sorted.
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Step 2: Calculated the quartiles
#> 
#> Sorted vector:
#> 112347888
#> 
#> 
#> Q1 -> (median 112347888)  = 4
#> 
#> Sorted vector:
#> 1123478881010111215202225
#> 
#> 
#> Q1 -> (median 1123478881010111215202225)  = 8
#> 
#> Sorted vector:
#> 81010111215202225
#> 
#> 
#> Q1 -> (median 81010111215202225)  = 12
#> 
#> 
#> Visualization with colors:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7 ,
#> 8 ,
#> 8 ,
#> 8 ,
#> 10 ,
#> 10 ,
#> 11 ,
#> 12 ,
#> 15 ,
#> 20 ,
#> 22 ,
#> 25
#> 
#> Q1 -> 4
#>  || Q2 ->  8
#>  || Q3 ->  12
#>  || Q4 -> onwards
#> 
#> Now try by your own! :D
#> 
#> Use quartile(interactive = TRUE) function to practice.
#> Q1 Q2 Q3 
#>  4  8 12

# Interactive mode would be called like this (cannot be ran in vignette):
# quartile(interactive = TRUE)

The percentile calculus function:

# Simple calculation
percentile(data, 0.3)
#> Percentile 30% = 7
#> [1] 7

# Learning mode with step-by-step explanation
percentile(data, 0.3, learn = TRUE)
#> 
#> __PERCENTILES CALCULUS__
#> 
#> The percentile divides the dataset in 100 parts.
#> The percentile indimessagees, once the data is ordered from least to greatest, the value of the variable below which a given percentage is lomessageed on the data
#> 
#> Formula x -> (k * N ) / 100 where k -> [1-100] and N -> vector size
#> 
#> If rest of x is diference to 0, the value of its percentile will be the position of the quotient of the previous operation.
#> 
#> In the opposite case and being 0 will be the sum of the elements whose value is the quotient and following, less in the case of the 100% percentile that will be the last element.
#> 
#> __Use Example__
#> 
#> Step 1: The vector must be sorted.
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Step 2: Apply the formula (k * N) / 100 where 'k' is [1-100]
#> 
#> We will calculate the percentiles 1,25,37,50,92 in this example
#> 
#> Percentile 1 -> (1 * 17) / 100 = 0.17
#>  .Round up the value to lomessagee it in the vector -> 0.17 ~ 1
#>  ..In our data, the value is =
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Percentile 25 -> (25 * 17) / 100 = 4.25
#>  .Round up the value to lomessagee it in the vector -> 4.25 ~ 5
#>  ..In our data, the value is =
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Percentile 37 -> (37 * 17) / 100 = 6.29
#>  .Round up the value to lomessagee it in the vector -> 6.29 ~ 7
#>  ..In our data, the value is =
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Percentile 50 -> (50 * 17) / 100 = 8.5
#>  .Round up the value to lomessagee it in the vector -> 8.5 ~ 9
#>  ..In our data, the value is =
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Percentile 92 -> (92 * 17) / 100 = 15.64
#>  .Round up the value to lomessagee it in the vector -> 15.64 ~ 16
#>  ..In our data, the value is =
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Now try by your own! :D
#> 
#> Use percentile(interactive = TRUE) function to practice.
#> [1] 7

# Interactive mode would be called like this (cannot be ran in vignette):
# percentile(interactive = TRUE)

The absolute frecuency calculus function:

# Simple calculation
absolute_frequency(data, 1)
#> [1] 2

# Learning mode with step-by-step explanation
absolute_frequency(data, 1, learn = TRUE)
#> 
#> __ABSOLUTE FRECUENCY CALCULUS__
#> 
#> The absolute frequency (Ni) of a value Xi is the number of times the value is in the set (X1, X2, ..., XN)
#> 
#> Formula -> N1 + N2 + N3 + ... + Nk -> Nk = X (Where 'X' is the element we want to examine)
#> 
#> __Use Example__
#> 
#> All we need to do is count the number of times that the element 1 appears in our data set
#> 
#> Our data set:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Now count the number of times that the element 1 appears: 2
#> 1 ,
#> 1 ,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> Now try by your own! :D
#> 
#> Use absolute_frequency(interactive = TRUE) function to practice.
#> [1] 2

# Interactive mode would be called like this (cannot be ran in vignette):
# absolute_frequency(interactive = TRUE)

The relative frecuency calculus function:

# Simple calculation
relative_frequency(data, 20)
#> [1] 0.05882353

# Learning mode with step-by-step explanation
relative_frequency(data, 20, learn = TRUE)
#> 
#> __RELATIVE FRECUENCY CALCULUS__
#> 
#> The relative frequency is the quotient between the absolute frequency of a certain value and the total number of data
#> 
#> Formula -> (Abs_frec(X) / N ) -> Where 'X' is the element we want to examine
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the element 20 appears in our data set
#> 
#> Our data set:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Now count the number of times that the element 20 appears: 1
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20 ,
#> 22,
#> 25
#> 
#> Step 2: divide it by the length of the data set
#> 
#> Solution --> relative_frecuency = (absolute_frecuency(x) / length(data)) = 1 / 17 = 0.0588235294117647.
#> 
#> Now try by your own! :D
#> 
#> Use relative_frequency(interactive = TRUE) function to practice.
#> [1] 0.05882353

# Interactive mode would be called like this (cannot be ran in vignette):
# relative_frequency(interactive = TRUE)

The absolute acumulated frecuency calculus function:

# Simple calculation
absolute_acum_frequency(data, 1)
#> [1] 2

# Learning mode with step-by-step explanation
absolute_acum_frequency(data, 1, learn = TRUE)
#> 
#> __ABSOLUTE ACUMULATED FRECUENCY CALCULUS__
#> 
#> The absolute acumulated frequency is the sum of the absolute frequency of the values minors or equals than the value we want to examine
#> 
#> Formula -> Summation(abs_frecuency <= X ) -> Where 'X' is the element we want to examine
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the elements minors or equals than 1 appears in our data set
#> 
#> Our data set:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Number of times that elements minors or equals to 1 appears = 2
#> 1 ,
#> 1 ,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> Solution --> absolute_acum_frequency = Summation(abs_frecuency <= X)  = 2.
#> 
#> Now try by your own! :D
#> 
#> Use absolute_acum_frequency(interactive = TRUE) function to practice.
#> [1] 2

# Interactive mode would be called like this (cannot be ran in vignette):
# absolute_acum_frequency(interactive = TRUE)

The relative acumulated frecuency calculus function:

# Simple calculation
relative_acum_frequency(data, 20)
#> [1] 0.8823529

# Learning mode with step-by-step explanation
relative_acum_frequency(data, 20, learn = TRUE)
#> 
#> __RELATIVE ACUMULATED FRECUENCY CALCULUS__
#> 
#> The relative acumulated frequency is the quotient between the sum of the absolute frequency of the values minors or equals than the value we want to examine, and the total number of data
#> 
#> Formula -> (Summation(abs_frecuency <= X) / N ) -> Where 'X' is the element we want to examine
#> 
#> __Use Example__
#> 
#> Step 1: count the number of times that the elements minors or equals than 20 appears in our data set
#> 
#> Our data set:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> 
#> Number of times that elements minors or equals to 20 appears = 0.882352941176471
#> 1 ,
#> 1 ,
#> 2 ,
#> 3 ,
#> 4 ,
#> 7 ,
#> 8 ,
#> 8 ,
#> 8 ,
#> 10 ,
#> 10 ,
#> 11 ,
#> 12 ,
#> 15 ,
#> 20 ,
#> 22,
#> 25
#> 
#> Step 2: divide it by the length of the data set
#> 
#> Solution --> relative_frecuency_acum = (Summation(abs_frecuency <= X) / length(data)) = 0.882352941176471 / 17 = 0.882352941176471.
#> 
#> Now try by your own! :D
#> 
#> Use relative_acum_frequency(interactive = TRUE) function to practice.
#> [1] 0.8823529

# Interactive mode would be called like this (cannot be ran in vignette):
# relative_acum_frequency(interactive = TRUE)

The covariance calculus function:

# Simple calculation
covariance(data, data2)
#> [1] 52.79585

# Learning mode with step-by-step explanation
covariance(data, data2, learn = TRUE)
#> 
#> __COVARIANCE CALCULUS__
#> 
#> The covariance of two datasets is calculated by multiplying the differences between each element and its mean, summing these products, and dividing by the number of elements.
#> 
#> Formula ->  Summation((x - mean_x)*(y - mean_y)) / n
#> 
#> Mean -> (x1 + x2 +..+xn) / n
#> 
#> __Use Example__
#> 
#> First of all, we need to know the contents of the datasets/vectors of numbers
#> 
#> The contents of the vectors are:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 1,
#> 1,
#> 4,
#> 5,
#> 5,
#> 5,
#> 7,
#> 8,
#> 10,
#> 10,
#> 10,
#> 11,
#> 20,
#> 22,
#> 22,
#> 24,
#> 25
#> 
#> The mean of x dataset is...9.82352941176471
#> 
#> The mean of y dataset is...11.1764705882353
#> 
#> The products of differences from means:
#> 89.7923875432526,
#> 89.7923875432526,
#> 56.1453287197232,
#> 42.1453287197232,
#> 35.9688581314879,
#> 17.439446366782,
#> 7.6159169550173,
#> 5.7923875432526,
#> 2.14532871972318,
#> -0.207612456747404,
#> -0.207612456747404,
#> -0.207612456747404,
#> 19.2041522491349,
#> 56.0276816608997,
#> 110.145328719723,
#> 156.145328719723,
#> 209.792387543253
#> 
#> Now we need to add all these products
#> The sum of the products is: 897.529411764706
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vectors is 17elements
#> 
#> Formula applied -> 897.529411764706/17 = 52.7958477508651
#> 
#> Now try by your own! :D
#> 
#> Use covariance(interactive = TRUE) function to practice.
#> [1] 52.79585

# Interactive mode would be called like this (cannot be ran in vignette):
# covariance(interactive = TRUE)

The harmonic mean calculus funtion:

# Simple calculation
harmonic_mean(data)
#> 
#> Sorted vector:
#> 1123478881010111215202225
#> 
#> [1] 4.069367

# Learning mode with step-by-step explanation
harmonic_mean(data, learn = TRUE)
#> 
#> __HARMONIC MEAN CALCULUS__
#> 
#> The harmonic mean of a dataset is calculated by the number of values by divided the inverse sum of the values . We'll give the user an example for better comprension.
#> 
#> Formula -> num_elements/ (1/x1 + 1/x2 +..+ 1/xn)
#> xn: valor of elements to dataset
#> 
#> __Use Example__
#> 
#> First of all, we need to know the content of the dataset/vector of numbers
#> 
#> The content of the vector is:
#> 1,
#> 1,
#> 1,
#> 2,
#> 3,
#> 8,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 10,
#> 11,
#> 20,
#> NA,
#> NA,
#> NA,
#> 
#> The invert sum of the elements is: 4.17755411255411
#> 
#> Next step, get the number of elements that we've examined
#> 
#> The length of the vector is 17elements
#> 
#> Formula applied -> 17/4.17755411255411 = 4.06936679740729
#> 
#> Now try by your own! :D
#> 
#> Use harmonic_mean(interactive = TRUE) function to practice.
#> [1] 4.069367

# Interactive mode would be called like this (cannot be ran in vignette):
# harmonic_mean(interactive = TRUE)

The pearson correlaction calculus funtion:

# Simple calculation
pearson(data, data2)
#> [1] 0.9510292

# Learning mode with step-by-step explanation
pearson(data, data2, learn = TRUE)
#> 
#> __PEARSON CORRELATION COEFFICIENT__
#> 
#> Pearson's correlation coefficient is the covariance of the two variables divided by the product of their standard deviations.It has a value between +1 and -1. A value of +1 is total positive linear correlation, 0 is no linear correlation, and -1 is total negative linear correlation.
#> 
#> Formula ->  (covariance(x,y) / (standardDeviation(x) * standardDeviation(y))
#> 
#> __Use Example__
#> 
#> First of all, we need to know the contents of the datasets/vectors of numbers
#> 
#> The contents of the vectors are:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 1,
#> 1,
#> 4,
#> 5,
#> 5,
#> 5,
#> 7,
#> 8,
#> 10,
#> 10,
#> 10,
#> 11,
#> 20,
#> 22,
#> 22,
#> 24,
#> 25
#> The value of covariance: 52.7958477508651
#> The standard deviation of the elements of x is: 6.98936413936664
#> The standard deviation of the elements of y is: 7.94270137864388
#> 
#> Formula applied -> (52.7958477508651/ (6.98936413936664 * 7.94270137864388) = 0.951029231000006
#> 
#> Now try by your own! :D
#> 
#> Use pearson(interactive = TRUE) function to practice.
#> [1] 0.9510292

# Interactive mode would be called like this (cannot be ran in vignette):
# pearson(interactive = TRUE)

The coefficient of variation calculus funtion:

# Simple calculation
cv(data)
#> [1] 0.7114922

# Learning mode with step-by-step explanation
cv(data, learn = TRUE)
#> 
#> __COEFFICIENT OF VARIATION__
#> 
#> The coefficient of variation (CV) is defined as the ratio of the standard deviation to the mean.
#> 
#> Formula ->  (standardDeviation(x) / mean(x))
#> 
#> __Use Example__
#> 
#> First of all, we need to know the contents of the datasets/vectors of numbers
#> 
#> The contents of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> The standard deviation of the elements of x is: 6.98936413936664
#> The value of mean: 9.82352941176471
#> 
#> Formula applied -> (6.98936413936664/ 9.82352941176471 = 0.711492157899598
#> 
#> Now try by your own! :D
#> 
#> Use cv(interactive = TRUE) function to practice.
#> [1] 0.7114922

# Interactive mode would be called like this (cannot be ran in vignette):
# cv(interactive = TRUE)

The Laplace rule calculus funtion:

# Simple calculation
laplace(data, data2)
#> [1] 1

# Learning mode with step-by-step explanation
laplace(data, data2, learn = TRUE)
#> 
#> __LAPLACE`S RULE __
#> 
#> Laplace's rule as the quotient between the number of favorable cases to A, and that of all possible results of the experiment.
#> 
#> Formula ->  (Cases favorable to A / All possible results)
#> 
#> __Use Example__
#> 
#> First of all, we need to know the contents of the datasets/vectors of numbers
#> 
#> The contents of the vector is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> Favorables cases: 17
#> All possible results: 17
#> 
#> Formula applied -> (17/ 17 = 1
#> 
#> Now try by your own! :D
#> 
#> Use laplace(interactive = TRUE) function to practice.
#> [1] 1

# Interactive mode would be called like this (cannot be ran in vignette):
# laplace(interactive = TRUE)

The binomial distribution calculus funtion:

# Simple calculation
binomial_(n, x, p)
#> [1] 0.441

# Learning mode with step-by-step explanation
binomial_(n, x, p, learn = TRUE)
#> 
#> __BIONOMIAL DISTRIBUTION__
#> 
#> Binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes or no question, and each with its
#>            own Boolean-valued outcome: success (with probability p) or failure (with probability q = 1 - p)
#> 
#> Formula ->  ((factorial(n) / (factorial(x) * factorial(n-x))) * (p ^ x) * (1 - p)^(n - x))
#> 
#> __Use Example__
#> 
#> First of all, we need to know the n, the number of trials
#> In this case n=3
#> 
#> Second, we need to know the p, probability of success.
#> In this case p=0.7
#> 
#> Finally, we need to know the x, binomial random variable
#> In this case x=2
#> 
#> Formula applied -> (factorial(3) / (factorial(2) * factorial(3-2))) * (0.7 ^ 2) * (1 - 0.7)^(3 - 2) = 0.441
#> 
#> Now try by your own! :D
#> 
#> Use binomial_(interactive = TRUE) function to practice.
#> [1] 0.441

# Interactive mode would be called like this (cannot be ran in vignette):
# binomial_(interactive = TRUE)

The poisson distribution calculus funtion:

# Simple calculation
poisson_(k, lam)
#> [1] 0.180447

# Learning mode with step-by-step explanation
poisson_(k, lam, learn = TRUE)
#> 
#> __POISSON DISTRIBUTION__
#> 
#> Poisson distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event
#> 
#> Formula ->  ((e ^ (- lam)) * (lam ^ k)) / factorial(k)
#> 
#> __Use Example__
#> 
#> First of all, we need to know the e, the s Euler's number
#> In this case e=2.71828182845905
#> 
#> Second, we need to know the lam, it is a positive parameter that represents the number of times the phenomenon is expected to occur during a given interval.
#> In this case lam=2
#> 
#> Finally, we need to know the k, the number of occurrences.
#> In this case k=3
#> 
#> Formula applied -> ((2.71828182845905  ^ (- 2)) * (2 ^ 3)) / factorial(3) = 0.180447044315484
#> 
#> Now try by your own! :D
#> 
#> Use poisson_(interactive = TRUE) function to practice.
#> [1] 0.180447

# Interactive mode would be called like this (cannot be ran in vignette):
# poisson_(interactive = TRUE)

The normal distribution calculus funtion:

# Simple calculation
normal(nor)
#> [1] 0.3969525

# Learning mode with step-by-step explanation
normal(nor, learn = TRUE)
#> 
#> __NORMAL DISTRIBUTION__
#> 
#>  The standard normal distribution is one that has the mean value of zero, M = 0, and the standard deviation of unity, Sigma = 1.
#> Its density function is:
#> 
#> Formula ->  (1/(2pi)^(1/2)) * (e)^((-x^2)/2)
#> 
#> __Use Example__
#> 
#> First of all, we need to know the e, the s Euler's number
#> In this case e=2.71828182845905
#> 
#> Finally, we need to know pi, the number pi.
#> In this case pi=3.14159265358979
#> 
#> Formula applied -> (1/(2*3.14159265358979)^(1/2)) * (2.71828182845905)^((-0.1^2)/2) = 0.396952547477012
#> 
#> Now try by your own! :D
#> 
#> Use normal(interactive = TRUE) function to practice.
#> [1] 0.3969525

# Interactive mode would be called like this (cannot be ran in vignette):
# normal(interactive = TRUE)

The tstudent distribution calculus function:

# Simple calculation
tstudent(xt, ut, st, nt)
#> [1] -1.6

# Learning mode with step-by-step explanation
tstudent(xt, ut, st, nt, learn = TRUE)
#> 
#> __T-STUDENT DISTRIBUTION__
#> 
#> T-student is a probability distribution that arises from the problem of estimating the mean of a normally distributed population when the sample size is small.
#> 
#> Formula ->  (x-u)/(s/(n)^(1/2))
#> 
#> __Use Example__
#> 
#> First of all, we need to know the x, is sample mean
#> In this case x=290
#> 
#> Second, we need to know the u, is population mean
#> In this case u=310
#> 
#> Next, we need to know the s, is population standard deviation
#> In this case s=50
#> 
#> Finally, we need to know the n, is sample size.
#> In this case n=16
#> 
#> Formula applied -> (290 - 310)/(50/(16)^(1/2)) = -1.6
#> 
#> Now try by your own! :D
#> 
#> Use tstudent(interactive = TRUE) function to practice.
#> [1] -1.6

# Interactive mode would be called like this (cannot be ran in vignette):
# tstudent(interactive = TRUE)

The chisquared distribution calculus function:

# Simple calculation
chisquared(data, data2)
#> [1] 9.118615

# Learning mode with step-by-step explanation
chisquared(data, data2, learn = TRUE)
#> 
#> __CALCUATED CHI-SQUARED DISTRIBUTION__
#> 
#> Calculated chi-squared is a probability distribution that serves to manifest tests in hypothesis of frequencies, this test compares observed frequencies with those expected frequencies.
#> 
#> Formula ->  ((x[1]-y[1])^2)/y[1] + ((x[2]-y[2])^2)/y[2] + ... + ((x[n]-y[n])^2)/y[n]
#> 
#> __Use Example__
#> 
#> First of all, we need to know the contents of the datasets/vectors of numbers
#> 
#> The contents of the vector x is:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#> The contents of the vector y is:
#> 1,
#> 1,
#> 4,
#> 5,
#> 5,
#> 5,
#> 7,
#> 8,
#> 10,
#> 10,
#> 10,
#> 11,
#> 20,
#> 22,
#> 22,
#> 24,
#> 25
#> 
#> 
#> Formula applied ->
#>   0 +
#>   0 +
#>   1 +
#>   0.8 +
#>   0.2 +
#>   0.8 +
#>   0.142857142857143 +
#>   0 +
#>   0.4 +
#>   0 +
#>   0 +
#>   0 +
#>   3.2 +
#>   2.22727272727273 +
#>   0.181818181818182 +
#>   0.166666666666667 +
#> 0
#>  =
#> 9.11861471861472
#> 
#> Now try by your own! :D
#> 
#> Use chisquared(interactive=TRUE) function to practice.
#> [1] 9.118615

# Interactive mode would be called like this (cannot be ran in vignette):
# chisquared(interactive = TRUE)

The fisher distribution calculus function:

# Simple calculation
fisher(data, data2)
#> [1] 0.03078098

# Learning mode with step-by-step explanation
fisher(data, data2, learn = TRUE)
#> 
#> __ F FISHER DISTRIBUTION__
#> 
#> F-Fisher distribution is a continuous probability distribution that arises frequently as the null distribution of a test statistic.
#> 
#> Formula -> sx2/sw2
#> 
#> sx2 <- 2 * (((mean_(x)-meant)^2) + ((mean_(y)-meant)^2))
#> 
#> (variance_(x) + variance_(y))/ 2
#> 
#> __Use Example__
#> 
#> First of all, we need two datasets.
#> 
#>  Dateset x:
#> 1,
#> 1,
#> 2,
#> 3,
#> 4,
#> 7,
#> 8,
#> 8,
#> 8,
#> 10,
#> 10,
#> 11,
#> 12,
#> 15,
#> 20,
#> 22,
#> 25
#> 
#>  Dateset x:
#> 1,
#> 1,
#> 4,
#> 5,
#> 5,
#> 5,
#> 7,
#> 8,
#> 10,
#> 10,
#> 10,
#> 11,
#> 20,
#> 22,
#> 22,
#> 24,
#> 25
#> 
#> Formula applied -> (1.83044982698962/59.4669117647059) = 0.030780980089099
#> 
#> Now try by your own! :D
#> 
#> Use fisher(interactive = TRUE) function to practice.
#> [1] 0.03078098

# Interactive mode would be called like this (cannot be ran in vignette):
# fisher(interactive = TRUE)

- Usage Examples: