geodezyx.stats package

Submodules

geodezyx.stats.least_squares module

@author: psakic

This sub-module of geodezyx.stats contains functions for least-squares processing.

it can be imported directly with: from geodezyx import stats

The GeodeZYX Toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/GeodeZYX/geodezyx-toolbox

geodezyx.stats.least_squares.bins_middle(bin_edges)
geodezyx.stats.least_squares.chi2_test_frontend(dist_inp, nbins=10, ddof=2, debug=0, mode2=False, aaa=1)

mode1 (par def) : on normalise la theorique et pas la observée mode2 : on normalise la distribution observée est pas la théorique INCOHERENT AVEC MATLAB => A EVITER

Enfin bon, la manière dont on fabrique les valeurs théoriques est quand même un peu vaseuse … penser à porter le code matlab chi2gof.m l.185 Et comprendre aussi pourquoi ils ont un ddof de 2 (quon ajoute ici aussi par bete copiage) …

en debug mode bin_edges,bin_edges2,hist,gauss,chi2

geodezyx.stats.least_squares.chi2_test_lsq(V, A, P=None, fuvin=None, risk=0.05, cleaning_std=False, cleaning_normalized=False, koefP=1)

P est uniquement la diagonale de la matrice des poids

les cleaning sont des astuces pour se rapprocher de 1 (en nettoyant les plus mauvaises valeurs) cleaning_normalized est à privilégier (et override cleaning_std)

koefP est coefficient qu’on donne a P pour trouver une solution viable

si koefP != 1, le nouveau P est donné en avant dernier argument

geodezyx.stats.least_squares.clean_nan(A, L)

DISCONTINUED

A est un array bi dimentionnel L est un array mono dimentionnel renvoie un A et un L nettoyé mutuellement de leurs NaN respectifs return np.sqrt(a + a.T - np.diag(a.diagonal()))

geodezyx.stats.least_squares.constraint_improve_N(N, C, trans=False, outsparsetype='csc')

give N normal matrix and C constraints matrix returns N compined with C trans is a (dirty) way to transpose C if made in wrong shape

convention Ghilani 2011 p424 :

N C.T C 0

geodezyx.stats.least_squares.ellipse_angle_of_rotation(a, outdeg=True)

core fct for ellipse_fit http://nicky.vanforeest.com/misc/fitEllipse/fitEllipse.html

geodezyx.stats.least_squares.ellipse_axis_length(a)

core fct for ellipse_fit http://nicky.vanforeest.com/misc/fitEllipse/fitEllipse.html

geodezyx.stats.least_squares.ellipse_center(a)

http://nicky.vanforeest.com/misc/fitEllipse/fitEllipse.html

geodezyx.stats.least_squares.ellipse_fit(x, y)

find the parameters a,b,phi,x0,y0 of an ellipse

from http://nicky.vanforeest.com/misc/fitEllipse/fitEllipse.html

geodezyx.stats.least_squares.ellipse_get_coords(a=0.0, b=0.0, x=0.0, y=0.0, angle=0.0, k=2, out_separate_X_Y=True, trigo=True)

Draws an ellipse using (360*k + 1) discrete points; based on pseudo code given at http://en.wikipedia.org/wiki/Ellipse k = 1 means 361 points (degree by degree) a = major axis distance, b = minor axis distance, x = offset along the x-axis y = offset along the y-axis angle = trigo/clockwise rotation [in degrees] of the ellipse;

  • angle=0 : the ellipse is aligned with the positive x-axis

  • angle=30 : rotated 30 degrees trigo/clockwise from positive x-axis

trigo sense is the standard convention

NBclockwise is the internal convention, but we prefer trigo

convention for the Ghiliani ellipses made by error_ellipse_parameters

source : scipy-central.org/item/23/1/plot-an-ellipse

geodezyx.stats.least_squares.error_ellipse(xm, ym, sigx, sigy, sigxy, nsig=1, ne=100, scale=1)

from matlab fct http://kom.aau.dk/~borre/matlab/geodesy/errell.m It works but don’t ask why …

(X,Y) orientation convention is inverted => (Y,X) … so in a practical way you must invert X ,Y (it is not important for the axis but it is for the orientation) AND sigx,sigy,sigxy must be first normalized with the fuv

sigx, sigy, sigxy :

so as we can generate a covariance matrix cov = np.array([[sigx ** 2,sigxy],[sigxy,sigy ** 2]])

ne :

nb of segements for the ellipse

RETURNS :

xe,ye,dx2,dy2

DEBUG:

si on a xe1,ye1,_,_ = stats.error_ellipse(pxp[0],pxp[1], sigxB , sigyB , sigxyB, scale= 10000) xe2,ye2,_,_ = stats.error_ellipse(pxp[0],pxp[1], sigyB , sigxB , sigxyB, scale= 10000) et PAS les - à D et dxy0 => on a 2 ellipses differentes

si on a xe1,ye1,_,_ = stats.error_ellipse(pxp[0],pxp[1], sigxB , sigyB , sigxyB, scale= 10000) xe2,ye2,_,_ = stats.error_ellipse(pxp[0],pxp[1], sigyB , sigxB , sigxyB, scale= 10000) et AVEC les - à D et dxy0 => on a 2 ellipses differentes au moins une ellipse coincide avec celle de Ghiliani

A investiguer, en attendant, à éviter

geodezyx.stats.least_squares.error_ellipse_parameters(qxx, qyy, qxy, fuv, out_t=False)
INPUT :

qxx,qyy,qxy : factors as in the varcovar matrix (no normalisation with the fuv or other) fuv

OUTPUT :

Su/a Sb/b : semimajor and semiminor axis t : angle that the u/a axis makes with the y axis in clockwise direction OR phi : angle that the u/a axis makes with the x axis in trigo direction

(this one is the natural way, perfect to test those of the Strang & Borre)

geodezyx.stats.least_squares.error_ellipse_parameters_2(sigx, sigy, sigxy, out_deg=True)

ref : Strang & Borre p 337 (X,Y) orientation convention is inverted => (Y,X) … so in a practical way you must invert X ,Y (it is not important for the axis but it is for the orientation) AND sigx,sigy,sigxy must be normalized with the fuv

geodezyx.stats.least_squares.fitEllipse_core(x, y)

core fct for ellipse_fit http://nicky.vanforeest.com/misc/fitEllipse/fitEllipse.html

geodezyx.stats.least_squares.fuv_calc(V, A, P=1, normafuv=1)
Args :

V : residuals vector

A : Jacobian matrix

P : weight matrix

Can manage both standard arrays and sparse array

Returns :

fuv : Facteur unitaire de variance (unitary variance factor)

Notes :

le fuv dépend de la martice de poids mais les sigmas non ex : poids de 10**-6 fuv : 439828.260843 sigma : [ 5.21009306 5.09591568 0.04098106] poids de 1 fuv : 4.39828260843e-07 sigma : [ 5.21009306 5.09591568 0.04098106]

geodezyx.stats.least_squares.fuv_calc_OLD(V, A)
geodezyx.stats.least_squares.fuv_calc_OLD2(V, A, P=None)
geodezyx.stats.least_squares.get_accur_coeff(i)

accuracy coefficients given by https://en.wikipedia.org/wiki/Finite_difference_coefficient

geodezyx.stats.least_squares.jacobian(f, var_in_list, var_out, kwargs_f_list=[], h=1e-06, nproc=4)

il n’y a que les kwargs qui sont gérés

geodezyx.stats.least_squares.jacobian_line(f, var_in_list, var_out=0, kwargs_f={}, args_f=[], h=0, aray=True)

same argument as partial_derive except var_in becomes var_in_list : it’s an iterable of all variables the derivation must be performed

geodezyx.stats.least_squares.kwargs_for_jacobian(kwdic_generik, kwdic_variables)

Building a list of kwargs for the jacobian function kwdic_generik : parameters which not gonna change kwdic_variable : parameters which gonna change, so must be associated with iterable

geodezyx.stats.least_squares.nan_cleaner(Ain, Bin)

remove A & B of their respective NaN

Args :

Ain , Bin : lists/arrays

Returns:

A & B withour NaN

Return type:

Aout , Bout

geodezyx.stats.least_squares.partial_derive(f, var_in, var_out=0, kwargs_f={}, args_f=[], h=0, accur=-1)

This function computes the partial derivatives of a python function

Parameters:
  • f (Python function) – the python function which will be derivated. this function must return a scalar. the parameter of f suseptibles to be derivated must be scalars. i.e. if for isntace you want to derivate a position vector X = [x,y,z] f must take as argument f(x,y,z) and not f(X)

  • var_in (int or string) – the detrivation is with respect to this variable can be a int (starts with 0) or a string describing the name of the var in f arguments.

  • var_out (int) – the output of f which needs to be considerated as the output ** must be an int ** The default is 0.

  • kwargs_f (dict, optional) – dictionary describing the arguments of f. The default is {}.

  • args_f (iterable, optional) – tuple/list & dict describing the arguments of f. The default is [].

  • h (float, optional) – derivation step, if h == 0 give x * sqrt(epsilon) (source : http://en.wikipedia.org/wiki/Numerical_differentiation) .

  • accur (int, optional) – accuracy coefficient index. -1 is the best but the slowest. The default is -1. https://en.wikipedia.org/wiki/Finite_difference_coefficient

Returns:

dout – the derivative of f w.r.t. var_in.

Return type:

float

geodezyx.stats.least_squares.partial_derive_old(f, var_in, var_out=0, kwargs_f={}, args_f=[], h=0)
var_in :

detrivation with respect to this variable can be a int (starts with 0) or a string descirbing the name of the var in f

var_out :

the output of f which needs to be considerated as the output ** must be a int **

args_f & kwargs_f :

tuple/list & dict describing the arguments of f

h :

derivation step, if h == 0 give x * sqrt(epsilon) (source : http://en.wikipedia.org/wiki/Numerical_differentiation)

geodezyx.stats.least_squares.sigmas_formal_calc(N, V, A, fuv=None, P=None)
geodezyx.stats.least_squares.smart_i_giver(subgrp_len_list, i_in_sublis, sublis_id, advanced=False, sublis_id_list=[])

eg subgrp_len_list = [4201, 4186, 4157, 4041, 4058, 4204, 4204, 4204, 4204] i_in_sublis = 2 sublis_id = 3 return 4201 + 4186 + 4157 + 2

advanced = True: the sublis_id is not a int but and generic identifier ( str ,int , set … ) present in sublis_id_list else must be an int

geodezyx.stats.least_squares.triangle_arr2vect(triarrin, k=1)
geodezyx.stats.least_squares.weight_mat(Sinp, Ninp=[], fuvinp=1, sparsediag=False)
Args :

Sinp : liste des Sigmas sig = sqrt(var) Ninp : liste de la taille de chaque blocs (obs) fuvinp = 1 : facteur unitaire de variance inspiré de mat_poids , fct écrite dans la lib resolution de GPShApy

Returns :

K : matrice de var-covar Q : matrice des cofacteurs P : matrice des poids inv(Q)

geodezyx.stats.least_squares.weight_mat_simple(Pinp, Ninp=[], sparsediag=False, return_digaonal_only=False)

Simple version of weight_mat : takes directly the weights (Pinp) and the size for each weigths blocks (Ninp)

Pinp and Ninp have to have the same length

Args :

Pinp : list of weigths Ninp : list of the size of each block (obs number) fuvinp = 1 : facteur unitaire de variance

Returns :

P : weigth matrix

geodezyx.stats.stats module

@author: psakic

This sub-module of geodezyx.stats contains functions for low-level statistics.

it can be imported directly with: from geodezyx import stats

The GeodeZYX Toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/GeodeZYX/geodezyx-toolbox

geodezyx.stats.stats.RMSmean(indata)

returns RMS mean of a list/array

useless redundancy with rms_mean this function use shall be avoided

geodezyx.stats.stats.butter_lowpass(cutoff, fs, order=5)
geodezyx.stats.stats.butter_lowpass_filter(data, cutoff, fs, order=5)
geodezyx.stats.stats.color_of_season(datein)
geodezyx.stats.stats.confid_interval_slope(x, y, alpha=0.95)

Calcule un intervalle de confiance sur une tendance En entrée: x = la variable indépendante

y = la variable dépendante alpha = la probabilité d’erreur tolérée

En sortie: mi = la borne inférieure de l’intervalle

ma = la borne supérieure de l’intervalle

Source (???? => En fait non …) http://www.i4.auc.dk/borre/matlab http://kom.aau.dk/~borre/matlab/

geodezyx.stats.stats.dates_middle(start, end)
geodezyx.stats.stats.detrend_timeseries(X, Y)

detrend, i.e. remove linear tendence of a timeserie Y(X)

Parameters:

Y (X &) – Values

Returns:

X & Yout – Detrended Y

Return type:

list or numpy.array

geodezyx.stats.stats.find_intersection(x1, y1, x2, y2)
geodezyx.stats.stats.gaussian_filter_GFZ_style_smoother(tim_ref, dat_ref, width=7)

Gaussian filter to smooth data, based on GFZ’s GMT_plus.pm/gaussian_kernel

Args :

tim_ref : the X/T component of the time serie (in decimal days !)

dat_ref : the Y component (the data)

width : size of the window (odd number is best ?)

Returns :

dat_smt : smoothed Y

NB :

Some other nice ideas here http://scipy-cookbook.readthedocs.io/items/SignalSmooth.html https://stackoverflow.com/questions/20618804/how-to-smooth-a-curve-in-the-right-way https://stackoverflow.com/questions/32900854/how-to-smooth-a-line-using-gaussian-kde-kernel-in-python-setting-a-bandwidth

NB2THIS VERSION IS VERY SLOW (DIRTY CONVERSION OF A PERL FCT)

THE PYTHONIC VERSION gaussian_filter_GFZ_style_smoother_improved BELOW SHOULD BE USED !!!

geodezyx.stats.stats.gaussian_filter_GFZ_style_smoother_improved(tim_ref, dat_ref, width=7)

Gaussian filter to smooth data, based on GFZ’s GMT_plus.pm/gaussian_kernel

Parameters:
  • tim_ref (iterable (list or array)) – the X/T component of the time serie (in decimal days!)

  • dat_ref (iterable (list or array)) – the Y component (the data).

  • width (int, optional) – size of the window (odd number is best ?). The default is 7.

Returns:

dat_smt2 – smoothed Y.

Return type:

array

geodezyx.stats.stats.get_season(now)
geodezyx.stats.stats.harmonic_mean(A)

harmonic mean of a list/array A

geodezyx.stats.stats.lagrange1(points)

Low level function to determine a lagrangian polynom

Replace scipy.interpolate.lagrange which is HIGHLY instable

Parameters:

points (list of n-interable) – point list.

Returns:

  • P (function) – function representing the polynom.

  • Source

  • ——

  • from (https://gist.github.com/melpomene/2482930)

geodezyx.stats.stats.lagrange2(X, Y)

Low level function to determine a lagrangian polynom

Replace scipy.interpolate.lagrange which is HIGHLY instable

this function is more pythonic, but slower thant lagrange1….

Parameters:

points (list of n-interable) – point list.

Returns:

  • P (function) – function representing the polynom.

  • Source

  • ——

  • from (https://gist.github.com/melpomene/2482930)

geodezyx.stats.stats.lagrange_interpolate(Tdata, Ydata, Titrp, n=10)

Perform a temporal lagrangian interpolation the X-component is a time

Parameters:
  • Tdata (iterable of datetime) – X/T component of the known points.

  • Ydata (iterable of floats) – Y component of the known points..

  • Titrp (iterable of datetime) – Epochs of the wished points.

  • n (int, optional) – degree of the polynom. Better if even. The default is 10.

Returns:

  • Yintrp (float array) – output interpolated data.

  • Tips

  • —-

  • Use conv.dt_range to generate the wished epochs range

geodezyx.stats.stats.linear_coef_a_b(x1, y1, x2, y2)

Gives coefficients of the line between two points (x1,y1) & (x2,y2) x1,y1,x2,y2 can be iterables

Parameters:
  • x1 (float or list or numpy.array) – Coordinates of the 1st and the 2nd point

  • y1 (float or list or numpy.array) – Coordinates of the 1st and the 2nd point

  • x2 (float or list or numpy.array) – Coordinates of the 1st and the 2nd point

  • y2 (float or list or numpy.array) – Coordinates of the 1st and the 2nd point

Returns:

  • a (float) – regression coefficient

  • b1 & b2 (float) – regression offsets coefficient (b1 must be equal to b2)

geodezyx.stats.stats.linear_reg_getvalue(X, a, b, full=True)

From a vector X and coefficients a & b, get Y = a*X + b

Parameters:
  • X (list or numpy.array) – Values

  • b (a &) – Linear regression coefficients

  • full (bool) – True : return X , Y = aX + b , False : return Y = aX + b

Returns:

  • Y (numpy.array) – if full == False

  • OR

  • X , Y (numpy.array) – if full == True

Note

Unstable while working with POSIX Time as X-data (too heigh values ? …) Decimal Years are recommended

geodezyx.stats.stats.linear_regression(x, y, fulloutput=False, simple_lsq=False, alpha=0.95)

Performs linear regression on two vectors, X and Y, and returns the coefficients a (slope) and b (intercept).

Parameters:
  • x (list or numpy.array) – The X values.

  • y (list or numpy.array) – The Y values.

  • simple_lsq (bool, optional) – If True, performs a basic, low-level least square inversion (faster, but less outputs). If False, calls scipy’s linregress. Default is False.

  • fulloutput (bool, optional) – If True, returns additional outputs (confidence interval for the slope and standard deviation). Default is False.

  • alpha (float, optional) – The alpha value for the confidence interval. Default is .95.

Returns:

  • slope (float) – The slope (a) of the linear regression.

  • intercept (float) – The intercept (b) of the linear regression.

  • confid_interval_slope (tuple of float, optional) – The confidence interval for the slope. Only returned if fulloutput is True.

  • std_err (float, optional) – The standard deviation. Only returned if fulloutput is True.

Notes

This function performs a similar job to scipy.stats.linregress.

Regarding computation speed: low-level least square inversion is faster for small datasets. For larger datasets, scipy’s linregress is faster (n points > ~13000).

geodezyx.stats.stats.mad(data, mode='median')

returns Median Absolute Deviation (MAD) a list/array

geodezyx.stats.stats.movingaverage(values, window)

including valid will REQUIRE there to be enough datapoints. for example, if you take out valid, it will start @ point one, not having any prior points, so itll be 1+0+0 = 1 /3 = .3333 http://sentdex.com/sentiment-analysisbig-data-and-python-tutorials-algorithmic-trading/how-to-chart-stocks-and-forex-doing-your-own-financial-charting/calculate-simple-moving-average-sma-python/

INTERNAL_ID_1

geodezyx.stats.stats.movingaverage_bis(interval, window_size, convolve_mode='same')

moyenne glissante, plus lente mais donne une sortie de meme taille que l’entrée https://stackoverflow.com/questions/11352047/finding-moving-average-from-data-points-in-python

INTERNAL_ID_4

geodezyx.stats.stats.movingaverage_ter(data, window_width)

https://stackoverflow.com/questions/11352047/finding-moving-average-from-data-points-in-python Roman Kh ans

INTERNAL_ID_5

geodezyx.stats.stats.outlier_above_below(X, threshold_values, reference=<function nanmean>, theshold_absolute=True, return_booleans=True, theshold_relative_value='reference', verbose=False)

Gives values of X which are between threshold values

Parameters:
  • threshold_values (single value (float) or a 2-tuple) –

    (lower bound theshold , upper bound theshold)

    WARN : those value(s) have to be positives. Minus sign for lower bound and plus sign for upper one will be applied internally

  • reference (float or callable) – the central reference value can be a absolute fixed value (float) or a function (e.g. np.mean of np.median)

  • theshold_absolute (bool) –

    if True threshold_values are absolutes values
    >>> low = reference - threshold_values[0]
    >>> upp = reference + threshold_values[1]
    
    if False they are fractions of theshold_relative_value
    >>> low = reference - threshold_values[0] * theshold_relative_value
    >>> upp = reference + threshold_values[1] * theshold_relative_value
    

    (see also below)

  • theshold_relative_value (str or function) – if the string “reference” or None is given, then it the reference value which is used if it is a fuction (e.g. np.std()) then it is this value returned by this function which is used Only useful when theshold_absolute = False

  • return_booleans (bool) – return booleans or not

  • verbose (bool)

Returns:

  • Xout (numpy array) – X between low_bound & upp_bound

  • bbool (numpy array) – X-sized array of booleans

geodezyx.stats.stats.outlier_above_below_binom(Y, X, threshold_values, reference=<function nanmean>, theshold_absolute=True, theshold_relative_value='reference', return_booleans=False, detrend_first=True, verbose=False)

Gives values of Y which are between threshold values, and correct an associated X so as X => Y(X)

Parameters:
  • threshold_values (single value (float) or a 2-tuple) –

    (lower bound theshold , upper bound theshold)

    WARN : those value(s) have to be positives. Minus sign for lower bound and plus sign for upper one will be applied internally

  • reference (float or callable) – the central reference value can be a absolute fixed value (float) or a function (e.g. np.mean of np.median)

  • theshold_absolute (bool) –

    if True threshold_values are absolutes values
    >>> low = reference - threshold_values[0]
    >>> upp = reference + threshold_values[1]
    
    if False they are fractions of theshold_relative_value
    >>> low = reference - threshold_values[0] * theshold_relative_value
    >>> upp = reference + threshold_values[1] * theshold_relative_value
    

    (see also below)

  • theshold_relative_value (str or function) – if the string “reference” or None is given, then it the reference value which is used if it is a fuction (e.g. np.std()) then it is this value returned by this function which is used Only useful when theshold_absolute = False

  • detrend_first (bool) – detrend linear behavior of Y(X) first Recommended

  • return_booleans (bool) – return booleans or not

  • verbose (bool)

Returns:

  • Xout (numpy array) – X between low_bound & upp_bound

  • bbool (numpy array) – X-sized array of booleans

geodezyx.stats.stats.outlier_above_below_simple(X, low_bound, upp_bound, return_booleans=True)

Gives values of X which are between low_bound & upp_bound

Parameters:
  • X (list or numpy.array) – Values

  • upp_bound (low_bound &) – lower and upper bound of X values wished

  • return_booleans (bool) – return booleans or not

Returns:

  • Xout (numpy.array) – X between low_bound & upp_bound

  • bbool (bool) – X-sized array of booleans

geodezyx.stats.stats.outlier_mad(data, threshold=3.5, verbose=False, convert_to_np_array=True, mad_mode='median', seuil=None)

clean the outlier of Ya dataset using the MAD approach

Parameters:
  • data (list or numpy.array) – Values

  • threshold (float) – MAD threshold

  • verbose (bool)

  • convert_to_np_array (bool) – if True returns output as an array, if False as a regular list

  • mad_mode (str) – ‘median’ or ‘mean’ : MAD can also be based on mean (for experimental purposes)

  • seuil (float, optional) – legacy name of ‘threshold’ argument. will override threshold value if given

Returns:

  • dataout (numpy.array) – Values cleaned of outliers

  • boolbad (numpy.array) – Y-sized booleans

  • Source

  • ——

  • Utilisation de la MAD pour detecter les outliers

  • http (//www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm)

  • http (//web.ipac.caltech.edu/staff/fmasci/home/statistics_refs/BetterThanMAD.pdf)

geodezyx.stats.stats.outlier_mad_binom(Y, X, threshold=3.5, verbose=False, detrend_first=False, return_booleans=False)

clean the outlier of Y using the MAD approach and clean the corresponding values in X assuming that we have the function : X => Y(X) (be carefull, Y is the first argument)

Parameters:
  • Y (list or numpy.array) – Values

  • X (list or numpy.array) – X Values so as X => Y(X)

  • threshold (float) – MAD threshold

  • verbose (bool)

  • detrend_first (bool) – detrend linear behavior of Y(X) first

  • return_booleans (bool) – return good and bad values of Y and X as booleans

Returns:

  • Yclean & Xclean (numpy.array)

  • bb (numpy.array (if return_booleans == True)) – Y-sized booleans

geodezyx.stats.stats.outlier_mad_binom_legacy(X, Y, threshold=3.5, verbose=False, detrend_first=False, return_booleans=False)

clean the outlier of X and clean the corresponding values in Y

legacy : order of X Y is different than in the main version, and here it might be unstable for the detrend

geodezyx.stats.stats.outlier_overmean(Xin, Yin, marge=0.1)

really old and discontinued, use outlier_above_below instead

geodezyx.stats.stats.outlier_sigma(datasigmain, threshold=3)

si un point a un sigma > threshold * moy(sigmas) on le vire

really old and discontinued, and not really efficient

geodezyx.stats.stats.plot_vertical_bar(xlis, color='r', linewidth=1)
geodezyx.stats.stats.plot_vertical_bar_ax(xlis, ax_in, color='r', linewidth=1)
geodezyx.stats.stats.rms_mean(A)

returns RMS mean of a list/array

geodezyx.stats.stats.rms_mean_alternativ(A)

returns “GRGS style” RMS of a list/array the arithmetic mean of the values is substracted from the values NB 1808 : It is basically the standard deviation …

i.e. _ √< (A - A)^2 > instead of √< (A)^2 >

geodezyx.stats.stats.rms_mean_kouba(A, multipl_coef=3, deg_of_freedom=7)

returns RMS mean of a list/array

geodezyx.stats.stats.runningMean(x, N)

http://stackoverflow.com/questions/13728392/moving-average-or-running-mean

INTERNAL_ID_2

geodezyx.stats.stats.running_mean(data_in, window, convolve_mode='same')

Gives running mean / moving average of data

Parameters:
  • data_in (list or numpy.array) – Values

  • window (float or int) – Size of the window for the running mean

  • convolve_mode (str) – (expert) mode for the underlying convolution

Returns:

data_run – running mean of data_in (sane size as data_in) should stay “same”

Return type:

numpy.array

Note

Nota :

After a stress test, this one is the only one to provide an output with same size as input AND not shifted This fct is slow but at leat, do the job

See running_mean_help for more details

convolve_mode should stay fixed as “same”

Nota 2 (for developpers) :

Wrapper based on fct movingaverage_bis

The substraction of the mean is an empirical trick

geodezyx.stats.stats.running_mean_core(x, N)

moyenne glissante https://stackoverflow.com/questions/13728392/moving-average-or-running-mean Alleo answer

INTERNAL_ID_3

geodezyx.stats.stats.running_mean_help()
geodezyx.stats.stats.sinusoide(T, A, omega, phi=0, f=None)

produce a sinusoidal waveform

Parameters:
  • T (float) – time variable.

  • A (float) – amplitude, the peak deviation of the function from zero.

  • omega (float, optional) – ω = 2πf, angular frequency, the rate of change of the function argument in units of radians per second.

  • phi (float, optional) – phase, specifies (in radians) where in its cycle the oscillation is at t = 0. The default is 0.

  • f (float) – ordinary frequency, the number of oscillations (cycles) that occur each second of time. If given, it overrides the angular frequency omega. Thus, to use it, declare also omega = 0 The default is None.

Returns:

a sinusoidal waveform.

Return type:

float

Notes

https://en.wikipedia.org/wiki/Sine_wave

geodezyx.stats.stats.smooth(x, window_len=11, window='hanning')

smooth the data using a window with requested size.

This method is based on the convolution of a scaled window with the signal. The signal is prepared by introducing reflected copies of the signal (with the window size) in both ends so that transient parts are minimized in the begining and end part of the output signal.

NOTA PERSO : works only for equaly spaced data ….

input:

x: the input signal window_len: the dimension of the smoothing window; should be an odd integer window: the type of window from ‘flat’, ‘hanning’, ‘hamming’, ‘bartlett’, ‘blackman’

flat window will produce a moving average smoothing.

output:

the smoothed signal

example:

t=linspace(-2,2,0.1) x=sin(t)+randn(len(t))*0.1 y=smooth(x)

see also:

numpy.hanning, numpy.hamming, numpy.bartlett, numpy.blackman, numpy.convolve scipy.signal.lfilter

TODO: the window parameter could be the window itself if an array instead of a string NOTE: length(output) != length(input), to correct this: return y[(window_len/2-1):-(window_len/2)] instead of just y. SOURCE : http://scipy-cookbook.readthedocs.io/items/SignalSmooth.html

geodezyx.stats.stats.time_win_basic(start, end, Tlisin, Datalisin, outposix=True, invert=False, out_array=False, out_boolis=False, only_boolis=False)

In Intern, we works in POSIX

only_boolisTo gain speed, no operation on Tlis & Datalisin is be done

None is outputed for Tlisout , Datalisout

Outputs :
If out_boolis == True:

Tlisout , Datalisout , boolis

If out_boolis == False:

Tlisout , Datalisout

geodezyx.stats.stats.time_win_multi(start, end, Tlist, Datalislis, outposix=True, invert=False, out_array=False)
geodezyx.stats.stats.time_win_multi_start_end(Start_list_in, End_list_in, Tlisin, Datalisin, outposix=True, invert=False, out_array=False, out_boolis=False)

In Intern, we works in POSIX

Outputs :
If out_boolis == True:

Tlisout , Datalisout , boolis_opera , boolis_stk (4 values !!)

If out_boolis == False:

Tlisout , Datalisout

geodezyx.stats.stats.wrapTo180(lon)
geodezyx.stats.stats.wrapTo360(lon)