geodezyx.utils_xtra package

Submodules

geodezyx.utils_xtra.pandas_utils module

@author: psakic

This sub-module of geodezyx.utils contains functions for operations related to Python’s Pandas object manipulations.

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils_xtra.pandas_utils.diff_pandas(DF, col_name, use_np_diff=False)

Differentiate a Pandas DataFrame, if index is time.

This function calculates the difference between consecutive elements in a specified column of a DataFrame. The difference is divided by the difference in time (seconds) between the corresponding indices. This is essentially a derivative operation, assuming the index represents time.

Parameters:
  • DF (pandas.DataFrame) – The input DataFrame. The index should represent time.

  • col_name (str) – The name of the column in the DataFrame that you want to differentiate.

  • use_np_diff (bool, optional) – If True, use Numpy’s diff. Default is False. This option has a (much) faster execution speed.

Returns:

The differentiated column of the input DataFrame. The type of the return value depends on the ‘return_array’ parameter. If ‘return_array’ is False (default), a DataFrame is returned. If ‘return_array’ is True, a numpy array is returned.

Return type:

pandas.DataFrame or numpy.array

geodezyx.utils_xtra.pandas_utils.pandas_DF_2_tuple_serie(DFin, columns_name_list, reset_index_first=False)

Solve the multiple columns selection problem.

Parameters:
  • DFin (pandas.DataFrame) – Input DataFrame

  • columns_name_list (list) – List of column names to select

  • reset_index_first (bool, optional) – Reset index before conversion (default False)

Returns:

Series of tuples

Return type:

pandas.Series

Notes

The idea is:

S1 = pandas_DF_2_tuple_serie(DF1, columns_name_list)
S2 = pandas_DF_2_tuple_serie(DF2, columns_name_list)
BOOL = S1.isin(S2)
DF1[BOOL]

References

https://stackoverflow.com/questions/53432043/pandas-dataframe-selection-of-multiple-elements-in-several-columns

geodezyx.utils_xtra.pandas_utils.pandas_DF_print(DFin)
geodezyx.utils_xtra.pandas_utils.pandas_column_rename_dic(*inpnames)

Wrapper of renamedic_fast_4_pandas for Pandas.

Parameters:

*inpnames (str) – Column names to rename

Returns:

Dictionary mapping column indices to new names

Return type:

dict

Notes

Example:

rnamedic = utils.renamedic_fast_4_pandas(*["zmax","ang","zsmooth","smoothtype","xgrad","ygrad",
                                   'r_eiko','z_eiko','pt_eiko_x','pt_eiko_y',"t_eiko",
                                   'r_sd',  'z_sd',  'pt_sd_x'  ,'pt_sd_y'  ,'t_sd',
                                   'diff_x','diff_y','diff','diff_t'])

pda = pda.rename(columns = rnamedic)
geodezyx.utils_xtra.pandas_utils.renamedic_fast_4_pandas(*inpnames)

Create rename dictionary for Pandas columns.

Parameters:

*inpnames (str) – Column names

Returns:

Dictionary mapping column indices to new names

Return type:

dict

Notes

Example:

rnamedic = utils.renamedic_fast_4_pandas(*["zmax","ang","zsmooth","smoothtype","xgrad","ygrad",
                                   'r_eiko','z_eiko','pt_eiko_x','pt_eiko_y',"t_eiko",
                                   'r_sd',  'z_sd',  'pt_sd_x'  ,'pt_sd_y'  ,'t_sd',
                                   'diff_x','diff_y','diff','diff_t'])

pda = pda.rename(columns = rnamedic)
geodezyx.utils_xtra.pandas_utils.weighted_average(df, data_col, weight_col, by_col)

Source

https://stackoverflow.com/questions/31521027/groupby-weighted-average-and-sum-in-pandas-dataframe

geodezyx.utils_xtra.plot_utils module

@author: psakic

This sub-module of geodezyx.utils contains functions for operations related to Python’s plot operations.

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils_xtra.plot_utils.axis_data_coords_sys_transform(axis_obj_in, xin, yin, inverse=False)

inverse = False : Axis => Data = True : Data => Axis

geodezyx.utils_xtra.plot_utils.color_list(l, colormap='jet')

Generate a list of colors from a colormap for each unique value in L.

Parameters:
  • l (array-like) – Input list or array of values. The number of unique values determines the number of colors.

  • colormap (str, optional) – Name of the matplotlib colormap to use. The default is ‘jet’.

Returns:

colist – List of RGBA color tuples from the specified colormap.

Return type:

list of tuple

Notes

The colors are evenly distributed across the colormap based on the number of unique values in L.

See also

matplotlib.pyplot.get_cmap

Get a colormap by name.

colors_from_colormap_getter

Alternative function to get colors from a colormap.

geodezyx.utils_xtra.plot_utils.colors_from_colormap_getter(ncolors, colormap='viridis')

Get a list of colors from a matplotlib colormap.

Parameters:
  • ncolors (int) – Number of colors to generate.

  • colormap (str, optional) – Name of the matplotlib colormap to use. The default is ‘viridis’.

Returns:

List of RGBA color tuples from the specified colormap.

Return type:

list of tuple

See also

color_list

Generate colors based on unique values in a list.

geodezyx.utils_xtra.plot_utils.figure_saver(figobjt_in, outdir, outname, outtype=('.png', '.pdf', '.figpik'), formt=None, dpi=200, transparent=False)

This function provides a front end to export pretty-print plots

Parameters:
  • figobjt_in (matplotlib Figure object) – input matplotlib Figure object. use for instance plt.gcf() to get it.

  • outdir (str) – the output directory.

  • outname (str) – output prefix filename.

  • outtype (tuple, optional) – the output formats. The default is (‘.png’,’.pdf’,’.figpik’).

  • formt (2-tuple or string , optional) – the format (size) of the plot. if string: a Ax format (A4, A3 etc…) if tuple: size of the plot in inches. The default is None.

  • dpi (int, optional) – DPI of the figure. The default is 200.

  • transparent (bool, optional) – make the plot transparent. The default is False.

Returns:

outpath_stk – output paths of the plots.

Return type:

string or list of string

geodezyx.utils_xtra.plot_utils.gaussian_for_plot(d, density=False, nbins=500, nsigma=3.5)

Generate a Gaussian curve for histogram overlay plots.

Parameters:
  • d (array-like) – Data vector to fit a Gaussian distribution to.

  • density (bool, optional) – If True, returns the PDF (normalized). If False, scales the PDF to match histogram area. The default is False.

  • nbins (int, optional) – Number of bins (points) to generate for the curve. The default is 500.

  • nsigma (float, optional) – Number of standard deviations to span for the x-axis (μ ± nsigma*σ). The default is 3.5.

Returns:

  • xpdf (numpy.ndarray) – X coordinates of the Gaussian curve.

  • ypdf_out (numpy.ndarray) – Y coordinates of the Gaussian curve (PDF or histogram-scaled).

Notes

Useful for overlaying a fitted Gaussian curve on a histogram. When density=False, the curve is scaled to match the histogram’s area. When density=True, the curve is the probability density function.

Examples

>>> import matplotlib.pyplot as plt
>>> data = np.random.randn(1000)
>>> x_curve, y_curve = gaussian_for_plot(data)
>>> plt.hist(data, bins=50, density=True)
>>> plt.plot(x_curve, y_curve, 'r-', label='Gaussian fit')

See also

scipy.stats.norm.pdf

Probability density function for normal distribution.

geodezyx.utils_xtra.plot_utils.get_figure(figin=0)

Get or create a matplotlib Figure object.

Parameters:

figin (int or matplotlib.figure.Figure, optional) – Figure specification. If 0, creates a new figure. If an integer, returns figure with that number. If a Figure object, returns that figure. The default is 0.

Returns:

figout – The requested or created figure object with at least one axes.

Return type:

matplotlib.figure.Figure

Notes

Ensures the returned figure has at least one axes (subplot).

geodezyx.utils_xtra.plot_utils.id2val(value_lis, id_lis, idin)

from a value list and a id pointer list return the good val from the good id replace dico bc. set is not supproted as key

geodezyx.utils_xtra.plot_utils.set_size_for_pub(width=418.25368, fraction=1, subplot=None)

Set aesthetic figure dimensions to avoid scaling in LaTeX.

Parameters:
  • width (float, optional) – Width of the figure in points (pt). The default is 418.25368 (approximately 146 mm).

  • fraction (float, optional) – Fraction of the width that the figure should occupy. The default is 1.

  • subplot (list of int, optional) – Subplot grid dimensions as [nrows, ncols]. The default is [1, 1].

Returns:

fig_dim – Dimensions of the figure as (width_inches, height_inches).

Return type:

tuple of float

Notes

Uses the golden ratio (φ = (√5 - 1) / 2 ≈ 0.618) to set aesthetic figure height. Useful for creating publication-ready plots that fit nicely in LaTeX documents.

Examples

>>> width = 418.25368  # Standard LaTeX column width
>>> fig_dim = set_size_for_pub(width, fraction=0.5, subplot=[2, 2])
>>> fig = plt.figure(figsize=fig_dim)
geodezyx.utils_xtra.plot_utils.symbols_list(l=None)
geodezyx.utils_xtra.plot_utils.ylim_easy(lin, delta=0.1, min_null_if_neg=False)

Calculate convenient axis limits for a data array.

Parameters:
  • lin (array-like) – Input data.

  • delta (float, optional) – Fraction of the data range to add as padding. The default is 0.1.

  • min_null_if_neg (bool, optional) – If True, set the lower limit to 0 if it would be negative. The default is False.

Returns:

(lower_limit, upper_limit) for the y-axis.

Return type:

tuple of float

Notes

Useful for automatic axis limit calculation in plots.