geodezyx.utils_xtra package
Submodules
geodezyx.utils_xtra.pandas_utils module
@author: psakic
This sub-module of geodezyx.utils contains functions for operations related to Python’s Pandas object manipulations.
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils_xtra.pandas_utils.diff_pandas(DF, col_name, use_np_diff=False)
Differentiate a Pandas DataFrame, if index is time.
This function calculates the difference between consecutive elements in a specified column of a DataFrame. The difference is divided by the difference in time (seconds) between the corresponding indices. This is essentially a derivative operation, assuming the index represents time.
- Parameters:
DF (pandas.DataFrame) – The input DataFrame. The index should represent time.
col_name (str) – The name of the column in the DataFrame that you want to differentiate.
use_np_diff (bool, optional) – If True, use Numpy’s diff. Default is False. This option has a (much) faster execution speed.
- Returns:
The differentiated column of the input DataFrame. The type of the return value depends on the ‘return_array’ parameter. If ‘return_array’ is False (default), a DataFrame is returned. If ‘return_array’ is True, a numpy array is returned.
- Return type:
pandas.DataFrame or numpy.array
- geodezyx.utils_xtra.pandas_utils.pandas_DF_2_tuple_serie(DFin, columns_name_list, reset_index_first=False)
Solve the multiple columns selection problem.
- Parameters:
DFin (pandas.DataFrame) – Input DataFrame
columns_name_list (list) – List of column names to select
reset_index_first (bool, optional) – Reset index before conversion (default False)
- Returns:
Series of tuples
- Return type:
pandas.Series
Notes
The idea is:
S1 = pandas_DF_2_tuple_serie(DF1, columns_name_list) S2 = pandas_DF_2_tuple_serie(DF2, columns_name_list) BOOL = S1.isin(S2) DF1[BOOL]
References
- geodezyx.utils_xtra.pandas_utils.pandas_DF_print(DFin)
- geodezyx.utils_xtra.pandas_utils.pandas_column_rename_dic(*inpnames)
Wrapper of renamedic_fast_4_pandas for Pandas.
- Parameters:
*inpnames (str) – Column names to rename
- Returns:
Dictionary mapping column indices to new names
- Return type:
dict
Notes
Example:
rnamedic = utils.renamedic_fast_4_pandas(*["zmax","ang","zsmooth","smoothtype","xgrad","ygrad", 'r_eiko','z_eiko','pt_eiko_x','pt_eiko_y',"t_eiko", 'r_sd', 'z_sd', 'pt_sd_x' ,'pt_sd_y' ,'t_sd', 'diff_x','diff_y','diff','diff_t']) pda = pda.rename(columns = rnamedic)
- geodezyx.utils_xtra.pandas_utils.renamedic_fast_4_pandas(*inpnames)
Create rename dictionary for Pandas columns.
- Parameters:
*inpnames (str) – Column names
- Returns:
Dictionary mapping column indices to new names
- Return type:
dict
Notes
Example:
rnamedic = utils.renamedic_fast_4_pandas(*["zmax","ang","zsmooth","smoothtype","xgrad","ygrad", 'r_eiko','z_eiko','pt_eiko_x','pt_eiko_y',"t_eiko", 'r_sd', 'z_sd', 'pt_sd_x' ,'pt_sd_y' ,'t_sd', 'diff_x','diff_y','diff','diff_t']) pda = pda.rename(columns = rnamedic)
- geodezyx.utils_xtra.pandas_utils.weighted_average(df, data_col, weight_col, by_col)
Source
https://stackoverflow.com/questions/31521027/groupby-weighted-average-and-sum-in-pandas-dataframe
geodezyx.utils_xtra.plot_utils module
@author: psakic
This sub-module of geodezyx.utils contains functions for operations related to Python’s plot operations.
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils_xtra.plot_utils.axis_data_coords_sys_transform(axis_obj_in, xin, yin, inverse=False)
inverse = False : Axis => Data = True : Data => Axis
- geodezyx.utils_xtra.plot_utils.color_list(l, colormap='jet')
Generate a list of colors from a colormap for each unique value in L.
- Parameters:
l (array-like) – Input list or array of values. The number of unique values determines the number of colors.
colormap (str, optional) – Name of the matplotlib colormap to use. The default is ‘jet’.
- Returns:
colist – List of RGBA color tuples from the specified colormap.
- Return type:
list of tuple
Notes
The colors are evenly distributed across the colormap based on the number of unique values in L.
See also
matplotlib.pyplot.get_cmapGet a colormap by name.
colors_from_colormap_getterAlternative function to get colors from a colormap.
- geodezyx.utils_xtra.plot_utils.colors_from_colormap_getter(ncolors, colormap='viridis')
Get a list of colors from a matplotlib colormap.
- Parameters:
ncolors (int) – Number of colors to generate.
colormap (str, optional) – Name of the matplotlib colormap to use. The default is ‘viridis’.
- Returns:
List of RGBA color tuples from the specified colormap.
- Return type:
list of tuple
See also
color_listGenerate colors based on unique values in a list.
- geodezyx.utils_xtra.plot_utils.figure_saver(figobjt_in, outdir, outname, outtype=('.png', '.pdf', '.figpik'), formt=None, dpi=200, transparent=False)
This function provides a front end to export pretty-print plots
- Parameters:
figobjt_in (matplotlib Figure object) – input matplotlib Figure object. use for instance plt.gcf() to get it.
outdir (str) – the output directory.
outname (str) – output prefix filename.
outtype (tuple, optional) – the output formats. The default is (‘.png’,’.pdf’,’.figpik’).
formt (2-tuple or string , optional) – the format (size) of the plot. if string: a Ax format (A4, A3 etc…) if tuple: size of the plot in inches. The default is None.
dpi (int, optional) – DPI of the figure. The default is 200.
transparent (bool, optional) – make the plot transparent. The default is False.
- Returns:
outpath_stk – output paths of the plots.
- Return type:
string or list of string
- geodezyx.utils_xtra.plot_utils.gaussian_for_plot(d, density=False, nbins=500, nsigma=3.5)
Generate a Gaussian curve for histogram overlay plots.
- Parameters:
d (array-like) – Data vector to fit a Gaussian distribution to.
density (bool, optional) – If True, returns the PDF (normalized). If False, scales the PDF to match histogram area. The default is False.
nbins (int, optional) – Number of bins (points) to generate for the curve. The default is 500.
nsigma (float, optional) – Number of standard deviations to span for the x-axis (μ ± nsigma*σ). The default is 3.5.
- Returns:
xpdf (numpy.ndarray) – X coordinates of the Gaussian curve.
ypdf_out (numpy.ndarray) – Y coordinates of the Gaussian curve (PDF or histogram-scaled).
Notes
Useful for overlaying a fitted Gaussian curve on a histogram. When density=False, the curve is scaled to match the histogram’s area. When density=True, the curve is the probability density function.
Examples
>>> import matplotlib.pyplot as plt >>> data = np.random.randn(1000) >>> x_curve, y_curve = gaussian_for_plot(data) >>> plt.hist(data, bins=50, density=True) >>> plt.plot(x_curve, y_curve, 'r-', label='Gaussian fit')
See also
scipy.stats.norm.pdfProbability density function for normal distribution.
- geodezyx.utils_xtra.plot_utils.get_figure(figin=0)
Get or create a matplotlib Figure object.
- Parameters:
figin (int or matplotlib.figure.Figure, optional) – Figure specification. If 0, creates a new figure. If an integer, returns figure with that number. If a Figure object, returns that figure. The default is 0.
- Returns:
figout – The requested or created figure object with at least one axes.
- Return type:
matplotlib.figure.Figure
Notes
Ensures the returned figure has at least one axes (subplot).
- geodezyx.utils_xtra.plot_utils.id2val(value_lis, id_lis, idin)
from a value list and a id pointer list return the good val from the good id replace dico bc. set is not supproted as key
- geodezyx.utils_xtra.plot_utils.set_size_for_pub(width=418.25368, fraction=1, subplot=None)
Set aesthetic figure dimensions to avoid scaling in LaTeX.
- Parameters:
width (float, optional) – Width of the figure in points (pt). The default is 418.25368 (approximately 146 mm).
fraction (float, optional) – Fraction of the width that the figure should occupy. The default is 1.
subplot (list of int, optional) – Subplot grid dimensions as [nrows, ncols]. The default is [1, 1].
- Returns:
fig_dim – Dimensions of the figure as (width_inches, height_inches).
- Return type:
tuple of float
Notes
Uses the golden ratio (φ = (√5 - 1) / 2 ≈ 0.618) to set aesthetic figure height. Useful for creating publication-ready plots that fit nicely in LaTeX documents.
Examples
>>> width = 418.25368 # Standard LaTeX column width >>> fig_dim = set_size_for_pub(width, fraction=0.5, subplot=[2, 2]) >>> fig = plt.figure(figsize=fig_dim)
- geodezyx.utils_xtra.plot_utils.symbols_list(l=None)
- geodezyx.utils_xtra.plot_utils.ylim_easy(lin, delta=0.1, min_null_if_neg=False)
Calculate convenient axis limits for a data array.
- Parameters:
lin (array-like) – Input data.
delta (float, optional) – Fraction of the data range to add as padding. The default is 0.1.
min_null_if_neg (bool, optional) – If True, set the lower limit to 0 if it would be negative. The default is False.
- Returns:
(lower_limit, upper_limit) for the y-axis.
- Return type:
tuple of float
Notes
Useful for automatic axis limit calculation in plots.