geodezyx.utils package

Submodules

geodezyx.utils.dict_utils module

@author: psakic

This sub-module of geodezyx.utils contains functions for operations related to Python’s dictionary manipulations.

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils.dict_utils.dic_key_for_vals_list_finder(dic_in, value_in)

Find the key in a dictionary of lists that contains a given value.

Parameters:
  • dic_in (dict) –

    Dictionary with lists as values, e.g.:

    dic_in[key1] = [val1a, val1b] dic_in[key2] = [val2a, val2b, val2c]

  • value_in (object) – Value to search for in the lists.

Returns:

The key associated with the list containing value_in, or None if not found.

Return type:

key or None

Warning

This function returns the first key found. The input dictionary should be injective (no duplicate values across lists) for predictable behavior.

Notes

Example: if value_in = val2b, the function returns key2.

Uses log.warning() to report when no key is found for the given value.

geodezyx.utils.dict_utils.dicts_merge(*dict_args)

Merge multiple dictionaries into a single dictionary.

Performs a shallow copy and merge of any number of dictionaries with precedence going to key-value pairs in later dictionaries.

Parameters:

*dict_args (dict) – Variable number of dictionaries to merge.

Returns:

A new merged dictionary.

Return type:

dict

Warning

First values will be erased if the same key is present in following dictionaries. Later dictionaries override earlier ones.

Notes

See https://stackoverflow.com/questions/38987/how-can-i-merge-two-python-dictionaries-in-a-single-expression

geodezyx.utils.dict_utils.dicts_of_list_merge(*dict_args)

Merge multiple dictionaries of lists into a single dictionary.

Parameters:

*dict_args (dict) – Variable number of dictionaries with lists as values.

Returns:

Merged dictionary where lists from all input dictionaries are combined.

Return type:

dict

See also

dicts_of_list_merge_mono

Merge two dictionaries of lists.

geodezyx.utils.dict_utils.dicts_of_list_merge_mono(dol1, dol2)

Merge two dictionaries of lists by combining list values for common keys.

Parameters:
  • dol1 (dict) – First dictionary with lists as values.

  • dol2 (dict) – Second dictionary with lists as values.

Returns:

Merged dictionary where lists from both input dictionaries are combined.

Return type:

dict

Notes

See https://stackoverflow.com/questions/1495510/combining-dictionaries-of-lists-in-python

geodezyx.utils.list_utils module

@author: psakic

This sub-module of geodezyx.utils contains functions for operations related to Python’s list manipulations.

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils.list_utils.chunkIt(seq, num)

Divide a list into approximately num equal sublists.

Parameters:
  • seq (list or array-like) – Input sequence to divide.

  • num (int) – Desired number of sublists.

Returns:

List of sublists, each roughly equal in size.

Return type:

list

Notes

The sublists may vary in size by 1 element if the sequence length is not evenly divisible by num.

Source: http://stackoverflow.com/questions/2130016/splitting-a-list-of-arbitrary-size-into-only-roughly-n-equal-parts

geodezyx.utils.list_utils.consecutive_groupIt(data, only_start_end=False)

Identify groups of continuous numbers in a list.

Parameters:
  • data (list or array-like) – Input sequence of numbers.

  • only_start_end (bool, optional) – If True, return only (start, end) tuples for each group. If False, return full lists of elements in each group. Default is False.

Returns:

List of groups. Each group is either a list of consecutive elements (if only_start_end=False) or a tuple (start, end) (if only_start_end=True).

Return type:

list

Notes

Useful for time periods with a prior conversion to MJD.

Source : https://stackoverflow.com/questions/2154249/identify-groups-of-continuous-numbers-in-a-list

geodezyx.utils.list_utils.decimateIt(listinp, n)

Decimate a list by selecting every n-th element.

Parameters:
  • listinp (list or array-like) – Input sequence to decimate.

  • n (int) – Decimation factor. Elements at indices where i % n == 0 are selected.

Returns:

Decimated list containing every n-th element.

Return type:

list

geodezyx.utils.list_utils.df_sel_val_in_col(df, col_name, col_val)

Select rows from a DataFrame where a column matches a specific value.

Parameters:
  • df (pandas.DataFrame) – Input DataFrame.

  • col_name (str) – Name of the column to filter by.

  • col_val (scalar) – Value to match in the specified column.

Returns:

Filtered DataFrame containing only rows where col_name == col_val.

Return type:

pandas.DataFrame

geodezyx.utils.list_utils.dicofdic(mat, names)

Create a 2D dictionary from a matrix and corresponding names.

Parameters:
  • mat (array-like) – N x N matrix of values.

  • names (list) – List of N names to use as keys for both dimensions.

Returns:

A nested dictionary where dic[name1][name2] = mat[i, j] where i and j are the indices corresponding to name1 and name2.

Return type:

dict

Notes

Source: http://stackoverflow.com/questions/13326042/2d-dictionary-with-multiple-keys-per-value

geodezyx.utils.list_utils.duplicates_finder(seq)

Find all duplicate elements in a sequence.

Parameters:

seq (iterable) – Input sequence to search for duplicates.

Returns:

List of elements that appear more than once in the input sequence.

Return type:

list

Notes

Source: http://stackoverflow.com/questions/9835762/find-and-list-duplicates-in-python-list

geodezyx.utils.list_utils.find_common_elts(*lists)

Find common elements across multiple lists.

Parameters:

*lists (list) – Variable number of input lists.

Returns:

Sorted array of elements common to all input lists.

Return type:

numpy.ndarray

geodezyx.utils.list_utils.find_index_multi_occurences(l, elt)

Find all indices where an element occurs in a list.

Parameters:
  • l (list) – Input list to search.

  • elt (scalar) – Element to find.

Returns:

List of indices where elt appears in L.

Return type:

list

geodezyx.utils.list_utils.find_interval_bound(listin, val, outindexes=True)

Find the bounding values/indices of an interval around a target value.

Parameters:
  • listin (list or array-like) – Input list/array (assumed to be sorted).

  • val (scalar) – Target value to find bounds for.

  • outindexes (bool, optional) – If True, return indices of bounds. If False, return the bounding values. Default is True.

Returns:

If outindexes is True, returns (lower_index, upper_index). If outindexes is False, returns (lower_value, upper_value).

Return type:

tuple

geodezyx.utils.list_utils.find_nearest(listin, value)

Find the nearest value in a list to a target value.

Parameters:
  • listin (list or array-like) – Input list/array to search.

  • value (scalar) – Target value to find nearest element to.

Returns:

(nearest_value, index_of_nearest) where nearest_value is the element in listin closest to value, and index_of_nearest is its index.

Return type:

tuple

geodezyx.utils.list_utils.find_regex_in_list(regex, L, only_first_occurence=False, line_number=False)

Find elements in a list matching a regular expression pattern.

Parameters:
  • regex (str) – Regular expression pattern to search for.

  • L (list) – List of strings to search.

  • only_first_occurence (bool, optional) – If True, return only the first match. If False, return all matches. Default is False.

  • line_number (bool, optional) – If True, return tuples of (index, element). If False, return just elements. Default is False.

Returns:

If only_first_occurence=True, returns a single match (element or tuple). If only_first_occurence=False, returns a list of matches. Format depends on line_number parameter.

Return type:

list or scalar

geodezyx.utils.list_utils.find_surrounding(L, v)

Find the two nearest values surrounding a target value.

Parameters:
  • L (iterable) – Input list/array to search.

  • v (scalar) – Target value to find surrounding values for.

Returns:

(surrounding_values, surrounding_indices) where: - surrounding_values is a tuple of the two nearest values - surrounding_indices is a tuple of their indices in L

Return type:

tuple

geodezyx.utils.list_utils.get_interval(start, end, delta)

Generate a list of values at regular intervals between start and end.

Parameters:
  • start (numeric) – Starting value (inclusive).

  • end (numeric) – Ending value (exclusive).

  • delta (numeric) – Step size between consecutive values.

Returns:

List of values from start to end with step delta.

Return type:

list

Notes

Source: http://stackoverflow.com/questions/10688006/generate-a-list-of-datetimes-between-an-interval-in-python

geodezyx.utils.list_utils.groups_near_central_values(a, tol, b=None)

group elements of an array by proximity to unique central values.

Parameters:
  • a (array-like) – Input array to group.

  • tol (float) – Absolute tolerance for grouping elements near central values.

  • b (array-like, optional) – Auxiliary array corresponding to elements in A. Default is None.

Returns:

If B is None, returns a list of lists where each sublist contains elements from A grouped around a central value. If B is provided, returns a tuple (groups_A, groups_B) containing grouped elements from both arrays.

Return type:

list or tuple

Notes

This function is in beta status and may have bugs if tolerance is poorly chosen.

geodezyx.utils.list_utils.identical_consecutive_eltsIt(linp)

Group consecutive identical elements together.

Parameters:

linp (list or iterable) – Input sequence with potentially repeated consecutive elements.

Returns:

List of lists, where each inner list contains consecutive identical elements.

Return type:

list

geodezyx.utils.list_utils.identical_groupIt(data)

Group consecutive identical elements together.

Parameters:

data (list or iterable) – Input sequence with potentially repeated consecutive elements.

Returns:

List of lists, where each inner list contains consecutive identical elements.

Return type:

list

Notes

Source : https://stackoverflow.com/questions/30293071/python-find-same-values-in-a-list-and-group-together-a-new-list

geodezyx.utils.list_utils.is_listoflist(inp)

Check if inp is a list of list.

Parameters:

inp (iterable) – Input object to check.

Returns:

True if inp contains at least one list or numpy array element, False otherwise.

Return type:

bool

Examples

>>> is_listoflist([[1, 2], [3, 4]])
True
>>> is_listoflist([1, 2, 3])
False
geodezyx.utils.list_utils.median_improved(l)

Calculate the median of a list, handling even-length lists differently.

Parameters:

l (list or array-like) – Input sequence.

Returns:

The median value. For even-length lists, returns the nearest value in the list to the actual median instead of interpolating.

Return type:

scalar

Notes

For even-length lists, does not return the mean of the two middle values but instead returns the nearest value from the input list.

geodezyx.utils.list_utils.middle(linp)

Calculate the midpoints between consecutive elements of a list.

Parameters:

linp (list or array-like) – Input sequence with at least 2 elements.

Returns:

List of midpoint values between consecutive elements.

Return type:

list

geodezyx.utils.list_utils.minmax(l)

Find the minimum and maximum values in a list.

Parameters:

l (list or array-like) – Input sequence.

Returns:

(min_value, max_value) of the input sequence.

Return type:

tuple

geodezyx.utils.list_utils.most_common(lst)

Find the most frequently occurring element in a list.

Parameters:

lst (list or iterable) – Input sequence to analyze.

Returns:

The element with the highest frequency in the list.

Return type:

scalar

Notes

Source: http://stackoverflow.com/questions/1518522/python-most-common-element-in-a-list

geodezyx.utils.list_utils.occurence(l, tolerence=None, pretty_output=False)

Count occurrences of elements in a list.

Parameters:
  • l (list) – Input list

  • tolerence (float, optional) – Tolerance to find close elements of L if no tolerance is given then a set() is used

  • pretty_output (bool) –

    if False, return a list of 2-tuples:

    (element of the list, number of occurrence of this element in the list)

    if True, return tuple with sorted occurrences and values

Returns:

output – See pretty_output parameter

Return type:

list or tuple

Notes

pretty_output is implemented because the first mode is not really useful (180612) the equal test is also replaced by is close

geodezyx.utils.list_utils.second_smallest(numbers)

Find the second smallest value in a sequence.

Parameters:

numbers (iterable) – Input sequence to analyze.

Returns:

The second smallest element in the sequence.

Return type:

scalar

Notes

Returns infinity if there are fewer than 2 elements.

geodezyx.utils.list_utils.shrink_listoflist(lin)

Shrink a list of list if it contains only one sublist.

If lin is a list of list and contains only one element, returns the inner sublist, e.g. [[a, b, c]] => [a, b, c].

Parameters:

lin (list) – Input list, potentially a list of lists.

Returns:

The single sublist if lin is a one-element list of lists, otherwise lin unchanged.

Return type:

list

Examples

>>> shrink_listoflist([[1, 2, 3]])
[1, 2, 3]
>>> shrink_listoflist([[1, 2], [3, 4]])
[[1, 2], [3, 4]]
geodezyx.utils.list_utils.sliceIt(seq, num)

Divide a list into sublists of fixed size.

Parameters:
  • seq (list or array-like) – Input sequence to divide.

  • num (int) – Size of each sublist.

Returns:

List of sublists, each containing num elements (last sublist may be shorter).

Return type:

list

Notes

Source: http://stackoverflow.com/questions/4501636/creating-sublists

geodezyx.utils.list_utils.sort_basename(file_paths)

Sort a list of file paths by their basenames.

Parameters:

file_paths (list) – List of file paths to be sorted.

Returns:

Sorted list of file paths by their basenames.

Return type:

list

geodezyx.utils.list_utils.sort_binom_list(x, y, array_out=False)

Sort Y according to X and sort X.

Parameters:
  • x (list or array-like) – Reference values to sort by.

  • y (list or array-like) – Values to sort according to X ordering.

  • array_out (bool, optional) – If True, return numpy arrays. If False, return lists. Default is False.

Returns:

(xnew, ynew) where both are sorted according to X. Type depends on array_out parameter.

Return type:

tuple

Raises:

Warning – If len(X) != len(Y), a warning is logged.

geodezyx.utils.list_utils.sort_multinom_list(x, *y)

Sort multiple Y sequences according to X and sort X.

Parameters:
  • x (list or array-like) – Reference values to sort by.

  • *y (list or array-like) – Variable number of sequences to sort according to X ordering.

Returns:

(xnew, Ynew_1, Ynew_2, …) where all sequences are sorted according to X. X is returned as a numpy array, while Y sequences are returned as lists.

Return type:

tuple

geodezyx.utils.list_utils.sort_table(table, col)

Sort a table by multiple columns.

Parameters:
  • table (list of lists or tuple of tuples) – where each inner list represents a row

  • col (int) – column number to sort by

Returns:

outtable – sorted table

Return type:

list

geodezyx.utils.list_utils.sublistsIt(seq, lenofsublis_lis, output_array=False)

Divide a sequence into sublists of specified sizes.

Parameters:
  • seq (list or array-like) – Input sequence to divide.

  • lenofsublis_lis (list) – List of integers specifying the size of each sublist. Example: [2, 3, 4, 2] creates 4 sublists of sizes 2, 3, 4, and 2.

  • output_array (bool, optional) – If True, return list of numpy arrays. If False, return list of lists. Default is False.

Returns:

List of sublists (or arrays if output_array=True).

Return type:

list

Raises:

Exception – If sum(lenofsublis_lis) != len(seq).

geodezyx.utils.list_utils.trio_lists_2_tab(xlis, ylis, vlis)

Convert three lists into a 2D table structure.

Parameters:
  • xlis (list) – List of X values that will become column headers.

  • ylis (list) – List of Y values that will become row headers.

  • vlis (list) – List of data values corresponding to (X, Y) pairs.

Returns:

A 2D table structure where the first row contains X values, and subsequent rows contain Y value and corresponding V values. Compatible with the tabulate module.

Return type:

list

Notes

The data lookup is performed with a brute-force approach (nested loops), which is not optimized for large datasets.

Examples

>>> trio_lists_2_tab([1, 2, 1, 2], [1, 1, 2, 2], [10, 20, 30, 40])
[[1, 2], [1, 10, 20], [2, 30, 40]]
geodezyx.utils.list_utils.uniq_and_sort(l, natural_sort=True)

Remove duplicates from a list and sort it.

Parameters:
  • l (list) – Input list to deduplicate and sort.

  • natural_sort (bool, optional) – If True, use natural sorting (default). If False, use standard sorting.

Returns:

Sorted list with duplicate elements removed.

Return type:

list

geodezyx.utils.list_utils.uniq_set_list(setlis, frozen=True)

Remove duplicate sets from a list of sets.

Parameters:
  • setlis (list of sets) – Input list containing sets or set-like iterables.

  • frozen (bool, optional) – If True, returns frozensets (hashable and immutable). Default is True.

Returns:

List of unique sets (either sets or frozensets depending on frozen parameter).

Return type:

list

geodezyx.utils.list_utils.uniqify_list(seq, idfun=None)

Remove duplicate elements from a sequence while preserving order.

Parameters:
  • seq (iterable) – Input sequence to deduplicate.

  • idfun (callable, optional) – Function to extract the identifier for uniqueness comparison. If None, the elements themselves are used as identifiers.

Returns:

Deduplicated sequence with order preserved.

Return type:

list

Notes

Based on: https://www.peterbe.com/plog/uniqifiers-benchmark

geodezyx.utils.list_utils.uniqify_list_of_lists(l)

Remove duplicate sublists while preserving uniqueness.

Parameters:

l (list of lists) – Input list containing sublists.

Returns:

List of unique sublists.

Return type:

list

Notes

Source: http://stackoverflow.com/questions/3724551/python-uniqueness-for-list-of-lists

geodezyx.utils.list_utils.uniquetol(a, tol)

Find unique elements in an array within a tolerance threshold.

Parameters:
  • a (array-like) – Input array.

  • tol (float) – Absolute tolerance for uniqueness comparison.

Returns:

Array of unique elements within the specified tolerance.

Return type:

numpy.ndarray

Notes

Source: http://stackoverflow.com/questions/37847053/uniquify-an-array-list-with-a-tolerance-in-python-uniquetol-equivalent

geodezyx.utils.list_utils.uniquetol2(a, tol=1e-06)

Find unique elements in an array within a tolerance threshold (optimized version).

Parameters:
  • a (array-like) – Input array.

  • tol (float, optional) – Tolerance for rounding before finding unique elements. Default is 10**-6.

Returns:

Array of unique elements.

Return type:

numpy.ndarray

Notes

This is a faster alternative to uniquetol. Source: https://stackoverflow.com/questions/5426908/find-unique-elements-of-floating-point-array-in-numpy-with-comparison-using-a-d

geodezyx.utils.shell_like module

@author: psakic

This sub-module of geodezyx.utils contains functions for Shell-like

operations in Python.

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils.shell_like.cat(outfilename, *infilenames)

Concatenate files.

Parameters:
  • outfilename (str) – Output filename

  • infilenames (str) – Input filenames to concatenate

Returns:

outfilename – The output filename

Return type:

str

Notes

For just a print, use cat_print!

References

http://stackoverflow.com/questions/11532980/reproduce-the-unix-cat-command-in-python kindall response

geodezyx.utils.shell_like.cat_print(inpfile)

Print file contents to stdout, line by line.

Parameters:

inpfile (str) – Path to the file to print.

Return type:

None

geodezyx.utils.shell_like.cat_remove_header(infilepath, outfilepath, header='', header_included=False)

Concatenate file content starting from a specified header line.

Parameters:
  • infilepath (str) – Path to the input file.

  • outfilepath (str) – Path to the output file.

  • header (str, optional) – The header pattern to search for. Default is empty string.

  • header_included (bool, optional) – If True, include the header line in the output. Default is False.

Returns:

Path to the output file.

Return type:

str

geodezyx.utils.shell_like.check_regex(filein, regex)

Check if a file contains a regex pattern.

Parameters:
  • filein (str) – Path to the file to search.

  • regex (str) – Regular expression pattern to search for.

Returns:

True if the pattern is found in the file, False otherwise.

Return type:

bool

geodezyx.utils.shell_like.copy_recursive(src, dst, force=False)

Copy a directory recursively from src to dst with an option to force overwrite.

Parameters:
  • src (str) – The source directory path.

  • dst (str) – The destination directory path.

  • force (bool, optional) – If True, overwrite the destination directory if it exists. Default is False.

Return type:

None

geodezyx.utils.shell_like.create_dir(directory)

Create a directory if it does not already exist.

Parameters:

directory (str) – Path to the directory to create.

Returns:

The directory path.

Return type:

str

geodezyx.utils.shell_like.egrep_big_string(regex, bigstring, only_first_occur=False)

Perform a regex grep on a big string separated with newlines.

Deprecated since version Use: grep() instead, which handles this functionality (as of 260121).

Parameters:
  • regex (str) – Regular expression pattern to search for.

  • bigstring (str) – The large string (separated by newlines) to search in.

  • only_first_occur (bool, optional) – If True, return only the first matching line. Default is False.

Returns:

Matching line(s) as str if single result, list if multiple results, or empty string if no matches.

Return type:

str or list

Notes

This function must be improved with regular pattern matching, without relying on regex.

geodezyx.utils.shell_like.empty_file_check(fpath)

Check if a file is empty or does not exist.

Parameters:

fpath (str) – The file path to check.

Returns:

True if the file is empty or does not exist, False otherwise.

Return type:

bool

See also

http

//stackoverflow.com/questions/2507808/python-how-to-check-file-empty-or-not

geodezyx.utils.shell_like.fileprint(output, outfile)

Log output to console and append to a file.

Parameters:
  • output (str) – The output message to log and write.

  • outfile (str) – Path to the output file.

Return type:

None

geodezyx.utils.shell_like.find_recursive(parent_folder, pattern, sort_results=True, case_sensitive=True, extended_file_stats=False, warn_if_empty=True, regex=False)

Find files in a folder and his sub-folders in a recursive way.

Parameters:
  • parent_folder (str) – the parent folder path

  • pattern (str) – the researched files pattern name (can manage wildcard or regex) - wildcard (only * and ?) for case_sensitive = True - regex for case_sensitive = False

  • sort_results (bool) – Sort results

  • case_sensitive (bool) – Case sensitive or not. If False, the pattern must be a regex Deprecated since 2025-01, use regex instead

  • extended_file_stats (bool) –

    if True, returns the stats of the files the outputed matches list will be a list of tuples (file_path,stat_object), where stat_object has the following attributes

    • st_mode - protection bits,

    • st_ino - inode number,

    • st_dev - device,

    • st_nlink - number of hard links,

    • st_uid - user id of owner,

    • st_gid - group id of owner,

    • st_size - size of file, in bytes,

    • st_atime - time of most recent access,

    • st_mtime - time of most recent content modification,

    • st_ctime - platform dependent; time of most recent metadata

      change on Unix, or the time of creation on Windows)

  • warn_if_empty (bool) – print a debug warning if no files are found

  • regex (bool) – if True, the pattern in a regular expression. Default is False

Returns:

  • matches (list) – Found files

  • Source

  • ——

  • https (//stackoverflow.com/questions/2186525/use-a-glob-to-find-files-recursively-in-python)

  • https (//stackoverflow.com/questions/15652594/how-to-find-files-with-specific-case-insensitive-extension-names-in-python)

  • (for the case unsensitive case)

geodezyx.utils.shell_like.glob_smart(dir_path, file_pattern=None, verbose=True)

Find files in a directory using glob pattern with optional logging.

Parameters:
  • dir_path (str) – The directory path to search in.

  • file_pattern (str, optional) – File pattern to match. Default is None (search all files).

  • verbose (bool, optional) – If True, log warnings/info about search results. Default is True.

Returns:

List of file paths matching the pattern.

Return type:

list

geodezyx.utils.shell_like.grep(file_in, search_string, only_first_occur=False, invert=False, regex=False, line_number=False, col=(None, None), force_list_output=False)

Search for lines matching a pattern in a file.

Returns an empty string if nothing is found, not a singleton list with an empty string inside.

Parameters:
  • file_in (str or file-like object) – Path to the file or a file-like object to search.

  • search_string (str or list) – String(s) to search for.

  • only_first_occur (bool, optional) – Return only the first occurrence. Default is False.

  • invert (bool, optional) – If True, return lines that do NOT match the search string. Default is False.

  • regex (bool, optional) – If True, treat search_string as a regular expression. Default is False.

  • line_number (bool, optional) – If True, also return the line numbers. Default is False.

  • col (tuple of int, optional) – Column range (start, end) where the grep is executed. Use None for unbounded indices. Default is (None, None).

  • force_list_output (bool, optional) – If True, always return a list even for single elements. Default is False.

Returns:

  • If line_number is True and single result: (line_number, line)

  • If line_number is True and multiple results: (line_numbers_list, lines_list)

  • Otherwise returns matching line(s) as str or list, or empty string if no matches.

Return type:

str, list, or tuple

Notes

  • If nothing is found returns an empty string (not a singleton list)

  • search_string can be a list of patterns to match any

geodezyx.utils.shell_like.grep_boolean(file_in, search_string)

Check if a string exists in a file.

Parameters:
  • file_in (str) – Path to the file to search.

  • search_string (str) – String to search for.

Returns:

True if search_string is found in the file, False otherwise.

Return type:

bool

geodezyx.utils.shell_like.gzip_compress(inp_path, out_dir=None, out_fname=None, rm_inp=False)

Compress a file using gzip.

Parameters:
  • inp_path (str) – Path to the input file to compress.

  • out_dir (str, optional) – Output directory. Default is None (same as input file).

  • out_fname (str, optional) – Output filename. Default is None (input filename + “.gz”).

  • rm_inp (bool, optional) – If True, remove the input file after compression. Default is False.

Returns:

Path to the compressed output file.

Return type:

str

geodezyx.utils.shell_like.head(filename, count=1)

Get the first few lines of a file.

Parameters:
  • filename (str or file-like object) – Path to the file or a file-like object (StringIO, BytesIO, etc.)

  • count (int, optional) – Number of lines to return from the beginning of the file. Default is 1.

Returns:

List of lines from the beginning of the file.

Return type:

list

geodezyx.utils.shell_like.insert_lines_in_file(file_path, text_values, lines_ids)

Insert text lines at specified positions in a file.

Parameters:
  • file_path (str) – Path to the file to modify.

  • text_values (str or list) – Text string(s) to insert.

  • lines_ids (int or list) – Line number(s) where text should be inserted.

Returns:

Path to the modified file.

Return type:

str

geodezyx.utils.shell_like.insert_str_in_file_if_line_contains(file_path, str_to_insert, line_pattern_tup, position=None, only_first_occur=False)

Insert a string at lines matching a pattern.

Parameters:
  • file_path (str) – Path to the file to modify.

  • str_to_insert (str) – String to insert before matching lines.

  • line_pattern_tup (tuple) – Tuple of patterns to search for.

  • position (int, optional) – Position for insertion (not implemented). Default is None.

  • only_first_occur (bool, optional) – If True, only process the first occurrence. Default is False.

Returns:

Path to the modified file.

Return type:

str

Notes

The position parameter is not currently implemented.

geodezyx.utils.shell_like.is_exe(fpath)

Check if a file is executable.

Parameters:

fpath (str) – File path.

Returns:

True if the file is executable, False otherwise.

Return type:

bool

geodezyx.utils.shell_like.regex2filelist(dossier, regex, outtype='file')

Get files in a directory matching a regex pattern.

Parameters:
  • dossier (str) – Path to the directory.

  • regex (str) – Regular expression pattern to match filenames.

  • outtype (str, optional) – Type of output. ‘file’ returns only files, other values return all matches. Default is ‘file’.

Returns:

Sorted list of file paths matching the pattern.

Return type:

list

geodezyx.utils.shell_like.regex_or_from_list(listin)

Create a regex OR pattern from a list of strings.

Parameters:

listin (list) – List of strings to convert to regex OR pattern.

Returns:

Regex pattern matching any of the input strings, e.g., “(pattern1|pattern2)”.

Return type:

str

geodezyx.utils.shell_like.remove_dir(directory)

Remove a directory and all its contents.

Parameters:

directory (str) – Path to the directory to remove.

Return type:

None

Warning

Logs a warning if the directory does not exist.

geodezyx.utils.shell_like.replace(file_path, pattern, subst)

Replace a string in a file with a substitute

Parameters:
  • file_path (str) – path of the file.

  • pattern (str) – string to be replaced.

  • subst (str) – string which will be substituted.

Return type:

None.

geodezyx.utils.shell_like.subprocess_frontend(cmd_in, save_log=False, log_dir=None, log_name_out='out.log', log_name_err='err.log', logname_timestamp=False)

Run a shell command and optionally save output to log files.

Parameters:
  • cmd_in (str) – Command to execute via shell.

  • save_log (bool, optional) – If True, save stdout and stderr to log files. Default is False.

  • log_dir (str, optional) – Directory where log files will be saved. Default is current working directory.

  • log_name_out (str, optional) – Name of the stdout log file. Default is “out.log”.

  • log_name_err (str, optional) – Name of the stderr log file. Default is “err.log”.

  • logname_timestamp (bool, optional) – If True, prepend timestamp to log filenames. Default is False.

Returns:

  • process1subprocess.CompletedProcess

    The subprocess return object.

  • process1_stdoutstr

    Standard output from the command.

  • process1_stderrstr

    Standard error from the command.

Return type:

tuple

geodezyx.utils.shell_like.tail(filename, count=1, offset=1024)

Get the last few lines of a file efficiently.

Depending on the length of your lines, you will want to modify offset to get better performance.

Parameters:
  • filename (str or file-like object) – Path to the file or a file-like object (StringIO, BytesIO, etc.)

  • count (int, optional) – Number of lines to return from the end of the file. Default is 1.

  • offset (int, optional) – Number of bytes to read from the end of the file. Default is 1024.

Returns:

List of lines from the end of the file.

Return type:

list

geodezyx.utils.shell_like.uncompress(pathin, dirout='', opts='-f')

Uncompress a file using the uncompress command.

Deprecated since version Use: geodezyx.files_rw.unzip_gz_z() instead.

Parameters:
  • pathin (str) – Path to the file to uncompress.

  • dirout (str, optional) – Output directory. Default is ‘’ (current directory).

  • opts (str, optional) – Options for the uncompress command. Default is ‘-f’.

Returns:

Path to the uncompressed file, or None if input file does not exist.

Return type:

str or None

geodezyx.utils.shell_like.walk_dir(parent_dir)

From a main parent_dir, returns files_list & dirs_list containing all the files and all the dirs in the parent_dir. Supports wildcards in the parent_dir path.

Parameters:

parent_dir (str) – The parent directory path, which can include wildcards.

Returns:

  • files_list (list) – List of all file paths.

  • dirs_list (list) – List of all directory paths.

geodezyx.utils.shell_like.write_in_file(string_to_write, outdir_or_outpath, outname='', ext='.txt', encoding='utf8', append=False)

Write a string to a file with support for bytes or text encoding.

Parameters:
  • string_to_write (str or bytes) – The content to write to the file.

  • outdir_or_outpath (str) – Output directory path or full file path.

  • outname (str, optional) – Output filename (without extension). Default is “”.

  • ext (str, optional) – File extension. Default is ‘.txt’.

  • encoding (str, optional) – Text encoding to use. Default is ‘utf8’. See https://docs.python.org/3/library/codecs.html#standard-encodings

  • append (bool, optional) – If True, append to existing file. If False, overwrite. Default is False.

Returns:

Path to the output file.

Return type:

str

Notes

Supported encodings: utf8, latin_1, etc. See https://docs.python.org/3/library/codecs.html#standard-encodings

geodezyx.utils.utils_core module

@author: psakic

This sub-module of geodezyx.utils contains functions for misc. low level function

it can be imported directly with: from geodezyx import utils

The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License

Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx

geodezyx.utils.utils_core.Aformat(A, landscape=True)
class geodezyx.utils.utils_core.Tee(*files)

Bases: object

Internal class for Tee_frontend

Source

flush()
pause()
restart()
stop()
write(obj)
geodezyx.utils.utils_core.Tee_frontend(dir_in, logname_in, suffix='', ext='log', print_timestamp=True)

Write in a file the console output

Parameters:
  • dir_in (str) – directory path.

  • logname_in (str) – logfile name.

  • suffix (str, optional) – An optional suffix. The default is ‘’.

  • ext (str, optional) – file extension. The default is ‘log’.

  • print_timestamp (bool, optional) – print a timestamp in the filename. The default is True.

Returns:

F_tee – Object controling the output

Return type:

F_tee object

Note

It is recommended to stop the writing at the end of the script with F_tee.stop()

geodezyx.utils.utils_core.add_symbol_to_new_lines(s, symbol='·')

Adds a specified symbol to the beginning of each new line in a multi-line string.

Parameters:
  • s (str) – The input multi-line string.

  • symbol (str, optional) – The symbol to add to each new line. Default is ‘·’.

Returns:

The modified string with the symbol added to each new line.

Return type:

str

geodezyx.utils.utils_core.alphabet(num=None)
geodezyx.utils.utils_core.alphabet_reverse(letter=None)
geodezyx.utils.utils_core.array_from_lists(*listsin)

fonction pour arreter de galerer avec les conversions de lists => matrices

geodezyx.utils.utils_core.boolean_dict(list_of_keywords)
geodezyx.utils.utils_core.clear_all()

Clears all the variables from the workspace of the spyder application.

geodezyx.utils.utils_core.dday()

Give the time span between present and toolbox author’s PhD defense date

(tests also the console messages)

Returns:

D – elapsed time.

Return type:

datetime

geodezyx.utils.utils_core.detect_encoding(file_path)

Detect the encoding of a text file.

This function uses the chardet library to detect the encoding of a given text file. It reads the file line by line and feeds each line to a chardet UniversalDetector. When the detector has made a determination, it stops reading the file and returns the detected encoding.

Parameters:

file_path (str) – The path to the text file for which to detect the encoding.

Returns:

  • str – The detected encoding of the text file.

  • Source

  • ——

  • https (//www.geeksforgeeks.org/detect-encoding-of-a-text-file-with-python/)

geodezyx.utils.utils_core.diagonalize(x, n=10)
geodezyx.utils.utils_core.docstring_generic()

prints and returns an prototype generic docstring. Based on Numpy docstring convention

Source

https://numpydoc.readthedocs.io/en/latest/format.html

geodezyx.utils.utils_core.eval_a_dict(dictin, where, verbose=True)

Evaluate dictionary values in a given namespace.

Parameters:
  • dictin (dict) – Dictionary to evaluate

  • where (dict) – Namespace where to evaluate (usually globals() or locals())

  • verbose (bool) – Print verbose output (default True)

Return type:

None

Notes

WARNING: doesn’t work in a function! Use instead:

for k, v in booldic.items():
    globals()[k] = v
    locals()[k] = v
geodezyx.utils.utils_core.extract_text_between_elements(file_path, elt_start, elt_end)
source :

https://stackoverflow.com/questions/9222106/how-to-extract-information-between-two-unique-words-in-a-large-text-file

geodezyx.utils.utils_core.extract_text_between_elements_2(file_path, elt_start, elt_end, return_string=False, nth_occur_elt_start=0, nth_occur_elt_end=0, invert=False, verbose=False)

This function is based on REGEX (elt_start , elt_end are REGEX) and can manage several blocks in the same file

return_string = True : returns a string of the matched lines return_string = False : returns a list of the matched lines invert : exclude text between the pattern

NBin SINEX context, with “+MARKER”, use backslash i.e.

“+MARKER”

NB2 : think about StingIO for a Pandas DataFrame Handeling https://docs.python.org/2/library/stringio.html

geodezyx.utils.utils_core.get_computer_name()
geodezyx.utils.utils_core.get_function_name()
geodezyx.utils.utils_core.get_specific_locals(prefix)

get locals params with ‘prefix’ in the name can actually be a suffix

geodezyx.utils.utils_core.get_timestamp(outstring=True, separator='T', utc=False)

frontend to get easily a timestamp

geodezyx.utils.utils_core.get_type_smart(obj_in)

get type of an object, to convert easily another one to this type for instance type(np.array(A)) doesn’t return a constructor

geodezyx.utils.utils_core.get_username()
geodezyx.utils.utils_core.globals_filtered()

Filter globals() variables with only compatible variables for pickle.

https://stackoverflow.com/questions/2960864/how-to-save-all-the-variables-in-the-current-python-session

Returns:

data_out – filtered globals() variables.

Return type:

dict

geodezyx.utils.utils_core.greek_alphabet(num=None, maj=False)
geodezyx.utils.utils_core.indice_printer(i, print_every=10, text_before='')

print an index every N iteration

geodezyx.utils.utils_core.is_in_str(string, *patterns)

recipe to the famous problem of pattern in string from http://stackoverflow.com/questions/3389574/check-if-multiple-strings-exist-in-another-string

geodezyx.utils.utils_core.is_iterable(inp, consider_str_as_iterable=False, consider_dict_as_iterable=False)

Test if the input is an iterable like a list or a numpy array or not

Parameters:
  • inp (list, numpy.array, ...)

  • consider_str_as_iterable (bool) – string are considered as iterable by Python per default This boolean will avoid True as return if you test a string

Returns:

out – True if inp is iterable, False either

Return type:

bool

geodezyx.utils.utils_core.is_lambda(v)

Check if v is lambda

Source

https://stackoverflow.com/questions/3655842/how-can-i-test-whether-a-variable-holds-a-lambda

geodezyx.utils.utils_core.is_not_iterable(inp, consider_str_as_iterable=False)

Simple negation of is_iterable()

geodezyx.utils.utils_core.join_improved(strseparat, *varsin)
geodezyx.utils.utils_core.line_count(filein)
geodezyx.utils.utils_core.line_in_file_checker(file_path, string)
geodezyx.utils.utils_core.listify(inp)

Convert the input into a list.

Parameters:

inp (any) – The input to be converted into a list.

Returns:

A list containing the input elements if the input is iterable, otherwise a list with the input as its single element.

Return type:

list

geodezyx.utils.utils_core.mdot(*args)
geodezyx.utils.utils_core.mdotr(*args)
geodezyx.utils.utils_core.memmap_from_array(arrin)
geodezyx.utils.utils_core.mmpa(arrin)
geodezyx.utils.utils_core.multidot(tupin)
geodezyx.utils.utils_core.open_readlines_smart(file_in, decode_type='iso-8859-1', verbose=False)

This function takes an input object, opens it, and reads its lines. The input file can be the path of a file as a string or as a Path object, or the file content as a string, bytes, StringIO object, or a list of lines.

Parameters:
  • file_in (various) – An input object. This can be a string representing a file path, a Path object, a string representing file content, bytes, a StringIO object, or a list of lines.

  • decode_type (str, optional) – The decode standard. Default is “iso-8859-1”.

  • verbose (bool, optional) – If set to True, the function will print the type of the input file. Default is False.

Returns:

lines – A list of the lines in the input file.

Return type:

list

Raises:

FileNotFoundError – If the file specified by file_in does not exist.

geodezyx.utils.utils_core.pickle_loader(pathin)

Load a Python object saved as a Pickle file.

Wrapper of pickle.load

Parameters:

pathin (str) – the input pickle file path.

Returns:

outdata – Data which will be loaded from the a pickle..

Return type:

generic

geodezyx.utils.utils_core.pickle_saver(datain, outdir=None, outname=None, ext='.pik', timestamp=False, full_path=None)

Save a Python object in a Pickle file.

Wrapper of pickle.dump

Parameters:
  • datain (generic) – Data which will be saved as a pickle.

  • outdir (str, optional) – output directory. The default is None.

  • outname (str, optional) – pickle output name. The default is None.

  • ext (str, optional) – pickle file extension. The default is ‘.pik’.

  • timestamp (bool, optional) – add the timestamp in the pickle’s filename. The default is False.

  • full_path (str, optional) – gives the full path where to save the pickle. if full_path is given, override outdir and outname. The default is None.

Returns:

outpath – the output pickle file path.

Return type:

str

geodezyx.utils.utils_core.read_comments(filein, comment='#')
geodezyx.utils.utils_core.read_mat_file(pathin, full=False)

low level reader of a MATLAB mat file

geodezyx.utils.utils_core.replace_in_file(file_in, str_before, str_after)

https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file

geodezyx.utils.utils_core.save_array_fast(arrin, outname='', outdir='/home/psakicki/aaa_FOURBI', txt=True)
geodezyx.utils.utils_core.save_obj_as_file(objin, pathin, prefix, ext='.exp', suffix='')

OLD proto-version of pickle saver DISCONTINUED

geodezyx.utils.utils_core.split_improved(strin, sep_left, sep_right)
geodezyx.utils.utils_core.split_string_after_n_chars_at_space(s, n)

Splits a string into substrings of a maximum length of n characters.

This function splits a string into substrings of a maximum length of n characters, but only splits at spaces and inserts a newline (\n) after each substring.

Parameters:
  • s (str) – The input string to be split.

  • n (int) – The maximum length of each substring.

Returns:

The modified string with newlines inserted.

Return type:

str

geodezyx.utils.utils_core.spyder_run_check()

Check if the code is run inside Spyder IDE

geodezyx.utils.utils_core.str2float_smart(str_in)
geodezyx.utils.utils_core.str2int_float_autodetect(str_list_in)
geodezyx.utils.utils_core.str2int_smart(str_in)
geodezyx.utils.utils_core.str_2_float_line(line, sep=' ', out_type=<class 'float'>)

convert a line of number (in str) to a list of float (or other out_type)

geodezyx.utils.utils_core.stringizer(tupin, separ=' ', eol=True)

transform elts of a tuple in a string line, ready for write in a file

geodezyx.utils.utils_core.timeout(func, args=(), kwargs={}, timeout_duration=1, default=None)

This function will spwan a thread and run the given function using the args, kwargs and return the given default value if the timeout_duration is exceeded

http://stackoverflow.com/questions/366682/how-to-limit-execution-time-of-a-function-call-in-python

geodezyx.utils.utils_core.transpose_vector_array(X)

transpose a Nx3 array to a 3xN array if necessary (necessary for some usages)

Parameters:

X (iterable)

Returns:

X – X transposed if necessary.

Return type:

iterable

geodezyx.utils.utils_core.trunc(f, n)

Truncates/pads a float f to n decimal places without rounding

geodezyx.utils.utils_core.vectorialize(array_in)

redondant avec .flat ???