geodezyx.utils package
Submodules
geodezyx.utils.dict_utils module
@author: psakic
This sub-module of geodezyx.utils contains functions for operations related to Python’s dictionary manipulations.
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils.dict_utils.dic_key_for_vals_list_finder(dic_in, value_in)
Find the key in a dictionary of lists that contains a given value.
- Parameters:
dic_in (dict) –
Dictionary with lists as values, e.g.:
dic_in[key1] = [val1a, val1b] dic_in[key2] = [val2a, val2b, val2c]
value_in (object) – Value to search for in the lists.
- Returns:
The key associated with the list containing value_in, or None if not found.
- Return type:
key or None
Warning
This function returns the first key found. The input dictionary should be injective (no duplicate values across lists) for predictable behavior.
Notes
Example: if value_in = val2b, the function returns key2.
Uses log.warning() to report when no key is found for the given value.
- geodezyx.utils.dict_utils.dicts_merge(*dict_args)
Merge multiple dictionaries into a single dictionary.
Performs a shallow copy and merge of any number of dictionaries with precedence going to key-value pairs in later dictionaries.
- Parameters:
*dict_args (dict) – Variable number of dictionaries to merge.
- Returns:
A new merged dictionary.
- Return type:
dict
Warning
First values will be erased if the same key is present in following dictionaries. Later dictionaries override earlier ones.
Notes
- geodezyx.utils.dict_utils.dicts_of_list_merge(*dict_args)
Merge multiple dictionaries of lists into a single dictionary.
- Parameters:
*dict_args (dict) – Variable number of dictionaries with lists as values.
- Returns:
Merged dictionary where lists from all input dictionaries are combined.
- Return type:
dict
See also
dicts_of_list_merge_monoMerge two dictionaries of lists.
- geodezyx.utils.dict_utils.dicts_of_list_merge_mono(dol1, dol2)
Merge two dictionaries of lists by combining list values for common keys.
- Parameters:
dol1 (dict) – First dictionary with lists as values.
dol2 (dict) – Second dictionary with lists as values.
- Returns:
Merged dictionary where lists from both input dictionaries are combined.
- Return type:
dict
Notes
See https://stackoverflow.com/questions/1495510/combining-dictionaries-of-lists-in-python
geodezyx.utils.list_utils module
@author: psakic
This sub-module of geodezyx.utils contains functions for operations related to Python’s list manipulations.
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils.list_utils.chunkIt(seq, num)
Divide a list into approximately num equal sublists.
- Parameters:
seq (list or array-like) – Input sequence to divide.
num (int) – Desired number of sublists.
- Returns:
List of sublists, each roughly equal in size.
- Return type:
list
Notes
The sublists may vary in size by 1 element if the sequence length is not evenly divisible by num.
- geodezyx.utils.list_utils.consecutive_groupIt(data, only_start_end=False)
Identify groups of continuous numbers in a list.
- Parameters:
data (list or array-like) – Input sequence of numbers.
only_start_end (bool, optional) – If True, return only (start, end) tuples for each group. If False, return full lists of elements in each group. Default is False.
- Returns:
List of groups. Each group is either a list of consecutive elements (if only_start_end=False) or a tuple (start, end) (if only_start_end=True).
- Return type:
list
Notes
Useful for time periods with a prior conversion to MJD.
Source : https://stackoverflow.com/questions/2154249/identify-groups-of-continuous-numbers-in-a-list
- geodezyx.utils.list_utils.decimateIt(listinp, n)
Decimate a list by selecting every n-th element.
- Parameters:
listinp (list or array-like) – Input sequence to decimate.
n (int) – Decimation factor. Elements at indices where i % n == 0 are selected.
- Returns:
Decimated list containing every n-th element.
- Return type:
list
- geodezyx.utils.list_utils.df_sel_val_in_col(df, col_name, col_val)
Select rows from a DataFrame where a column matches a specific value.
- Parameters:
df (pandas.DataFrame) – Input DataFrame.
col_name (str) – Name of the column to filter by.
col_val (scalar) – Value to match in the specified column.
- Returns:
Filtered DataFrame containing only rows where col_name == col_val.
- Return type:
pandas.DataFrame
- geodezyx.utils.list_utils.dicofdic(mat, names)
Create a 2D dictionary from a matrix and corresponding names.
- Parameters:
mat (array-like) – N x N matrix of values.
names (list) – List of N names to use as keys for both dimensions.
- Returns:
A nested dictionary where dic[name1][name2] = mat[i, j] where i and j are the indices corresponding to name1 and name2.
- Return type:
dict
Notes
Source: http://stackoverflow.com/questions/13326042/2d-dictionary-with-multiple-keys-per-value
- geodezyx.utils.list_utils.duplicates_finder(seq)
Find all duplicate elements in a sequence.
- Parameters:
seq (iterable) – Input sequence to search for duplicates.
- Returns:
List of elements that appear more than once in the input sequence.
- Return type:
list
Notes
Source: http://stackoverflow.com/questions/9835762/find-and-list-duplicates-in-python-list
- geodezyx.utils.list_utils.find_common_elts(*lists)
Find common elements across multiple lists.
- Parameters:
*lists (list) – Variable number of input lists.
- Returns:
Sorted array of elements common to all input lists.
- Return type:
numpy.ndarray
- geodezyx.utils.list_utils.find_index_multi_occurences(l, elt)
Find all indices where an element occurs in a list.
- Parameters:
l (list) – Input list to search.
elt (scalar) – Element to find.
- Returns:
List of indices where elt appears in L.
- Return type:
list
- geodezyx.utils.list_utils.find_interval_bound(listin, val, outindexes=True)
Find the bounding values/indices of an interval around a target value.
- Parameters:
listin (list or array-like) – Input list/array (assumed to be sorted).
val (scalar) – Target value to find bounds for.
outindexes (bool, optional) – If True, return indices of bounds. If False, return the bounding values. Default is True.
- Returns:
If outindexes is True, returns (lower_index, upper_index). If outindexes is False, returns (lower_value, upper_value).
- Return type:
tuple
- geodezyx.utils.list_utils.find_nearest(listin, value)
Find the nearest value in a list to a target value.
- Parameters:
listin (list or array-like) – Input list/array to search.
value (scalar) – Target value to find nearest element to.
- Returns:
(nearest_value, index_of_nearest) where nearest_value is the element in listin closest to value, and index_of_nearest is its index.
- Return type:
tuple
- geodezyx.utils.list_utils.find_regex_in_list(regex, L, only_first_occurence=False, line_number=False)
Find elements in a list matching a regular expression pattern.
- Parameters:
regex (str) – Regular expression pattern to search for.
L (list) – List of strings to search.
only_first_occurence (bool, optional) – If True, return only the first match. If False, return all matches. Default is False.
line_number (bool, optional) – If True, return tuples of (index, element). If False, return just elements. Default is False.
- Returns:
If only_first_occurence=True, returns a single match (element or tuple). If only_first_occurence=False, returns a list of matches. Format depends on line_number parameter.
- Return type:
list or scalar
- geodezyx.utils.list_utils.find_surrounding(L, v)
Find the two nearest values surrounding a target value.
- Parameters:
L (iterable) – Input list/array to search.
v (scalar) – Target value to find surrounding values for.
- Returns:
(surrounding_values, surrounding_indices) where: - surrounding_values is a tuple of the two nearest values - surrounding_indices is a tuple of their indices in L
- Return type:
tuple
- geodezyx.utils.list_utils.get_interval(start, end, delta)
Generate a list of values at regular intervals between start and end.
- Parameters:
start (numeric) – Starting value (inclusive).
end (numeric) – Ending value (exclusive).
delta (numeric) – Step size between consecutive values.
- Returns:
List of values from start to end with step delta.
- Return type:
list
Notes
- geodezyx.utils.list_utils.groups_near_central_values(a, tol, b=None)
group elements of an array by proximity to unique central values.
- Parameters:
a (array-like) – Input array to group.
tol (float) – Absolute tolerance for grouping elements near central values.
b (array-like, optional) – Auxiliary array corresponding to elements in A. Default is None.
- Returns:
If B is None, returns a list of lists where each sublist contains elements from A grouped around a central value. If B is provided, returns a tuple (groups_A, groups_B) containing grouped elements from both arrays.
- Return type:
list or tuple
Notes
This function is in beta status and may have bugs if tolerance is poorly chosen.
- geodezyx.utils.list_utils.identical_consecutive_eltsIt(linp)
Group consecutive identical elements together.
- Parameters:
linp (list or iterable) – Input sequence with potentially repeated consecutive elements.
- Returns:
List of lists, where each inner list contains consecutive identical elements.
- Return type:
list
- geodezyx.utils.list_utils.identical_groupIt(data)
Group consecutive identical elements together.
- Parameters:
data (list or iterable) – Input sequence with potentially repeated consecutive elements.
- Returns:
List of lists, where each inner list contains consecutive identical elements.
- Return type:
list
Notes
- geodezyx.utils.list_utils.is_listoflist(inp)
Check if inp is a list of list.
- Parameters:
inp (iterable) – Input object to check.
- Returns:
True if inp contains at least one list or numpy array element, False otherwise.
- Return type:
bool
Examples
>>> is_listoflist([[1, 2], [3, 4]]) True >>> is_listoflist([1, 2, 3]) False
- geodezyx.utils.list_utils.median_improved(l)
Calculate the median of a list, handling even-length lists differently.
- Parameters:
l (list or array-like) – Input sequence.
- Returns:
The median value. For even-length lists, returns the nearest value in the list to the actual median instead of interpolating.
- Return type:
scalar
Notes
For even-length lists, does not return the mean of the two middle values but instead returns the nearest value from the input list.
- geodezyx.utils.list_utils.middle(linp)
Calculate the midpoints between consecutive elements of a list.
- Parameters:
linp (list or array-like) – Input sequence with at least 2 elements.
- Returns:
List of midpoint values between consecutive elements.
- Return type:
list
- geodezyx.utils.list_utils.minmax(l)
Find the minimum and maximum values in a list.
- Parameters:
l (list or array-like) – Input sequence.
- Returns:
(min_value, max_value) of the input sequence.
- Return type:
tuple
- geodezyx.utils.list_utils.most_common(lst)
Find the most frequently occurring element in a list.
- Parameters:
lst (list or iterable) – Input sequence to analyze.
- Returns:
The element with the highest frequency in the list.
- Return type:
scalar
Notes
Source: http://stackoverflow.com/questions/1518522/python-most-common-element-in-a-list
- geodezyx.utils.list_utils.occurence(l, tolerence=None, pretty_output=False)
Count occurrences of elements in a list.
- Parameters:
l (list) – Input list
tolerence (float, optional) – Tolerance to find close elements of L if no tolerance is given then a set() is used
pretty_output (bool) –
if False, return a list of 2-tuples:
(element of the list, number of occurrence of this element in the list)
if True, return tuple with sorted occurrences and values
- Returns:
output – See pretty_output parameter
- Return type:
list or tuple
Notes
pretty_output is implemented because the first mode is not really useful (180612) the equal test is also replaced by is close
- geodezyx.utils.list_utils.second_smallest(numbers)
Find the second smallest value in a sequence.
- Parameters:
numbers (iterable) – Input sequence to analyze.
- Returns:
The second smallest element in the sequence.
- Return type:
scalar
Notes
Returns infinity if there are fewer than 2 elements.
- geodezyx.utils.list_utils.shrink_listoflist(lin)
Shrink a list of list if it contains only one sublist.
If
linis a list of list and contains only one element, returns the inner sublist, e.g.[[a, b, c]]=>[a, b, c].- Parameters:
lin (list) – Input list, potentially a list of lists.
- Returns:
The single sublist if
linis a one-element list of lists, otherwiselinunchanged.- Return type:
list
Examples
>>> shrink_listoflist([[1, 2, 3]]) [1, 2, 3] >>> shrink_listoflist([[1, 2], [3, 4]]) [[1, 2], [3, 4]]
- geodezyx.utils.list_utils.sliceIt(seq, num)
Divide a list into sublists of fixed size.
- Parameters:
seq (list or array-like) – Input sequence to divide.
num (int) – Size of each sublist.
- Returns:
List of sublists, each containing num elements (last sublist may be shorter).
- Return type:
list
Notes
Source: http://stackoverflow.com/questions/4501636/creating-sublists
- geodezyx.utils.list_utils.sort_basename(file_paths)
Sort a list of file paths by their basenames.
- Parameters:
file_paths (list) – List of file paths to be sorted.
- Returns:
Sorted list of file paths by their basenames.
- Return type:
list
- geodezyx.utils.list_utils.sort_binom_list(x, y, array_out=False)
Sort Y according to X and sort X.
- Parameters:
x (list or array-like) – Reference values to sort by.
y (list or array-like) – Values to sort according to X ordering.
array_out (bool, optional) – If True, return numpy arrays. If False, return lists. Default is False.
- Returns:
(xnew, ynew) where both are sorted according to X. Type depends on array_out parameter.
- Return type:
tuple
- Raises:
Warning – If len(X) != len(Y), a warning is logged.
- geodezyx.utils.list_utils.sort_multinom_list(x, *y)
Sort multiple Y sequences according to X and sort X.
- Parameters:
x (list or array-like) – Reference values to sort by.
*y (list or array-like) – Variable number of sequences to sort according to X ordering.
- Returns:
(xnew, Ynew_1, Ynew_2, …) where all sequences are sorted according to X. X is returned as a numpy array, while Y sequences are returned as lists.
- Return type:
tuple
- geodezyx.utils.list_utils.sort_table(table, col)
Sort a table by multiple columns.
- Parameters:
table (list of lists or tuple of tuples) – where each inner list represents a row
col (int) – column number to sort by
- Returns:
outtable – sorted table
- Return type:
list
- geodezyx.utils.list_utils.sublistsIt(seq, lenofsublis_lis, output_array=False)
Divide a sequence into sublists of specified sizes.
- Parameters:
seq (list or array-like) – Input sequence to divide.
lenofsublis_lis (list) – List of integers specifying the size of each sublist. Example: [2, 3, 4, 2] creates 4 sublists of sizes 2, 3, 4, and 2.
output_array (bool, optional) – If True, return list of numpy arrays. If False, return list of lists. Default is False.
- Returns:
List of sublists (or arrays if output_array=True).
- Return type:
list
- Raises:
Exception – If sum(lenofsublis_lis) != len(seq).
- geodezyx.utils.list_utils.trio_lists_2_tab(xlis, ylis, vlis)
Convert three lists into a 2D table structure.
- Parameters:
xlis (list) – List of X values that will become column headers.
ylis (list) – List of Y values that will become row headers.
vlis (list) – List of data values corresponding to (X, Y) pairs.
- Returns:
A 2D table structure where the first row contains X values, and subsequent rows contain Y value and corresponding V values. Compatible with the tabulate module.
- Return type:
list
Notes
The data lookup is performed with a brute-force approach (nested loops), which is not optimized for large datasets.
Examples
>>> trio_lists_2_tab([1, 2, 1, 2], [1, 1, 2, 2], [10, 20, 30, 40]) [[1, 2], [1, 10, 20], [2, 30, 40]]
- geodezyx.utils.list_utils.uniq_and_sort(l, natural_sort=True)
Remove duplicates from a list and sort it.
- Parameters:
l (list) – Input list to deduplicate and sort.
natural_sort (bool, optional) – If True, use natural sorting (default). If False, use standard sorting.
- Returns:
Sorted list with duplicate elements removed.
- Return type:
list
- geodezyx.utils.list_utils.uniq_set_list(setlis, frozen=True)
Remove duplicate sets from a list of sets.
- Parameters:
setlis (list of sets) – Input list containing sets or set-like iterables.
frozen (bool, optional) – If True, returns frozensets (hashable and immutable). Default is True.
- Returns:
List of unique sets (either sets or frozensets depending on frozen parameter).
- Return type:
list
- geodezyx.utils.list_utils.uniqify_list(seq, idfun=None)
Remove duplicate elements from a sequence while preserving order.
- Parameters:
seq (iterable) – Input sequence to deduplicate.
idfun (callable, optional) – Function to extract the identifier for uniqueness comparison. If None, the elements themselves are used as identifiers.
- Returns:
Deduplicated sequence with order preserved.
- Return type:
list
Notes
- geodezyx.utils.list_utils.uniqify_list_of_lists(l)
Remove duplicate sublists while preserving uniqueness.
- Parameters:
l (list of lists) – Input list containing sublists.
- Returns:
List of unique sublists.
- Return type:
list
Notes
Source: http://stackoverflow.com/questions/3724551/python-uniqueness-for-list-of-lists
- geodezyx.utils.list_utils.uniquetol(a, tol)
Find unique elements in an array within a tolerance threshold.
- Parameters:
a (array-like) – Input array.
tol (float) – Absolute tolerance for uniqueness comparison.
- Returns:
Array of unique elements within the specified tolerance.
- Return type:
numpy.ndarray
Notes
- geodezyx.utils.list_utils.uniquetol2(a, tol=1e-06)
Find unique elements in an array within a tolerance threshold (optimized version).
- Parameters:
a (array-like) – Input array.
tol (float, optional) – Tolerance for rounding before finding unique elements. Default is 10**-6.
- Returns:
Array of unique elements.
- Return type:
numpy.ndarray
Notes
This is a faster alternative to uniquetol. Source: https://stackoverflow.com/questions/5426908/find-unique-elements-of-floating-point-array-in-numpy-with-comparison-using-a-d
geodezyx.utils.shell_like module
@author: psakic
- This sub-module of geodezyx.utils contains functions for Shell-like
operations in Python.
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils.shell_like.cat(outfilename, *infilenames)
Concatenate files.
- Parameters:
outfilename (str) – Output filename
infilenames (str) – Input filenames to concatenate
- Returns:
outfilename – The output filename
- Return type:
str
Notes
For just a print, use cat_print!
References
http://stackoverflow.com/questions/11532980/reproduce-the-unix-cat-command-in-python kindall response
- geodezyx.utils.shell_like.cat_print(inpfile)
Print file contents to stdout, line by line.
- Parameters:
inpfile (str) – Path to the file to print.
- Return type:
None
- geodezyx.utils.shell_like.cat_remove_header(infilepath, outfilepath, header='', header_included=False)
Concatenate file content starting from a specified header line.
- Parameters:
infilepath (str) – Path to the input file.
outfilepath (str) – Path to the output file.
header (str, optional) – The header pattern to search for. Default is empty string.
header_included (bool, optional) – If True, include the header line in the output. Default is False.
- Returns:
Path to the output file.
- Return type:
str
- geodezyx.utils.shell_like.check_regex(filein, regex)
Check if a file contains a regex pattern.
- Parameters:
filein (str) – Path to the file to search.
regex (str) – Regular expression pattern to search for.
- Returns:
True if the pattern is found in the file, False otherwise.
- Return type:
bool
- geodezyx.utils.shell_like.copy_recursive(src, dst, force=False)
Copy a directory recursively from src to dst with an option to force overwrite.
- Parameters:
src (str) – The source directory path.
dst (str) – The destination directory path.
force (bool, optional) – If True, overwrite the destination directory if it exists. Default is False.
- Return type:
None
- geodezyx.utils.shell_like.create_dir(directory)
Create a directory if it does not already exist.
- Parameters:
directory (str) – Path to the directory to create.
- Returns:
The directory path.
- Return type:
str
- geodezyx.utils.shell_like.egrep_big_string(regex, bigstring, only_first_occur=False)
Perform a regex grep on a big string separated with newlines.
Deprecated since version Use:
grep()instead, which handles this functionality (as of 260121).- Parameters:
regex (str) – Regular expression pattern to search for.
bigstring (str) – The large string (separated by newlines) to search in.
only_first_occur (bool, optional) – If True, return only the first matching line. Default is False.
- Returns:
Matching line(s) as str if single result, list if multiple results, or empty string if no matches.
- Return type:
str or list
Notes
This function must be improved with regular pattern matching, without relying on regex.
- geodezyx.utils.shell_like.empty_file_check(fpath)
Check if a file is empty or does not exist.
- Parameters:
fpath (str) – The file path to check.
- Returns:
True if the file is empty or does not exist, False otherwise.
- Return type:
bool
See also
http//stackoverflow.com/questions/2507808/python-how-to-check-file-empty-or-not
- geodezyx.utils.shell_like.fileprint(output, outfile)
Log output to console and append to a file.
- Parameters:
output (str) – The output message to log and write.
outfile (str) – Path to the output file.
- Return type:
None
- geodezyx.utils.shell_like.find_recursive(parent_folder, pattern, sort_results=True, case_sensitive=True, extended_file_stats=False, warn_if_empty=True, regex=False)
Find files in a folder and his sub-folders in a recursive way.
- Parameters:
parent_folder (str) – the parent folder path
pattern (str) – the researched files pattern name (can manage wildcard or regex) - wildcard (only * and ?) for case_sensitive = True - regex for case_sensitive = False
sort_results (bool) – Sort results
case_sensitive (bool) – Case sensitive or not. If False, the pattern must be a regex Deprecated since 2025-01, use regex instead
extended_file_stats (bool) –
if True, returns the stats of the files the outputed matches list will be a list of tuples (file_path,stat_object), where stat_object has the following attributes
st_mode - protection bits,
st_ino - inode number,
st_dev - device,
st_nlink - number of hard links,
st_uid - user id of owner,
st_gid - group id of owner,
st_size - size of file, in bytes,
st_atime - time of most recent access,
st_mtime - time of most recent content modification,
- st_ctime - platform dependent; time of most recent metadata
change on Unix, or the time of creation on Windows)
warn_if_empty (bool) – print a debug warning if no files are found
regex (bool) – if True, the pattern in a regular expression. Default is False
- Returns:
matches (list) – Found files
Source
——
https (//stackoverflow.com/questions/2186525/use-a-glob-to-find-files-recursively-in-python)
https (//stackoverflow.com/questions/15652594/how-to-find-files-with-specific-case-insensitive-extension-names-in-python)
(for the case unsensitive case)
- geodezyx.utils.shell_like.glob_smart(dir_path, file_pattern=None, verbose=True)
Find files in a directory using glob pattern with optional logging.
- Parameters:
dir_path (str) – The directory path to search in.
file_pattern (str, optional) – File pattern to match. Default is None (search all files).
verbose (bool, optional) – If True, log warnings/info about search results. Default is True.
- Returns:
List of file paths matching the pattern.
- Return type:
list
- geodezyx.utils.shell_like.grep(file_in, search_string, only_first_occur=False, invert=False, regex=False, line_number=False, col=(None, None), force_list_output=False)
Search for lines matching a pattern in a file.
Returns an empty string if nothing is found, not a singleton list with an empty string inside.
- Parameters:
file_in (str or file-like object) – Path to the file or a file-like object to search.
search_string (str or list) – String(s) to search for.
only_first_occur (bool, optional) – Return only the first occurrence. Default is False.
invert (bool, optional) – If True, return lines that do NOT match the search string. Default is False.
regex (bool, optional) – If True, treat search_string as a regular expression. Default is False.
line_number (bool, optional) – If True, also return the line numbers. Default is False.
col (tuple of int, optional) – Column range (start, end) where the grep is executed. Use None for unbounded indices. Default is (None, None).
force_list_output (bool, optional) – If True, always return a list even for single elements. Default is False.
- Returns:
If line_number is True and single result: (line_number, line)
If line_number is True and multiple results: (line_numbers_list, lines_list)
Otherwise returns matching line(s) as str or list, or empty string if no matches.
- Return type:
str, list, or tuple
Notes
If nothing is found returns an empty string (not a singleton list)
search_string can be a list of patterns to match any
- geodezyx.utils.shell_like.grep_boolean(file_in, search_string)
Check if a string exists in a file.
- Parameters:
file_in (str) – Path to the file to search.
search_string (str) – String to search for.
- Returns:
True if search_string is found in the file, False otherwise.
- Return type:
bool
- geodezyx.utils.shell_like.gzip_compress(inp_path, out_dir=None, out_fname=None, rm_inp=False)
Compress a file using gzip.
- Parameters:
inp_path (str) – Path to the input file to compress.
out_dir (str, optional) – Output directory. Default is None (same as input file).
out_fname (str, optional) – Output filename. Default is None (input filename + “.gz”).
rm_inp (bool, optional) – If True, remove the input file after compression. Default is False.
- Returns:
Path to the compressed output file.
- Return type:
str
- geodezyx.utils.shell_like.head(filename, count=1)
Get the first few lines of a file.
- Parameters:
filename (str or file-like object) – Path to the file or a file-like object (StringIO, BytesIO, etc.)
count (int, optional) – Number of lines to return from the beginning of the file. Default is 1.
- Returns:
List of lines from the beginning of the file.
- Return type:
list
- geodezyx.utils.shell_like.insert_lines_in_file(file_path, text_values, lines_ids)
Insert text lines at specified positions in a file.
- Parameters:
file_path (str) – Path to the file to modify.
text_values (str or list) – Text string(s) to insert.
lines_ids (int or list) – Line number(s) where text should be inserted.
- Returns:
Path to the modified file.
- Return type:
str
- geodezyx.utils.shell_like.insert_str_in_file_if_line_contains(file_path, str_to_insert, line_pattern_tup, position=None, only_first_occur=False)
Insert a string at lines matching a pattern.
- Parameters:
file_path (str) – Path to the file to modify.
str_to_insert (str) – String to insert before matching lines.
line_pattern_tup (tuple) – Tuple of patterns to search for.
position (int, optional) – Position for insertion (not implemented). Default is None.
only_first_occur (bool, optional) – If True, only process the first occurrence. Default is False.
- Returns:
Path to the modified file.
- Return type:
str
Notes
The position parameter is not currently implemented.
- geodezyx.utils.shell_like.is_exe(fpath)
Check if a file is executable.
- Parameters:
fpath (str) – File path.
- Returns:
True if the file is executable, False otherwise.
- Return type:
bool
- geodezyx.utils.shell_like.regex2filelist(dossier, regex, outtype='file')
Get files in a directory matching a regex pattern.
- Parameters:
dossier (str) – Path to the directory.
regex (str) – Regular expression pattern to match filenames.
outtype (str, optional) – Type of output. ‘file’ returns only files, other values return all matches. Default is ‘file’.
- Returns:
Sorted list of file paths matching the pattern.
- Return type:
list
- geodezyx.utils.shell_like.regex_or_from_list(listin)
Create a regex OR pattern from a list of strings.
- Parameters:
listin (list) – List of strings to convert to regex OR pattern.
- Returns:
Regex pattern matching any of the input strings, e.g., “(pattern1|pattern2)”.
- Return type:
str
- geodezyx.utils.shell_like.remove_dir(directory)
Remove a directory and all its contents.
- Parameters:
directory (str) – Path to the directory to remove.
- Return type:
None
Warning
Logs a warning if the directory does not exist.
- geodezyx.utils.shell_like.replace(file_path, pattern, subst)
Replace a string in a file with a substitute
- Parameters:
file_path (str) – path of the file.
pattern (str) – string to be replaced.
subst (str) – string which will be substituted.
- Return type:
None.
- geodezyx.utils.shell_like.subprocess_frontend(cmd_in, save_log=False, log_dir=None, log_name_out='out.log', log_name_err='err.log', logname_timestamp=False)
Run a shell command and optionally save output to log files.
- Parameters:
cmd_in (str) – Command to execute via shell.
save_log (bool, optional) – If True, save stdout and stderr to log files. Default is False.
log_dir (str, optional) – Directory where log files will be saved. Default is current working directory.
log_name_out (str, optional) – Name of the stdout log file. Default is “out.log”.
log_name_err (str, optional) – Name of the stderr log file. Default is “err.log”.
logname_timestamp (bool, optional) – If True, prepend timestamp to log filenames. Default is False.
- Returns:
- process1subprocess.CompletedProcess
The subprocess return object.
- process1_stdoutstr
Standard output from the command.
- process1_stderrstr
Standard error from the command.
- Return type:
tuple
- geodezyx.utils.shell_like.tail(filename, count=1, offset=1024)
Get the last few lines of a file efficiently.
Depending on the length of your lines, you will want to modify offset to get better performance.
- Parameters:
filename (str or file-like object) – Path to the file or a file-like object (StringIO, BytesIO, etc.)
count (int, optional) – Number of lines to return from the end of the file. Default is 1.
offset (int, optional) – Number of bytes to read from the end of the file. Default is 1024.
- Returns:
List of lines from the end of the file.
- Return type:
list
- geodezyx.utils.shell_like.uncompress(pathin, dirout='', opts='-f')
Uncompress a file using the uncompress command.
Deprecated since version Use:
geodezyx.files_rw.unzip_gz_z()instead.- Parameters:
pathin (str) – Path to the file to uncompress.
dirout (str, optional) – Output directory. Default is ‘’ (current directory).
opts (str, optional) – Options for the uncompress command. Default is ‘-f’.
- Returns:
Path to the uncompressed file, or None if input file does not exist.
- Return type:
str or None
- geodezyx.utils.shell_like.walk_dir(parent_dir)
From a main parent_dir, returns files_list & dirs_list containing all the files and all the dirs in the parent_dir. Supports wildcards in the parent_dir path.
- Parameters:
parent_dir (str) – The parent directory path, which can include wildcards.
- Returns:
files_list (list) – List of all file paths.
dirs_list (list) – List of all directory paths.
- geodezyx.utils.shell_like.write_in_file(string_to_write, outdir_or_outpath, outname='', ext='.txt', encoding='utf8', append=False)
Write a string to a file with support for bytes or text encoding.
- Parameters:
string_to_write (str or bytes) – The content to write to the file.
outdir_or_outpath (str) – Output directory path or full file path.
outname (str, optional) – Output filename (without extension). Default is “”.
ext (str, optional) – File extension. Default is ‘.txt’.
encoding (str, optional) – Text encoding to use. Default is ‘utf8’. See https://docs.python.org/3/library/codecs.html#standard-encodings
append (bool, optional) – If True, append to existing file. If False, overwrite. Default is False.
- Returns:
Path to the output file.
- Return type:
str
Notes
Supported encodings: utf8, latin_1, etc. See https://docs.python.org/3/library/codecs.html#standard-encodings
geodezyx.utils.utils_core module
@author: psakic
This sub-module of geodezyx.utils contains functions for misc. low level function
it can be imported directly with: from geodezyx import utils
The geodezyx toolbox is a software for simple but useful functions for Geodesy and Geophysics under the GNU LGPL v3 License
Copyright (C) 2019 Pierre Sakic et al. (IPGP, sakic@ipgp.fr) GitHub repository : https://github.com/IPGP/geodezyx
- geodezyx.utils.utils_core.Aformat(A, landscape=True)
- class geodezyx.utils.utils_core.Tee(*files)
Bases:
objectInternal class for Tee_frontend
Source
based on http://stackoverflow.com/questions/11325019/output-on-the-console-and-file-using-python Secondary links http://stackoverflow.com/questions/616645/how-do-i-duplicate-sys-stdout-to-a-log-file-in-python http://stackoverflow.com/questions/2996887/how-to-replicate-tee-behavior-in-python-when-using-subprocess
- flush()
- pause()
- restart()
- stop()
- write(obj)
- geodezyx.utils.utils_core.Tee_frontend(dir_in, logname_in, suffix='', ext='log', print_timestamp=True)
Write in a file the console output
- Parameters:
dir_in (str) – directory path.
logname_in (str) – logfile name.
suffix (str, optional) – An optional suffix. The default is ‘’.
ext (str, optional) – file extension. The default is ‘log’.
print_timestamp (bool, optional) – print a timestamp in the filename. The default is True.
- Returns:
F_tee – Object controling the output
- Return type:
F_tee object
Note
It is recommended to stop the writing at the end of the script with F_tee.stop()
- geodezyx.utils.utils_core.add_symbol_to_new_lines(s, symbol='·')
Adds a specified symbol to the beginning of each new line in a multi-line string.
- Parameters:
s (str) – The input multi-line string.
symbol (str, optional) – The symbol to add to each new line. Default is ‘·’.
- Returns:
The modified string with the symbol added to each new line.
- Return type:
str
- geodezyx.utils.utils_core.alphabet(num=None)
- geodezyx.utils.utils_core.alphabet_reverse(letter=None)
- geodezyx.utils.utils_core.array_from_lists(*listsin)
fonction pour arreter de galerer avec les conversions de lists => matrices
- geodezyx.utils.utils_core.boolean_dict(list_of_keywords)
- geodezyx.utils.utils_core.clear_all()
Clears all the variables from the workspace of the spyder application.
- geodezyx.utils.utils_core.dday()
Give the time span between present and toolbox author’s PhD defense date
(tests also the console messages)
- Returns:
D – elapsed time.
- Return type:
datetime
- geodezyx.utils.utils_core.detect_encoding(file_path)
Detect the encoding of a text file.
This function uses the chardet library to detect the encoding of a given text file. It reads the file line by line and feeds each line to a chardet UniversalDetector. When the detector has made a determination, it stops reading the file and returns the detected encoding.
- Parameters:
file_path (str) – The path to the text file for which to detect the encoding.
- Returns:
str – The detected encoding of the text file.
Source
——
https (//www.geeksforgeeks.org/detect-encoding-of-a-text-file-with-python/)
- geodezyx.utils.utils_core.diagonalize(x, n=10)
- geodezyx.utils.utils_core.docstring_generic()
prints and returns an prototype generic docstring. Based on Numpy docstring convention
Source
- geodezyx.utils.utils_core.eval_a_dict(dictin, where, verbose=True)
Evaluate dictionary values in a given namespace.
- Parameters:
dictin (dict) – Dictionary to evaluate
where (dict) – Namespace where to evaluate (usually globals() or locals())
verbose (bool) – Print verbose output (default True)
- Return type:
None
Notes
WARNING: doesn’t work in a function! Use instead:
for k, v in booldic.items(): globals()[k] = v locals()[k] = v
- geodezyx.utils.utils_core.extract_text_between_elements(file_path, elt_start, elt_end)
- geodezyx.utils.utils_core.extract_text_between_elements_2(file_path, elt_start, elt_end, return_string=False, nth_occur_elt_start=0, nth_occur_elt_end=0, invert=False, verbose=False)
This function is based on REGEX (elt_start , elt_end are REGEX) and can manage several blocks in the same file
return_string = True : returns a string of the matched lines return_string = False : returns a list of the matched lines invert : exclude text between the pattern
- NBin SINEX context, with “+MARKER”, use backslash i.e.
“+MARKER”
NB2 : think about StingIO for a Pandas DataFrame Handeling https://docs.python.org/2/library/stringio.html
- geodezyx.utils.utils_core.get_computer_name()
- geodezyx.utils.utils_core.get_function_name()
- geodezyx.utils.utils_core.get_specific_locals(prefix)
get locals params with ‘prefix’ in the name can actually be a suffix
- geodezyx.utils.utils_core.get_timestamp(outstring=True, separator='T', utc=False)
frontend to get easily a timestamp
- geodezyx.utils.utils_core.get_type_smart(obj_in)
get type of an object, to convert easily another one to this type for instance type(np.array(A)) doesn’t return a constructor
- geodezyx.utils.utils_core.get_username()
- geodezyx.utils.utils_core.globals_filtered()
Filter globals() variables with only compatible variables for pickle.
- Returns:
data_out – filtered globals() variables.
- Return type:
dict
- geodezyx.utils.utils_core.greek_alphabet(num=None, maj=False)
- geodezyx.utils.utils_core.indice_printer(i, print_every=10, text_before='')
print an index every N iteration
- geodezyx.utils.utils_core.is_in_str(string, *patterns)
recipe to the famous problem of pattern in string from http://stackoverflow.com/questions/3389574/check-if-multiple-strings-exist-in-another-string
- geodezyx.utils.utils_core.is_iterable(inp, consider_str_as_iterable=False, consider_dict_as_iterable=False)
Test if the input is an iterable like a list or a numpy array or not
- Parameters:
inp (list, numpy.array, ...)
consider_str_as_iterable (bool) – string are considered as iterable by Python per default This boolean will avoid True as return if you test a string
- Returns:
out – True if inp is iterable, False either
- Return type:
bool
- geodezyx.utils.utils_core.is_lambda(v)
Check if v is lambda
Source
https://stackoverflow.com/questions/3655842/how-can-i-test-whether-a-variable-holds-a-lambda
- geodezyx.utils.utils_core.is_not_iterable(inp, consider_str_as_iterable=False)
Simple negation of is_iterable()
- geodezyx.utils.utils_core.join_improved(strseparat, *varsin)
- geodezyx.utils.utils_core.line_count(filein)
- geodezyx.utils.utils_core.line_in_file_checker(file_path, string)
- geodezyx.utils.utils_core.listify(inp)
Convert the input into a list.
- Parameters:
inp (any) – The input to be converted into a list.
- Returns:
A list containing the input elements if the input is iterable, otherwise a list with the input as its single element.
- Return type:
list
- geodezyx.utils.utils_core.mdot(*args)
- geodezyx.utils.utils_core.mdotr(*args)
- geodezyx.utils.utils_core.memmap_from_array(arrin)
- geodezyx.utils.utils_core.mmpa(arrin)
- geodezyx.utils.utils_core.multidot(tupin)
- geodezyx.utils.utils_core.open_readlines_smart(file_in, decode_type='iso-8859-1', verbose=False)
This function takes an input object, opens it, and reads its lines. The input file can be the path of a file as a string or as a Path object, or the file content as a string, bytes, StringIO object, or a list of lines.
- Parameters:
file_in (various) – An input object. This can be a string representing a file path, a Path object, a string representing file content, bytes, a StringIO object, or a list of lines.
decode_type (str, optional) – The decode standard. Default is “iso-8859-1”.
verbose (bool, optional) – If set to True, the function will print the type of the input file. Default is False.
- Returns:
lines – A list of the lines in the input file.
- Return type:
list
- Raises:
FileNotFoundError – If the file specified by file_in does not exist.
- geodezyx.utils.utils_core.pickle_loader(pathin)
Load a Python object saved as a Pickle file.
Wrapper of pickle.load
- Parameters:
pathin (str) – the input pickle file path.
- Returns:
outdata – Data which will be loaded from the a pickle..
- Return type:
generic
- geodezyx.utils.utils_core.pickle_saver(datain, outdir=None, outname=None, ext='.pik', timestamp=False, full_path=None)
Save a Python object in a Pickle file.
Wrapper of pickle.dump
- Parameters:
datain (generic) – Data which will be saved as a pickle.
outdir (str, optional) – output directory. The default is None.
outname (str, optional) – pickle output name. The default is None.
ext (str, optional) – pickle file extension. The default is ‘.pik’.
timestamp (bool, optional) – add the timestamp in the pickle’s filename. The default is False.
full_path (str, optional) – gives the full path where to save the pickle. if full_path is given, override outdir and outname. The default is None.
- Returns:
outpath – the output pickle file path.
- Return type:
str
- geodezyx.utils.utils_core.read_comments(filein, comment='#')
- geodezyx.utils.utils_core.read_mat_file(pathin, full=False)
low level reader of a MATLAB mat file
- geodezyx.utils.utils_core.replace_in_file(file_in, str_before, str_after)
https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file
- geodezyx.utils.utils_core.save_array_fast(arrin, outname='', outdir='/home/psakicki/aaa_FOURBI', txt=True)
- geodezyx.utils.utils_core.save_obj_as_file(objin, pathin, prefix, ext='.exp', suffix='')
OLD proto-version of pickle saver DISCONTINUED
- geodezyx.utils.utils_core.split_improved(strin, sep_left, sep_right)
- geodezyx.utils.utils_core.split_string_after_n_chars_at_space(s, n)
Splits a string into substrings of a maximum length of n characters.
This function splits a string into substrings of a maximum length of n characters, but only splits at spaces and inserts a newline (
\n) after each substring.- Parameters:
s (str) – The input string to be split.
n (int) – The maximum length of each substring.
- Returns:
The modified string with newlines inserted.
- Return type:
str
- geodezyx.utils.utils_core.spyder_run_check()
Check if the code is run inside Spyder IDE
- geodezyx.utils.utils_core.str2float_smart(str_in)
- geodezyx.utils.utils_core.str2int_float_autodetect(str_list_in)
- geodezyx.utils.utils_core.str2int_smart(str_in)
- geodezyx.utils.utils_core.str_2_float_line(line, sep=' ', out_type=<class 'float'>)
convert a line of number (in str) to a list of float (or other out_type)
- geodezyx.utils.utils_core.stringizer(tupin, separ=' ', eol=True)
transform elts of a tuple in a string line, ready for write in a file
- geodezyx.utils.utils_core.timeout(func, args=(), kwargs={}, timeout_duration=1, default=None)
This function will spwan a thread and run the given function using the args, kwargs and return the given default value if the timeout_duration is exceeded
http://stackoverflow.com/questions/366682/how-to-limit-execution-time-of-a-function-call-in-python
- geodezyx.utils.utils_core.transpose_vector_array(X)
transpose a Nx3 array to a 3xN array if necessary (necessary for some usages)
- Parameters:
X (iterable)
- Returns:
X – X transposed if necessary.
- Return type:
iterable
- geodezyx.utils.utils_core.trunc(f, n)
Truncates/pads a float f to n decimal places without rounding
- geodezyx.utils.utils_core.vectorialize(array_in)
redondant avec .flat ???