hazimp.misc

Functions that haven’t found a proper module.

Module Contents

Functions

csv2dict(filename, add_ids=False)

Read a csv file in and return the information as a dictionary

instanciate_classes(module)

Create a dictionary of calc names (key) and the calc instance (value).

mod_file_list(file_list, variable)

Modify the filename list for working with netcdf format files.

get_required_args(func)

Get the arguments required in a function, from the function.

squash_narray(ary)

Reduce an array to 1 dimension. Firstly try to average the values.

add(var, var2)

Add the values of two numpy arrays together.

weighted_values(values, probabilities, size, forced_random=None)

Return values weighted by the probabilities.

sorted_dict_values(adict)

Given a dictionary return the sorted keys and values,

permutate_att_values(dframe, fields, groupby=None)

Given a dataframe, return the dataframe with the values in

get_file_mtime(file)

Retrieve the modified time of a file

get_git_commit()

Return the git commit hash, branch, datetime of the commit, as well as the

get_s3_client(**kwargs)

Returns service client for S3. It eliminates initialising service

get_temporary_directory()

Returns temporary directory to store file from and to

s3_path_segments_from_vsis3(s3_path)

Function to extract bucket name, key and filename from path specified

download_from_s3(s3_source_path, destination_directory, ignore_exception=False)

Function to download a S3 file into local directory.

download_file_from_s3_if_needed(s3_source_path, default_ext='.shp', destination_directory=None)

This function checks if the path is pointing to S3. If S3 path is

create_temp_file_path_for_s3(destination_path)

This function checks if the path is pointing to S3. If yes, it changes file

upload_to_s3_if_applicable(local_path, bucket_name, bucket_key, ignore_exception=False)

Function to upload files from local directory to s3.

check_data_type(data)

Function to check the data type of a given attribute

Attributes

LOGGER

TEMP_DIR

S3_CLIENT

ROOT_DIR

RESOURCE_DIR

EXAMPLE_DIR

INTID

DRIVERS

DATEFMT

hazimp.misc.LOGGER
hazimp.misc.TEMP_DIR
hazimp.misc.S3_CLIENT
hazimp.misc.ROOT_DIR
hazimp.misc.RESOURCE_DIR
hazimp.misc.EXAMPLE_DIR
hazimp.misc.INTID = internal_id
hazimp.misc.DRIVERS
hazimp.misc.DATEFMT = %Y-%m-%d %H:%M:%S %Z
hazimp.misc.csv2dict(filename, add_ids=False)

Read a csv file in and return the information as a dictionary where the key is the column names and the values are column arrays.

Parameters:
  • add_ids – If True add a key, value of ids, from 0 to n

  • filename – The csv file path string.

hazimp.misc.instanciate_classes(module)

Create a dictionary of calc names (key) and the calc instance (value).

Parameters:

module

??

hazimp.misc.mod_file_list(file_list, variable)

Modify the filename list for working with netcdf format files.

For netcdf files, GDAL expects the filename to be of the form ‘NETCDF:”<filename>”:<variable>’, where variable is a valid variable in the netcdf file.

Parameters:
  • file_list – List of files or a single file to be processed

  • variable – Variable name

Returns:

list of filenames, modified to the above format

hazimp.misc.get_required_args(func)

Get the arguments required in a function, from the function.

Parameters:

func – The function that you need to know about.

hazimp.misc.squash_narray(ary)

Reduce an array to 1 dimension. Firstly try to average the values. If that doesn’t work only take the first dimension.

Parameters:

ary – the numpy array to be squashed.

Returns:

The ary array, averaged to 1d.

hazimp.misc.add(var, var2)

Add the values of two numpy arrays together. If the values are strings concatenate them.

Parameters:
  • var – The values in this array are added.

  • var2 – The values in this array are added.

Returns:

The new column name, with the values of Var1 + var2.

hazimp.misc.weighted_values(values, probabilities, size, forced_random=None)

Return values weighted by the probabilities.

precondition: The sum of probabilities should sum to 1 Code from: goo.gl/oBo2zz

Parameters:
  • values – The values to go into the final array

  • probabilities – The probabilities of the values

  • size – The array size/shape. Must be 1D.

Returns:

The array of values, made using the probabilities

hazimp.misc.sorted_dict_values(adict)

Given a dictionary return the sorted keys and values, sorting with respect to the keys.

code from: goo.gl/Sb7Czw :param adict: A dictionary. :return: The sorted keys and the corresponding values as two lists.

hazimp.misc.permutate_att_values(dframe, fields, groupby=None)

Given a dataframe, return the dataframe with the values in fields permutated. If the groupby arg is given, then permutate the values of fields within each grouping of groupby.

Parameters:
  • dframe (pandas.DataFrame) – A dataframe.

  • fields (str or list.) – Name of a field to permutate, or a list of fields.

  • groupby (str) – Name of the field to group values by.

Returns:

The same pandas.DataFrame, with the values of fields permutated.

hazimp.misc.get_file_mtime(file)

Retrieve the modified time of a file

Parameters:

file (str) – Full path to a valid file

Returns:

ISO-format of the modification time of the file

hazimp.misc.get_git_commit()

Return the git commit hash, branch, datetime of the commit, as well as the url of the remote repo.

Returns:

the commit hash and current branch if the code is maintained in a git repo. If not, the commit is “unknown”, branch is empty and the datetime is set to be the modified time of the called python script (usually hazimp/main.py)

hazimp.misc.get_s3_client(**kwargs)

Returns service client for S3. It eliminates initialising service client if AWS path is not used.

hazimp.misc.get_temporary_directory()

Returns temporary directory to store file from and to S3 for local processing.

hazimp.misc.s3_path_segments_from_vsis3(s3_path)

Function to extract bucket name, key and filename from path specified using GDAL Virtual File Systems conventions :param str s3_path: Path to S3 location in /vsis3/bucket/key format. :returns bucket name, bucket key and file name

hazimp.misc.download_from_s3(s3_source_path, destination_directory, ignore_exception=False)

Function to download a S3 file into local directory. :param str s3_source_path: S3 path of the file. :param str destination_directory: Local directory location to

hazimp.misc.download_file_from_s3_if_needed(s3_source_path, default_ext='.shp', destination_directory=None)

This function checks if the path is pointing to S3. If S3 path is specified, this function downloads the file to a temporary directory and return local file path. In case of shapefile, 4 other files (with extensions .shx, .dbf, .prj and .shp.xml) are downloaded from S3.

If zip file path is provided, the zip file is extracted and .shp file path is returned.

Parameters:
  • s3_source_path (str) – S3 path of the file.

  • default_ext (str) – If a zipped file is provided, this extension shall be used to find the the target file

  • destination_directory (str) – Local directory location to

Returns:

downloaded file path in local file system.

hazimp.misc.create_temp_file_path_for_s3(destination_path)

This function checks if the path is pointing to S3. If yes, it changes file path to a file in temporary directory which will be uploaded after later. :param str destination_path: S3 path of the file. :returns: local file path, bucket name and bucket key.

hazimp.misc.upload_to_s3_if_applicable(local_path, bucket_name, bucket_key, ignore_exception=False)

Function to upload files from local directory to s3.

Parameters:
  • local_path (str) – Local directory path containing files to upload.

  • bucket_name (str) – Destination S3 bucket name

  • bucket_key (str) – Destination S3 bucket key for the file

  • ignore_exception (bool) – ignore any exception related to file upload. Set true for optional files.

hazimp.misc.check_data_type(data)

Function to check the data type of a given attribute

Parameters:

data (pd.Series or pd.DataFrame) – Sample of the data