:py:mod:`hazimp.misc`
=====================

.. py:module:: hazimp.misc

.. autoapi-nested-parse::

   Functions that haven't found a proper module.


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   hazimp.misc.csv2dict
   hazimp.misc.instanciate_classes
   hazimp.misc.mod_file_list
   hazimp.misc.get_required_args
   hazimp.misc.squash_narray
   hazimp.misc.add
   hazimp.misc.weighted_values
   hazimp.misc.sorted_dict_values
   hazimp.misc.permutate_att_values
   hazimp.misc.get_file_mtime
   hazimp.misc.get_git_commit
   hazimp.misc.get_s3_client
   hazimp.misc.get_temporary_directory
   hazimp.misc.s3_path_segments_from_vsis3
   hazimp.misc.download_from_s3
   hazimp.misc.download_file_from_s3_if_needed
   hazimp.misc.create_temp_file_path_for_s3
   hazimp.misc.upload_to_s3_if_applicable
   hazimp.misc.check_data_type


Attributes
~~~~~~~~~~

.. autoapisummary::

   hazimp.misc.LOGGER
   hazimp.misc.TEMP_DIR
   hazimp.misc.S3_CLIENT
   hazimp.misc.ROOT_DIR
   hazimp.misc.RESOURCE_DIR
   hazimp.misc.EXAMPLE_DIR
   hazimp.misc.INTID
   hazimp.misc.DRIVERS
   hazimp.misc.DATEFMT


.. py:data:: LOGGER
   

.. py:data:: TEMP_DIR
   

.. py:data:: S3_CLIENT
   

.. py:data:: ROOT_DIR
   

.. py:data:: RESOURCE_DIR
   

.. py:data:: EXAMPLE_DIR
   

.. py:data:: INTID
   :annotation: = internal_id

   
.. py:data:: DRIVERS
   

.. py:data:: DATEFMT
   :annotation: = %Y-%m-%d %H:%M:%S %Z

   
.. py:function:: csv2dict(filename, add_ids=False)

   Read a csv file in and return the information as a dictionary
   where the key is the column names and the values are column arrays.

   :param add_ids: If True add a key, value of ids, from 0 to n
   :param filename: The csv file path string.


.. py:function:: instanciate_classes(module)

   Create a dictionary of calc names (key) and the calc instance (value).

   :param module: ??


.. py:function:: mod_file_list(file_list, variable)

   Modify the filename list for working with netcdf format files.

   For netcdf files, GDAL expects the filename to be of the form
   'NETCDF:"<filename>":<variable>', where variable is a valid
   variable in the netcdf file.

   :param file_list: List of files or a single file to be processed
   :param variable: Variable name

   :returns: list of filenames, modified to the above format


.. py:function:: get_required_args(func)

   Get the arguments required in a function, from the function.

   :param func: The function that you need to know about.


.. py:function:: squash_narray(ary)

   Reduce an array to 1 dimension. Firstly try to average the values.
   If that doesn't work only take the first dimension.

   :param ary: the numpy array to be squashed.
   :returns: The ary array, averaged to 1d.


.. py:function:: add(var, var2)

   Add the values of two numpy arrays together.
   If the values are strings concatenate them.

   :param var: The values in this array are added.
   :param var2: The values in this array are added.
   :returns: The new column name, with the values of Var1 + var2.


.. py:function:: weighted_values(values, probabilities, size, forced_random=None)

   Return values weighted by the probabilities.

   precondition: The sum of probabilities should sum to 1
   Code from: goo.gl/oBo2zz

   :param values:  The values to go into the final array
   :param probabilities:  The probabilities of the values
   :param size: The array size/shape. Must be 1D.
   :return: The array of values, made using the probabilities


.. py:function:: sorted_dict_values(adict)

   Given a dictionary return the sorted keys and values,
   sorting with respect to the keys.

   code from: goo.gl/Sb7Czw
   :param adict: A dictionary.
   :return: The sorted keys and the corresponding values
   as two lists.


.. py:function:: permutate_att_values(dframe, fields, groupby=None)

   Given a dataframe, return the dataframe with the values in
   ``fields`` permutated. If the ``groupby`` arg is given,
   then permutate the values of ``fields`` within each grouping of
   ``groupby``.

   :param dframe: A dataframe.
   :type dframe: ``pandas.DataFrame``
   :param fields: Name of a field to permutate, or a list of fields.
   :type fields: str or list.
   :param str groupby: Name of the field to group values by.

   :return: The same ``pandas.DataFrame``, with the values of ``fields``
            permutated.


.. py:function:: get_file_mtime(file)

   Retrieve the modified time of a file

   :param str file: Full path to a valid file

   :returns: ISO-format of the modification time of the file


.. py:function:: get_git_commit()

   Return the git commit hash, branch, datetime of the commit, as well as the
   url of the remote repo.

   :returns: the commit hash and current branch if the code is maintained in a
             git repo. If not, the commit is "unknown", branch is empty and
             the datetime is set to be the modified time of the called python
             script (usually hazimp/main.py)


.. py:function:: get_s3_client(**kwargs)

   Returns service client for S3. It eliminates initialising service
   client if AWS path is not used.


.. py:function:: get_temporary_directory()

   Returns temporary directory to store file from and to
   S3 for local processing.


.. py:function:: s3_path_segments_from_vsis3(s3_path)

   Function to extract bucket name, key and filename from path specified
   using GDAL Virtual File Systems conventions
   :param str s3_path: Path to S3 location in /vsis3/bucket/key format.
   :returns  bucket name, bucket key and file name


.. py:function:: download_from_s3(s3_source_path, destination_directory, ignore_exception=False)

   Function to download a S3 file into local directory.
   :param str s3_source_path: S3 path of the file.
   :param str destination_directory: Local directory location to


.. py:function:: download_file_from_s3_if_needed(s3_source_path, default_ext='.shp', destination_directory=None)

   This function checks if the path is pointing to S3. If S3 path is
   specified, this function downloads the file to a temporary directory and
   return local file path. In case of shapefile, 4 other files (with
   extensions .shx, .dbf, .prj and .shp.xml) are downloaded from S3.

   If zip file path is provided, the zip file is extracted and .shp
   file path is returned.

   :param str s3_source_path: S3 path of the file.
   :param str default_ext: If a zipped file is
              provided, this extension shall be used to find the the
              target file
   :param str destination_directory: Local directory location to
   :returns: downloaded file path in local file system.


.. py:function:: create_temp_file_path_for_s3(destination_path)

   This function checks if the path is pointing to S3. If yes, it changes file
   path to a file in temporary directory which will be uploaded after later.
   :param str destination_path: S3 path of the file.
   :returns: local file path, bucket name and bucket key.


.. py:function:: upload_to_s3_if_applicable(local_path, bucket_name, bucket_key, ignore_exception=False)

   Function to upload files from local directory to s3.

   :param str local_path: Local directory path containing files to upload.
   :param str bucket_name: Destination S3 bucket name
   :param str bucket_key: Destination S3 bucket key for the file
   :param bool ignore_exception: ignore any exception related to file upload.
           Set true for optional files.


.. py:function:: check_data_type(data)

   Function to check the data type of a given attribute

   :param data: Sample of the data
   :type data: `pd.Series` or `pd.DataFrame`