disdrodb.l0 package

Subpackages

Submodules

disdrodb.l0.check_configs module

disdrodb.l0.check_metadata module

disdrodb.l0.check_readers module

disdrodb.l0.check_standards module

disdrodb.l0.io module

disdrodb.l0.io.check_glob_pattern(pattern: str) None[source]

Check if the input parameters is a string and if it can be used as pattern.

Parameters

pattern (str) – String to be checked.

Raises
  • TypeError – The input parameter is not a string.

  • ValueError – The input parameter can not be used as pattern.

disdrodb.l0.io.check_glob_patterns(patterns: Union[str, list]) list[source]

Check if glob patterns are valids.

disdrodb.l0.io.check_processed_dir(processed_dir)[source]

Check input, format and validity of the directory path

Parameters

processed_dir (str) – Path of the processed directory

Returns

Path of the processed directory

Return type

str

disdrodb.l0.io.check_raw_dir(raw_dir: str, verbose: bool = False) None[source]

Check validity of raw_dir.

Steps: 1. Check that ‘raw_dir’ is a valid directory path 2. Check that ‘raw_dir’ follows the expect directory structure 3. Check that each station_name directory contains data 4. Check that for each station_name the mandatory metadata.yml is specified. 4. Check that for each station_name the mandatory issue.yml is specified.

Parameters
  • raw_dir (str) – Input raw directory

  • verbose (bool, optional) – Wheter to verbose the processing. The default is False.

disdrodb.l0.io.create_directory_structure(processed_dir, product_level, station_name, force, verbose=False)[source]

Create directory structure for L0B and higher DISDRODB products.

disdrodb.l0.io.create_initial_directory_structure(raw_dir, processed_dir, station_name, force, verbose=False, product_level='L0A')[source]

Create directory structure for the first L0 DISDRODB product.

If the input data are raw text files –> product_level = “L0A” (run_l0a) If the input data are raw netCDF files –> product_level = “L0B” (run_l0b_nc)

disdrodb.l0.io.get_L0A_dir(processed_dir: str, station_name: str) str[source]

Define L0A directory.

Parameters
  • processed_dir (str) – Path of the processed directory

  • station_name (str) – Name of the station

Returns

L0A directory path.

Return type

str

disdrodb.l0.io.get_L0A_fname(df, processed_dir, station_name: str) str[source]

Define L0A file name.

Parameters
  • df (pd.DataFrame) – L0A DataFrame

  • processed_dir (str) – Path of the processed directory

  • station_name (str) – Name of the station

Returns

L0A file name.

Return type

str

disdrodb.l0.io.get_L0A_fpath(df: DataFrame, processed_dir: str, station_name: str) str[source]

Define L0A file path.

Parameters
  • df (pd.DataFrame) – L0A DataFrame.

  • processed_dir (str) – Path of the processed directory.

  • station_name (str) – Name of the station.

Returns

L0A file path.

Return type

str

disdrodb.l0.io.get_L0B_dir(processed_dir: str, station_name: str) str[source]

Define L0B directory.

Parameters
  • processed_dir (str) – Path of the processed directory

  • station_name (int) – Name of the station

Returns

Path of the L0B directory

Return type

str

disdrodb.l0.io.get_L0B_fname(ds, processed_dir, station_name: str) str[source]

Define L0B file name.

Parameters
  • ds (xr.Dataset) – L0B xarray Dataset

  • processed_dir (str) – Path of the processed directory

  • station_name (str) – Name of the station

Returns

L0B file name.

Return type

str

disdrodb.l0.io.get_L0B_fpath(ds: Dataset, processed_dir: str, station_name: str, l0b_concat=False) str[source]

Define L0B file path.

Parameters
  • ds (xr.Dataset) – L0B xarray Dataset.

  • processed_dir (str) – Path of the processed directory.

  • station_name (str) – ID of the station

  • l0b_concat (bool) – If False, the file is specified inside the station directory. If True, the file is specified outside the station directory.

Returns

L0B file path.

Return type

str

disdrodb.l0.io.get_campaign_name(path: str) str[source]

Return the campaign name from a file or directory path.

Current assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters

base_dir (str) – path can be a campaign_dir (‘raw_dir’ or ‘processed_dir’), or a DISDRODB file path.

Returns

Name of the campaign.

Return type

str

disdrodb.l0.io.get_data_source(path: str) str[source]

Return the data_source from a file or directory path.

Current assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters

base_dir (str) – path can be a campaign_dir (‘raw_dir’ or ‘processed_dir’), or a DISDRODB file path.

Returns

Name of the campaign.

Return type

str

disdrodb.l0.io.get_dataframe_min_max_time(df: DataFrame)[source]

Retrieves dataframe starting and ending time.

Parameters

df (pd.DataFrame) – Input dataframe

Returns

(starting_time, ending_time)

Return type

tuple

disdrodb.l0.io.get_dataset_min_max_time(ds: Dataset)[source]

Retrieves dataset starting and ending time.

Parameters

ds (xr.Dataset) – Input dataset

Returns

(starting_time, ending_time)

Return type

tuple

disdrodb.l0.io.get_disdrodb_dir(path: str) str[source]

Return the disdrodb base directory from a file or directory path.

Current assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters

path (str) – path can be a campaign_dir (‘raw_dir’ or ‘processed_dir’), or a DISDRODB file path.

Returns

Path of the DISDRODB directory.

Return type

str

disdrodb.l0.io.get_disdrodb_path(path: str) str[source]

Return the path fron the disdrodb_dir directory.

Current assumption: no data_source, campaign_name, station_name or file contain the word DISDRODB!

Parameters

path (str) – path can be a campaign_dir (‘raw_dir’ or ‘processed_dir’), or a DISDRODB file path.

Returns

Path inside the DISDRODB archive. Format: DISDRODB/<Raw or Processed>/<data_source>/…

Return type

str

disdrodb.l0.io.get_l0a_file_list(processed_dir, station_name, debugging_mode)[source]

Retrieve L0A files for a give station.

Parameters
  • processed_dir (str) – Directory of the campaign where to search for the L0A files. Format <..>/DISDRODB/Processed/<data_source>/<campaign_name>

  • station_name (str) – ID of the station

  • debugging_mode (bool, optional) – If True, it select maximum 3 files for debugging purposes. The default is False.

Returns

list_fpaths – List of L0A file paths.

Return type

list

disdrodb.l0.io.get_raw_file_list(raw_dir, station_name, glob_patterns, verbose=False, debugging_mode=False)[source]

Get the list of files from a directory based on input parameters.

Currently concatenates all files provided by the glob patterns. In future, this might be modified to enable DISDRODB processing when raw data are separated in multiple files.

Parameters
  • raw_dir (str) – Directory of the campaign where to search for files. Format <..>/DISDRODB/Raw/<data_source>/<campaign_name>

  • station_name (str) – ID of the station

  • verbose (bool, optional) – Wheter to verbose the processing. The default is False.

  • debugging_mode (bool, optional) – If True, it select maximum 3 files for debugging purposes. The default is False.

Returns

list_fpaths – List of files file paths.

Return type

list

disdrodb.l0.io.read_L0A_dataframe(fpaths: Union[str, list], verbose: bool = False, debugging_mode: bool = False) DataFrame[source]

Read DISDRODB L0A Apache Parquet file(s).

Parameters
  • fpaths (str or list) – Either a list or a single filepath .

  • verbose (bool) – Whether to print detailed processing information into terminal. The default is False.

  • debugging_mode (bool) – If True, it reduces the amount of data to process. If fpaths is a list, it reads only the first 3 files For each file it select only the first 100 rows. The default is False.

Returns

L0A Dataframe.

Return type

pd.DataFrame

disdrodb.l0.issue module

class disdrodb.l0.issue.NoDatesSafeLoader(stream)[source]

Bases: SafeLoader

classmethod remove_implicit_resolver(tag_to_remove)[source]

Remove implicit resolvers for a particular tag

Takes care not to modify resolvers in super classes.

We want to load datetimes as strings, not dates, because we go on to serialise as json which doesn’t have the advanced types of yaml, and leads to incompatibilities down the track.

disdrodb.l0.issue.check_issue_dict(issue_dict)[source]

Check validity of the issue dictionary

disdrodb.l0.issue.check_issue_file(fpath: str) None[source]

Check issue YAML file validity.

Parameters

fpath (str) – Issue YAML file path.

disdrodb.l0.issue.check_time_periods(time_periods)[source]

Check time_periods validity.

disdrodb.l0.issue.check_timesteps(timesteps)[source]

Check timesteps validity.

It expects timesteps string in YYYY-mm-dd HH:MM:SS format with second accuracy. If timesteps is None, return None.

disdrodb.l0.issue.is_numpy_array_datetime(arr)[source]

Check if the numpy array contains datetime64

Parameters

arr (numpy array) – Numpy array to check.

Returns

Numpy array checked.

Return type

numpy array

disdrodb.l0.issue.is_numpy_array_string(arr)[source]

Check if the numpy array contains strings

Parameters

arr (numpy array) – Numpy array to check.

disdrodb.l0.issue.load_yaml_without_date_parsing(filepath)[source]

Read a YAML file without converting automatically date string to datetime.

disdrodb.l0.issue.read_issue(raw_dir: str, station_name: str) dict[source]

Read YAML issue file.

Parameters
  • raw_dir (str) – Path of the campaign raw directory.

  • station_name (int) – Station name.

Returns

Issue dictionary.

Return type

dict

disdrodb.l0.issue.read_issue_file(fpath: str) dict[source]

Read YAML issue file.

Parameters

fpath (str) – Filepath of the issue YAML.

Returns

Issue dictionary.

Return type

dict

disdrodb.l0.issue.write_default_issue(fpath: str) None[source]

Write an empty issue YAML file.

Parameters

fpath (str) – Filepath of the issue YAML to write.

disdrodb.l0.issue.write_issue_dict(fpath: str, issue_dict: dict) None[source]

Write the issue YAML file.

Parameters
  • fpath (str) – Filepath of the issue YAML to write.

  • issue_dict (dict) – Issue dictionary

disdrodb.l0.l0_processing module

disdrodb.l0.l0_reader module

disdrodb.l0.l0a_processing module

disdrodb.l0.l0b_concat module

disdrodb.l0.l0b_processing module

disdrodb.l0.metadata module

disdrodb.l0.metadata.add_missing_metadata_keys(metadata)[source]

Add missing keys to the metadata dictionary.

disdrodb.l0.metadata.check_metadata_compliance(disdrodb_dir, data_source, campaign_name, station_name)[source]

Check DISDRODB metadata compliance.

disdrodb.l0.metadata.create_campaign_default_metadata(disdrodb_dir, campaign_name, data_source)[source]

Create default YAML metadata files for all stations within a campaign.

Use the function with caution to avoid overwrite existing YAML files.

disdrodb.l0.metadata.get_default_metadata_dict() dict[source]

Get DISDRODB metadata default values.

Returns

Dictionary of attibutes standard

Return type

dict

disdrodb.l0.metadata.get_metadata_missing_keys(metadata)[source]

Return the DISDRODB metadata keys which are missing.

disdrodb.l0.metadata.get_metadata_unvalid_keys(metadata)[source]

Return the DISDRODB metadata keys which are not valid.

disdrodb.l0.metadata.get_valid_metadata_keys() list[source]

Get DISDRODB valid metadata list.

Returns

List of valid metadata keys

Return type

list

disdrodb.l0.metadata.read_metadata(campaign_dir: str, station_name: str) dict[source]

Read YAML metadata file.

Parameters
  • raw_dir (str) – Path of the raw directory

  • station_name (int) – Id of the station.

Returns

Dictionnary of the metadata.

Return type

dict

disdrodb.l0.metadata.remove_unvalid_metadata_keys(metadata)[source]

Remove unvalid keys from the metadata dictionary.

disdrodb.l0.metadata.sort_metadata_dictionary(metadata)[source]

Sort the keys of the metadata dictionary by valid_metadata_keys list order.

disdrodb.l0.metadata.write_default_metadata(fpath: str) None[source]

Create default YAML metadata file at the specified filepath.

Parameters

fpath (str) – File path

disdrodb.l0.metadata.write_metadata(metadata, fpath)[source]

Write dictionary to YAML file.

disdrodb.l0.standards module

disdrodb.l0.summary module

disdrodb.l0.template_tools module

disdrodb.l0.utils_nc module

Module contents