ada.utils package

Submodules

ada.utils.config_file_generation module

class ada.utils.config_file_generation.ConfigVariants[source]

Bases: object

add(name, **params)[source]
save(filename)[source]
to_dict()[source]
class ada.utils.config_file_generation.Iter(*params)[source]

Bases: object

ada.utils.experimentation module

ada.utils.experimentation.create_timestamp_string(fmt='%Y-%m-%d.%H.%M.%S.%f')[source]
ada.utils.experimentation.load_json_dict(conf_filename)[source]
ada.utils.experimentation.loop_train_test_model(method, results, nseeds, backup_file, test_params, data_factory, gpus, force_run=False, progress_callback=<function <lambda>>, method_name=None, method_params=None, mlflow_uri=None, tensorboard_dir=None, checkpoint_dir=None)[source]
ada.utils.experimentation.param_to_hash(param_dict)[source]
ada.utils.experimentation.param_to_str(param_dict)[source]
ada.utils.experimentation.record_hashes(hash_file, hash_, value)[source]
ada.utils.experimentation.set_all_seeds(seed)[source]

See https://pytorch.org/docs/stable/notes/randomness.html

We activate the PyTorch options for best reproducibility. Note that this may be detrimental to processing speed, as per the above documentation:

…the processing speed (e.g. the number of batches trained per second) may be lower than when the model functions nondeterministically. However, even though single-run speed may be slower, depending on your application determinism may save time by facilitating experimentation, debugging, and regression testing.
Parameters:seed (int) – the seed which will be used for all random generators.
ada.utils.experimentation.train_model(method, data_factory, train_params=None, archi_params=None, method_name=None, method_params=None, seed=98347, fix_few_seed=0, gpus=None, mlflow_uri=None, tensorboard_dir=None, checkpoint_dir=None, fast=False, try_to_resume=True)[source]

This is the main function where a single model is created and trained, for a single seed value.

Parameters:
  • method (archis.Method) – type of method, used to decide which networks to build and how to use some parameters.
  • data_factory (DataFactory) – dataset description to get dataset loaders, as well as useful information for some networks.
  • train_params (dict, optional) – Hyperparameters for training (see network config). Defaults to None.
  • archi_params (dict, optional) – Parameters of the network (see network config). Defaults to None.
  • method_name (string, optional) – A unique name describing the method, with its parameters. Used for logging results. Defaults to None.
  • method_params (dict, optional) – Parameters to be fed to the model that are specific to method. Defaults to None.
  • seed (int, optional) – Global seed for reproducibility. Defaults to 98347.
  • fix_few_seed (int, optional) – See for semi-supervised setting, fixing which target samples are labeled. Defaults to 0.
  • gpus (list of int, optional) – Which GPU ids to use. Defaults to None.
  • mlflow_uri (int|string, optional) – if a string, must be formatted like <uri>:<port>. If a port, will try to log to a MLFlow server on localhost:port. If None, ignores MLFlow logging. Defaults to None.
  • fast (bool, optional) – Whether to activate the fast_dev_run option of PyTorch-Lightning, training only on 1 batch per epoch for debugging. Defaults to False.
Returns:

  • pl.Trainer: object containing the resulting metrics, used for evaluation.
  • BaseAdaptTrainer: pl.LightningModule object (derived class depending on method), containing
    both the dataset & trained networks.

Return type:

2-elements tuple containing

ada.utils.experimentation_results module

class ada.utils.experimentation_results.XpResults(metrics, df=None)[source]

Bases: object

already_computed(method_name, seed)[source]
append_to_markdown(filepath, test_params, nseeds, splits=None)[source]
append_to_txt(filepath, test_params, nseeds, splits=None)[source]
static from_file(metrics, filepath)[source]
get_best_archi_seed()[source]
get_data()[source]
get_last_seed()[source]
get_mean_seed(mean_metric)[source]
print_scores(method_name, split='Validation', stdout=True, fdout=None, print_func=<built-in function print>, file_format='markdown')[source]
remove(method_names)[source]
to_csv(filepath)[source]
update(is_validation, method_name, seed, metric_values)[source]

ada.utils.plotting module

ada.utils.plotting.colored_scattered_plot2x2(X_s, X_t, y_sparse_train_s, y_sparse_train_t, figsize=(12, 6), set_aspect_equal=False)[source]
ada.utils.plotting.plot_archi_data(domain_archi, tag, save_prefix=None, plot_features=None, plot_f_lines=False, do_domain_boundary=False, do_entropy=False, num_samples=600)[source]

This method generates a series of figures depending on the model and on the dataset used.

For toy data, more figures are available:

  • the classifier boundary
  • the domain boundary
  • the entropy values
  • the lines corresponding to the hidden neurons of the first feature layer
  • a PCA or TSNE or UMAP projection of the features
For other datasets with more than 2 dimensions,
  • only the feature projections

are available.

Parameters:
  • domain_archi (BaseAdaptTrainer) – the trained architecture.
  • tag (string) – the name of the method used both in the generated image titles and file names.
  • save_prefix (string, optional) – defaults to None. images will be saved to “{save_prefix}_{auto-gen-name}.png” If save_prefix is None, the images are not saved to disk.
  • plot_features (bool) – defaults to None None or string or list of strings from (“pca”, “tsne”, “umap”)
  • plot_f_lines (bool, optional) – defaults to False. If True, plot the lines corresponding to the first neurons for 2D data.
  • do_domain_boundary (bool, optional) – defaults to False If True, plot the domain boundary for 2D toy data.
  • do_entropy (bool, optional) – defaults to False If True, plots the level of entropy values between 0 and 1
  • num_samples (int, optional) – defaults to 600 Number of random samples use for plotting

ada.utils.streamlit_configs module

ada.utils.streamlit_configs.configure_dataset(default_dir, on_sidebar=True)[source]
ada.utils.streamlit_configs.configure_network(default_file, on_sidebar=True)[source]
ada.utils.streamlit_configs.test_view_data(data_params)[source]

Module contents