ada.utils package¶
Submodules¶
ada.utils.config_file_generation module¶
ada.utils.experimentation module¶
-
ada.utils.experimentation.
loop_train_test_model
(method, results, nseeds, backup_file, test_params, data_factory, gpus, force_run=False, progress_callback=<function <lambda>>, method_name=None, method_params=None, mlflow_uri=None, tensorboard_dir=None, checkpoint_dir=None)[source]¶
-
ada.utils.experimentation.
set_all_seeds
(seed)[source]¶ See https://pytorch.org/docs/stable/notes/randomness.html
We activate the PyTorch options for best reproducibility. Note that this may be detrimental to processing speed, as per the above documentation:
…the processing speed (e.g. the number of batches trained per second) may be lower than when the model functions nondeterministically. However, even though single-run speed may be slower, depending on your application determinism may save time by facilitating experimentation, debugging, and regression testing.Parameters: seed (int) – the seed which will be used for all random generators.
-
ada.utils.experimentation.
train_model
(method, data_factory, train_params=None, archi_params=None, method_name=None, method_params=None, seed=98347, fix_few_seed=0, gpus=None, mlflow_uri=None, tensorboard_dir=None, checkpoint_dir=None, fast=False, try_to_resume=True)[source]¶ This is the main function where a single model is created and trained, for a single seed value.
Parameters: - method (archis.Method) – type of method, used to decide which networks to build and how to use some parameters.
- data_factory (DataFactory) – dataset description to get dataset loaders, as well as useful information for some networks.
- train_params (dict, optional) – Hyperparameters for training (see network config). Defaults to None.
- archi_params (dict, optional) – Parameters of the network (see network config). Defaults to None.
- method_name (string, optional) – A unique name describing the method, with its parameters. Used for logging results. Defaults to None.
- method_params (dict, optional) – Parameters to be fed to the model that are specific to method. Defaults to None.
- seed (int, optional) – Global seed for reproducibility. Defaults to 98347.
- fix_few_seed (int, optional) – See for semi-supervised setting, fixing which target samples are labeled. Defaults to 0.
- gpus (list of int, optional) – Which GPU ids to use. Defaults to None.
- mlflow_uri (int|string, optional) – if a string, must be formatted like <uri>:<port>. If a port, will try to log to a MLFlow server on localhost:port. If None, ignores MLFlow logging. Defaults to None.
- fast (bool, optional) – Whether to activate the fast_dev_run option of PyTorch-Lightning, training only on 1 batch per epoch for debugging. Defaults to False.
Returns: - pl.Trainer: object containing the resulting metrics, used for evaluation.
- BaseAdaptTrainer: pl.LightningModule object (derived class depending on method), containing
- both the dataset & trained networks.
Return type: 2-elements tuple containing
ada.utils.experimentation_results module¶
ada.utils.plotting module¶
-
ada.utils.plotting.
colored_scattered_plot2x2
(X_s, X_t, y_sparse_train_s, y_sparse_train_t, figsize=(12, 6), set_aspect_equal=False)[source]¶
-
ada.utils.plotting.
plot_archi_data
(domain_archi, tag, save_prefix=None, plot_features=None, plot_f_lines=False, do_domain_boundary=False, do_entropy=False, num_samples=600)[source]¶ This method generates a series of figures depending on the model and on the dataset used.
For toy data, more figures are available:
- the classifier boundary
- the domain boundary
- the entropy values
- the lines corresponding to the hidden neurons of the first feature layer
- a PCA or TSNE or UMAP projection of the features
- For other datasets with more than 2 dimensions,
- only the feature projections
are available.
Parameters: - domain_archi (BaseAdaptTrainer) – the trained architecture.
- tag (string) – the name of the method used both in the generated image titles and file names.
- save_prefix (string, optional) – defaults to None. images will be saved to “{save_prefix}_{auto-gen-name}.png” If save_prefix is None, the images are not saved to disk.
- plot_features (bool) – defaults to None None or string or list of strings from (“pca”, “tsne”, “umap”)
- plot_f_lines (bool, optional) – defaults to False. If True, plot the lines corresponding to the first neurons for 2D data.
- do_domain_boundary (bool, optional) – defaults to False If True, plot the domain boundary for 2D toy data.
- do_entropy (bool, optional) – defaults to False If True, plots the level of entropy values between 0 and 1
- num_samples (int, optional) – defaults to 600 Number of random samples use for plotting