GammaLearn package API
Subpackages
- gammalearn.tests package
- Submodules
- gammalearn.tests.test_criterions module
- gammalearn.tests.test_datasets module
TestDL1Parameters
TestLSTDataset
TestLSTDataset.setUp()
TestLSTDataset.test_energy_filter_file()
TestLSTDataset.test_energy_filter_memory()
TestLSTDataset.test_intensity_energy_filter_file()
TestLSTDataset.test_intensity_energy_filter_memory()
TestLSTDataset.test_intensity_filter_file()
TestLSTDataset.test_intensity_filter_memory()
TestLSTDataset.test_intensity_lstchain_filter_file()
TestLSTDataset.test_intensity_lstchain_filter_memory()
TestLSTDataset.test_mono_file()
TestLSTDataset.test_mono_memory()
TestLSTDataset.test_mono_memory_test_mode()
TestLSTDataset.test_subarray()
TestLSTRealDataset
TestLSTRealDataset.setUp()
TestLSTRealDataset.test_intensity_filter_file()
TestLSTRealDataset.test_intensity_filter_memory()
TestLSTRealDataset.test_intensity_lstchain_filter_file()
TestLSTRealDataset.test_intensity_lstchain_filter_memory()
TestLSTRealDataset.test_mono_file()
TestLSTRealDataset.test_mono_memory()
TestLSTRealDataset.test_mono_memory_test_mode()
- gammalearn.tests.test_nets module
- gammalearn.tests.test_optimizers module
- gammalearn.tests.test_utils module
MockLSTDataset
TestIndexMatrix
TestTransformerUtils
TestUtils
TestUtils.setUp()
TestUtils.test_cleaning()
TestUtils.test_emission_cone()
TestUtils.test_energy()
TestUtils.test_impact_distance()
TestUtils.test_inject_geometry_into_parameters()
TestUtils.test_leakage()
TestUtils.test_multiplicity()
TestUtils.test_multiplicity_strict()
TestUtils.test_nets_definition_path()
TestUtils.test_rotated_indices()
TestWrite
- gammalearn.tests.test_version module
- Module contents
gammalearn.callbacks module
- class gammalearn.callbacks.LogFeatures[source]
Bases:
Callback
- class gammalearn.callbacks.LogGradNormTracker[source]
Bases:
Callback
Callback to send gradnorm parameters to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogGradientCosineSimilarity[source]
Bases:
Callback
Callback to send the tasks gradient cosine similarity to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogGradientNorm[source]
Bases:
Callback
Callback to send the gradient total norm to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- on_train_epoch_end(trainer, pl_module)[source]
Called when the train epoch ends.
To access all batch outputs at the end of the epoch, either:
Implement training_epoch_end in the LightningModule and access outputs via the module OR
Cache data across train batch hooks inside the callback implementation to post-process in this hook.
- class gammalearn.callbacks.LogGradientNormPerTask[source]
Bases:
Callback
Callback to send the tasks gradient norm to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogLambda[source]
Bases:
Callback
Callback to send loss the gradient weighting from BaseW to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogLinearGradient[source]
Bases:
Callback
Callback to send gradients to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogLossWeighting[source]
Bases:
Callback
Callback to send loss weight coefficients to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogModelParameters[source]
Bases:
Callback
Callback to send the network parameters to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- on_train_epoch_end(trainer, pl_module)[source]
Called when the train epoch ends.
To access all batch outputs at the end of the epoch, either:
Implement training_epoch_end in the LightningModule and access outputs via the module OR
Cache data across train batch hooks inside the callback implementation to post-process in this hook.
- class gammalearn.callbacks.LogModelWeightNorm[source]
Bases:
Callback
Callback to send sum of squared weigths of the network to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- on_train_epoch_end(trainer, pl_module)[source]
Called when the train epoch ends.
To access all batch outputs at the end of the epoch, either:
Implement training_epoch_end in the LightningModule and access outputs via the module OR
Cache data across train batch hooks inside the callback implementation to post-process in this hook.
- class gammalearn.callbacks.LogReLUActivations[source]
Bases:
Callback
Callback to send activations to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.LogUncertaintyTracker[source]
Bases:
Callback
Callback to send loss log vars and precisions of the Uncertainty estimation method to logger
- Returns:
!! processed by numpydoc !!
- Return type:
- class gammalearn.callbacks.WriteADistance[source]
Bases:
WriteData
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteAccuracy[source]
Bases:
WriteData
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteAccuracyDomain[source]
Bases:
WriteData
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteAutoEncoder[source]
Bases:
WriteData
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteAutoEncoderDL1[source]
Bases:
Callback
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteConfusionMatrix[source]
Bases:
WriteData
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- class gammalearn.callbacks.WriteDL2Files[source]
Bases:
Callback
Callback to produce testing result data files :param trainer (Trainer): :param pl_module (LightningModule):
- Returns:
- gammalearn.callbacks.make_activation_sender(pl_module, name)[source]
Creates the adapted activations sender to tensorboard :param pl_module (LightningModule): :type pl_module (LightningModule): the tensorboardX writer :param name (string): :type name (string): name of the layer which activation is logged
- Returns:
- An adapted function
gammalearn.constants module
gammalearn.criterions module
- class gammalearn.criterions.DANNLoss(training_class: Optional[list] = None, gamma: Optional[int] = None)[source]
Bases:
Module
Implementation of the Domain Adversarial Neural Networl (DANN) loss. From the DANN article https://arxiv.org/abs/1505.07818.
- Parameters:
- training_class: (dict) The dict of all the classes that trigger the training of the domain classifier. If set to
- None, no domain conditional is applied. In the LST dataset, MC labels are processed using the particle dictionary
- defined in the experiment settings, however the real labels remain the same.
- gamma: (int) If gamma is not None, the weight associated to the loss is computed according to the lambda_p strategy.
- static fetch_domain_conditional_from_targets(targets: dict) bool [source]
In DANN training and validation steps, check if domain conditional is True. If it is, the step functions will update the domain loss mask at each iteration.
- Parameters:
- targets: (dict) The experiment setting targets dictionary.
- forward(output: Tensor, labels: Tensor)[source]
DANN loss function.
- Parameters:
- output: (torch.Tensor) The model’s output.
- labels: (torch.Tensor) The ground truth domain labels.
- set_domain_loss_mask(labels: Tensor) None [source]
Update the domain loss mask if domain conditional is True.
- Parameters:
- labels: (torch.Tensor) The ground truth class labels.
- static set_domain_loss_mask_from_targets(targets: dict, labels: Tensor) None [source]
Update the domain loss mask at each iteration of the training and validation steps from the experiment setting targets variable.
- Parameters:
- targets: (dict) The experiment setting targets dictionary.
- labels: (torch.Tensor) The ground truth class labels.
- training: bool
- class gammalearn.criterions.DeepCORALLoss[source]
Bases:
Module
Implementation of the CORAL loss. From the DeepCORAL article https://arxiv.org/abs/1607.01719.
- Parameters:
- forward(ds: Tensor, dt: Tensor) Tensor [source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class gammalearn.criterions.DeepJDOTLoss[source]
Bases:
Module
Implementation of the Wasserstein loss using the Optimal Transport theory. From the DeepJDOT article https://arxiv.org/abs/1803.10081.
- forward(latent_features_source: Tensor, latent_features_target: Tensor)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class gammalearn.criterions.EqualWeighting(targets: Dict[str, Dict], requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
MultiLossBalancing
Assigned the same weight to all the losses.
- training: bool
- class gammalearn.criterions.FocalLoss(alpha=0.5, gamma=2.0, reduction='mean')[source]
Bases:
Module
Criterion that computes Focal loss. According to [1], the Focal loss is computed as follows: .. math:
\text{FL}(p_t) = -\alpha_t (1 - p_t)^{\gamma} \, \text{log}(p_t)
- where:
\(p_t\) is the model’s estimated probability for each class.
- Parameters:
alpha (float) – Weighting factor \(\alpha \in [0, 1]\).
gamma (float) – Focusing parameter \(\gamma >= 0\).
reduction (str, optional) – Specifies the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’. ‘none’: no reduction will be applied, ‘mean’: the sum of the output will be divided by the number of elements in the output, ‘sum’: the output will be summed. Default: ‘none’.
- Shape:
Input: \((N, C, H, W)\) where C = number of classes.
Target: \((N, H, W)\) where each value is \(0 ≤ targets[i] ≤ C−1\).
Examples
>>> N = 5 # num_classes >>> args = {"alpha": 0.5, "gamma": 2.0, "reduction": 'mean'} >>> loss = FocalLoss(*args) >>> x = torch.randn(1, N, 3, 5, requires_grad=True) >>> target = torch.empty(1, 3, 5, dtype=torch.long).random_(N) >>> output = loss(x, target) >>> output.backward()
References
[1] https://arxiv.org/abs/1708.02002
- forward(x, target)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class gammalearn.criterions.GaussianKernel(alpha: torch.float32)[source]
Bases:
Module
Gaussian kernel matrix. This implementation is inspired from https://github.com/thuml/Transfer-Learning-Library/blob/0fdc06ca87c71fbf784d58e7388cf03a3f13bf00/tllib/modules/kernels.py
- Parameters:
- alpha: (float) magnitude of the variance of the Gaussian
- forward(x: Tensor, y: Tensor) Tensor [source]
- Parameters:
- x: (torch.Tensor) the first input feature vector of size (batch_size, feature_size).
- y: (torch.Tensor) the second input feature vector of size (batch_size, feature_size).
- Returns:
- The kernel value of size (batch_size, batch_size).
- training: bool
- class gammalearn.criterions.GradNorm(targets: Dict[str, Dict], alpha: float = 1.0, layer: Optional[Module] = None, requires_gradients: bool = True)[source]
Bases:
MultiLossBalancing
From the article GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks ( https://arxiv.org/abs/1711.02257). The method consists in computing the gradients of the loss with respect to the shared weights and then compute the norm of the gradients. The weights are then updated according to the norm of the gradients. Inspired from https://github.com/NVIDIA/modulus-sym/blob/main/modulus/sym/loss/aggregator.py#L111.
- training: bool
- class gammalearn.criterions.GradientToolBox(targets: Dict[str, Dict], layer: Optional[str] = None)[source]
Bases:
object
This class gathers some functions to calculate the gradients on a specified set of weights. Inspired from https://github.com/median-research-group/LibMTL/blob/main/LibMTL/weighting/abstract_weighting.py
- class gammalearn.criterions.LossComputing(targets, conditional=False, gamma_class=None, path_distrib_weights: Optional[str] = None)[source]
Bases:
object
- class gammalearn.criterions.MKMMDLoss(kernels: Optional[List[GaussianKernel]] = None)[source]
Bases:
Module
Implementation of the Multiple Kernel Mean Maximum Discrepancy loss. This implementation is inspired from https://github.com/thuml/Transfer-Learning-Library/blob/0fdc06ca87c71fbf784d58e7388cf03a3f13bf00/tllib/alignment/dan.py
- Parameters:
- kernels: (list(GaussianKernel)) The list of kernels to apply. Currently, only Gaussian kernels are implemented. If
- kernels is None, then it is instantiated as GaussianKernel(alpha=2**k) for k in range(-3, 2).
- forward(xs: Tensor, xt: Tensor) Tensor [source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class gammalearn.criterions.ManualWeighting(targets: Dict[str, Dict], requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
MultiLossBalancing
Manual weighting of the loss. These hyperparameters must be defined in the targets dictionary of the experiment setting file.
- training: bool
- class gammalearn.criterions.MovingAverageMetric(window_size: int = 10)[source]
Bases:
object
Compute the moving average of a metric.
- class gammalearn.criterions.MultiLossBalancing(targets: Dict[str, Dict], balancing: bool = True, requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
Module
Generic function for loss balancing.
- Parameters:
- targets: (dict) The loss dictionary defining for every objective of the experiment the loss function
- Returns:
- forward(loss: Dict[str, Tensor], module: Optional[LightningModule] = None) dict [source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class gammalearn.criterions.OutOfBalancing(targets: Dict[str, Dict], requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
MultiLossBalancing
Manual weighting of the loss when mt_balancing is set to False. These hyperparameters must be defined in the targets dictionary of the experiment setting file.
- training: bool
- class gammalearn.criterions.RandomLossWeighting(targets: Dict[str, Dict], requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
MultiLossBalancing
From the article Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning ( https://arxiv.org/abs/2111.10603). The method consists in assigning a random weight to each task drawn from a normal distribution. The random weight is recomputed at each iteration. Implementation inspired from https://github.com/median-research-group/LibMTL/blob/main/LibMTL/weighting/RLW.py
- training: bool
- class gammalearn.criterions.UncertaintyWeighting(targets: Dict[str, Dict], log_var_coefficients: Optional[list] = None, penalty: int = 0, requires_gradients: bool = False, layer: Optional[str] = None)[source]
Bases:
MultiLossBalancing
Create the function to compute the loss in case of multi regression experiment with homoscedastic uncertainty loss balancing. See the paper https://arxiv.org/abs/1705.07115. In the paper the total loss is defined as: .. math:
\text{L}(W,\sigma_1,\sigma_2,...,\sigma_i) = \sum_i \frac{1}{2\sigma_i}^2 \text{L}_i + \text{log}\sigma_i^2
but in https://github.com/yaringal/multi-task-learning-example/blob/master/multi-task-learning-example.ipynb as: .. math:
\text{L}(W,\sigma_1,\sigma_2,...,\sigma_i) = \sum_i \frac{1}{\sigma_i}^2 \text{L}_i + \text{log}\sigma_i^2 -1
should not make a big difference. However, we introduce log_var_coefficients and penalty to let the user choose: .. math:
\text{L} = \sum_i \frac{1}{\{log_var_coefficients}\sigma_i}^2 \text{L}_i + \text{log}\sigma_i^2 -\text{penalty}
- Parameters:
- targets (dict): The loss dictionary defining for every objective of the experiment the loss function and its
- initial log_var
- Returns:
- The function to compute the loss
- training: bool
- gammalearn.criterions.focal_loss(x, target, gamma=2.0, reduction='none')[source]
Function that computes Focal loss. See
FocalLoss
for details.
- gammalearn.criterions.one_hot(labels, num_classes, device=None, dtype=None, eps=1e-06)[source]
Converts an integer label 2D tensor to a one-hot 3D tensor. :param labels: tensor with labels of shape \((N, H, W)\),
where N is batch siz. Each value is an integer representing correct classification.
- Parameters:
num_classes (int) – number of classes in labels.
device (Optional[torch.device]) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (see torch.set_default_tensor_type()). device will be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
dtype (Optional[torch.dtype]) – the desired data type of returned tensor. Default: if None, infers data type from values.
eps –
- Returns:
the labels in one hot tensor.
- Return type:
torch.Tensor
gammalearn.data_handlers module
- class gammalearn.data_handlers.BaseDataModule(experiment)[source]
Bases:
LightningDataModule
Create datasets and dataloaders. :param experiment (Experiment): :type experiment (Experiment): the experiment
- Returns:
- setup(stage=None)[source]
- In the case that the train and the test data modules are different, two setup functions are defined in order to
prevent from loading data twice.
- setup_test()[source]
This function is used if test is set to True in experiment setting file. If no data module test is provided, test is completed on the validation set. If neither a data module test nor a validation set is provided, an error will be raised.
- train_dataloader()[source]
Implement one or more PyTorch DataLoaders for training.
- Returns:
A collection of
torch.utils.data.DataLoader
specifying training samples. In the case of multiple dataloaders, please see this section.
The dataloader you return will not be reloaded unless you set :paramref:`~pytorch_lightning.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.
For data processing use the following pattern:
download in
prepare_data()
process and split in
setup()
However, the above are only necessary for distributed processing.
Warning
do not assign state in prepare_data
fit()
prepare_data()
Note
Lightning adds the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.
Example:
# single dataloader def train_dataloader(self): transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (1.0,))]) dataset = MNIST(root='/path/to/mnist/', train=True, transform=transform, download=True) loader = torch.utils.data.DataLoader( dataset=dataset, batch_size=self.batch_size, shuffle=True ) return loader # multiple dataloaders, return as list def train_dataloader(self): mnist = MNIST(...) cifar = CIFAR(...) mnist_loader = torch.utils.data.DataLoader( dataset=mnist, batch_size=self.batch_size, shuffle=True ) cifar_loader = torch.utils.data.DataLoader( dataset=cifar, batch_size=self.batch_size, shuffle=True ) # each batch will be a list of tensors: [batch_mnist, batch_cifar] return [mnist_loader, cifar_loader] # multiple dataloader, return as dict def train_dataloader(self): mnist = MNIST(...) cifar = CIFAR(...) mnist_loader = torch.utils.data.DataLoader( dataset=mnist, batch_size=self.batch_size, shuffle=True ) cifar_loader = torch.utils.data.DataLoader( dataset=cifar, batch_size=self.batch_size, shuffle=True ) # each batch will be a dict of tensors: {'mnist': batch_mnist, 'cifar': batch_cifar} return {'mnist': mnist_loader, 'cifar': cifar_loader}
- val_dataloader()[source]
Implement one or multiple PyTorch DataLoaders for validation.
The dataloader you return will not be reloaded unless you set :paramref:`~pytorch_lightning.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.
It’s recommended that all data downloads and preparation happen in
prepare_data()
.fit()
validate()
prepare_data()
Note
Lightning adds the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.
- Returns:
A
torch.utils.data.DataLoader
or a sequence of them specifying validation samples.
Examples:
def val_dataloader(self): transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (1.0,))]) dataset = MNIST(root='/path/to/mnist/', train=False, transform=transform, download=True) loader = torch.utils.data.DataLoader( dataset=dataset, batch_size=self.batch_size, shuffle=False ) return loader # can also return multiple dataloaders def val_dataloader(self): return [loader_a, loader_b, ..., loader_n]
Note
If you don’t need a validation dataset and a
validation_step()
, you don’t need to implement this method.Note
In the case where you return multiple validation dataloaders, the
validation_step()
will have an argumentdataloader_idx
which matches the order here.
- class gammalearn.data_handlers.GLearnDataModule(experiment)[source]
Bases:
BaseDataModule
- class gammalearn.data_handlers.GLearnDomainAdaptationDataModule(experiment)[source]
Bases:
GLearnDataModule
- class gammalearn.data_handlers.VisionDataModule(experiment)[source]
Bases:
BaseDataModule
Create datasets and dataloaders. :param experiment (Experiment): :type experiment (Experiment): the experiment
- Returns:
- class gammalearn.data_handlers.VisionDomainAdaptationDataModule(experiment)[source]
Bases:
VisionDataModule
Create datasets and dataloaders. :param experiment (Experiment): :type experiment (Experiment): the experiment
- Returns:
- gammalearn.data_handlers.create_datasets(datafiles_list, experiment, train=True, **kwargs)[source]
Create datasets from datafiles list, data are loaded in memory. :param datafiles (List): :type datafiles (List): files to load data from :param experiment (Experiment): :type experiment (Experiment): the experiment
- Returns:
- Datasets
- gammalearn.data_handlers.split_dataset(datasets, ratio)[source]
Split a list of datasets into a train and a validation set :param datasets (list of Dataset): :type datasets (list of Dataset): the list of datasets :param ratio (float): :type ratio (float): the ratio of data for validation
- Returns:
- train set, validation set
gammalearn.datasets module
- class gammalearn.datasets.BaseLSTDataset(hdf5_file_path, camera_type, group_by, targets=None, particle_dict=None, use_time=False, train=True, subarray=None, transform=None, target_transform=None, domain_dict=None, **kwargs)[source]
Bases:
Dataset
camera simulation dataset for lstchain DL1 hdf5 files. Tested with lstchain 0.7
- class gammalearn.datasets.CleanImages(new_channel=False, **opts)[source]
Bases:
TransformIACT
Cleaning transform. :param new_channel (Bool): :type new_channel (Bool): if True, adds the cleaning mask to the data as a new channel. :param If False: :param apply the cleaning mask to the data.:
- class gammalearn.datasets.FileLSTDataset(hdf5_file_path, camera_type, group_by, targets=None, particle_dict=None, use_time=False, train=True, subarray=None, transform=None, target_transform=None, **kwargs)[source]
Bases:
BaseLSTDataset
- class gammalearn.datasets.GLearnCompose(transforms)[source]
Bases:
Compose
Custom transform Compose that can receive extra parameters when called.
- class gammalearn.datasets.HDF5Dataset(path, camera_type, transform=None, target_transform=None, telescope_transform=None)[source]
Bases:
Dataset
Loads data in a Dataset from a HDF5 file.
- Parameters:
path (str) – The path to the HDF5 file.
transform (callable, optional) – A callable or a composition of callable to be applied to the data.
target_transform (callable, optional) – A callable or a composition of callable to be applied to the labels.
- class gammalearn.datasets.MemoryLSTDataset(hdf5_file_path, camera_type, group_by, targets=None, particle_dict=None, use_time=False, train=True, subarray=None, transform=None, target_transform=None, **kwargs)[source]
Bases:
BaseLSTDataset
- class gammalearn.datasets.MockLSTDataset(hdf5_file_path, camera_type, group_by, targets=None, particle_dict=None, use_time=False, train=True, subarray=None, transform=None, target_transform=None, **kwargs)[source]
Bases:
Dataset
- class gammalearn.datasets.NumpyDataset(data, labels, transform=None, target_transform=None)[source]
Bases:
Dataset
- class gammalearn.datasets.Patchify(n_h, n_w, n_c=0)[source]
Bases:
object
Transform an image into patches.
- class gammalearn.datasets.ResampleImage(mapping, output_size)[source]
Bases:
TransformIACT
Resample an hexagonal image with DL1 Data Handler Image Mapper The class needs to be instantiated first to be passed to a Dataset class but can be setup later when the camera information is available to the user
- class gammalearn.datasets.RotateImage(rotated_indices)[source]
Bases:
object
Rotate telescope image based on rotated indices
- class gammalearn.datasets.VisionDataset(paths, dataset_parameters, transform=None, target_transform=None, train=True, domain=None, max_files=-1, num_workers=10)[source]
Bases:
Dataset
- The Dataset class corresponding to the following vision datasets :
digits : mnist, mnistm, usps
visda : synthetic, real
- Parameters:
- paths: (list) Paths to the datasets.
- dataset_parameters: (dict) The corresponding parameters to the datasets. These parameters will be applied to all
- datasets.
- transform: The transforms applied to the images.
- target_transform: The transforms applied to the targets.
- train: (bool) If train is ‘True’, returns the train datasets, otherwise returns the test datasets.
- domain: (str) In the domain adaptation context, gives an extra domain label ‘source’ or ‘target’. Otherwise, set
- to ‘None’ (default).
- max_files: (int) The number of images to load in each dataset. If set to ‘-1’ (default), all images and targets
- will be loaded.
- class gammalearn.datasets.VisionDomainAdaptationDataset(source_dataset, target_dataset)[source]
Bases:
Dataset
- gammalearn.datasets.augment_via_duplication(datasets, scale, num_workers)[source]
Augment data by duplicating events based on the inverse detected energies distribution :param datasets (list): :type datasets (list): list of Subsets :param scale (float): :type scale (float): the scale to constrain the maximum duplication factor :param num_workers (int): :type num_workers (int): number of processes to use
- Returns:
gammalearn.experiment_runner module
- class gammalearn.experiment_runner.Experiment(settings)[source]
Bases:
object
Loads the settings of the experiment from the settings object, check them and defines default values for not specified ones.
- class gammalearn.experiment_runner.LitGLearnModule(experiment)[source]
Bases:
LightningModule
- configure_optimizers()[source]
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.Tuple of dictionaries as described above, with an optional
"frequency"
key.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
The
frequency
value specified in a dict along with theoptimizer
key is an int corresponding to the number of sequential batches optimized with the specific optimizer. It should be given to none or to all of the optimizers. There is a difference between passing multiple optimizers in a list, and passing multiple optimizers in dictionaries with a frequency of 1:In the former case, all optimizers will operate on the given batch in each optimization step.
In the latter, only one optimizer will operate on the given batch at every step.
This is different from the
frequency
value specified in thelr_scheduler_config
mentioned above.def configure_optimizers(self): optimizer_one = torch.optim.SGD(self.model.parameters(), lr=0.01) optimizer_two = torch.optim.SGD(self.model.parameters(), lr=0.01) return [ {"optimizer": optimizer_one, "frequency": 5}, {"optimizer": optimizer_two, "frequency": 10}, ]
In this example, the first optimizer will be used for the first 5 steps, the second optimizer for the next 10 steps and that cycle will continue. If an LR scheduler is specified for an optimizer using the
lr_scheduler
key in the above dict, the scheduler will only be updated when its optimizer is being used.Examples:
# most cases. no learning rate scheduler def configure_optimizers(self): return Adam(self.parameters(), lr=1e-3) # multiple optimizer case (e.g.: GAN) def configure_optimizers(self): gen_opt = Adam(self.model_gen.parameters(), lr=0.01) dis_opt = Adam(self.model_dis.parameters(), lr=0.02) return gen_opt, dis_opt # example with learning rate schedulers def configure_optimizers(self): gen_opt = Adam(self.model_gen.parameters(), lr=0.01) dis_opt = Adam(self.model_dis.parameters(), lr=0.02) dis_sch = CosineAnnealing(dis_opt, T_max=10) return [gen_opt, dis_opt], [dis_sch] # example with step-based learning rate schedulers # each optimizer has its own scheduler def configure_optimizers(self): gen_opt = Adam(self.model_gen.parameters(), lr=0.01) dis_opt = Adam(self.model_dis.parameters(), lr=0.02) gen_sch = { 'scheduler': ExponentialLR(gen_opt, 0.99), 'interval': 'step' # called after each training step } dis_sch = CosineAnnealing(dis_opt, T_max=10) # called every epoch return [gen_opt, dis_opt], [gen_sch, dis_sch] # example with optimizer frequencies # see training procedure in `Improved Training of Wasserstein GANs`, Algorithm 1 # https://arxiv.org/abs/1704.00028 def configure_optimizers(self): gen_opt = Adam(self.model_gen.parameters(), lr=0.01) dis_opt = Adam(self.model_dis.parameters(), lr=0.02) n_critic = 5 return ( {'optimizer': dis_opt, 'frequency': n_critic}, {'optimizer': gen_opt, 'frequency': 1} )
Note
Some things to know:
Lightning calls
.backward()
and.step()
on each optimizer as needed.If learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizers.If you use multiple optimizers,
training_step()
will have an additionaloptimizer_idx
parameter.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, gradients will be calculated only for the parameters of current optimizer at each training step.
If you need to control how often those optimizers step or override the default
.step()
schedule, override theoptimizer_step()
hook.
- forward(x)[source]
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- log_from_console_logger(output: dict, labels: dict, loss: dict, loss_data: dict, batch_idx: int) None [source]
- test_step(batch, batch_idx)[source]
Operates on a single batch of data from the test set. In this step you’d normally generate examples or calculate anything of interest such as accuracy.
# the pseudocode for these calls test_outs = [] for test_batch in test_data: out = test_step(test_batch) test_outs.append(out) test_epoch_end(test_outs)
- Parameters:
batch – The output of your
DataLoader
.batch_idx – The index of this batch.
dataloader_id – The index of the dataloader that produced this batch. (only if multiple test dataloaders used).
- Returns:
Any of.
Any object or value
None
- Testing will skip to the next batch
# if you have one test dataloader: def test_step(self, batch, batch_idx): ... # if you have multiple test dataloaders: def test_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single test dataset def test_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) test_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'test_loss': loss, 'test_acc': test_acc})
If you pass in multiple test dataloaders,
test_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple test dataloaders def test_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to test you don’t need to implement this method.
Note
When the
test_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of the test epoch, the model goes back to training mode and gradients are enabled.
- training_epoch_end(outputs)[source]
Called at the end of the training epoch with the outputs of all training steps. Use this in case you need to do something with all the outputs returned by
training_step()
.# the pseudocode for these calls train_outs = [] for train_batch in train_data: out = training_step(train_batch) train_outs.append(out) training_epoch_end(train_outs)
- Parameters:
outputs – List of outputs you defined in
training_step()
. If there are multiple optimizers or when usingtruncated_bptt_steps > 0
, the lists have the dimensions (n_batches, tbptt_steps, n_optimizers). Dimensions of length 1 are squeezed.- Returns:
None
Note
If this method is not overridden, this won’t be called.
def training_epoch_end(self, training_step_outputs): # do something with all training_step outputs for out in training_step_outputs: ...
- training_step(batch, batch_idx)[source]
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch (
Tensor
| (Tensor
, …) | [Tensor
, …]) – The output of yourDataLoader
. A tensor, tuple or list.batch_idx (
int
) – Integer displaying index of this batchoptimizer_idx (
int
) – When using multiple optimizers, this argument will also be present.hiddens (
Any
) – Passed in if :paramref:`~pytorch_lightning.core.module.LightningModule.truncated_bptt_steps` > 0.
- Returns:
Any of.
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
None
- Training will skip to the next batch. This is only for automatic optimization.This is not supported for multi-GPU, TPU, IPU, or DeepSpeed.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
If you define multiple optimizers, this step will be called with an additional
optimizer_idx
parameter.# Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx, optimizer_idx): if optimizer_idx == 0: # do training_step with encoder ... if optimizer_idx == 1: # do training_step with decoder ...
If you add truncated back propagation through time you will also get an additional argument with the hidden states of the previous step.
# Truncated back-propagation through time def training_step(self, batch, batch_idx, hiddens): # hiddens are the hidden states from the previous truncated backprop step out, hiddens = self.lstm(data, hiddens) loss = ... return {"loss": loss, "hiddens": hiddens}
Note
The loss value shown in the progress bar is smoothed (averaged) over the last values, so it differs from the actual loss returned in train/validation step.
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- training_step_end(batch_parts)[source]
Use this when training with dp because
training_step()
will operate on only part of the batch. However, this is still optional and only needed for things like softmax or NCE loss.Note
If you later switch to ddp or some other mode, this will still be called so that you don’t have to change your code
# pseudocode sub_batches = split_batches_for_dp(batch) step_output = [training_step(sub_batch) for sub_batch in sub_batches] training_step_end(step_output)
- Parameters:
step_output – What you return in training_step for each batch part.
- Returns:
Anything
When using the DP strategy, only a portion of the batch is inside the training_step:
def training_step(self, batch, batch_idx): # batch is 1/num_gpus big x, y = batch out = self(x) # softmax uses only a portion of the batch in the denominator loss = self.softmax(out) loss = nce_loss(loss) return loss
If you wish to do something with all the parts of the batch, then use this method to do it:
def training_step(self, batch, batch_idx): # batch is 1/num_gpus big x, y = batch out = self.encoder(x) return {"pred": out} def training_step_end(self, training_step_outputs): gpu_0_pred = training_step_outputs[0]["pred"] gpu_1_pred = training_step_outputs[1]["pred"] gpu_n_pred = training_step_outputs[n]["pred"] # this softmax now uses the full batch loss = nce_loss([gpu_0_pred, gpu_1_pred, gpu_n_pred]) return loss
See also
See the Multi GPU Training guide for more details.
- validation_epoch_end(*args, **kwargs)[source]
Called at the end of the validation epoch with the outputs of all validation steps.
# the pseudocode for these calls val_outs = [] for val_batch in val_data: out = validation_step(val_batch) val_outs.append(out) validation_epoch_end(val_outs)
- Parameters:
outputs – List of outputs you defined in
validation_step()
, or if there are multiple dataloaders, a list containing a list of outputs for each dataloader.- Returns:
None
Note
If you didn’t define a
validation_step()
, this won’t be called.Examples
With a single dataloader:
def validation_epoch_end(self, val_step_outputs): for out in val_step_outputs: ...
With multiple dataloaders, outputs will be a list of lists. The outer list contains one entry per dataloader, while the inner list contains the individual outputs of each validation step for that dataloader.
def validation_epoch_end(self, outputs): for dataloader_output_result in outputs: dataloader_outs = dataloader_output_result.dataloader_i_outputs self.log("final_metric", final_value)
- validation_step(batch, batch_idx)[source]
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
# the pseudocode for these calls val_outs = [] for val_batch in val_data: out = validation_step(val_batch) val_outs.append(out) validation_epoch_end(val_outs)
- Parameters:
batch – The output of your
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple val dataloaders used)
- Returns:
Any object or value
None
- Validation will skip to the next batch
# pseudocode of order val_outs = [] for val_batch in val_data: out = validation_step(val_batch) if defined("validation_step_end"): out = validation_step_end(out) val_outs.append(out) val_outs = validation_epoch_end(val_outs)
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
gammalearn.gl_dl1_to_dl2 module
- gammalearn.gl_dl1_to_dl2.build_argparser()[source]
Construct main argument parser for the
gl_dl1_to_dl2
script- Returns:
argparser: argparse.ArgumentParser
- gammalearn.gl_dl1_to_dl2.create_dataset(file_queue: Queue, dl1_queue: Queue, dataset_class: BaseLSTDataset, dataset_parameters: dict) None [source]
Create the datasets and fill the correspond queue.
- Parameters:
- file_queue: (Queue) The queue containing the file names of the dl1 folder.
- dl1_queue: (Queue) The queue containing the datasets.
- dataset_class: (gammalearn.datasets.BaseLSTDataset) the dataset class as specified in the experiment settings file.
- dataset_parameters: (dict) The dataset parameters as specified in the experiment settings file.
gammalearn.logging module
gammalearn.optimizers module
- class gammalearn.optimizers.DANNLR(optimizer, domain_classifier=False)[source]
Bases:
_LRScheduler
- load_state_dict(state_dict) None [source]
Loads the schedulers state.
- Parameters:
state_dict (dict) – scheduler state. Should be an object returned from a call to
state_dict()
.
- gammalearn.optimizers.elastic(net)[source]
Elastic penalty (L1 + L2). :param net (nn.Module): :type net (nn.Module): the network.
- Returns:
- the penalty
- gammalearn.optimizers.freeze(net, parameters)[source]
Freeze the network parameters :param net (nn.Module): :type net (nn.Module): the network or the subnetwork (e.g. feature) :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.get_layer_id_for_prime(name: str) int [source]
Retrieve GammaPhysNetPrime layer id from parameter name
- gammalearn.optimizers.l1(net)[source]
Simple L1 penalty. :param net (nn.Module): :type net (nn.Module): the network.
- Returns:
- the penalty
- gammalearn.optimizers.l2(net)[source]
Simple L2 penalty. :param net (nn.Module): :type net (nn.Module): the network.
- Returns:
- the penalty
- gammalearn.optimizers.load_adam(net, parameters)[source]
Load the Adam optimizer :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.load_adam_w(net, parameters)[source]
Load the Adam optimizer :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.load_per_layer_sgd(net, parameters)[source]
Load the SGD optimizer with a different learning rate for each layer. :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.load_rmsprop(net, parameters)[source]
Load the RMSprop optimizer :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.load_sgd(net, parameters)[source]
Load the SGD optimizer :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.prime_optimizer(net: Module, parameters: dict) Optimizer [source]
Load the optimizer for Masked AutoEncoder fine tuning (transformers) :param net (nn.Module): :type net (nn.Module): the network of the experiment :param parameters (dict): :type parameters (dict): a dictionary describing the parameters of the optimizer
- Returns:
- the optimizer
- gammalearn.optimizers.srip(net)[source]
Spectral Restricted Isometry Property (SRIP) regularization penalty. See https://arxiv.org/abs/1810.09102 :param net (nn.Module): :type net (nn.Module): the network.
- Returns:
- the penalty
gammalearn.steps module
- gammalearn.steps.run_model(module, batch, requires_latent_features=False, requires_pointing=False, forward_params=None, train=True)[source]
If the training is not in the context of domain adaptation, the information will be stored in xxx_source. :param module: :type module: () The current module. :param batch: :type batch: (torch.tensor) The current batch of data. :param requires_latent_features: :type requires_latent_features: (bool) Creates a hook to get the latent space. The model must respect the following :param design rule and have either a ‘module.net.main_task_model.feature’ or a ‘module.net.feature’ component.: :param requires_pointing: :type requires_pointing: (bool) Whether to include the pointing information as the model’s argument. :param forward_params: :type forward_params: (dict) Allows to pass the model’s forward function extra parameters. The model must respect the :param corresponding design rule of the model’s forward function.: :param train: :type train: (bool) Whether the current step is a training or a test step.
gammalearn.utils module
- class gammalearn.utils.BaseW[source]
Bases:
object
This class is inspired from the Pytorch LRScheduler that defines a learning rate scheduler. Analogically, the purpose of this class is to introduce a time-dependent loss/gradient weight. The module parameter is set within the ‘gammalearn.experiment_runner.Experiment’ class and provides the current and the max step information from the Pytorch Lightning Trainer that is involved in the training. :param apply_on_grads (bool): :type apply_on_grads (bool): Whether the weight must be applied on the gradients (e.g. for DANN).
- Returns:
lambda_p (float)
- Return type:
the step-dependent loss/gradient weight
- class gammalearn.utils.DistributionW(path: str)[source]
Bases:
object
- Parameters:
- path (str): the path to the csv file containing the distribution
- class gammalearn.utils.ExponentialW(gamma=10)[source]
Bases:
BaseW
Compute the exponential weight corresponding to the domain adaptation loss weight proposed in Domain-Adversarial Training of Neural Networks (DANN) (https://doi.org/10.48550/arxiv.1505.07818) from Y. Ganin. This class is particularly useful when applied to the DANN ‘grad_weight’ argument but may also be applied in other context. In more details, in the experiment setting file and in the case of DANN, this class can be used as follows: targets = collections.OrderedDict({
- ‘domain_class’: {
…, ‘grad_weight’: ExponentialW(gamma=10), …,
}
}) :param gamma (int): :type gamma (int): the exponential coefficient in exp(-gamma*p).
- class gammalearn.utils.TrainerLogger[source]
Bases:
object
The TrainerLogger class is used to define the logger used by the Pytorch Lightning Trainer.
- class gammalearn.utils.TrainerTensorboardLogger[source]
Bases:
TrainerLogger
The TrainerTensorboardLogger is a wrapper around the TensorBoardLogger class. It is used to define the logger used by the Pytorch Lightning Trainer, based on Tensorboard.
- class gammalearn.utils.TrainerWandbLogger(offline: bool = False)[source]
Bases:
TrainerLogger
The TrainerWandbLogger is a wrapper around the WandbLogger class. It is used to define the logger used by the Pytorch Lightning Trainer, based on Weights and Biases. More info at https://docs.wandb.ai/guides/integrations/lightning.
- gammalearn.utils.add_tokens_to_pos_embed(pos_embed, additional_tokens, add_pointing, embed_dim)[source]
- gammalearn.utils.browse_folder(data_folder, extension=None)[source]
Browse folder given to find hdf5 files :param data_folder (string): :param extension (string):
- Returns:
- set of hdf5 files
- gammalearn.utils.center_time(dataset, **opts)[source]
Center pixel time based on the max intensity pixel
- Parameters:
- dataset: `Dataset`
- Returns:
- indices: numpy.array
- gammalearn.utils.check_patches(patch_indices, patch_centroids, geom, width_ratio=1.2)[source]
Check patch indices validity :param patch_indices (torch.Tensor): :type patch_indices (torch.Tensor): pixel indices for each patch, corresponding to pixel modules :param patch_centroids (torch.Tensor): :type patch_centroids (torch.Tensor): position of the module centroids :param geom (ctapipe.CameraGeometry): :type geom (ctapipe.CameraGeometry): geometry :param width_ratio (int): :type width_ratio (int): tolerance to check pixel distance to centroid
- Returns:
!! processed by numpydoc !!
- Return type:
- gammalearn.utils.cleaning_filter(dataset, dl1=False, **opts)[source]
Filter images according to a cleaning operation.
- Parameters:
- dataset: `Dataset`
- dl1: (bool) whether to use the info computed by lstchain or to recompute the value
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.compute_total_parameter_number(net)[source]
Compute the total number of parameters of a network :param net (nn.Module): :type net (nn.Module): the network
- Returns:
- int: the number of parameters
- gammalearn.utils.dump_experiment_config(experiment)[source]
Load experiment info from the settings file :param experiment (Experiment): :type experiment (Experiment): experiment
- Returns:
- gammalearn.utils.emission_cone_filter(dataset, max_angle=inf)[source]
Filter the dataset on the maximum distance between the impact point and the telescope position in km :param dataset (Dataset): :type dataset (Dataset): the dataset to filter :param max_angle (float): :type max_angle (float): the max angle between the telescope pointing direction and the direction of the shower in rad
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.energyband_filter(dataset, energy=None, filter_only_gammas=False)[source]
Filter dataset on energy (in TeV). :param dataset (Dataset): :type dataset (Dataset): the dataset to filter :param energy (float): :type energy (float): energy in TeV :param filter_only_gammas (bool):
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.fetch_data_module_settings(experiment, train, domain)[source]
Load the data module described in the experiment setting file. :param experiment: :type experiment: the experiment instance :param train: :type train: True or False depending on the train/test context :param domain: :type domain: ‘source’ or ‘target’ if domain adaptation or None if no domain adaptation
- Returns:
- The data module.
- gammalearn.utils.find_datafiles(data_folders, files_max_number=-1)[source]
Find datafiles in the folders specified :param data_folders (list): :type data_folders (list): the folders where the data are stored :param train_files_max_number (int: :type train_files_max_number (int: the maximum number of files to keep per folder :param optional): :type optional): the maximum number of files to keep per folder
- Returns:
- gammalearn.utils.get_2d_sincos_pos_embedding_from_grid(grid, embed_dim, additional_tokens=None, add_pointing=False)[source]
- gammalearn.utils.get_2d_sincos_pos_embedding_from_patch_centroids(centroids, embed_dim, additional_tokens=None, add_pointing=False)[source]
Compute 2d sincos positional embedding from pixel module centroid positions :param centroids (torch.Tensor): :type centroids (torch.Tensor): x and y position of pixel module centroids :param embed_dim (int): :type embed_dim (int): dimension of embedding :param additional_tokens (list): :type additional_tokens (list): list of additional tokens for which add an embedding
- Return type:
torch.Tensor
- gammalearn.utils.get_camera_layout_from_geom(camera_geometry)[source]
From a ctapipe camera geometry, compute the index matrix and the camera layout (Hex or Square) for indexed conv
- Parameters:
- camera_geometry: `ctapipe.instrument.CameraGeometry`
- Returns:
- tensor_matrix: torch.Tensor
shape (1, 1, n, n)
- camera_layout: str
Hex or Square
- gammalearn.utils.get_centroids_from_patches(patches, geom)[source]
Compute module centroid positions from patch indices and geometry :param patches (torch.Tensor): :type patches (torch.Tensor): pixel indices for each patch, corresponding to pixel modules :param geom (ctapipe.CameraGeometry): :type geom (ctapipe.CameraGeometry): geometry
- Return type:
torch.Tensor
- gammalearn.utils.get_dataset_geom(d, geometries)[source]
Update geometries by append the geometries from d
- Parameters:
- d: list or `torch.utils.ConcatDataset` or `torch.utils.data.Subset` or `torch.utils.data.Dataset`
- geometries: list
- gammalearn.utils.get_index_matrix_from_geom(camera_geometry)[source]
Compute the index matrix from a ctapipe CameraGeometry
- Parameters:
- camera_geometry: `ctapipe.instrument.CameraGeometry`
- Returns:
- indices_matrix: numpy.ndarray
shape (n, n)
- gammalearn.utils.get_patch_indices_and_centroids_from_geometry(geom)[source]
Compute patch indices and centroid positions from geometry :param geom (ctapipe.CameraGeometry): :type geom (ctapipe.CameraGeometry): geometry
- Return type:
(torch.Tensor, torch.Tensor)
- gammalearn.utils.get_torch_weights_from_lightning_checkpoint(checkpoint)[source]
- Parameters:
- checkpoint
- Returns:
- Torch state dict
- gammalearn.utils.impact_distance_filter(dataset, max_distance=inf)[source]
Filter the dataset on the maximum distance between the impact point and the telescope position in km :param dataset (Dataset): :type dataset (Dataset): the dataset to filter :param max_distance (float): :type max_distance (float): the maximum distance between the impact point and the telescope position in km
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.inject_geometry_into_parameters(parameters: dict, camera_geometry)[source]
Adds camera geometry in model backbone parameters
- gammalearn.utils.intensity_filter(dataset, intensity=None, cleaning=False, dl1=False, **opts)[source]
Filter dataset on intensity (in pe) :param dataset (Dataset): :type dataset (Dataset): the dataset to filter :param a (int): :type a (int): total intensity in photoelectrons :param cleaning (bool): :type cleaning (bool): cut after cleaning :param dl1 (bool): :type dl1 (bool): whether to use the info computed by lstchain or to recompute the value :param opts (dict): :type opts (dict): cleaning options
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.is_datafile_healthy(file_path)[source]
Check that the data file does not contain empty dataset :param file_path (str): :type file_path (str): the path to the file
- Returns:
- A boolean
- gammalearn.utils.leakage_filter(dataset, leakage1_cut=None, leakage2_cut=None, dl1=False, **opts)[source]
Filter images according to a cleaning operation.
- Parameters:
- dataset: `Dataset`
- leakage1_cut: `int`
- leakage2_cut: `int`
- dl1: `bool` whether to use the info computed by lstchain or to recompute the value
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.load_camera_parameters(camera_type)[source]
Deprecated since version 20-08-2021: the camera parameters are now loaded from the camera geometry, read from datafiles
Load camera parameters : nbCol and injTable :param datafiles (List): :type datafiles (List): files to load data from :param camera_type (str): :type camera_type (str): the type of camera to load data for ; eg ‘LST_LSTCam’
- Returns:
- A dictionary
- gammalearn.utils.nets_definition_path()[source]
Return the path to the net definition file
- Returns:
- str
- gammalearn.utils.post_process_data(merged_outputs, merged_dl1_params, dataset_parameters)[source]
Post process data produced by the inference of a model on dl1 data to make them dl2 ready Parameters: ———– merged_outputs (pd.Dataframe): merged outputs produced by the model at test time merged_dl1_params (pd.Dataframe): corresponding merged dl1 parameters dataset_parameters (dict): parameters used to instantiate dataset objects Returns: ——– dl2_params
- gammalearn.utils.prepare_experiment_folder(main_directory, experiment_name)[source]
Prepare experiment folder and check if already exists :param main_directory (string): :param experiment_name (string):
- Returns:
- gammalearn.utils.prepare_gammaboard_folder(main_directory, experiment_name)[source]
Prepare tensorboard run folder for the experiment :param main_directory (string): :param experiment_name (string):
- Returns:
- gammalearn.utils.rgb_to_grays(dataset)[source]
Function to convert RGB images to 2 channels gray images. :param dataset (Dataset):
- gammalearn.utils.rotated_indices(pixels_position, theta)[source]
Function that returns the rotated indices of an image from the pixels position.
- Parameters:
- pixels_position (numpy.Array): an array of shape (n, 2) of the position of the pixels
- theta (float): angle of rotation
- Returns:
- Rotated indices
- gammalearn.utils.shower_position_filter(dataset, max_distance, dl1=False, **opts)[source]
Filter images according to the position of the centroid of the shower. The image is cleaned then Hillas parameters are computed. For the LST a distance of 0.5 m corresponds to 10 pixels.
- Parameters:
- dataset (`Dataset`)
- max_distance (float): distance to the center of the camera in meters
- opts (dict): cleaning options
- dl1 (bool): whether to use the info computed by lstchain or to recompute the value
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.telescope_multiplicity_filter(dataset, multiplicity, strict=False)[source]
Filter dataset on number of telescopes that triggered (for a particular type of telescope) :param dataset (Dataset): :type dataset (Dataset): the dataset to filter :param multiplicity (int): :type multiplicity (int): the number of telescopes that triggered :param strict (bool):
- Returns:
- (list of bool): the mask to filter the data
- gammalearn.utils.write_dataframe(dataframe, outfile, table_path, mode='a', index=False, complib='blosc:zstd', complevel=1)[source]
Write a pandas dataframe to a HDF5 file using pytables formatting. Re-implementation of https://github.com/cta-observatory/cta-lstchain/blob/1280e47950726f92200e1624ca38c672760e9d77/lstchain/io/io.py#L771 to allow for compression
- Parameters:
- dataframe: `pandas.DataFrame`
- outfile: path
- table_path: str
path to the table to write in the HDF5 file
- mode: str
‘a’ to append to an existing file, ‘w’ to overwrite
- index: bool
whether to write the index of the dataframe
- complib: str
compression library to use
- complevel: int
compression level to use
- Returns:
- None