span_marker.trainer module¶
- class span_marker.trainer.Trainer(model=None, args=None, train_dataset=None, eval_dataset=None, model_init=None, compute_metrics=None, callbacks=None, optimizers=(None, None), preprocess_logits_for_metrics=None)[source]¶
Bases:
Trainer
Trainer is a simple but feature-complete training and eval loop for SpanMarker, built tightly on top of the 🤗 Transformers Trainer.
- Parameters:
model (Optional[SpanMarkerModel]) – The model to train, evaluate or use for predictions. If not provided, a
model_init
must be passed.args (Optional[TrainingArguments]) – The arguments to tweak for training. Will default to a basic instance of
TrainingArguments
with theoutput_dir
set to a directory named models/my_span_marker_model in the current directory if not provided.train_dataset (Optional[Dataset]) – The dataset to use for training. Must contain
tokens
andner_tags
columns, and may containdocument_id
andsentence_id
columns for document-level context during training.eval_dataset (Optional[Dataset]) – The dataset to use for evaluation. Must contain
tokens
andner_tags
columns, and may containdocument_id
andsentence_id
columns for document-level context during evaluation.model_init (Optional[Callable[[], SpanMarkerModel]]) –
A function that instantiates the model to be used. If provided, each call to
Trainer.train()
will start from a new instance of the model as given by this function.The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to be able to choose different architectures according to hyper parameters (such as layer count, sizes of inner layers, dropout probabilities etc).
compute_metrics (Optional[Callable[[EvalPrediction], Dict]]) – The function that will be used to compute metrics at evaluation. Must take a
EvalPrediction
and return a dictionary string to metric values.callbacks (Optional[List[TrainerCallback]]) –
A list of callbacks to customize the training loop. Will add those to the list of default callbacks detailed in the Hugging Face Callback documentation.
If you want to remove one of the default callbacks used, use the
remove_callback()
method.optimizers (Tuple[Optional[Optimizer], Optional[LambdaLR]]) – A tuple containing the optimizer and the scheduler to use. Will default to an instance of
AdamW
on your model and a scheduler given byget_linear_schedule_with_warmup
controlled byargs
.preprocess_logits_for_metrics (Optional[Callable[[Tensor, Tensor], Tensor]]) –
A function that preprocess the logits right before caching them at each evaluation step. Must take two tensors, the logits and the labels, and return the logits once processed as desired. The modifications made by this function will be reflected in the predictions received by
compute_metrics
.Note that the labels (second parameter) will be
None
if the dataset does not have them.
Important attributes:
model – Always points to the core model.
model_wrapped – Always points to the most external model in case one or more other modules wrap the original model. This is the model that should be used for the forward pass. For example, under
DeepSpeed
, the inner model is wrapped inDeepSpeed
and then again intorch.nn.DistributedDataParallel
. If the inner model hasn’t been wrapped, thenself.model_wrapped
is the same asself.model
.is_model_parallel – Whether or not a model has been switched to a model parallel mode (different from data parallelism, this means some of the model layers are split on different GPUs).
place_model_on_device – Whether or not to automatically place the model on the device - it will be set to
False
if model parallel or deepspeed is used, or if the default TrainingArguments.place_model_on_device is overridden to returnFalse
.is_in_train – Whether or not a model is currently running
train()
(e.g. whenevaluate
is called while intrain
)
- preprocess_dataset(dataset, label_normalizer, tokenizer, dataset_name='train', is_evaluate=False)[source]¶
Normalize the
ner_tags
labels and call tokenizer ontokens
.- Parameters:
dataset (Dataset) – A Hugging Face dataset with
tokens
andner_tags
columns.label_normalizer (LabelNormalizer) – A callable that normalizes
ner_tags
into start-end-label tuples.tokenizer (SpanMarkerTokenizer) – The tokenizer responsible for tokenizing
tokens
into input IDs, and adding start and end markers.dataset_name (str, optional) – The name of the dataset. Defaults to “train”.
is_evaluate (bool, optional) – Whether to return the number of words for each sample. Required for evaluation. Defaults to False.
- Raises:
ValueError – If the
dataset
does not containtokens
andner_tags
columns.- Returns:
The normalized and tokenized version of the input dataset.
- Return type:
Dataset
- static add_context(dataset, model_max_length, max_prev_context=None, max_next_context=None, show_progress_bar=True)[source]¶
Add document-level context from previous and next sentences in the same document.
- Parameters:
dataset (Dataset) – The partially processed dataset, containing “input_ids”, “start_position_ids”, “end_position_ids”, “document_id” and “sentence_id” columns.
model_max_length (int) – The total number of tokens that can be processed before truncation.
max_prev_context (Optional[int]) – The maximum number of previous sentences to include. Defaults to None, representing as many previous sentences as fits.
max_next_context (Optional[int]) – The maximum number of next sentences to include. Defaults to None, representing as many previous sentences as fits.
show_progress_bar (bool) – Whether to show a progress bar. Defaults to True.
- Returns:
A copy of the Dataset with additional previous and next sentences added to input_ids.
- Return type:
Dataset
- static spread_sample(batch, model_max_length, marker_max_length)[source]¶
Spread sentences between multiple samples if lack of space per sample requires it.
- Parameters:
batch (Dict[str, List[Any]]) – A dictionary of dataset keys to lists of values.
model_max_length (int) – The total number of tokens that can be processed before truncation.
marker_max_length (int) – The maximum length for each of the span markers. A value of 128 means that each training and inferencing sample contains a maximum of 128 start markers and 128 end markers, for a total of 256 markers per sample.
- Returns:
A dictionary of dataset keys to lists of values.
- Return type:
Dict[str, List[Any]]
- Trainer.train(resume_from_checkpoint=None, trial=None, ignore_keys_for_eval=None, **kwargs)[source]¶
Main training entry point.
- Parameters:
resume_from_checkpoint (str or bool, optional) – If a str, local path to a saved checkpoint as saved by a previous instance of [Trainer]. If a bool and equals True, load the last checkpoint in args.output_dir as saved by a previous instance of [Trainer]. If present, training will resume from the model/optimizer/scheduler states loaded here.
trial (optuna.Trial or Dict[str, Any], optional) – The trial run or the hyperparameter dictionary for hyperparameter search.
ignore_keys_for_eval (List[str], optional) – A list of keys in the output of your model (if it is a dictionary) that should be ignored when gathering predictions for evaluation during the training.
kwargs (Dict[str, Any], optional) – Additional keyword arguments used to hide deprecated arguments
- Trainer.evaluate(eval_dataset=None, ignore_keys=None, metric_key_prefix='eval')[source]¶
Run evaluation and returns metrics.
The calling script will be responsible for providing a method to compute metrics, as they are task-dependent (pass it to the init compute_metrics argument).
You can also subclass and override this method to inject custom behavior.
- Parameters:
eval_dataset (Union[Dataset, Dict[str, Dataset]), optional) –
Pass a dataset if you wish to override self.eval_dataset. If it is a [~datasets.Dataset], columns not accepted by the model.forward() method are automatically removed. If it is a dictionary, it will evaluate on each dataset, prepending the dictionary key to the metric name. Datasets must implement the __len__ method.
<Tip>
If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation. When used with load_best_model_at_end, make sure metric_for_best_model references exactly one of the datasets. If you, for example, pass in {“data1”: data1, “data2”: data2} for two datasets data1 and data2, you could specify metric_for_best_model=”eval_data1_loss” for using the loss on data1 and metric_for_best_model=”eval_data2_loss” for the loss on data2.
</Tip>
ignore_keys (List[str], optional) – A list of keys in the output of your model (if it is a dictionary) that should be ignored when gathering predictions.
metric_key_prefix (str, optional, defaults to “eval”) – An optional prefix to be used as the metrics key prefix. For example the metrics “bleu” will be named “eval_bleu” if the prefix is “eval” (default)
- Returns:
A dictionary containing the evaluation loss and the potential metrics computed from the predictions. The dictionary also contains the epoch number which comes from the training state.
- Return type:
- Trainer.push_to_hub(commit_message='End of training', blocking=True, token=None, revision=None, **kwargs)[source]¶
Upload self.model and self.processing_class to the 🤗 model hub on the repo self.args.hub_model_id.
- Parameters:
commit_message (str, optional, defaults to “End of training”) – Message to commit while pushing.
blocking (bool, optional, defaults to True) – Whether the function should return only when the git push has finished.
token (str, optional, defaults to None) – Token with write permission to overwrite Trainer’s original args.
revision (str, optional) – The git revision to commit from. Defaults to the head of the “main” branch.
kwargs (Dict[str, Any], optional) – Additional keyword arguments passed along to [~Trainer.create_model_card].
- Returns:
The URL of the repository where the model was pushed if blocking=False, or a Future object tracking the progress of the commit if blocking=True.
- Return type:
- Trainer.hyperparameter_search(hp_space=None, compute_objective=None, n_trials=20, direction='minimize', backend=None, hp_name=None, **kwargs)[source]¶
Launch an hyperparameter search using optuna or Ray Tune or SigOpt. The optimized quantity is determined by compute_objective, which defaults to a function returning the evaluation loss when no metric is provided, the sum of all metrics otherwise.
<Tip warning={true}>
To use this method, you need to have provided a model_init when initializing your [Trainer]: we need to reinitialize the model at each new run. This is incompatible with the optimizers argument, so you need to subclass [Trainer] and override the method [~Trainer.create_optimizer_and_scheduler] for custom optimizer/scheduler.
</Tip>
- Parameters:
hp_space (Callable[[“optuna.Trial”], Dict[str, float]], optional) – A function that defines the hyperparameter search space. Will default to [~trainer_utils.default_hp_space_optuna] or [~trainer_utils.default_hp_space_ray] or [~trainer_utils.default_hp_space_sigopt] depending on your backend.
compute_objective (Callable[[Dict[str, float]], float], optional) – A function computing the objective to minimize or maximize from the metrics returned by the evaluate method. Will default to [~trainer_utils.default_compute_objective].
n_trials (int, optional, defaults to 100) – The number of trial runs to test.
direction (str or List[str], optional, defaults to “minimize”) – If it’s single objective optimization, direction is str, can be “minimize” or “maximize”, you should pick “minimize” when optimizing the validation loss, “maximize” when optimizing one or several metrics. If it’s multi objectives optimization, direction is List[str], can be List of “minimize” and “maximize”, you should pick “minimize” when optimizing the validation loss, “maximize” when optimizing one or several metrics.
backend (str or [~training_utils.HPSearchBackend], optional) – The backend to use for hyperparameter search. Will default to optuna or Ray Tune or SigOpt, depending on which one is installed. If all are installed, will default to optuna.
hp_name (Callable[[“optuna.Trial”], str]], optional) – A function that defines the trial/run name. Will default to None.
kwargs (Dict[str, Any], optional) –
Additional keyword arguments for each backend:
optuna: parameters from [optuna.study.create_study](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.create_study.html) and also the parameters timeout, n_jobs and gc_after_trial from [optuna.study.Study.optimize](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.Study.html#optuna.study.Study.optimize)
ray: parameters from [tune.run](https://docs.ray.io/en/latest/tune/api_docs/execution.html#tune-run). If resources_per_trial is not set in the kwargs, it defaults to 1 CPU core and 1 GPU (if available). If progress_reporter is not set in the kwargs, [ray.tune.CLIReporter](https://docs.ray.io/en/latest/tune/api/doc/ray.tune.CLIReporter.html) is used.
sigopt: the parameter proxies from [sigopt.Connection.set_proxies](https://docs.sigopt.com/support/faq#how-do-i-use-sigopt-with-a-proxy).
- Returns:
All the information about the best run or best runs for multi-objective optimization. Experiment summary can be found in run_summary attribute for Ray backend.
- Return type:
[trainer_utils.BestRun or List[trainer_utils.BestRun]]