span_marker.trainer module¶

class span_marker.trainer.Trainer(model=None, args=None, train_dataset=None, eval_dataset=None, model_init=None, compute_metrics=None, callbacks=None, optimizers=(None, None), preprocess_logits_for_metrics=None)[source]¶

Bases: Trainer

Trainer is a simple but feature-complete training and eval loop for SpanMarker, built tightly on top of the 🤗 Transformers Trainer.

Parameters:
  • model (Optional[SpanMarkerModel]) – The model to train, evaluate or use for predictions. If not provided, a model_init must be passed.

  • args (Optional[TrainingArguments]) – The arguments to tweak for training. Will default to a basic instance of TrainingArguments with the output_dir set to a directory named models/my_span_marker_model in the current directory if not provided.

  • train_dataset (Optional[Dataset]) – The dataset to use for training. Must contain tokens and ner_tags columns, and may contain document_id and sentence_id columns for document-level context during training.

  • eval_dataset (Optional[Dataset]) – The dataset to use for evaluation. Must contain tokens and ner_tags columns, and may contain document_id and sentence_id columns for document-level context during evaluation.

  • model_init (Optional[Callable[[], SpanMarkerModel]]) –

    A function that instantiates the model to be used. If provided, each call to Trainer.train() will start from a new instance of the model as given by this function.

    The function may have zero argument, or a single one containing the optuna/Ray Tune/SigOpt trial object, to be able to choose different architectures according to hyper parameters (such as layer count, sizes of inner layers, dropout probabilities etc).

  • compute_metrics (Optional[Callable[[EvalPrediction], Dict]]) – The function that will be used to compute metrics at evaluation. Must take a EvalPrediction and return a dictionary string to metric values.

  • callbacks (Optional[List[TrainerCallback]]) –

    A list of callbacks to customize the training loop. Will add those to the list of default callbacks detailed in the Hugging Face Callback documentation.

    If you want to remove one of the default callbacks used, use the remove_callback() method.

  • optimizers (Tuple[Optional[Optimizer], Optional[LambdaLR]]) – A tuple containing the optimizer and the scheduler to use. Will default to an instance of AdamW on your model and a scheduler given by get_linear_schedule_with_warmup controlled by args.

  • preprocess_logits_for_metrics (Optional[Callable[[Tensor, Tensor], Tensor]]) –

    A function that preprocess the logits right before caching them at each evaluation step. Must take two tensors, the logits and the labels, and return the logits once processed as desired. The modifications made by this function will be reflected in the predictions received by compute_metrics.

    Note that the labels (second parameter) will be None if the dataset does not have them.

Important attributes:

  • model – Always points to the core model.

  • model_wrapped – Always points to the most external model in case one or more other modules wrap the original model. This is the model that should be used for the forward pass. For example, under DeepSpeed, the inner model is wrapped in DeepSpeed and then again in torch.nn.DistributedDataParallel. If the inner model hasn’t been wrapped, then self.model_wrapped is the same as self.model.

  • is_model_parallel – Whether or not a model has been switched to a model parallel mode (different from data parallelism, this means some of the model layers are split on different GPUs).

  • place_model_on_device – Whether or not to automatically place the model on the device - it will be set to False if model parallel or deepspeed is used, or if the default TrainingArguments.place_model_on_device is overridden to return False.

  • is_in_train – Whether or not a model is currently running train() (e.g. when evaluate is called while in train)

preprocess_dataset(dataset, label_normalizer, tokenizer, dataset_name='train', is_evaluate=False)[source]¶

Normalize the ner_tags labels and call tokenizer on tokens.

Parameters:
  • dataset (Dataset) – A Hugging Face dataset with tokens and ner_tags columns.

  • label_normalizer (LabelNormalizer) – A callable that normalizes ner_tags into start-end-label tuples.

  • tokenizer (SpanMarkerTokenizer) – The tokenizer responsible for tokenizing tokens into input IDs, and adding start and end markers.

  • dataset_name (str, optional) – The name of the dataset. Defaults to “train”.

  • is_evaluate (bool, optional) – Whether to return the number of words for each sample. Required for evaluation. Defaults to False.

Raises:

ValueError – If the dataset does not contain tokens and ner_tags columns.

Returns:

The normalized and tokenized version of the input dataset.

Return type:

Dataset

static add_context(dataset, model_max_length, max_prev_context=None, max_next_context=None, show_progress_bar=True)[source]¶

Add document-level context from previous and next sentences in the same document.

Parameters:
  • dataset (Dataset) – The partially processed dataset, containing “input_ids”, “start_position_ids”, “end_position_ids”, “document_id” and “sentence_id” columns.

  • model_max_length (int) – The total number of tokens that can be processed before truncation.

  • max_prev_context (Optional[int]) – The maximum number of previous sentences to include. Defaults to None, representing as many previous sentences as fits.

  • max_next_context (Optional[int]) – The maximum number of next sentences to include. Defaults to None, representing as many previous sentences as fits.

  • show_progress_bar (bool) – Whether to show a progress bar. Defaults to True.

Returns:

A copy of the Dataset with additional previous and next sentences added to input_ids.

Return type:

Dataset

static spread_sample(batch, model_max_length, marker_max_length)[source]¶

Spread sentences between multiple samples if lack of space per sample requires it.

Parameters:
  • batch (Dict[str, List[Any]]) – A dictionary of dataset keys to lists of values.

  • model_max_length (int) – The total number of tokens that can be processed before truncation.

  • marker_max_length (int) – The maximum length for each of the span markers. A value of 128 means that each training and inferencing sample contains a maximum of 128 start markers and 128 end markers, for a total of 256 markers per sample.

Returns:

A dictionary of dataset keys to lists of values.

Return type:

Dict[str, List[Any]]

create_model_card(*_args, **_kwargs)[source]¶

Creates a draft of a model card using the information available to the Trainer, the SpanMarkerModel and the SpanMarkerModelCardData.

Return type:

None

Trainer.train(resume_from_checkpoint=None, trial=None, ignore_keys_for_eval=None, **kwargs)[source]¶

Main training entry point.

Parameters:
  • resume_from_checkpoint (str or bool, optional) – If a str, local path to a saved checkpoint as saved by a previous instance of [Trainer]. If a bool and equals True, load the last checkpoint in args.output_dir as saved by a previous instance of [Trainer]. If present, training will resume from the model/optimizer/scheduler states loaded here.

  • trial (optuna.Trial or Dict[str, Any], optional) – The trial run or the hyperparameter dictionary for hyperparameter search.

  • ignore_keys_for_eval (List[str], optional) – A list of keys in the output of your model (if it is a dictionary) that should be ignored when gathering predictions for evaluation during the training.

  • kwargs (Dict[str, Any], optional) – Additional keyword arguments used to hide deprecated arguments

Trainer.evaluate(eval_dataset=None, ignore_keys=None, metric_key_prefix='eval')[source]¶

Run evaluation and returns metrics.

The calling script will be responsible for providing a method to compute metrics, as they are task-dependent (pass it to the init compute_metrics argument).

You can also subclass and override this method to inject custom behavior.

Parameters:
  • eval_dataset (Union[Dataset, Dict[str, Dataset]), optional) –

    Pass a dataset if you wish to override self.eval_dataset. If it is a [~datasets.Dataset], columns not accepted by the model.forward() method are automatically removed. If it is a dictionary, it will evaluate on each dataset, prepending the dictionary key to the metric name. Datasets must implement the __len__ method.

    <Tip>

    If you pass a dictionary with names of datasets as keys and datasets as values, evaluate will run separate evaluations on each dataset. This can be useful to monitor how training affects other datasets or simply to get a more fine-grained evaluation. When used with load_best_model_at_end, make sure metric_for_best_model references exactly one of the datasets. If you, for example, pass in {“data1”: data1, “data2”: data2} for two datasets data1 and data2, you could specify metric_for_best_model=”eval_data1_loss” for using the loss on data1 and metric_for_best_model=”eval_data1_loss” for the loss on data2.

    </Tip>

  • ignore_keys (List[str], optional) – A list of keys in the output of your model (if it is a dictionary) that should be ignored when gathering predictions.

  • metric_key_prefix (str, optional, defaults to “eval”) – An optional prefix to be used as the metrics key prefix. For example the metrics “bleu” will be named “eval_bleu” if the prefix is “eval” (default)

Returns:

A dictionary containing the evaluation loss and the potential metrics computed from the predictions. The dictionary also contains the epoch number which comes from the training state.

Return type:

Dict[str, float]

Trainer.push_to_hub(commit_message='End of training', blocking=True, **kwargs)[source]¶

Upload self.model and self.tokenizer to the 🤗 model hub on the repo self.args.hub_model_id.

Parameters:
  • commit_message (str, optional, defaults to “End of training”) – Message to commit while pushing.

  • blocking (bool, optional, defaults to True) – Whether the function should return only when the git push has finished.

  • kwargs (Dict[str, Any], optional) – Additional keyword arguments passed along to [~Trainer.create_model_card].

Returns:

The URL of the repository where the model was pushed if blocking=False, or a Future object tracking the progress of the commit if blocking=True.

Return type:

str

Launch an hyperparameter search using optuna or Ray Tune or SigOpt. The optimized quantity is determined by compute_objective, which defaults to a function returning the evaluation loss when no metric is provided, the sum of all metrics otherwise.

<Tip warning={true}>

To use this method, you need to have provided a model_init when initializing your [Trainer]: we need to reinitialize the model at each new run. This is incompatible with the optimizers argument, so you need to subclass [Trainer] and override the method [~Trainer.create_optimizer_and_scheduler] for custom optimizer/scheduler.

</Tip>

Parameters:
  • hp_space (Callable[[“optuna.Trial”], Dict[str, float]], optional) – A function that defines the hyperparameter search space. Will default to [~trainer_utils.default_hp_space_optuna] or [~trainer_utils.default_hp_space_ray] or [~trainer_utils.default_hp_space_sigopt] depending on your backend.

  • compute_objective (Callable[[Dict[str, float]], float], optional) – A function computing the objective to minimize or maximize from the metrics returned by the evaluate method. Will default to [~trainer_utils.default_compute_objective].

  • n_trials (int, optional, defaults to 100) – The number of trial runs to test.

  • direction (str or List[str], optional, defaults to “minimize”) – If it’s single objective optimization, direction is str, can be “minimize” or “maximize”, you should pick “minimize” when optimizing the validation loss, “maximize” when optimizing one or several metrics. If it’s multi objectives optimization, direction is List[str], can be List of “minimize” and “maximize”, you should pick “minimize” when optimizing the validation loss, “maximize” when optimizing one or several metrics.

  • backend (str or [~training_utils.HPSearchBackend], optional) – The backend to use for hyperparameter search. Will default to optuna or Ray Tune or SigOpt, depending on which one is installed. If all are installed, will default to optuna.

  • hp_name (Callable[[“optuna.Trial”], str]], optional) – A function that defines the trial/run name. Will default to None.

  • kwargs (Dict[str, Any], optional) –

    Additional keyword arguments passed along to optuna.create_study or ray.tune.run. For more information see:

Returns:

All the information about the best run or best runs for multi-objective optimization. Experiment summary can be found in run_summary attribute for Ray backend.

Return type:

[trainer_utils.BestRun or List[trainer_utils.BestRun]]