Skip to content

Latest commit

 

History

History
59 lines (56 loc) · 3.87 KB

parameter_selection.md

File metadata and controls

59 lines (56 loc) · 3.87 KB

Prompt Tuning Parameters

Parameters for selection of training algorithm

  • base_model

    • Description: Path to the base model or caikit.Resource object of the base model to be used for tuning. A model-name string may also be provided. In this case, the Transformers API will automatically load it up from Hugging Face model cache if the model is locally available. If it is not available, the model may be downloaded by setting the ALLOW_DOWNLOADS environment variable to true.
    • Accepted values:
      • The model needs to be of type causal-lm or seq2seq, thus loadable via HuggingFace AutoModelForCausalLM or AutoModelForSeq2SeqLM loading methods.
  • tuning_type: (str or caikit_nlp.modules.text_generation.TuningType)

    • Type of Peft Tuning config which we would like to build.
    • Accepted values: PROMPT_TUNING and MULTITASK_PROMPT_TUNING
    • Default: PROMPT_TUNING
  • num_epochs: (int)

    • quality depends a lot on number of epochs.
    • Expose to end user recommendation: True
    • Accepted values: any positive int
    • Default: 50
    • NOTE: defaults might also differ based on base_model
  • verbalizer

    • Verbalizer template to be used for formatting data at train and inference time. This template may use brackets to indicate which fields from the data model TrainGenerationRecord should be rendered.
    • Default: "{{input}}", i.e., the raw text.
    • Expose to end user recommendation: True
  • batch_size:

    • Default: 8
    • Expose to end user recommendation: False
  • max_length:

    • Max length of sequences being considered. Currently this limit is used for both the input sequence and the output sequence.
    • Default: 256.
    • Expose to end user recommendation: False
    • Could be configurable per model.
  • tuning_config.num_virtual_tokens:

    • Number of virtual tokens to be used for training.
    • Default: 20 (might depend on base_model and tuning_type)
    • This should correspond to available source prompt, if source prompt exists, i.e a user need to select number of virtual token as per the source prompt available, in case they want to use MPT source prompts.
  • tuning_config.prompt_tuning_init_method:

    • Could be: RANDOM, TEXT, ONLY_SOURCE_SHARED and AVERAGE_SOURCE
    • TEXT requires tuning_config.prompt_tuning_init_text to be set
    • ONLY_SOURCE_SHARED and AVERAGE_SOURCE requires tuning_config.prompt_tuning_init_source_model to be set and source prompt model to be available for the given base_model
    • Default: RANDOM
    • Expose to end user recommendation: False (TBD)
  • tuning_config.prompt_tuning_init_text:

    • Initialization text to be used IF tuning_config.prompt_tuning_init_method is set to TEXT otherwise this will be ignored.
    • Default: NO Default.
    • Expose to end user recommendation: True (if TEXT init_method is exposed to customers)
  • tuning_config.prompt_tuning_init_source_model:

    • Path pointing to the source prompt model. This path is relative to config.source_prompt_base (or SOURCE_PROMPT_BASE` env variable)
    • The source model selection needs to correspond to the base_model.
    • There may be cases where we have multiple source prompts available for a given model, in which case, their selection criteria needs to be determined.
    • Default Would depend on the base_model. If MULTITASK_PROMPT_TUNING is not selected as the tuning type, then this will be ignored.
  • tuning_config.output_model_types: List(str)

    • Could contain a list containing string ENCODER, DECODER.
    • Acceptable values for types of models:
      • CausalLM: ["DECODER"]
      • Seq2Seq: ["ENCODER"], ["DECODER"], ["ENCODER", "DECODER"]
    • Default:
      • CausalLM: [DECODER]
      • Seq2Seq: [ENCODER]
    • Expose to end user recommendation: False