Skip to content

Latest commit

 

History

History

CPT

CPT

Pre-Training for Chinese Language Understanding and Generation. This model is based on BartModel of transformers.

Model

CPT supports both understanding and generation task. CPT has two kinds of decoders: decoder for understanding (DU) and decoder for generating (DG). When we only use DU, the model is similar to BERT while use DG only, it is similar to BART. CPT model is placed in models/, and you can set different cls_mode to run different tasks:

  • cls_mode = 1: Only encoder, or DU only.
  • cls_mode = 2: Only decoder, or DG only.
  • cls_mode = 3: Both encoder and decoder.

We implemented some extra functions here:

  • tensor_unique in models/bart_utils.py

requirement

This project uses the lightly version of oneflow. You can use the following command to install. CPU:

python3 -m pip install -f  https://staging.oneflow.info/branch/master/cpu  --pre oneflow

GPU:

python3 -m pip install -f  https://staging.oneflow.info/branch/master/cu112  --pre oneflow

You can install other dependencies using the following command.

pip install -r requirements.txt

Train

We use CLUE(Chinese GLUE)-AFQMC to test our CPT model. You can check this site and this site for details.

After training, CPT can figure out if the two input sentences are of the same meaning.

First, get the dataset:

wget https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/nlp/CPT/clue_afqmc.tar.gz
tar -xzf clue_afqmc.tar.gz

The dataset contains two json files train.json and eval.json.

Then download the pretrained model cpt-base, which contains 12 encoder layers and 2 decoder layers:

wget https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/nlp/CPT/cpt-base.tar.gz
tar -xzf cpt-base.tar.gz

or the pretrained model cpt-large, which contains 24 encoder layers and 4 decoder layers:

wget https://oneflow-public.oss-cn-beijing.aliyuncs.com/model_zoo/nlp/CPT/cpt-large.tar.gz
tar -xzf cpt-large.tar.gz

Finally, run bash train.sh can train the model. Remember to change parameter CPT_PRETRAIN_DIR to correctly load the pretrained parameters.

sh train.sh

Inference

Bash script infer.sh is used to test the pretrained model.

sh infer.sh

The default texts are "订单退款成功了,为什么仍然要还花呗 and "退款到了花呗,为什么还要还款", then the model will predict label 1 which means the two sentences are same in meanning. You can change them in infer.sh.