Skip to content

Commit

Permalink
Created using Colaboratory
Browse files Browse the repository at this point in the history
  • Loading branch information
thtrieu committed Mar 11, 2021
1 parent 728793b commit 05bf549
Showing 1 changed file with 56 additions and 34 deletions.
90 changes: 56 additions & 34 deletions colab/Interactive_Back_Translation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"metadata": {
"colab": {
"name": "Interactive Back Translation.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"include_colab_link": true
Expand All @@ -29,8 +28,7 @@
{
"cell_type": "markdown",
"metadata": {
"id": "FuzuxFlWeWh2",
"colab_type": "text"
"id": "FuzuxFlWeWh2"
},
"source": [
"# Data Augmentation by Backtranslation\n",
Expand All @@ -41,8 +39,7 @@
{
"cell_type": "markdown",
"metadata": {
"id": "HlgMKwzE0wMu",
"colab_type": "text"
"id": "HlgMKwzE0wMu"
},
"source": [
"**MIT License**\n",
Expand Down Expand Up @@ -71,8 +68,7 @@
{
"cell_type": "markdown",
"metadata": {
"id": "FkgZPK_GpTF0",
"colab_type": "text"
"id": "FkgZPK_GpTF0"
},
"source": [
"## Introduction\n",
Expand All @@ -85,7 +81,6 @@
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "6Uar8Ae88MmM"
},
"source": [
Expand Down Expand Up @@ -129,9 +124,10 @@
"cell_type": "code",
"metadata": {
"id": "kRO6TXGLT4Qb",
"colab_type": "code",
"cellView": "both",
"colab": {}
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "93778ca0-21e8-4ed7-8ed3-85578218adda"
},
"source": [
"%tensorflow_version 1.x\n",
Expand Down Expand Up @@ -168,25 +164,58 @@
"to_data_dir = 'data/translate_vien_iwslt32k/' # @param {type:\"string\"}\n",
"\n",
"bucket_path = 'gs:https://' + google_cloud_bucket\n",
"from_ckpt = os.path.join(bucket_path, from_ckpt)\n",
"from_ckpt_dir = os.path.join(bucket_path, from_ckpt)\n",
"to_ckpt = os.path.join(bucket_path, to_ckpt)\n",
"from_data_dir = os.path.join(bucket_path, from_data_dir)\n",
"to_data_dir = os.path.join(bucket_path, to_data_dir)\n",
"\n",
"# Convert directory into checkpoints\n",
"if tf.gfile.IsDirectory(from_ckpt):\n",
" from_ckpt = tf.train.latest_checkpoint(from_ckpt)\n",
"if tf.gfile.IsDirectory(from_ckpt_dir):\n",
" from_ckpt = tf.train.latest_checkpoint(from_ckpt_dir)\n",
"if tf.gfile.IsDirectory(to_ckpt):\n",
" to_ckpt = tf.train.latest_checkpoint(to_ckpt)\n"
],
"execution_count": 0,
"outputs": []
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"Requirement already satisfied: tensorflow-datasets==3.2.1 in /usr/local/lib/python3.7/dist-packages (3.2.1)\n",
"Requirement already satisfied: promise in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (2.3)\n",
"Requirement already satisfied: absl-py in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (0.10.0)\n",
"Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (4.41.1)\n",
"Requirement already satisfied: termcolor in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (1.1.0)\n",
"Requirement already satisfied: dill in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (0.3.3)\n",
"Requirement already satisfied: wrapt in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (1.12.1)\n",
"Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (2.23.0)\n",
"Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (1.19.5)\n",
"Requirement already satisfied: attrs>=18.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (20.3.0)\n",
"Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (3.12.4)\n",
"Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (1.15.0)\n",
"Requirement already satisfied: tensorflow-metadata in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (0.28.0)\n",
"Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from tensorflow-datasets==3.2.1) (0.16.0)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->tensorflow-datasets==3.2.1) (2020.12.5)\n",
"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->tensorflow-datasets==3.2.1) (2.10)\n",
"Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->tensorflow-datasets==3.2.1) (3.0.4)\n",
"Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.19.0->tensorflow-datasets==3.2.1) (1.24.3)\n",
"Requirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from protobuf>=3.6.1->tensorflow-datasets==3.2.1) (54.0.0)\n",
"Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow-metadata->tensorflow-datasets==3.2.1) (1.53.0)\n",
"/content\n",
"/content/dab\n",
"Already up to date.\n",
"/\n",
"back_translate.py gif\t\t__pycache__\tt2t_decoder.py\n",
"colab\t\t LICENSE\tREADME.md\tt2t_trainer.py\n",
"decoding.py\t problems.py\tt2t_datagen.py\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cwH1Iqau-udo",
"colab_type": "text"
"id": "cwH1Iqau-udo"
},
"source": [
"## Step 2. Run back translation!"
Expand All @@ -195,8 +224,7 @@
{
"cell_type": "markdown",
"metadata": {
"id": "3ITqk72kujSK",
"colab_type": "text"
"id": "3ITqk72kujSK"
},
"source": [
"### a. Back-translating an English sentence"
Expand All @@ -205,10 +233,7 @@
{
"cell_type": "code",
"metadata": {
"id": "hEoMmnXO2UaZ",
"colab_type": "code",
"cellView": "form",
"colab": {}
"id": "hEoMmnXO2UaZ"
},
"source": [
"beam_size = 2 #@param {type: \"integer\"}\n",
Expand All @@ -233,20 +258,20 @@
"--hparams_set=$hparams_set \\\n",
"--from_problem=$from_problem \\\n",
"--to_problem=$to_problem \\\n",
"--output_dir=$from_ckpt_dir \\\n",
"--from_ckpt=$from_ckpt \\\n",
"--to_ckpt=$to_ckpt \\\n",
"--from_data_dir=$from_data_dir \\\n",
"--to_data_dir=$to_data_dir \\\n",
"--backtranslate_interactively\n"
],
"execution_count": 0,
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "QB8WUiyaul48",
"colab_type": "text"
"id": "QB8WUiyaul48"
},
"source": [
"### b. Back translating sentences in the intermediate language"
Expand All @@ -256,9 +281,7 @@
"cell_type": "code",
"metadata": {
"id": "KJ5IMgC1uJPe",
"colab_type": "code",
"cellView": "form",
"colab": {}
"cellView": "form"
},
"source": [
"beam_size = 2 #@param {type: \"integer\"}\n",
Expand All @@ -283,14 +306,13 @@
"--to_data_dir=$to_data_dir \\\n",
"--backtranslate_interactively\n"
],
"execution_count": 0,
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Gn5aGaGvADKL",
"colab_type": "text"
"id": "Gn5aGaGvADKL"
},
"source": [
"## Acknowledgements\n",
Expand All @@ -307,4 +329,4 @@
]
}
]
}
}

0 comments on commit 05bf549

Please sign in to comment.