GitHub - EveryOneIsGross/ERNIE: BERT used as a bidirectional sentence generator

The concept of bidirectional expansion in this script refers to the method of growing the generated text both forward (rightward) and backward (leftward) from the initial seed point, using masked language modeling. This technique is implemented to enhance the contextuality and richness of the generated text by allowing the model to add relevant content on both sides of the initial input.

Implementation of Bidirectional Expansion

Initial Seed Processing:
- The process begins with a seed word or phrase which is first preprocessed to fit the model's requirements, including replacing special placeholders with actual model tokens (like replacing [MASK] with BERT’s mask token).
Adding Mask Tokens:
- At each iteration of the text generation cycle, mask tokens are added at both the beginning and the end of the current text sequence. This is done to prompt the model to consider expanding the text in both directions.
- For example, if your current text is "today is a beautiful", the script might modify it to "[MASK] today is a beautiful [MASK]". This encourages the model to predict suitable words to add before "today" and after "beautiful".
Prediction and Replacement of Mask Tokens:
- With the newly added mask tokens, the sequence is fed into the BERT model. BERT predicts the most probable replacements for the mask tokens based on the surrounding context.
- The script selects top candidates for each masked position, and these candidates are then evaluated for how well they fit into the overall sentence, maintaining grammatical and contextual coherence.
Iterative Refinement:
- This expansion process is not done in a single step but rather iteratively over several cycles, allowing for gradual development of the text. Each cycle can introduce new masked tokens at the edges of the text, continuously prompting the model to expand further.
- During each iteration, the model’s predictions are refined based on the feedback from both the BERT and GPT-2 models, where BERT suggests possible word choices and GPT-2 evaluates the coherence of the resulting sentences.
Dynamic Adjustment:
- The process dynamically adjusts the length and direction of expansion based on various factors, such as reaching a maximum length limit or achieving satisfactory coherence in the generated text, at which point no new mask tokens might be added.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
divineBEDSIDETABLE_07.py		divineBEDSIDETABLE_07.py
ernie_20.py		ernie_20.py
ernie_22_nonoise.py		ernie_22_nonoise.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of Bidirectional Expansion

About

Releases

Packages

Languages

EveryOneIsGross/ERNIE

Folders and files

Latest commit

History

Repository files navigation

Implementation of Bidirectional Expansion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages