GANs for Natural Language Processing
- 0 Collaborators
This project aims to explore the use of Generative Adversarial Networks for NLP tasks like text generation, machine translation and question/answer systems. ...learn more
Project status: Concept
Intel Technologies
AI DevCloud / Xeon,
MKL,
Intel Opt ML/DL Framework
Overview / Usage
Applying GANs to text is not straight forward as they were originally designed to work with continues valued data (like image data) allowing for slight adjustments in the fake samples to make the samples become increasingly realistic.
The issues with textual data are:
- Discreteness of textual data which makes computing gradients impossible and hence makes use of backprop a no-go.
- Slight adjustments in continuous data like images can lead to only a slightly different image. On the other hand, a slight change in textual data may lead to a completely different word or sentence which is undesirable.
- Dealing with mode collapsing in sentence generation
The goal of this project to find ways to use GANs on text and use these methods for NLP tasks.
Methodology / Approach
Proposed approaches:
- Making a system where the discriminator works on continuous data
- Reinforcement learning based sentence generation where the generator's goal is to maximize the rewards for sentence completion.
- Exploring actor-critic CGANs for filling in missing text
- Stochastic generator
- Exploring evaluation techniques like BLUE score
Technologies Used
Proposed tools/resources:
Intel AI DevCloud – a free Intel platform for cloud based computing resources for deep learning training and AI workloads. This cloud is powered by Intel Xeon Scalable processors and are capable of providing the much needed stability during the training process for GANs.
Intel Math Kernel Library for Deep Neural Networks (Intel MKL-DNN) – a set of highly optimized modules to accelerate compute-intensive parts of the DNN frameworks like TensorFlow which tends to be the primary choice for implementing GANs. It can serve to optimize matrix and vector based computations and operations that are heavily involved in the training process.