imagecaptioning-robot

yang stone

yang stone

Shenzhen, Guangdong

1 0
  • 0 Collaborators

An application of imagecaptioning :User uploads an image or a video to automatically generate captions and text descriptions. ...learn more

Project status: Under Development

Artificial Intelligence

Code Samples [1]

Overview / Usage

We use TensorFlow to implementation of the image-to-text model described in the paper:
"Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge."
Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan.
IEEE transactions on pattern analysis and machine intelligence (2016).
Full text available at: http://arxiv.org/abs/1609.06647

And then we will improve the model:
1、Use Attention Mechanism to the relevant part of the image while it generates each word.
2、Use Faster R-cnn to improve our model

We build an webapp to connect to our model server to provide imagecaption servicing

Methodology / Approach

We use Tensorflow to build a deeplearning model:
The model generates captions from a fixed vocabulary that describe the contents of images in the COCO Dataset. The model consists of an encoder model - a deep convolutional net using the Inception-v3 architecture trained on ImageNet-2012 data - and a decoder model - an LSTM network that is trained conditioned on the encoding from the image encoder model. The input to the model is an image, and the output is a sentence describing the image content.
We use VUE to build a webapp:
The web application that will caption images and allow the user to filter through images based image content. The web application provides an interactive user interface backed by a lightweight python server using Tornado. The server takes in images via the UI and sends them to a REST end point for the model and displays the generated captions on the UI

Technologies Used

deeplearning,python,vue

Repository

https://github.com/stoensin/IC/

Comments (0)