High-quality YouTube video streaming at a lower (internet) data-rate.

Debapriya Tula

Debapriya Tula

Bengaluru, Karnataka

We want to watch youtube videos at high resolution, but not at the cost of losing a lot of our data quota for the day. What if we could reduce the number of frames per sec Youtube sends and reconstruct the scene. Do go through this paper by Niklaus et. al. https://arxiv.org/abs/1708.01692. ...learn more

Project status: Under Development

Artificial Intelligence, Graphics and Media

Intel Technologies
DevCloud

Code Samples [1]Links [3]

Overview / Usage

What if we could watch videos on Youtube(for now only Youtube) at a resolution of 1080p or 1440p without losing out much on data. Say a data rate of 400 kbps could load entire videos and you watch them without them buffering.

How do we achieve that? Videos are generated at around 30 frames per second(fps). What if we could bring this down to say 20 fps and regenerate/interpolate the intermediate frames. This idea of regeneration is studied under the field of Computer Vision.

For now, it's just a concept. I am looking for like-minded people to work along with me on this project.

This project if successfully implemented holds potential to benefit not just the users of Youtube but also people who upload videos and finally Youtube itself.

Methodology / Approach

The approach that I am using follows the design pattern of the paper "From Here to There: Video Inbetweening Using Direct 3D Convolutions". The model receives 3 inputs, viz, the starting frame(x(s)), the ending frame(x(e)) and a Gaussian noise vector(u ∈ R^d). The output of the model will be a video x(s),x(1),x(2),x(3).........,x(T-2),x(e). The output video varies with the choice of the Gaussian noise vector.

The model consists of three components: an image encoder, a latent representation generator and a video generator. A video discriminator and an image discriminator are added so that the whole model can be trained using adversarial learning to produce realistic video sequences.

It takes a GAN based approach where each of the components mentioned above are trained with an adversarial loss function.

This method was tried but was found to be unsuitable for our problem statement. We are now following a new approach based on Adaptive Convolutions.

The description for the approach has been well outlined in this Medium article: https://medium.com/analytics-vidhya/review-adaconv-video-frame-interpolation-via-adaptive-convolution-video-frame-interpolation-fbce6acaa2a5

Technologies Used

  1. Pytorch.

  2. ML libraries like numpy and matplotlib.

  3. Intel Devcloud for model training.

Repository

https://github.com/Debapriya-Tula/AdaConv-Pytorch

Collaborators

There are no people to show.

Comments (0)