DataGAN: Leveraging Synthetic Data for Self-Driving Vehicles

0 0
  • 0 Collaborators

Leveraging Generative Adversarial Networks to create self-driving data at scale is crucial. By emphasizing a focus on DCGANs, I’m focusing on creating high-quality self-driving images that can be used to train and improve the performance of computer vision models. ...learn more

Project status: Published/In Market

oneAPI, Artificial Intelligence

Intel Technologies
Intel Integrated Graphics

Code Samples [1]Links [1]

Overview / Usage

So far, more than $16 billion has been spent on self-driving research. What’s the problem? Self-driving is expensive? Why? Getting data and training these models are not only time-consuming but really expensive. You then also need to take into account the fact that Waymo’s spent more than 20 million miles on public roads for data gathering (talking about the amount of energy consumed is for a whole other article…) What percentage of data is actually useful? Very minimal. Why? Most of the data is usually from “normal” driving scenes, not edge case scenarios such as car overtaking, parking, traffic, etc.

How do we solve that?

General adversarial networks. Focusing on road/scenes rather than edge case scenarios i.e. parking and overtaking. Being able to generate vasts amounts of self-driving data while not having to spend excessive amounts of capital gathering data can be crucial in self-driving deployment. As an attempt to solve this problem, I’ve been focusing on building DataGAN out. Leveraging Generative Adversarial Networks to create self-driving data at scale is crucial. By emphasizing a focus on DCGANs, I’m focusing on creating high-quality self-driving images that can be used to train and improve the performance of computer vision models such as lane + object detection, and semantic segmentation.

Methodology / Approach

Rather than using straightforward Dense layers for generating + discriminating images, DCGANs leverage the use of Convolutional Neural Networks (CNNs) to accomplish this task. The TLDR; of CNNs is that Convolutional Neural Networks are essentially a method used to help break down images while capturing spatial + temporal dependencies an image has via its filters.

In a DCGAN, we would use upsampling techniques such as transpose convolutions and Leaky ReLU functions for the generator. On the other hand, we would use convolutions for downsampling the input until we get to a 1x1 output for the discriminator, therefore telling us if the input image is real or not.

DataGAN leverages the Fully Convolutional Network (FCN) architecture along with the Deep Convolutional Generative Adversarial Network to create trainable synthetic data.

With respect to training, I’m currently running DataGAN on a modified version of the Cityscapes Dataset. The quality and resolution of the dataset make it an ideal choice, especially considering the fact that generating urban scenes as opposed to highways, is a lot more beneficial for the overall AV industry.

For fast processing + training, I’m resizing the images to be 128x128 rather than the original 256x256 mainly because of hardware requirements.

My generator takes advantage of convolution transpose operations to help create realistic driving scene images while updating weights while training for robust outputs.

I’ve been training for around ~4000 (3875) epochs with this dataset (around 3075 images for training). Although I haven’t faced mode collapse (which is a good indicator so far!).

Technologies Used

Python, PyTorch, torch, torchvision, NVIDIA GTX 1650 4GB, matplotlib, datasets, kaggle, numpy

Repository

https://github.com/srianumakonda/DataGAN

Comments (0)