AI/ML Projects

Deep Learning Projects

  1. Artistic style transfer
    • Built deep Convolutional models to capture style of artist and reproduce any images in that style. Used CNNs to extract features, explored convergence algorithms to optimize aesthetic appearance of image
    • Trained Perceptual Losses Networks to generate styled image in one forward pass, built a system to feed real time webcam video to the network to achieve real-time video style transfer.
    • Integrated computer camera with the perceptual losses model to show real time style transfer.
  2. Analysis of Depth estimation models using error detection network
    • Implemented a UNet architecture comprised of ResNet50 as encoder for analysing state of the art depth estimation models
    • Analysed PlaneRCNN, From Big to Small, Multi-Scale Deep Network methods for depth estimation accuracy and scope of improvement for depth estimation.
  3. 3D Image segmentation using graph neural networks
    • Attempted to improve image segmentation task using graph based deep learning algorithms. Tried to exploit the depth information as a graph to achieve the improvements.
    • Tried a novel graph generation method to attempt improving of proposal generation task
  4. Audio Equalization of Musical Instruments using Deep Learning
    • Attempted the task of audio equalisation with different deep learning architectures
    • Generated a synthetic dataset using the DSD100 dataset for the task
    • Compared the different architectures experimented with and found out that best SNR was generated by encoder decoder architecture which uses STFT input, however the results generted by architecture which used raw audio as input (inspired by WaveNet) was a close second.
  5. Various Deep Learning Algorithm Implementations
    • Generative Adversarial Network: Implemented a generative adversarial network to generate handwritten digits similar to MNIST dataset.
    • Conditional Generative Adversarial Network: Implemented a conditional GAN to control digit generation process using MNIST dataset.
    • Siamese Network for Voice Identification: Implemented Siamese network for speaker classification for audio clips.
    • Varitional Autoencoder on MNIST: Learned the gaussian distribution which can generate MNIST digits. Tried different combinations of parameters of the distribution to see effect on the generated digits.
    • Link to the respository