Skip to content

SunnyJaneH/DeepLearning-ComputerVision-PyTorch

Repository files navigation

DeepLearning-ComputerVision-PyTorch

A collection of deep learning and computer vision projects implemented in PyTorch, covering foundational neural networks, CNN architectures, autoencoders, and NPU-deployed super resolution models.


Repository Structure

DeepLearning-ComputerVision-PyTorch/
├── 01_neural_network_from_scratch/
├── 02_cnn_lenet5/
├── 03_cnn_alexnet/
├── 04_autoencoder/
├── 05_image_classification_npu/
├── 06_super_resolution_high_psnr/
└── 07_super_resolution_low_latency/

Projects

01 Neural Network from Scratch

Implements forward pass, backpropagation, and weight updates using raw PyTorch tensors — no nn.Module.

File Description Dataset Result
binary_classification_nn.py 2-layer NN (ReLU + Sigmoid) MNIST binary Loss: 0.704
backpropagation_from_scratch.py Manual backprop with chain rule MNIST binary Test Acc: 98.43%
multiclass_nn_softmax.py Deep NN with Softmax output MNIST 10-class Test Acc: 92.10%

Key concepts: He/Xavier initialization, binary cross-entropy, softmax + categorical cross-entropy, manual gradient computation


02 CNN — LeNet-5

Classic LeNet-5 architecture with Average Pooling and Tanh activations, trained on MNIST.

File Architecture Dataset Result
lenet5_mnist.py LeNet-5 (2 Conv + 3 FC) MNIST 10-class Test Acc: 98.65%

Key concepts: Convolutional layers, average pooling, SGD + momentum


03 CNN — AlexNet

Custom AlexNet implementation trained on a 20-class ImageNet subset with data augmentation and cosine LR scheduling.

File Architecture Dataset Result
alexnet_train.py AlexNet (5 Conv + 3 FC) ImageNet-20 Val Acc: 56.80% (epoch 40)

Key concepts: Dropout, CosineAnnealingLR, gradient clipping, ImageNet normalization, feature map visualization


04 Autoencoder

Convolutional autoencoder for unsupervised image reconstruction on Fashion-MNIST.

File Architecture Dataset Result
convolutional_autoencoder.py Encoder (4 Conv) + Decoder (3 ConvTranspose) Fashion-MNIST MSE converged

Key concepts: ConvTranspose2d, latent space (dim=128), MSE reconstruction loss, BatchNorm

Original vs Reconstructed:

Training & Test Loss Curves:


05 Image Classification — NPU Deployment (JaneCNN)

Custom 5-block CNN designed for NPU deployment. Full pipeline: training → ONNX export → INT8 quantization → NPU inference.

File Description
janecnn_train.py Train JaneCNN on ImageNet-20, export to ONNX
compile_onnx_to_mxq.py Compile ONNX to MXQ via Qubee (INT8 post-training quantization)
inference_npu.py Run inference on MX1601 NPU, compute accuracy

Architecture: 5 blocks (Conv + BatchNorm + LeakyReLU + MaxPool) + AdaptiveAvgPool + FC
Result: Best Val Acc 68.30% (PyTorch) → deployed on NPU (MXQ)
Key concepts: AdaptiveAvgPool2d, LeakyReLU, INT8 quantization, ONNX export, NPU deployment


06 Super Resolution — StrongSRNet, High PSNR (NPU)

Residual super resolution network with 24 FlashBlocks optimized for maximum reconstruction quality. Trained on LR/HR image pairs and deployed on NPU.

File Description
strongsrnet_train.py Train StrongSRNet, export to ONNX
compile_onnx_to_mxq.py Compile ONNX to MXQ (INT8, per-channel quantization)
inference_psnr.py NPU inference + PSNR evaluation

Architecture: Head Conv + 24× FlashBlock (residual) + Tail Conv + Sigmoid
Result: PSNR ~23.3 dB on NPU (baseline FP32: 22.08 dB)
Key concepts: Residual blocks, PSNR metric, LR/HR pairs, ReduceLROnPlateau


07 Super Resolution — PlainSRNet, Low Latency (NPU)

Lightweight super resolution optimized for NPU inference speed with combined L1+MSE loss and cosine LR annealing. Fine-tuned to 200 epochs.

File Description
plainsrnet_train.py Train PlainSRNet with CombinedLoss, export to ONNX
compile_onnx_to_mxq.py Compile ONNX to MXQ for NPU
inference_psnr.py NPU inference + PSNR + latency evaluation

Architecture: Head Conv + 16× PlainBlock (residual) + Tail Conv + Sigmoid
Result: PSNR 23.98 dB | NPU latency 13.18 ms/image
Key concepts: CombinedLoss (L1 + MSE), CosineAnnealingLR, INT8 quantization, latency optimization


Tech Stack

PyTorch Python ONNX

  • Framework: PyTorch
  • Deployment: ONNX → MXQ (Qubee) → MX1601 NPU
  • Datasets: MNIST, Fashion-MNIST, ImageNet-20, Custom SR datasets
  • Environment: Google Colab (GPU) + Windows NPU workstation

Key Skills Demonstrated

  • Neural network implementation from scratch (forward pass, backpropagation, weight updates)
  • CNN architecture design (LeNet-5, AlexNet, custom CNNs)
  • Image classification and super resolution
  • Model optimization: BatchNorm, Dropout, LR scheduling, gradient clipping
  • Full deployment pipeline: PyTorch → ONNX → INT8 quantization → NPU inference
  • Evaluation metrics: accuracy, loss curves, PSNR

About

Neural networks from scratch to super-resolution: LeNet-5, AlexNet, Autoencoder, custom SR models · PSNR 23.98dB · NPU deployment · PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages