A collection of deep learning and computer vision projects implemented in PyTorch, covering foundational neural networks, CNN architectures, autoencoders, and NPU-deployed super resolution models.
DeepLearning-ComputerVision-PyTorch/
├── 01_neural_network_from_scratch/
├── 02_cnn_lenet5/
├── 03_cnn_alexnet/
├── 04_autoencoder/
├── 05_image_classification_npu/
├── 06_super_resolution_high_psnr/
└── 07_super_resolution_low_latency/
Implements forward pass, backpropagation, and weight updates using raw PyTorch tensors — no nn.Module.
| File | Description | Dataset | Result |
|---|---|---|---|
binary_classification_nn.py |
2-layer NN (ReLU + Sigmoid) | MNIST binary | Loss: 0.704 |
backpropagation_from_scratch.py |
Manual backprop with chain rule | MNIST binary | Test Acc: 98.43% |
multiclass_nn_softmax.py |
Deep NN with Softmax output | MNIST 10-class | Test Acc: 92.10% |
Key concepts: He/Xavier initialization, binary cross-entropy, softmax + categorical cross-entropy, manual gradient computation
Classic LeNet-5 architecture with Average Pooling and Tanh activations, trained on MNIST.
| File | Architecture | Dataset | Result |
|---|---|---|---|
lenet5_mnist.py |
LeNet-5 (2 Conv + 3 FC) | MNIST 10-class | Test Acc: 98.65% |
Key concepts: Convolutional layers, average pooling, SGD + momentum
Custom AlexNet implementation trained on a 20-class ImageNet subset with data augmentation and cosine LR scheduling.
| File | Architecture | Dataset | Result |
|---|---|---|---|
alexnet_train.py |
AlexNet (5 Conv + 3 FC) | ImageNet-20 | Val Acc: 56.80% (epoch 40) |
Key concepts: Dropout, CosineAnnealingLR, gradient clipping, ImageNet normalization, feature map visualization
Convolutional autoencoder for unsupervised image reconstruction on Fashion-MNIST.
| File | Architecture | Dataset | Result |
|---|---|---|---|
convolutional_autoencoder.py |
Encoder (4 Conv) + Decoder (3 ConvTranspose) | Fashion-MNIST | MSE converged |
Key concepts: ConvTranspose2d, latent space (dim=128), MSE reconstruction loss, BatchNorm
Original vs Reconstructed:
Training & Test Loss Curves:
Custom 5-block CNN designed for NPU deployment. Full pipeline: training → ONNX export → INT8 quantization → NPU inference.
| File | Description |
|---|---|
janecnn_train.py |
Train JaneCNN on ImageNet-20, export to ONNX |
compile_onnx_to_mxq.py |
Compile ONNX to MXQ via Qubee (INT8 post-training quantization) |
inference_npu.py |
Run inference on MX1601 NPU, compute accuracy |
Architecture: 5 blocks (Conv + BatchNorm + LeakyReLU + MaxPool) + AdaptiveAvgPool + FC
Result: Best Val Acc 68.30% (PyTorch) → deployed on NPU (MXQ)
Key concepts: AdaptiveAvgPool2d, LeakyReLU, INT8 quantization, ONNX export, NPU deployment
Residual super resolution network with 24 FlashBlocks optimized for maximum reconstruction quality. Trained on LR/HR image pairs and deployed on NPU.
| File | Description |
|---|---|
strongsrnet_train.py |
Train StrongSRNet, export to ONNX |
compile_onnx_to_mxq.py |
Compile ONNX to MXQ (INT8, per-channel quantization) |
inference_psnr.py |
NPU inference + PSNR evaluation |
Architecture: Head Conv + 24× FlashBlock (residual) + Tail Conv + Sigmoid
Result: PSNR ~23.3 dB on NPU (baseline FP32: 22.08 dB)
Key concepts: Residual blocks, PSNR metric, LR/HR pairs, ReduceLROnPlateau
Lightweight super resolution optimized for NPU inference speed with combined L1+MSE loss and cosine LR annealing. Fine-tuned to 200 epochs.
| File | Description |
|---|---|
plainsrnet_train.py |
Train PlainSRNet with CombinedLoss, export to ONNX |
compile_onnx_to_mxq.py |
Compile ONNX to MXQ for NPU |
inference_psnr.py |
NPU inference + PSNR + latency evaluation |
Architecture: Head Conv + 16× PlainBlock (residual) + Tail Conv + Sigmoid
Result: PSNR 23.98 dB | NPU latency 13.18 ms/image
Key concepts: CombinedLoss (L1 + MSE), CosineAnnealingLR, INT8 quantization, latency optimization
- Framework: PyTorch
- Deployment: ONNX → MXQ (Qubee) → MX1601 NPU
- Datasets: MNIST, Fashion-MNIST, ImageNet-20, Custom SR datasets
- Environment: Google Colab (GPU) + Windows NPU workstation
- Neural network implementation from scratch (forward pass, backpropagation, weight updates)
- CNN architecture design (LeNet-5, AlexNet, custom CNNs)
- Image classification and super resolution
- Model optimization: BatchNorm, Dropout, LR scheduling, gradient clipping
- Full deployment pipeline: PyTorch → ONNX → INT8 quantization → NPU inference
- Evaluation metrics: accuracy, loss curves, PSNR



