Optimizing BERT Beyond FlashAttention
Feb 2025 — Apr 2025
Categories: Deep Learning NLP CUDA GPU Architecture
GPU optimizations for BERT-like models using Kernel Fusion
Feb 2025 — Apr 2025
Categories: Deep Learning NLP CUDA GPU Architecture
GPU optimizations for BERT-like models using Kernel Fusion
Oct 2024 — Dec 2024
Categories: High-Performance Computing
Accelerated large VLMs with CUDA and compression, achieving 390× CPU speedup and real-time inference performance.
Feb 2024 — Mar 2024
Categories: Computer Architecture Software Development
Built a cycle-accurate VMIPS vector processor simulator with ML workloads and architecture optimizations.
Jan 2024 — May 2024
Categories: Deep Learning Computer Vision GANs
Achieved high PSNR and SSIM in super-resolution with minimal training using pre-upsampled UNets.
Oct 2023 — Nov 2023
Categories: Computer Architecture Software Development
Designed and optimized a configurable L1–L2 exclusive cache simulator with performance-tuned eviction policies.
Sep 2023 — Nov 2023
Categories: Computer Architecture Software Development
Built a cycle-accurate MIPS simulator in C++ with Tomasulo's algorithm and advanced branch prediction.
Aug 2022 — May 2023
Categories: NLP Deep Learning
Used knowledge graphs and graph-based machine learning to predict disease associations and recommend preventive drugs.
Apr 2022 — Jul 2022
Categories: Deep Learning Computer Vision
Developed novel architectures for fine-grained and coarse-level classification using ensemble learning on bird taxonomy data.
Jul 2021 — Oct 2021
Categories: Web Development
Led the re-development of DJ Unicode's website, improved SEO and integrated a Graph DB for content management.
Aug 2020 — Jan 2021
Categories: Web Development
Built a feature-rich E-Commerce platform with secure payment systems, cost-optimized deployment, and Redux state management.