PinnedDeepSeek-R1: Model ArchitectureThis article provides an in-depth exploration of the DeepSeek-R1 model architecture. Let’s trace DeepSeek-R1 model from input to the output…Feb 5Feb 5
Understanding Scalable Deployment Tools on AWS: AWS ECR, ECS, ALB, IAM and Secrets ManagerDeploying a Large Language Model (LLM) chat application that can scale efficiently on AWS requires understanding key AWS services. This…Feb 17Feb 17
Learning to Build Scalable LLM Chat Application: Microservices Architecture and Docker…📜 Table of ContentsFeb 14Feb 14
DeepSeek-R1: Training Recipe and DataFor simplified understanding, the training pipeline of DeepSeek-R1 is presented in 6 stages. The official technical report describes it in…Feb 5Feb 5
Published inThe StartupEvaluate Robustness of Convolutional Neural Networks (CNNs) with CIFAR100-C and CIFAR10-C datasetsCIFAR100-C and CIFAR10-C datasets explained and github code providedJan 12, 2022Jan 12, 2022
Published inGeek CultureVisualizing Hyperparameter Tuning Results of KerasTuner With Weights & BiasesMy previous blog explains about how to use KerasTuner for hyperparameter tuning in Keras/TensorFlow 2. This article shows how to visualize…Mar 8, 20212Mar 8, 20212
Published inThe StartupOptuna: Hyperparameter Optimization in PyTorchHyperparameter tuning of PyTorch models with OptunaJan 19, 2021Jan 19, 2021
Published inAnalytics VidhyaSolution to TensorFlow 2 not using GPUMaking TensorFlow 2 code or Keras code run on GPUJan 16, 20213Jan 16, 20213
Published inThe StartupHyperparameter Tuning in Keras: TensorFlow 2: With Keras Tuner: RandomSearch, Hyperband…This article will explore the options available in Keras Tuner for hyperparameter optimization with example TensorFlow 2 codes for…Jan 10, 20212Jan 10, 20212