Articles
Latest news and updates from PySpur - AI Agents Builder

•
Guide
Quantization of Large Language Models
A comprehensive guide to quantizing Large Language Models, covering both Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) methods, with mathematical formulations, code examples, and interactive quizzes.

•
Guide
Introduction to CUDA Programming for Python Developers
A guide for Python programmers who want to dive into GPU programming with CUDA, covering parallel computing concepts, thread management, memory hierarchies, and practical code examples (in both CUDA C and Python with Numba) to understand the differences and similarities.

•
Guide
DeepSeek's Multi-Head Latent Attention and Other KV Cache Tricks
How a Key-Value (KV) cache reduces Transformer inference time by trading memory for computation