DIY LLMs to New Architectures: The Surge from Toy Models to Discrete Distribution Networks

DIY LLMs are moving from hobbyist screens to real research chatter. The spark: hands-on tutorials from Andrej Karpathy showing how to build a GPT-like model from scratch. News that Discrete Distribution Networks (DDN) earned an ICLR2025 slot signals the shift from toy models to serious architectures. ^[1]^[2]

DIY LLM Tutorials - Andrej Karpathy's video “Let's Build GPT: from scratch, in code, spelled out” lays out the practical path ^[1]. - NanoGPT is a project by Andrej Karpathy that’s accessible and easy to run ^[1]. - Karpathy’s other videos move from approachable explanations to deep technical dives, outlining what LLMs really do ^[1].

DDN: A New Architecture on the Rise - Discrete Distribution Networks (DDN) is a novel generative model with simple, elegant principles; the paper has been accepted to ICLR2025 ^[2]. - It generates multiple outputs in a single forward pass, and these outputs together form a discrete distribution. It also emphasizes Zero-Shot Conditional Generation, a one-dimensional discrete latent representation organized in a tree, and full end-to-end differentiability ^[2]. - The approach can be combined with GPT-style systems and even explored as DDN LLMs, including ideas like minimizing tokenizers and using speculative sampling ^[2]. - In comparisons, DDN sits alongside diffusion, GANs, VAE, and autoregressive models, offering a distinctive, hierarchical take ^[2]. - ICLR reviewers called the method novel and elegant, hinting at new directions for generative modeling in LLMs ^[2].

Closing thought: the DIY LLM era is maturing—from hands-on builds to architecturally innovative paths that could reshape how we think about chat, code, and compression.

References

[1]

HackerNews

Ask HN: Build Your Own LLM?

Asks for tutorials to build toy LLMs from scratch; cites Karpathy videos, NanoGPT, and related resources to learn concepts faster

View source

[2]

HackerNews

Show HN: I invented a new generative model and got accepted to ICLR

Thread discusses Discrete Distribution Networks (DDN), ICLR acceptance, GPT/LLMs integration, zero-shot generation, and comparisons to diffusion, GAN, VQ-VAE.

View source

References

Ask HN: Build Your Own LLM?

Show HN: I invented a new generative model and got accepted to ICLR

Want to track your own topics?