_______ __ _______ | | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----. | || _ || __|| < | -__|| _| | || -__|| | | ||__ --| |___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____| on Gopher (inofficial) URI Visit Hacker News on the Web COMMENT PAGE FOR: URI DeepDive in everything of Llama3: revealing detailed insights and implementation aghilmort wrote 22 hours 28 min ago: great need; mulling over; shows up all the time in AI paradigms therealoliver wrote 14 hours 47 min ago: glad to have helped you :) simonw wrote 22 hours 49 min ago: I hadn't realized OpenAI's tiktoken Python library could work with other models outside of the OpenAI family, that's really useful: URI [1]: https://github.com/therealoliver/Deepdive-llama3-from-scratch?... moffkalast wrote 4 hours 31 min ago: It's more than just that, practically every notable open model released in the past year or so uses tiktoken as the tokenizer. therealoliver wrote 14 hours 48 min ago: I'm glad to have helped you :) kevmo314 wrote 22 hours 55 min ago: I like the use of the functional API here. I learned through a similar route and it was very helpful for me compared to trying to understand `torch.nn.Module`. Here's a gist of my learning path if it's helpful to anyone: URI [1]: https://gist.github.com/kevmo314/294001659324429bae6749062a900... therealoliver wrote 14 hours 49 min ago: Yes, these are two different learning paths. The detailed process learning is beneficial for future research, while the API-style approach is convenient and quick for getting started and using. Both are very useful! DIR <- back to front page