_______ __ _______
| | |.---.-..----.| |--..-----..----. | | |.-----..--.--.--..-----.
| || _ || __|| < | -__|| _| | || -__|| | | ||__ --|
|___|___||___._||____||__|__||_____||__| |__|____||_____||________||_____|
on Gopher (inofficial)
URI Visit Hacker News on the Web
COMMENT PAGE FOR:
URI DeepDive in everything of Llama3: revealing detailed insights and implementation
aghilmort wrote 1 day ago:
great need; mulling over; shows up all the time in AI paradigms
therealoliver wrote 1 day ago:
glad to have helped you :)
aghilmort wrote 6 hours 28 min ago:
just realized Siri typo'd meant to say great read
simonw wrote 1 day ago:
I hadn't realized OpenAI's tiktoken Python library could work with
other models outside of the OpenAI family, that's really useful:
URI [1]: https://github.com/therealoliver/Deepdive-llama3-from-scratch?...
moffkalast wrote 15 hours 31 min ago:
It's more than just that, practically every notable open model
released in the past year or so uses tiktoken as the tokenizer.
therealoliver wrote 1 day ago:
I'm glad to have helped you :)
kevmo314 wrote 1 day ago:
I like the use of the functional API here. I learned through a similar
route and it was very helpful for me compared to trying to understand
`torch.nn.Module`.
Here's a gist of my learning path if it's helpful to anyone:
URI [1]: https://gist.github.com/kevmo314/294001659324429bae6749062a900...
therealoliver wrote 1 day ago:
Yes, these are two different learning paths. The detailed process
learning is beneficial for future research, while the API-style
approach is convenient and quick for getting started and using. Both
are very useful!
DIR <- back to front page