Loading...
Engaged employer
Describe 3 different optimisations applied to LLM inference.
Anonymous
KV caching, speculative decoding, operator fusion
Check out your Company Bowl for anonymous work chats.
Get actionable career advice tailored to you by joining more bowls.
Stay ahead in opportunities and insider tips by following your dream companies.
Get personalised job recommendations and updates by starting your searches.