Inworld AI interview question

How would you improve LLM model serving performance?