Large prompts or extensive documents (knowledge base) can significantly lengthen the LLM's response time for certain tasks, and it would be helpful to observe this response time in the terminal while running uvicorn. It will be common for a developer to try different prompts to check the rensponse time.