Since the summary is not user-specific, the responses can be cached and reused. This will also improve latency
Since the summary is not user-specific, the responses can be cached and reused. This will also improve latency