Identifying RAG outputs in an LLM-generated response

LLMs responses that are augmented using external knowledge are hard to cross-check which parts of the external knowledge are used. I demonstrate a simple technique, a blend of COT and token manipulation. The result is a practical method that lets users highlight the retrieved parts in a generated text in a different color making the responses more interpretable.