This was a great read! I’ve been wondering about the best ways to evaluate LLMs. I experimented with several LLM-Modulo approaches but was not met with much success. I’ll try implementing the LLM-as-a-judge technique and see if it works better. Thanks for sharing, Stella.
This was a great read! I’ve been wondering about the best ways to evaluate LLMs. I experimented with several LLM-Modulo approaches but was not met with much success. I’ll try implementing the LLM-as-a-judge technique and see if it works better. Thanks for sharing, Stella.
Thank you Lily! Do you mind sharing the challenges you ran into with LLM-Modulo?