Discussion about this post

User's avatar
Kevin R. Haylett's avatar

Well done, dynamical testing is just what is needed. Most static tests fail and only given numbers that do not represent user experiences. LLMs are indeed complex non-linear dynamical systems, and measures need to recognise this fact. One would never measure a person with just one question - it takes me quite a while to get my train of thought going! :)

Expand full comment

No posts

Ready for more?