Discussion about this post

User's avatar
Andy W's avatar

I am intrigued by this idea "human evaluation is the only way to get a reliable signal"

Do you mean RLHF or something more? For example, Alan Cowen from Hume AI believes RLHF will always be biased and instead need to move to something more akin to evaluation based on how models actually affect users (eg, https://x.com/AlanCowen/status/1613293979071664146)

Expand full comment
1 more comment...

No posts