3 Comments

So the labeling is assigned to AI responses? How does this interact with prompt engineering?

Expand full comment

Hi Christine, here is the reply from Jimmy Whitaker:

"There are 2 pieces to the labeling. For raw data, then the model is generating predictions for the data to be verified by human annotators. For ground truth data (which can pre-exist or be created by verifying predictions) are used as the guide for the agent. The agent ends up being responsible for the prompt engineering, tuning the prompt to get a better score on the ground truth data. For instance, the "improved skill prompt" in the article was generated by Adala."

Expand full comment

Clever.

Expand full comment