Welcome to llm(artyn)!

Here's an early exploration of how we might make the coaching transcripts useful.

The "Compare" tab

This is a simple way to see how different models perform at predicting the best next response. In each round, you'll:

See the "current state" of the conversation. This is an edited (for readability) transcript of one of martyn's real coaching conversations. We arbitrarily cut off the conversation, and this is used as context for the model responses.
Compare two different options for how to continue the conversation. You may be presented with a pair of LLM responses, or a human response and an LLM response.
Choose which response you think is better.
Try to guess which response was the real human response, if any.

The LLM responses are either generated by a model with a "vanilla" prompt or a model with a "coach" prompt. The "vanilla" prompt simply tells the model it is an instructional coach. The "coach" prompt was developed from analyzing the coaching transcripts martyn shared with us.

The "Chat" tab

You can also try the "Chat" tab, where you can chat with the best model or model of your choice. Note that the chatbot doesn't have any context re: specific lessons, so it's not useful for that. It's more for you to play around with the model and see what it's like to chat with it.

LLM(artyn)

Response Preference Rates

0.0%

Human

0.0%

Coach Prompt

0.0%

Vanilla

Welcome to llm(artyn)!

The "Compare" tab

The "Chat" tab

LLM(artyn)

Response Preference Rates

Context