Back to portfolio

learning design case study

Algebuds: Student Tutors Chatbot

The student is no longer a recipient of explanations, they are the source of them.

Dot works through an equation in a notebook while chatting with the student tutor.
Dot uses the student's explanation in the workbook, making the student's thinking visible.

You may be familiar with Seymour Papert's LOGO turtle1, which draws shapes on screen given programmed instructions from elementary school students. In one story, a child sees draw a line, and is asked now to draw a square. He scripts , and watches as the turtle draws... a longer line.

In this moment, the child received "recursive feedback" and had the opportunity to refine his understanding and explanation of a square. He excitedly asks how to make the turtle turn...

LOGOwriter turtle drawing demo Selecting FW 10 draws a short line. Selecting FW 10 FW 10 FW 10 FW 10 adds a second turtle that draws a longer line.

Chatbot tutors aren't working for most students.

Chatbot tutors aren't working for most students2. They can supply explanations and answer lots of questions, but in doing so they take on all the cognitive work. Students lapse into passivity: "idk". The typical chatbot tutor assumes the student comes with a question; the chatbot comes with the answer. This leaves little room for students to put their own ideas in.

In this prototype I flip the tutoring roles.

In this prototype I flip the tutoring roles. The agent-as-tutee (AT) is the one asking for help, and the student user (SU) is the tutor. The AT is deliberately constrained so that it cannot solve the problems on its own. It genuinely needs the student's ideas to proceed. Now the student is no longer a recipient of explanations, they are the source of them.

Algebuds classroom world with Dot, a workbook, a classroom board, and chat.
The chat is placed alongside a classroom world and workbook modal to support the SU's positioning as tutor.

Recursive feedback turns partial explanations into something students can inspect.

The child programming in LOGO knew a square had four sides and that explanation had served him well in previous contexts. He used that knowledge to supply four copies of FW 10, and he observed the turtle earnestly interpret his instructions. In that moment of recursive feedback, he realizes he must modify his partial explanation to adapt to this new context, and make connections with new ideas such as angles and rotation.

The structure of this prototype is built on theories of student conceptions-in-progress and layers of feedback. First, we reconceive misconceptions3 not as wrong ideas to be replaced, but instead as useful conceptions in previous contexts that need refinement as contexts mature. Thus, we aim to draw out student explanations of their conceptions in order to interpret, refine, and grow them with feedback.

The feedback typically afforded by computers is evaluative: the computer can quickly say if a highly structured response is right or wrong. With LLMs, computers are now much more capable of giving interpretive4 feedback on natural language inputs. We elicit a student's natural language informal and imprecise partial conception, and we represent it back to them as the AT's solution attempt and commentary. In this method, we also supply recursive feedback5. The student is not just "learning by teaching" the AT, they get recursive feedback from observing the AT use their explanation to perform a task.

Dot tries new equation problems in the workbook after learning about keeping equations balanced.
The SU and AT converse informally while Dot's procedural work remains observable.

The important constraint is that Dot genuinely needs the student's ideas to proceed.

Multiple LLMs

Constrained Agent-as-Tutee LLM

The AT is deliberately limited so that it cannot solve the problems and needs help from the student. The fine tuning of this constraint is a good area for future research.

Knowledge State Modifier LLM

A separate LLM evaluates each of the SU's explanations and updates the AT's internal knowledge state: a structured list of active misconceptions and conceptual gaps. This state is never surfaced to the student. Making it visible would do the metacognitive work for them. The student infers the AT's progress only by watching its behavior change. This is a good area for further refinement.

Problem Generator LLM

A third model selects pedagogically sequenced problems drawing on the rich landscape of partial conceptions students may bring to an Algebra 1 course.

Environment design

Classroom world

The AT lives in a classroom scene that grounds the relationship between the SU, the AT, and the problems. The SU visits the AT in its world, a less threatening position than being the one who needs help, and can both chat with and observe it.

Chat style

The AT communicates like a peer: casual, earnest, and sending multiple short chat bubbles rather than one long reply. The bubbles serve as punctuation and pacing cues, making the conversation feel live rather than like receiving a document.

Observable workbook with side chat

Watching the AT perform is essential; conversation alone is not enough. The workbook makes the AT's procedural work visible at the same time it is having a conceptual conversation with the student. These two channels run in parallel deliberately.

Feedback

Interpretive

The key contribution of LLMs here is interpretation: a student's informal natural language explanation is received, understood, and represented back as the AT's solution attempt and commentary. The student sees their own thinking enacted, which is the interpretive feedback that computers could not previously provide.

Evaluative

A timer marks the AT's final answer and resolves to a correct or incorrect indicator. A subtle affordance: the timer gives the SU a beat to judge the work themselves and anticipate the outcome before it is confirmed.

Recursive

Finally, the student observes how well their explanation has served the AT in performing the task of solving algebra problems.

Onboarding

I included an onboarding flow around much simpler content (arithmetic) designed to show the core loop in brief by supplying suggested input text without asking the user to type yet. The user arrives at the main interaction having already had a taste of success.

Success will look like getting student engagement first.

There are a number of open questions about this prototype. First, can the constrained LLM AT both adapt to new inputs coming in and remain properly constrained so as not to break the necessary relationship with the SU? Second, how do real students engage with this? There are also possible other modes to consider beyond the classroom-workbook-chat arrangement.

Success will look like getting student engagement first. Finding a modality that real students in real classrooms work with will allow those students to generate a lot of written explanations. This wealth of new written content could be further LLM analyzed and summarized for the teacher of the class, giving them a very free form and powerful formative assessment tool.

History and Branches

The V2 prototype is not deployed currently since it uses LLM resources. See the repos if you are interested, or check out V1.

Animated Algebuds kitchen mode prototype with robot pets moving around an algebra kitchen.
Kitchen mode explores another modality: algebra practice as collaborative prep, try, fix, and retry.
  1. Papert, Seymour. Mindstorms: Children, Computers, and Powerful Ideas. Basic Books, Inc., 1980.
  2. Meyer, Dan. "RIP Khanmigo & Edtech Industry Dreams of AI Tutors." Mathworlds, April 15, 2026.
  3. Smith, John P., Andrea A. DiSessa, and Jeremy Roschelle. "Misconceptions Reconceived: A Constructivist Analysis of Knowledge in Transition." The Journal of the Learning Sciences 3, no. 2 (1994): 115-63.
  4. Wiliam, Dylan. Embedded Formative Assessment. Solution Tree Press, 2011.
  5. Okita, Sandra Y., and Daniel L. Schwartz. "Learning by Teaching Human Pupils and Teachable Agents: The Importance of Recursive Feedback." Journal of the Learning Sciences 22, no. 3 (2013): 375-412.