Skip to main content
AI & Learning9 min read· 2 May 2026

BKT vs DKT: How AI Actually Models What You've Mastered

O
Omar Fouab
Founder, Omie

Most corporate learning platforms know two things about you: your job title and which courses you've completed. From those two signals, they make recommendations. It's roughly the equivalent of a doctor diagnosing you based on your age and which waiting rooms you've sat in.

Knowledge tracing is the field that asks a better question: what do you actually know right now, and how confident are we about that estimate? It's a branch of educational data mining and AI that has been producing rigorous results since the early 1990s — and it's almost entirely absent from corporate L&D tooling.

This is a technical explainer for L&D professionals, HR technology buyers, and edtech builders who want to understand what "AI-powered personalization" can actually mean when it's built on cognitive science.

Callout: Completion rates measure inputs. Knowledge tracing measures learning.


What Is Knowledge Tracing?

Knowledge tracing is the problem of inferring a learner's hidden knowledge state from observable behavior — typically their responses to questions, exercises, or application prompts.

The key word is hidden. We can't directly observe whether someone has mastered giving feedback. We can observe whether they answered a scenario question correctly, whether their manager reports behavior change, or whether they successfully applied a framework in a real situation. Knowledge tracing uses those observable signals to maintain a running probabilistic estimate of mastery.

The two dominant approaches are BKT (Bayesian Knowledge Tracing, 1994) and DKT (Deep Knowledge Tracing, 2015). They represent thirty years of progress and a fundamental shift in how we think about the relationship between skills.


BKT: Bayesian Knowledge Tracing

BKT was introduced by Corbett and Anderson in 1994 and remains widely used in intelligent tutoring systems today. Its core insight was simple and powerful: model mastery as a hidden binary variable that changes over time.

The Four Parameters

BKT models each skill independently using four parameters:

  • P(L₀) — Prior probability of mastery before any observation (e.g., 20% of learners already know this)
  • P(T) — Probability of transitioning from unmastered to mastered after a single learning opportunity (the "learn" rate)
  • P(S) — Probability of making a mistake even when the skill is mastered (the "slip" rate)
  • P(G) — Probability of answering correctly even when the skill is not mastered (the "guess" rate)

After each observation, BKT applies Bayes' theorem to update its estimate of whether the learner has mastered the skill. If they answer correctly, the posterior probability of mastery increases. If they answer incorrectly, it decreases — but not to zero, because they might have slipped.

BKT in Practice

Say we're tracking mastery of decision-making under uncertainty. A learner starts with P(L₀) = 0.25. They answer three scenario questions correctly. BKT updates: "Based on these three correct responses, and accounting for the possibility of guessing, this learner has a 73% probability of having mastered this skill."

That's a real number. It's not a completion percentage or a quiz score — it's a probabilistic statement about an underlying cognitive state. HR could act on it: "Everyone below 60% on conflict resolution gets surfaced this content next week."

BKT's Limitations

BKT's elegant simplicity is also its constraint. It treats each skill as independent. The model for "giving developmental feedback" has no knowledge of "managing difficult conversations" — even though in practice, mastery of one is highly predictive of performance on the other.

In a domain like leadership or management, skills are deeply entangled. You can't cleanly separate "prioritization" from "delegation" from "strategic thinking." BKT doesn't know that. It runs independent belief updates for each, missing the cross-skill signal entirely.


DKT: Deep Knowledge Tracing

DKT was published by Piech et al. from Stanford in 2015 and represented a step-change in performance. The key contribution: replace the per-skill Bayesian model with a Recurrent Neural Network (RNN) that processes the full sequence of a learner's interactions.

The Architecture

Instead of four parameters per skill, DKT has a hidden state vector that encodes the learner's entire knowledge state at each point in time. At each step:

  1. The network receives the current interaction: which skill was tested, and whether the response was correct
  2. The hidden state is updated through the RNN's gated architecture (typically an LSTM)
  3. The network outputs a probability distribution over all skills — predicting the likelihood of a correct response on any skill, including ones that haven't been tested yet

That last point is the breakthrough. DKT can say: "You've been practicing feedback delivery, and based on your performance, we predict you're at 61% on conflict resolution even though we haven't given you any conflict scenarios yet." This is cross-skill transfer inference.

Why This Matters

In the real world, skills are not independent. Learning science tells us that communication skills transfer across contexts; that mastery of perspective-taking supports both feedback delivery and negotiation; that someone with high metacognitive awareness (knowing what they don't know) tends to accelerate across multiple domains simultaneously.

BKT is blind to all of this. DKT captures it implicitly through the learned representations in the hidden state. The network discovers the latent structure of skill interdependence from data, without anyone having to specify it manually.

In benchmark testing on the ASSISTments math dataset (the standard benchmark), DKT outperformed BKT by approximately 25% in AUC (the standard accuracy metric for knowledge tracing). In more complex, interdependent skill graphs — exactly the kind you find in professional soft skills — the gap is likely larger.

Callout: BKT says "you've probably mastered feedback skills at 73%." DKT says "because you've mastered feedback, you're likely at 61% on conflict resolution even though we haven't tested it." That difference is the gap between a quiz and a mastery model.


Limitations of DKT

DKT is not without problems, and it's worth being honest about them.

Interpretability: A Bayesian model is transparent. You can explain why BKT estimated 73% mastery — here are the three correct answers, here are the prior and transition parameters. With a deep neural network, the explanation is a 200-dimension hidden state vector. That's not useful to a manager or an L&D designer who wants to understand why someone is being served particular content.

Data requirements: DKT needs thousands of interaction sequences per learner population to learn reliable skill representations. In a consumer math education app with millions of students, this is fine. In a corporate L&D system with 500 employees, the data is sparse. A poorly-trained DKT model can be worse than a well-calibrated BKT model.

Cold start: Both models struggle with new users and new skills. BKT falls back to priors. DKT struggles more because its cross-skill inference depends on having seen similar learner trajectories in training.

Subsequent research has produced hybrid approaches — DKVMN (Dynamic Key-Value Memory Networks, 2017), AKT (Attentive Knowledge Tracing, 2021), and others — that address interpretability and cold-start issues while preserving DKT's cross-skill advantages. The field is active and moving fast.


What This Means for HR

If you're evaluating L&D platforms or edtech tools and someone tells you their system is "AI-powered" or "personalized," here are the questions worth asking:

1. What is your knowledge representation? Is it a completion percentage? A quiz score? Or a probabilistic mastery estimate per skill, updated with each interaction? The first two are inputs. The third is the beginning of actual learning science.

2. Do your skill models capture interdependence? If a learner masters communication frameworks, does that update the system's estimate of their performance on related skills like negotiation or executive presence? If the answer is no, the system is running independent BKT-style models at best.

3. How do you handle cold start? Any honest vendor will acknowledge this is a real problem. The good ones have a prior model based on role, seniority, and pre-assessment. The bad ones just show everyone the same content for the first 30 days and call it "onboarding."

4. How does the model handle skill decay? Mastery isn't permanent. Someone who demonstrated decision-making mastery six months ago and hasn't practiced since should have a lower current estimate than someone who was tested last week. Does the system model forgetting?

Most platforms can't answer questions 2, 3, and 4 with specificity. That's informative.


How Omie Uses Knowledge Tracing

Omie's recommendation engine uses a simplified hybrid approach — conceptually closer to DKT than to BKT, but designed for the sparse data reality of a mid-size company deployment.

The key architectural decisions:

Skill graph with explicit dependency edges. Rather than letting the network discover skill interdependencies from data (which requires more data than most enterprise deployments have), Omie pre-defines a skill graph based on learning science literature and domain expertise. When you demonstrate mastery of a prerequisite skill, that directly updates the prior for dependent skills in the graph.

Contextual signals as input. Beyond quiz-style responses, Omie uses engagement depth (did you complete the 10-minute nugget? did you save it? did you apply the practice prompt?), self-reported difficulty, and manager observation signals where available. This gives the model more signal than pure quiz data.

Forgetting-aware decay. Mastery estimates are time-decayed using FSRS-inspired parameters. A skill not reinforced for 45 days loses 20-30% of its estimated stability, which surfaces it for reinforcement before forgetting becomes complete.

Interpretable outputs. Each user can see their mastery estimates per skill cluster in their dashboard, expressed as a confidence range rather than a false-precision point estimate. "You're between 60-80% on feedback delivery" is more honest than "73.4%."

The practical result: on day 1, Omie's recommendations look similar for users with the same role and skill gaps. By day 30, they diverge based on observed learning behavior. By day 90, the divergence is large enough that two managers at the same company, same tenure, and same role are seeing entirely different content — because their mastery trajectories are different.

That's the bar for real personalization.


The Business Case for Knowledge Tracing

HR technology investments are made on business cases. Here's the one for knowledge tracing over completion-tracking:

False positive reduction. Completion-based systems let "compliant but incompetent" learners fall through. Someone who scores 65% on a quiz but completes the module is marked green. A mastery model flags them as requiring reinforcement. This is the difference between a checkbox and an outcome.

Resource efficiency. If mastery tracing reveals that 60% of your workforce already has >80% mastery of a skill you were planning to train, you don't spend the training budget. You redirect to the actual gaps. At enterprise scale, this is meaningful.

Manager signal generation. When a management layer can see skill mastery estimates per team member — and those estimates are calibrated against real performance data — they can have different conversations in 1:1s. "The system suggests you're developing well on feedback delivery but showing slower progress on conflict handling" is a better coaching input than "you completed module 7."

The difference between a learning platform and a learning intelligence system is knowledge tracing. The former tells you what people did. The latter tells you what people know.

If you want to see where your team's skill mastery actually stands — not completion rates, but real gap analysis — run a team Learning Scan. It won't give you a DKT probability vector, but it will give you the starting prior your next L&D decision should be built on.

Ready to apply what you've read?

Get your personalised lesson today — free for 14 days.

Start free
Related articles

Apply this to your day

Omie sends one lesson every morning — built around ideas like this one. Personalized for your role and goals.