Let's talk about W'

Every now and then, someone in this forum raises the question of a certain workout is designed to make you fail sooner or later. I think the only workout designed to make you fail is “Half Monty” since this test protocol drives its insights from the point where you fail to keep up a certain power. The other workouts may make you struggle but make you faster in the long term.
Anyway, I wanted to start a discussion about
the idea of the W' value. If I’m not mistaken completely, the W' value should help predict if and when you will fail in a workout. As far as I understand, W' this value tells you how many matches you can burn. You can think of it as a battery pack. If you use your headlamp as a flashlight, it will be very, very bright for a short time, but the battery will also die quickly the longer the light stays on overall. On the other hand, if you’re using a dim light the whole time, you might end up coming home with the battery still half full.
If I look at my training analysis data in Golden Cheetah I can see how “Full Frontal” and “Half Monty” deplete my battery completely. And I can see that Nine Hammers almost does the same.
That’s why I wanted to reach out to the coaches here on the forum to give us more insight into W’.

  • Could you elaborate on W’ a bit more?
  • Is W’ a value a cyclist should consider when pacing an event or a workout?
  • Do you consider the W’ when creating a workout?
  • Is W’ a considerable KPI in the SUF’s 4DP approach?

Whatever W’ is, I always run out of it long before the end of 9H!

1 Like

Mmm, W’ :grinning:, one of my favourite made up models of cycling performance. Possibly because anaerobic capacity absolutely dominates my profile and ability (and inability!) as a rider. And cos, well, numbers :nerd_face:

I don’t quite use Skiba’s (or Golden Cheetah’s) formula but use something very similar when analyzing my workouts and it really does work well for me for Sufferfest HIIT-style workouts (after I’ve “fixed” the targets and metrics :speak_no_evil: :wink: )

I think of it more like a bucket than a battery pack. BTW, my bucket is yellow which coincidentally matches the colour used for AC.

I’ll leave elaboration to the coaches as, like many things, it’s something that can start simple but quickly become not simple and it’s how they, not me, view/use it that matters to you. I’ll certainly be interested to read what they say…


I like numbers.

And I’d like to know what my W’ is.

Whatever that may be.

Like above comment on NH. I think my W’s bucket has a leak developing during HM.

1 Like

I think this is referred to in WKO5 as Functional Reserve Capacity (FRC). Basically how many times you can deplete your battery without blowing completely. Its more than just sustained pressure, its about repeated efforts (think Violator).
WKO dFRC Explained (wko5.com)
Building Pmax and Functional Reserve Capacity (FRC) in WKO4 – TrainingPeaks Help Center

I can easily turn on something very similar to W’ analysis for your GSL rides Sir @Martin (although maybe not properly tuned to your leak). But let’s not talk about that here :shushing_face: :grin:

Simplistically W’ is about work above “critical power”, CP. CP may or may not be related to FTP depending on many a multi-dimensional thing. The size of your bucket is the W’, the current level of the bucket is W’bal (these are the Skiba names, other names are available). Everybody has a different sized bucket, those strong for AC generally have bigger ones :face_with_hand_over_mouth:.

If you just head out and ride for as long as you can above “CP” then, by the model, the bucket will be empty, W’bal = 0, at the point you can ride no more (although complexities apply, as always). The harder you ride, the quicker the bucket empties. But if you back off for a bit, drop below CP, then the bucket starts to refill. The more you back off, the quicker it refills (some models may have a constant drip of refill regardless of what you’re doing, others add a delay, more complexities apply).

So: recovery intervals are your friend… they allow you to do more hard intervals! Suffer hard, recovery easy, and you can be way more than your W’.

Sorry, I said I’d leave the elaboration to the coaches. And I mostly did :wink: . It’s a wonderfully simple or gloriously complex topic depending on your leaning.



Whenever discussing performance models, this is the first thing that pops into my head.



Xert tries to do the same with its model of Maximum Power Available (MPA).

1 Like

As a data nerd I love all performance models but I’m also sceptical about most of them.
W’bal (Or matches you kan burn) is a cool metric to see if you were indeed empty after hammer 8. My experience is that on fresh legs the metric is pretty spot on(for me) When I’m tired it’s way of. So (as with all metrics) In its context it can usefull, I would not use it in a race.

As the saying goes, “All models are wrong, some models are useful.”

For example, if you want to measure weight loss, it does not matter if the scale is accurate, it just has to be consistent.

The SUF model of FF, HM, NM, AC, MAP, and FTP seem to be consistent, and they seem to help people develop their potential. Whether they accurately measure the underlying biological mechanisms is another question.


W’ is just energy use above a certain level, most commonly Critical Power which is sort of similar to FTP. I think a model only using one value goes against the 4DP principle. You would really need W’, W’’, W’’’ and W’’’’ if you think there are 4 thresholds, as 4DP assumes.

The free computer software ‘Golden Cheetah’ has a training mode with a live W’ graph. I haven’t tried but I’m pretty sure you could run Sufferfest and GC simultaneously on the same computer if you had 2 ANT+ sticks or send ANT+ to one and BT to the other.

1 Like

i think it would work for some people if their physiology matches the way that W’ degrades in the model. Like, at least for things like Xert and FRC in WKO, the amount by which you are above FTP or TP or whatever they call it affects the speed at which that bucket drains, and if their assumptions match up to what you’re really like, it should work, right?

i am one of the people where it’s a neat toy but it doesn’t really match up and work for me. Like re: FRC, even with up-to-date numbers and model, I can bring it down to 0 and hold it there for multiple times in an A/C type workout and that shouldn’t happen, right, since 0 is supposed to = 0. So they tell me i need to re-test because this suggests that my max 30 second or 1 minute power or whatever should be higher, and then i go out and retest, and get the same numbers. In other words my battery is bigger than my max power numbers would indicate in these capacity-type models.

1 Like

Your reserve capacity is probably the most difficult thing to model.

You can directly measure instantaneous power (i.e. how much energy is available without any work by the body). You can directly measure long-term power (however you define it). Both correspond to measurements in watts.

Since, as you say, reserve capacity can be drained quickly or slowly it really should be measured in energy units, not power.

Here you are indirectly measuring a biological process. Given the current inadequate state of analytics and scientific knowledge, I am sure there are many people that fall into your category.

Xert measures it in energy units which results in a more complicated workout builder. I presume SUF models it as AC and MAP because they found it easier to design workouts, and for athletes to understand those workouts.

How do you do with a workout such as Nine Hammers?


yeah, that makes a lot of sense. Thanks for the explanation!

And yes, as you probably are guessing i do pretty okay at 9 hammers. It doesn’t feel great (lol) but even on aggressive / up-to-date numbers i can generally complete it if i’m not exhausted. Whereas FTP workouts are a lot harder for me.


But how would you know if you were one of the few riders for whom it worked?

And even if it did I think you’d need to have different versions for different sorts of races. Going above FTP during a long flat TT is important but if you were doing a race with lots of short hard climbs it’d make more sense to consider going above your aerobic threshold more.

Lots of software plots individual graphs of power vs time. With today’s computers I’m pretty sure you could count the number of times you get close to the line. I don’t know when W’ was first described but it seems to have been around for ages. I bet the bike computers back then were relatively simple.

on the first part, what do you mean? Like how would you test it on yourself? Because the main way i can think of to test any performance model is to give it a shot :slight_smile: . Then you can see whether, for your purposes, it’s a useful model with inevitable limitations, or not useful.

i agree about the different types of efforts needing a different model, or maybe rather, the W’ type models that i’ve seen thus far like Xert have not worked identically well for different types of use cases. Like for example, i used to use Xert back in the day, and after a long zwift TT it still told me i had plenty of MPA (they use the term Maximum Power Available i believe), and under their model I did, because i’d only been not that far above threshold, but in reality i had zero gas in the tank at finish.

If you did that then the sample size would inherently be 1 & there’s no way you could know any apparent association hadn’t just arisen through chance. You’d never know if it really worked or was just imagined. And even that’s imagining it is marked against an objective outcome in a double blind fashion.

well, it’s true that we’re constantly doing, over and over, what are essentially experiments with an n of 1. But that definitely, 100% does not mean that over time, you can’t learn what generally seems to work for you and what does not.

First, double blind experiments are not possible here. The person doing the test, certainly knows what they are doing. Second, even with a large “n”, if the subjects studied do not correspond to your characteristics, the study is not relevant to you. Third, a sufficient number of controlled experiments on one individual can be equivalent to a large “n”. The problem of course with the last statement is that you are not the same after every experiment.

This is why understanding human biological is hard.

1 Like