reviews SYSTM


The two areas of improvement:
Outdoor ride integration: At present, there are two areas that I believe need to be improved, and both are focused on the training calendar aspect rather than the content library. Despite being able to set a training plan, there is no way to accommodate outdoor riding into this, which I am sure the vast majority of riders would appreciate being able to do.

Analytics: SYSTM also lacks any true analytical tools, and if you have the option to use a power meter and train with focus and towards a goal, you’ll likely want or already use another software program with this analytical ability. The ability to track progress and fatigue are key factors.


Interesting read. Surprised that the result of FTP +14W in just over 4 weeks was buried in the text… the reviewer has had fun and gained heaps of power!

If I recall correctly, GvA has said the Minions are working on both of those areas for improvement.


I also felt like he buried the lede! He was so intent on proving the slogan wrong that he didn’t realize (or maybe value) how well the app was working for him.


“The idea of the 4DP training and focusing on different elements of your power profile is likely to have a big positive effect on your fitness and progress, but the 4DP test will not work for everyone; I believe it would be better to test each area separately to get truer numbers and power figures to train by.“

This bugs me as I believe the whole of (cycling) SYSTM ethos is based upon this test. And that all separate areas are bound into one test for the sole reason of not having skewed numbers when doing each parameter on fresh legs.

I fear that having an expectation of what one’s power numbers are and then not achieving them through 4DP, one then faults the test as one’s expectations have not been met.

The 4DP test is hard and the information around doing the test is sometimes not adhered to and the result is that the testee faults the test rather than his application of the test.

I could be wrong. However, if pervious The Sufferfest based itself off of its prime test, and that the is test is regarded as not being for everyone, that will mentally restrict a number of people ever joining SYSTM before even having tried out the platform.

Those are harsh words from my point of view that the reviewer has laid down.


The reviewer (“Matt”, who classed himself as expert but gave no qualification to that term) came across as someone who was put out that FF returned an FTP value lower than his expectation (quite normal when doing FF, I believe).

Basically, “I didn’t like the result so the science is dubious”.



While I am a quite satisfied SYSTM user, the science of measuring cycling ability is fraught with difficulty.

The connection between the actual biological mechanisms and the measurements is imperfect. In addition the measurements depend on population statistics which make their application to any individual potentially difficult.


I guess the reviewer is missing the point of the main purpose of FF. Whether you could actually achieve higher number in a different test, is irrelevant.

1 Like

I agree, which is why I choose to trust the sports scientists with years of experience, not only in the actual science but with a demonstrable record at the highest level.

1 Like

Actually, my point is that you cannot completely rely on them. The nature of population statistics is that they may or may not apply to the individual rider. These statistics are a good starting point, but they have to be adapted to your particular circumstances.

4DP, WKO5, power curves, Xert fitness signatures, critical power, Coggan levels, FTP tests, etc. are all imperfect mappings to biology that can even vary with an individual rider on a given day.

For example, in a 20 minute test, riders will have varying aerobic or anaerobic contributions to their power. FF tries to compensate for that, but it is not guaranteed to work in all cases.

We do not have any biological markers to measure fatigue. I may have a great FTP or MAP, but how soon after a ToS can I achieve those levels again? Nobody knows for sure. There are averages, but they could be too short or too long for individuals.

In my particular case, I find the plans do not give me enough rejuvenation time between hard rides so I have to make changes to the plan. Years of experience at the highest levels does not map well to my experience levels.


Would you agree with the reviewer that doing individual tests is the way to go?

An AC effort +50 minutes in a 4DP test will (surely/should) result in way different numbers compared to doing a 1 minute all out AC standalone test. This is how I read the reviewer is pushing for testing to be. I don’t get the science behind that. I do get the science behind a 4DP test.

Further, and not to deviate from this topic, what would be your thoughts on TR’s AI FTP deduction tool, wrt your notion of using science to measure cycling ability being fraught with difficulty?

1 Like

The “science behind that” is that egos like bigger numbers.
Stand-alone tests are likely to generate bigger numbers.


4DP is a tool to allow focused application of further training and does that job fairly well.
I’ve been having a “similar” discussion with a Zwift user recently, where I have serious issues with the numbers that Zwift pumps out for benefit of Strava. He is adamant that Zwift numbers are “accurate”, I keep pointing out that his average speed on Zwift rides is massively above (more than a handful of kph) his fastest outdoor rides. His take is “if I pushed as hard outdoors as I do in Zwift races”, but the reality is that there are those of us in the group we go out with who are as fast as his Zwift speeds and he simply can’t keep up in outdoor rides when we step it up.

People want big numbers, it makes them feel good, even if it means nothing or, even worse, is counter-productive to actually producing results.


I think you misunderstood me. I don’t get the science of the reviewer pushing for standalone tests.

One sentence on you should have realized that and not read that one quoted sentence in isolation but as a continuation of what I say about the reviewer.

No, I agree with you.
I think his point is nonsense.
My point was that there was no “science” behind it, it’s simply his ego and my further point beyond that was that people will compromise their training choices and even platform choices based on the opportunity for ego bolstering numbers rather than simply something that actually works for their original goal.

The same carries into a lot of other aspects of life also.


Yeah, it’s funny how he doesn’t question Half Monty as a test when it gives him higher numbers twice.

Not that he should, of course, but it shows up the way he’s thinking about FF


The biggest issue i have with the review is that he mixed $ and £.

This makes systm appear more expensive in comparison to the other playforms.

Also he didnt take into consideration the discount associated with an annual systm membership. (Buy hey)

1 Like

Zwift’s motto “Fun is Fast” is pretty telling regarding Zwift virtual vs irl. One reason that brought me over to Sufferlandria was my belief that I wasn’t as strong a rider as Zwift was making me out to be. You see some pretty incredible ride times from riders on Zwift and have to wonder about over-inflation of a rider’s ability. Better to have a Sufferfest video reveal the truth.


The first question about testing is deciding what you want to measure and, how you are going to measure it. In cycling, we are fortunate that we can directly measure output, unlike running where they have to rely on indirect measurements using proxy measures such as heart rate, time, and distance.

Even power, as you note, has a complicated relationship to your past (both on your current and past rides) output. What part of the relationship is important to you depends on what type of riding you do, or your goals in riding.

Take myself, I generally do long/longer endurance rides, and I never race. I also wish to keep up my general fitness as I am in my very late 60s. Hence, for me NM power (as opposed to neuromuscular recruitment), or recovery from repeated efforts (which is why the AC is measured where it is in FF) are not that important. So for me, if there were no AC test at the end of FF, I would not care. For people who race, or do other kinds of riding, recovery from repeated efforts is important. Since a 4DP test is for a large range of riders, I think it belongs where it does. Since I do not care about rider types (although the plans might), HM testing might be good enough for me. Yes there are differences between the two, but for the riding I do, the difference in numbers really does not matter.

First, let us say that the AI moniker used in these kinds of tools is marketing. They are using Machine Learning which is, for our purposes, just a very sophisticated (and useful) statistical analysis. All the caveats about population statistics apply here. For example, weather prediction ML models have trouble predicting 100 year storms because the data on which they have learnt from generally does not include such events.

I am trying out Xert’s Adaptive Training Tool to help with workout and ride planning. So I take a SYSTM plan, and schedule the rides based on my recovery status. Their predictions are based on both my ride history and their statistical modelling.

1 Like

If I were using a tool that relied on maximal efforts for its algorithms, then I would agree. However, maximal efforts are not relevant for SYSTM workouts nor for road cycling in general. Except for the first couple of intervals, where a maximal effort may be possible.

That being the case, I think the reviewer’s assumptions show a lack of understanding regarding what 4DP represents. And that lack should be addressed.