This video received a lot of attention. Up to now, it has already received more than 300,000 views, 2,000 comments and almost a 1,000 "likes". Moreover, the reactions to this video are all of a particular kind, nicely reflected in the title "PVV lady doesn't understand statistics".
Lilian Helder is an MP from the Partij Voor de Vrijheid (PVV) - indeed, the infamous Dutch right-wing party of Geert Wilders. The people that read this blog know that I'm against almost all that this party stands for: anti-Islam, anti-immigration, anti-development aid, anti-Europe, etc. As a result, any possibility to take a shot at the PVV I take - even cheap ones. However, Lilian Helder - probably without knowing it herself - touches on a fundamental issue in statistics.
Person A is indeed different than person B. In order to make a proper causal claim one would have to compare the occurence of backsliding by person A that has community service with the occurance of backsliding of exactly that same person A that has a prison sentence. Of course this is not possible - we only live one life. This problem is so important in statistics that it is called "the fundamental problem of causal inference".
Comparing person A with a community service and person B with a prison sentence does indeed not make sense because they are different people. Any difference in occurence of backsliding between these two might very well be because of individual characteristics. Maybe person A is well-educated and quickly finds a job, while person B is not and thus might backslide. It is true that one can use statistical regression techniques to control for these factors. However, not all variables can be measured (psychological variables for example), and one can never be sure that all the necessary variables are included in the regression. These are very important problems in causal inference.
In order to make a correct causal claim one has to make sure to find a correct comparison. One way to do this is by a so-called Randomized Control Trail (RCT). Under this technique instead of individuals one compares groups. In our case above it would look as follows (I'll keep it brief). Let's say we have 10,000 people that have to be punished. We then randomly select 5,000 people for community service and 5,000 people for a prison sentence. Because these two groups have been selected randomly, the groups will have the same characteristics. For example the number of people that backslide in each of these two groups (community service vs prison sentence) can then be compared. While this technique has been used in bio-medical sciences for decades they have only recently been introduced in the social sciences. For a more complete discussion (with development aid as an example) please see here (in English) or here (in Dutch).
I haven't read the studies that the MPs refer to but it is very well possible that they did not make use of an RCT or a related strategy to make these causal claims. I have difficulties writing this, but Lilian Helder's remarks might not have been that dumb.