Interverbal: Reviews of Autism Statements and Research: A Review of the Challenge- Dechallenge- Rechallenge Design

Thursday, October 11, 2007

A Review of the Challenge- Dechallenge- Rechallenge Design

Most of the readers here are probably familiar with testimonials that claim a given treatment was implemented for a young child and then caused drastic improvement. Many of us on the skeptical side of the argument would correctly note that this is a post hoc, ergo propter hoc fallacy and that just because a change was noted after the treatment doesn’t mean the treatment caused the change.

I think it goes it without saying that pointing this out doesn’t always go over well. These issues deal with the core of what different people consider to be acceptable evidence. It also inevitably ties into an extremely emotive issue; their children’s health and wellbeing. Sometimes though, the people the people who have been accused of using such a fallacy do take the criticism to heart. They want to know what they could do to offer proof and be convincing.

I have been asked this several times. Frankly it is a frustrating question for me to answer, because I know the answer itself is unfortunate and unsatisfactory. And the answer is: “Not much”. The very limited list of possibilities includes:

1. Volunteering their child to be a participant in trials of their preferred treatments.

2. Conducting group based research like a double blind, placebo controlled, crossover design.

3. Advocate for studies to be undertaken concerning their preferred treatment.

4. Conducting a well controlled single case design.

The first option is unlikely because such trials are rare and the logistics of the thing make having the opportunity difficult. The second option is exceptionally hard because it requires significant technical expertise and well as considerable time and money. The third option is very indirect and often feels unsatisfying. It takes the ball out of the advocate’s court and places it into the hands of some third party.

This leaves the fourth option. This seems to be the option that many of the caregivers who accept the criticism of the post hoc fallacy, attempt to use. Specifically they attempt to make use of the Challenge- Dechallenge- Rechallenge Design. Said another way, this design has three phases, a treatment phase, followed by a baseline (non- treatment) phase, followed by another treatment phase. The advocates or caregivers claim this is a scientific design, which proves that a treatment was effective for their child.

To such advocates it may look like a good plan. This design is well considered in some mainstream scientific circles; it doesn’t involve the expense and difficulty recruiting for a group design or the detailed knowledge of inferential statistics. The basic mechanics of the design are easy enough to understand. It puts the ball back in the caregiver’s court. And maybe best of all, it seems to provide a direct answer about the effectiveness of a treatment, for someone very dear to the advocate or caregiver.

Unfortunately, a certain cliché is apt here: nothing is ever easy. The treatment- baseline- treatment design, even in the best possible case is a quasi- experimental design. It has reduced validity to answer if the treatment is what caused the improvement for the child. This has nothing to do with the fact that there is only one child. In fact there are several excellent single subject designs that allow us to be quite confident that treatment caused the improvement. Unfortunately, this design isn’t one of them.

What worse is that even though this design when conducted under excellent conditions and high control might have some validity, this still requires years of study, guided practice, and hard work to understand. Single subject has its own quirks and trip falls just like any other type of research from epidemiology to double blind designs. The point is you are not going to be competent in this type of design from reading a book or researching the topic on the internet. They could become so, but it takes years of hard study and practice.

The concerned caregiver or advocates who claim that they administered a treatment, withdrew, it administered it again, are not drawing level with trained researchers who use this design. They don’t have the same safeguards and controls. A good research project isn’t just the basal design, it is the level and appropriateness of the controls in the design. A trained researcher’s design might have:

1. More than one participant being investigated at the same time or in sequence to control for maturation or outside factors.

2. Specific targeted behaviors, with objective definitions, as opposed to someone’s general impression of “wellness” or competency.

3. Appropriate reliable and valid assesment.

4. Inter- observer agreement assessment, to make sure that those who give the assessment or record the behaviors are doing so in a valid and reliable manner.

5. Careful notation of possible confounds that emerge over the conditions.

6. Appropriate graphical analysis to observe trend, level, variability, and possible patterns indicative of additional or outside concerns e.g. (cyclical behavior).

7. Careful control of the context and environment in which the child is exposed to the treatment.

8. Submission and analysis in peer reviewed publications, where ones knowledgeable peers can point out problems, weakness, or graphical shenanigans.

9. Knowledge of what type of questions the design can and can not answer in the first place.

10. Knowledge that the conditions can not be arbitrarily switched, that there are rules for when we can transition to the next condition.

If one doesn’t have at least some of the above and especially 2, 3, 8, 9, and 10 intact, then they have no real design to speak of.

Look at Figure 1. This a graph of a hypothetical treatment- baseline- treatment design. There are only two points of comparison in this design; the first treatment phase to the baseline, and then the baseline to the second treatment phase. We have absolutely no idea what was going on before the first treatment phase implemented. And even if there was a dip in the baseline condition we have no idea if this was an artifact from an outside event or maybe based on a cyclical pattern of behavior. If you only have two points of comparison it is very hard to know.

This is figure 2. This is what happens when someone tries to use a treatment- baseline- treatment design, for a task that involved learning. You get stability where we should see variability between the phases. So, even if the treatment was biomedical and the measure was on something not easily lost like acquisition of language, the negative effect would not be observed during the language tests.

The next time someone says they used a Challenge- Dechallenge- Rechallenge which is a appropriate scientific research design, ask them if they can send you the write up so you can see if they used the same controls the trained scientists do.

8 Comments:

Club 166 said...: Thanks. It's always good to see real science explained. I doubt that those that really need to see it will look at this or understand, but it's good to have it out there, anyway.

Joe; 7:55 AM
Anonymous said...: I think it goes it without saying that pointing this out doesn’t always go over well.

The understatement of the year, Interverbal.; 8:17 AM
M.J. said...: Many of us on the skeptical side of the argument would correctly note that this is a post hoc, ergo propter hoc fallacy and that just because a change was noted after the treatment doesn’t mean the treatment caused the change.

I think this is the crux of the argument. Just because it can doesn't automatically imply the that it is a fallacy. Furthermore this is standard for attempting to prove a fact scientifically.

None of the commonly used therapies for autism have been proven scientifically. So if you are to take the "hasn't been proven" position you basically rule out doing any thing to help your child.

I don't think that any parent who try biomed or other therapies are looking to prove the science - they are looking for therapies to help their children. So in those cases the standard of proof is much lower.; 7:51 AM
Interverbal said...: Hi MJ,

Forgive me for the dissection. Your post was very interesting and deserves a point by point response.

You write "Just because it can doesn't automatically imply the that it is a fallacy."

I can think of no occasion where the above is true. If you can, you are welcome to share it.

You write "Furthermore this is standard for attempting to prove a fact scientifically."

Again, I can't think of an occasion where that is true. If you can, please share it.

You write "None of the commonly used therapies for autism have been proven scientifically."

The best researched category is education based interventions. These are common. The group based research in these therapies, whether they are ABA or otherwise, shows mixed or slight effect.

The research on smaller strategies such as chaining, fading, picture schedules, PECS, etc. are usually conducted using multiple true-single case designs and are much more robust. However while they can prove that a child learned a skill because of the method, they can not prove that the child would not have gained it through natural progression given more time.

You write "So if you are to take the "hasn't been proven" position you basically rule out doing any thing to help your child."

I advocate specific education based interventions that have robust findings in research; this is regardless of whether or not they arose in the field of ABA. I am well aware that they already exist. An IEP team can help you spot which ones are right for your child.

I think that an IEP team, who includes team members who have had a chance to observe the child in real life, and have been able to administer appropriate, reliable and valid assessments can create a very accurate picture of the child's present level of ability.

Based on this, I advocate that specialists such as reading specialist, behavior specialist, speech pathologist, etc. can help the team craft an IEP that includes research based practices.

You write "I don't think that any parent who try biomed or other therapies are looking to prove the science - they are looking for therapies to help their children."

I agree, I don't think anyone starts out that way. But if it does seem to help, I think many of these folks, then consider it to be "proved" or "scientific". They seem (based on what I have read from what they themselves write) to latch on to the C- D- R design in particular, without ever really understanding what this design entails.

You write "So in those cases the standard of proof is much lower."

I would go so far as to say that many things in life don't require a full out, hard core research design. The issue just isn't that important vs. the time and effort required to do the research.

But, tell me, when is the last time you saw someone say: “Yes we proved that the GF/CF worked well for our Susie, but our standards are an awful lot lower than real scientists”. I have never seen this happen. Now maybe it does, but I am far more used to seeing statements about “proof” without the qualifier of a lower level. I am also very used to seeing an attempt to latch onto the C- D- R design, but I see no evidence that these folks actually understand such a design.

I advocates want to claim science, they need to do the science. Please also see my new post on what modifies the need for research.; 10:01 AM
M.J. said...: Interverbal,

I apologize for the length of this comment, as my wife likes to point out, I can't seem to say something without rambling on for a while.

You write "Just because it can doesn't automatically imply the that it is a fallacy."

I can think of no occasion where the above is true. If you can, you are welcome to share it.

I don't think this is a matter of a specific example. I think this is a exercise in logic. Using the following example (lifted from wikipedia) :

The form of the post hoc fallacy can be expressed as follows:

A occurred, then B occurred.
Therefore, A caused B.

The fallacy here is that assuming that A causes B.

However, it is equally invalid to assume that A does not cause B - not without considering the other facts involved.

So as a general statement claiming that every testimonial is fallacy is equally invalid. You don't really have enough information to make a determination either way.

You write "Furthermore this is standard for attempting to prove a fact scientifically."

Again, I can't think of an occasion where that is true. If you can, please share it.

My point was that this is a higher standard to "prove" something is true or not to a reasonable standard. For more mundane, everyday events we all make assumptions that if B follows A that A could have caused B.

A simple example would be that after touching a hot object a blister develops. Most people would assume, correctly, that it is likely that touching the object caused the blister.

People are good at making this sort of relative judgment call. Now obviously this is an overly simple example. And when you get into complex areas where the cause and effect isn't obvious or there are many events occurring at the same time then this breaks down and people can jump to the wrong conclusion.

However, as an everyday standard when it doesn't really matter if the given concept is "proven" it is a good rule of thumb. Which comes back to my statement that worrying about the higher standards of proof is really only a scientific matter.

The best researched category is education based interventions. These are common. The group based research in these therapies, whether they are ABA or otherwise, shows mixed or slight effect.

Hence they are not "proven" to work. And from my understanding, the studies of ABA type therapies were not in the autism domain, so the evidence that they help autism is rather weak. And yet they are still used because they are known to be effective with other conditions that are thought to be related.

Similarly most of the major biomed treatments are known to help with some other condition that is thought to be related to autism. It is the relation that is unproven, not that the treatment. And to be clear, I am talking about some of the better biomed out there (GFCF, etc) not the snake-oil eat this herb and wear pink on Tuesdays variety.

I think that an IEP team, who includes team members who have had a chance to observe the child in real life, and have been able to administer appropriate, reliable and valid assessments can create a very accurate picture of the child's present level of ability.

I may be digressing from the main point, but after my twins have been through more than five or six assessments from different agencies for services, I have yet to see even one be a "very accurate picture" of their current ability.

The only ones who come close to having an accurate view are the BSC and TSS who have been working with them for months. And that understanding developed over the months.

I agree, I don't think anyone starts out that way. But if it does seem to help, I think many of these folks, then consider it to be "proved" or "scientific".

I would have to disagree with your conclusion here. The people that I have run into tend to say that a treatment worked for them and that they had good results with it. I have not seem many people who claim that a given treatment is a proven fact.

However, on the other hand, they may suggest trying it because they had good results with it.

This is where the research should step in and attempt to verify the assumption one way or the other.

The problem is that this has not happened very well to date; most of the research can be called inconclusive at best.

Until this happens properly, the best you can hope for is the recommendations of what other parents have had success with and best practices and suggestions from medical professionals.

But, tell me, when is the last time you saw someone say: “Yes we proved that the GF/CF worked well for our Susie, but our standards are an awful lot lower than real scientists”.

I actually try to say that. I have seen good results with my twins on the GFCF diet (although the majority of the improvements was from the CF part). I try to remember to qualify the statement that it is my experience rather than a general fact.; 1:40 PM
Interverbal said...: Hi MJ,

No problem re the length, some topics requiring a bit of length to do them justice.

You write “However, it is equally invalid to assume that A does not cause B - not without considering the other facts involved.”

Even without considering other facts, we can not say that because someone uses a fallacy that whatever they were arguing for is factually incorrect. That is not the way logic works. Logic can only tell us if the argument or proof was made in a valid way.

When we break this rule we use what is called argumentum ad logicum, and it is in itself a fallacy. However, the valid point you hint at concerning the argumentum ad logicum only works if we try to say that whatever was argued for fallaciously is not factually true.

If we say it is invalid we are safe.
If we say that we can’t know based on the logic used, we are safe.
If we say that it is poor logic, then we are safe.
If we say that it is a fallacy we are safe.
If we say that that you can not logically know the truth based on the fallacy, then we are safe.

Again, it is only if we say that it is not factually true because it is a fallacy, that we use a no-no.

You write “So as a general statement claiming that every testimonial is fallacy is equally invalid”.

Did I say that, if so please provide a quote. Not every testimonial is fallacy. However, I will say that every testimonial is anecdote, a lesser proof that leaves much to be desired. And I have no problem saying that every testimonial that has a post hoc, ergo propter hoc argument, contains at least one fallacy.

You write “You don't really have enough information to make a determination either way.”

Burden of proof MJ. It falls on the one trying to offer the proof, to give enough evidence for a valid decision.

You write “My point was that this is a higher standard to "prove" something is true or not to a reasonable standard. For more mundane, everyday events we all make assumptions that if B follows A that A could have caused B.”

I have written to some extent now about mundane issues and what isn’t worth research. I have also made some suggestions about what modifies a mundane issue into requiring research. If the issue is mundane, then we accept a lower level of proof. This doesn’t mean the proof was good, it just means it was good enough.

However, this issue was addressing your statement "Furthermore this is standard for attempting to prove a fact scientifically." So, which is it? here Does science accept a lower standard of proof in terms of the mundane. Or is this about what falls between the cracks of science because it isn’t worth the time and energy to do real research on it? And if it is the former and I want to see some examples please.

You write “A simple example would be that after touching a hot object a blister develops. Most people would assume, correctly, that it is likely that touching the object caused the blister.”

You are right this is a mundane example. A blister is a minor issue, probably not worthy of research. Are there things that could modify the importance of this issue and lead to research, I think there could be. What if a rash of body blisters breaks out among very young children. Parents think it may be associated with a popular new electric blanket that gets quite warms. A rash of lawsuits begins based on the common sense premise that heat = blisters. However, maybe a little research here looking at whether the blanket gets hot enough to cause blisters is in order. Maybe the real cause is a fire retardant recently put into children clothes, that causes an allergic reaction in some children.

The mundane issue, that fell through the cracks of science, suddenly could use science if we make it important. We upped the ante, in terms of what proof is good enough.

You write “People are good at making this sort of relative judgment call”

Are they? The parents in Hoover & Millch (1994) who were sure that sugar caused hyperactivity in their kids, couldn’t tell what their children were given by their behavior. The parents in the recent Elder et al. GF/CF study couldn’t tell what condition their kids were in. For my part I think people can sometimes be right and they can sometimes be wrong. I don’t place the issue I terms of the “people” I place it in terms of how logically they made a point, or how strong their science based evidence was.

You write “However, as an everyday standard when it doesn't really matter if the given concept is "proven" it is a good rule of thumb. Which comes back to my statement that worrying about the higher standards of proof is really only a scientific matter.”

That is my point. This has been the core of what I have been writing about. You can’t claim science if you don’t actually do science. Again, I have written what modifies the need for science. And again, I think what is “proof” never changes. Merely the acceptable level of “proof”. The difference here is saying “we proved A causes B” vs. “we have enough proof for us”.

You write “Hence they are not "proven" to work”

No, they are not proven to provide a better effect than natural progression. The issue changes if we are willing to look at single case design as opposed to group design.

You write “And from my understanding, the studies of ABA type therapies were not in the autism domain, so the evidence that they help autism is rather weak. And yet they are still used because they are known to be effective with other conditions that are thought to be related.”

Of the many valid criticisms of ABA in autism, (some I have made myself), this simply is not one of them. There are well over a thousand (no joking) single case designs that have a true experimental design, concerning some aspect of behavior analysis and autism. They don’t all cover the same technique however. The research concerning behavior analysis in general is far larger still.

You write “Similarly most of the major biomed treatments are known to help with some other condition that is thought to be related to autism. It is the relation that is unproven, not that the treatment.”

Yes, I understand this point. I believe my writing reflects this. If not you are welcome to quote me.

You wrote “I may be digressing from the main point, but after my twins have been through more than five or six assessments from different agencies for services, I have yet to see even one be a "very accurate picture" of their current ability.”

I am sorry, without knowing the details I can’t offer any comment here.

“I would have to disagree with your conclusion here. The people that I have run into tend to say that a treatment worked for them and that they had good results with it. I have not seem many people who claim that a given treatment is a proven fact.”

I can only offer anecdotes based on the interactions I have had with pro biomed types on autism forums, or the various autism blogs. I maintain that my earlier statement represents an accurate picture of the argument given by many (not all) of them.

You write “This is where the research should step in and attempt to verify the assumption one way or the other.”

I agree, I think it speak well that you are willing to put this to a true test (could go either way). There are some advocates and parents who think that it is research’s job to verify what they think. If the evidence is against what they think then the evidence is wrong. That is not the way science works.

At the moment we have had brief and imperfect attempts to do research in autism for the GF/CF diet and for B12. In both cases the research found no benefit.

You write “I actually try to say that. I have seen good results with my twins on the GFCF diet (although the majority of the improvements was from the CF part). I try to remember to qualify the statement that it is my experience rather than a general fact.”

Noted, you are first person I have talked to, who takes this approach.; 5:12 PM
M.J. said...: Even without considering other facts, we can not say that because someone uses a fallacy that whatever they were arguing for is factually incorrect.

I think we are saying the same thing here. You cannot assume that simply because B follows A that A caused B nor can you say that B was not caused by A. In many cases there simply isn't enough information either way. This is what I was attempting (probably badly) to say the first time.

Did I say that, if so please provide a quote. Not every testimonial is fallacy. However, I will say that every testimonial is anecdote, a lesser proof that leaves much to be desired.

I'm sorry, I tend to assume from reading other blogs that when someone asserts that testimonials can contain fallacies that is implying the statement that they are fallacies.

And I have no problem saying that every testimonial that has a post hoc, ergo propter hoc argument, contains at least one fallacy.

Just for the sake of argument I would say could (or even probably ) contain at least one fallacy - the minute you make it an absolute you are going to run into the cases where the argument is valid.

Burden of proof MJ. It falls on the one trying to offer the proof, to give enough evidence for a valid decision.

True - and I think they either prove or fail to prove the argument. But it does not follow that failing to prove the argument means disproved, rather is seems it would be unproven.

Does science accept a lower standard of proof in terms of the mundane. Or is this about what falls between the cracks of science because it isn’t worth the time and energy to do real research on it

I think it is a combination of the two. For items that are more mundane science will assume that the common sense answer is the valid one because it is assumed correct.

An relevant example here is the refrigerator mom theory of autism. It was put forth and to a large extent considered correct at the time. So how many other research projects started from the basic assumption that the theory was correct?

The parents in Hoover & Millch (1994) who were sure that sugar caused hyperactivity in their kids, couldn’t tell what their children were given by their behavior.

Without reading the study I would hazard to guess that parents can tell when their child is being hyperactive, not necessarily the cause of said hyperactivity.

Of the many valid criticisms of ABA in autism, (some I have made myself), this simply is not one of them.

Well, it isn't the first time that my understanding was wrong and certainly won't be the last.

I can only offer anecdotes based on the interactions I have had with pro biomed types on autism forums, or the various autism blogs.

I think the problem here can be two fold. First, as a general rule, only the more opinionated people tend to blog or write comments. Second, I think the issue of autism has become very polarized, especially in online forums. When you add those two facts together what you get is a lot of rhetoric.

If the evidence is against what they think then the evidence is wrong. That is not the way science works.

I would welcome some real studies on the topics. However, I think part of the problem is that the studies are being performed by people or organizations with a conflict of interest. So when you people who appear to have an agenda (like Geirer) or organizations with conflicting roles (drug companies) putting out research people tend not to trust it - even it is is perfectly valid.

At the moment we have had brief and imperfect attempts to do research in autism for the GF/CF diet and for B12. In both cases the research found no benefit.

As to B12 - I don't really have any knowledge of it so I can't comment.

For the GFCF I assume that you are referring to the Elder et al. study. I don't find that study is particularly meaningful for establishing that the GFCF diet doesn't work. The study doesn't begin to account for the how the diet is supposed to work (at least for my understanding of how it works).; 7:32 PM
Interverbal said...: Hi MJ

You write “Just for the sake of argument I would say could (or even probably) contain at least one fallacy - the minute you make it an absolute you are going to run into the cases where the argument is valid.”

By definition, what I have offered is correct. There are no cases with valid post hoc, ergo propter hoc arguments. If you can find a valid example against this, I will be happy to modify my argument.

I think you are struggling with the concept a bit. Similar to how many people don’t understand that an insult or name calling is not always an ad hominem. There are things that look sort of like post hoc logic, but they have controls or some sort of manipulation of variables that change the nature of logic. A single subject, reversal design (baseline- treatment- baseline- treatment) kind of looks like post hoc logic at first, but the analysis is quite different.

You wrote “I think it is a combination of the two. For items that are more mundane science will assume that the common sense answer is the valid one because it is assumed correct.”

Well no…. Science requires that an idea be potentially disprovable. If you don’t have that, you really don’t have science. Also, science moves through stages of proof this ranges from the doubtful to the tentatively acceptable. The issue doesn’t change the level of “proof”, merely the level of proof we accept.

You write “An relevant example here is the refrigerator mom theory of autism. It was put forth and to a large extent considered correct at the time.”

Now you have just opened up a can of worms. The psychoanalytic field is not based on science. It is a pseudo scientific field. Carl Popper, the philosopher of science from whom we get the “must be disprovable” logic discussed above, made it clear that psychodynamic theory wasn’t science.

A lot of the real scientists of the time didn’t buy the cold mother theory. Dr. Rimland wrote “Infantile Autism”. A book that did more than anything else to decrease this myth’s popularity. But he was not the first one to make the argument and he relied heavily on existing science of autism done by others, to make his case.

The case you describe here isn’t one of science having a lower standard, but of non-science that became popular for a short time. Also, as recent as 2005 there was a letter in a US psychoanalytic journal about how autism could result from mother’s eye contact. In France and in other places in Europe, psychoanalytic theory is still in the rage. Go down a few posts to read what they are up to in France.

You write “So how many other research projects started from the basic assumption that the theory was correct?”

A lot, maybe an awful lot. The difference here is whether the theory can be potentially disproved and if the advocates of the theory are honest enough to give it a fair trial. If you can’t potentially look like an idiot (be proved wrong) you can’t do science.

You write “Without reading the study I would hazard to guess that parents can tell when their child is being hyperactive, not necessarily the cause of said hyperactivity.”

That seems like a reasonable guess, but it is not correct. The Hoover and Millch study, randomized supposed sugar reactive kids with hyperactivity into a number of groups. Both groups received a drink with artificial sweetener. One set of parents was told it was artificial sweetener, one group’s parents were told it was “real” sugar. The parents of the group, who were told it was real sugar, rated their children behavior statistically significantly worse.

But my all time favorite study used showed teachers what they were told was videos of an ADHD boy in his classroom. In reality the boy was just a typically developing but active child. The teachers began to label all of his obvious and unmistakable ADHD behaviors.

Why did smart people like the parents and the teachers get it wrong? I would recommend that you go to wikipedia and look up the Forer Effect and confirmation bias.

You write “For the GFCF I assume that you are referring to the Elder et al. study. I don't find that study is particularly meaningful for establishing that the GFCF diet doesn't work. The study doesn't begin to account for the how the diet is supposed to work (at least for my understanding of how it works).”

The problem is, there seems to be no central authority on the GF/CF. I strongly suspect that if the diet had been changed to reflect what you wanted, I would be having this exact same conversation with someone else. It sure is hard to hit a moving target.; 9:54 AM

Interverbal: Reviews of Autism Statements and Research

Thursday, October 11, 2007

A Review of the Challenge- Dechallenge- Rechallenge Design

8 Comments:

About Me

Previous Posts