Sunday, October 28, 2007

Comparing Group Design and Single Case Design

One of the claims that I hear from the more scientifically oriented readers or other commenters in the blogosphere is that the way to be really certain of an answer in science is to use some variant of the blinded methodology with random assignment.

My thoughts are “Well…. not exactly”. The group designs are the right tool to assess certain scientific questions, but not others. It really depends on the nature of the question. Below I have created a brief list of some of the differences between single case design and group design research logics. My objective is not to thoroughly explore the differences, but merely to be a brief introduction.

Group Design

-Tests the null hypothesis

-Uses deductive logic

-Uses inferential statistics

-Washes out the individual variance in data

-Effect significance is assessed in objective terms

-Compares the experimental group to the

-Control group

-Uses reliable and valid assessment


Single Case Design

-Uses inductive logic

-Answers the question of to what extent the change in the independent variable caused the change in the dependent variable

-Uses graphical analysis

-Can detect patterns in the data that might otherwise be missed

-Effect significance is determined in a more subjective manner, gray area is possible

-Systematically alters the level/presence of the independent variable

-Uses objective behavioral definitions and inter-rater reliability checks


23 Comments:

Blogger Alyric said...

I think you left out the single most important difference. Your group design answers can be inferred to be useful outside of the local context. Your single study design is limited to the local context. Therefore one might say that the usefulness of single study design is inherently limited.

6:53 AM  
Blogger Interverbal said...

Hi Alyric,

Actually, I did cover that. I mentioned that group design uses deductive logic and single case design uses inductive logic.

The implication of this is that the findings of single case designs apply to the individual not the group.

You write " Therefore one might say that the usefulness of single study design is inherently limited."

Careful.... while that is true it is worded in a way that might be misleading. All designs have limitations. I can as easily spin that around and say that a group design has limited utility to tell me how effective an intervention was for any one person in the research.

And this is exactly why I argued for understanding what type of question we are seeking to answer. There is no, one size fits all in research.

8:58 AM  
Anonymous Anonymous said...

I think the problem with single case designs is that they don't adequately control for coincidence, placebo response and bias.

Also, if I find 100 single case reports in the literature reporting positive results, how do I know they aren't self-selected coincidences? I would venture a guess negative results are not published as often.

10:42 AM  
Blogger KeithABA said...

Joseph,

In reference to your first comment, I completely disagree. In fact in an ABAB design, placebo effect and coincidence are strongly controlled for by the removal of the independent variable.

In reference to your second comment, no the results of studies in which the independent variable is not shown to have a reliable and predictable effect on the dependent variable. Sometimes articles like that are published in JABA. The point being that the journal saw an advantage in publishing a given report because it may lead to better or revised future research.

During analysis of group designs, there are often several individuals in the group design that did not show statistically significant change as a result of the intervention. They somewhat fall through the cracks due to nature statistical analysis. 100% of people who take antidepressents in medical research do not improve. Would you say the group design is worthless because of 2-3% of people in which the independent variable did not have the desired effect?

12:45 PM  
Blogger Interverbal said...

Hi Joseph,

There are seven major threats to the validity of research:

1 History (cultural occurances, while the research is occuring)

2 Experimental mortality
3 Regression to the mean
4 Maturation
5 Instrumentation
6 Testing bias
7 Selection bias

Good group designs control for these, just as good single case design does (although they apply in different ways).

To answer your points:

-Coincidence

There is more than one type of single case design. Some are pretty robust. The best one's leave almost no standing for this criticism.

Let's take the reversal design for example. The changes between conditions only occurs when certain criteria are met. The widely agreed informal rule is to have 3 stable data points before you switch conditions.

This has led to some long baselines. But this in itself can be good because certain data patterns like cyclical behavior may become evident from the baselines alone.

The dependent variable is continously measured in responce to repeated manipulations of the independent variable. This can be done in a pre-experimental way using baseline-treatment data. Or in a quasi-experimental way with the ever popular Challenge-Dechallenge-Rechallenge design.

It can also be done in a true experimental way with a baseline-treatment-baseline-treatment reversal design. Often the reveral will be carried out more than that. The data must always imemdiately rise/fall in response to treatment condition and must always immediately reverse in response to the baseline condition.

The odds of repeated
"coincidences" in the data, begin to look pretty slim.
If the reversals are carried on for 6 conditions or more, then the odds are infinitesimal.

And remember, this is just the reversal design. Other single case designs handle this in different ways.

-Placebo

Now this criticism might have some validity for certain designs. This would be legitimate for a vanilla reversal design. However in some reversal designs, one or more different additional treatment condition is added. This serves in part, as placebo control.

Also, I think it is worth noting that not all true experimental designs in group research use placebos either. This is especially true in certain crossover designs or in matched pairs designs.

My point is not to knock group design, but to show that placebo control is not a requirement for a true experimental design.

-bias

Interobserver agreement is almost universally taken in single case design. I can't remember the last time I saw a single case design without this control.

This is where two or more observers independently watch the same behaviors. Their data are then compared and an agreement percentage is calculated.

This controls for bias. Also, sometimes, the behaviors involve permanent products,so there is little chance for bias e.g. (math dittos). Also, sometimes reliable and valid assesments are used in place of objective behavior defintions.

And don't forget there are more than a few examples of hybrids or designs that combine parts of single case and parts of group design.

Also, you are right, negative results of single case design are difficult to publish just like they are in group design. That seems to be a universal truth, no matter what paradigm one operates in.

However I can think of several examples where negative results were published. Late in my undergrad carear, stimulus-stimulus pairing to increase vocalizations in non-verbal chidlren was looking really promising. It made perfect sense theoretically.

I was really excited to potentially add to the reserach base in this area, but by time I became a grad student the existing research didn't show the effect I had hoped. This is only because the journals published negative results.

1:09 PM  
Blogger KeithABA said...

This is a conversation I have been waiting to see on the Autism blogs for a long time.

Like Interverbal pointed out, the studies are good for doing different things. Each design has it's strenths and weaknesses.

A single subject design would not be good for testing if a 40 hour a week ABA program helps children with autism.

But just because such a paper (other than the Lovaas paper) doesn't exist, does not mean you can draw the conclusion that ABA has little or no empiricle support. Despite that fact there are numerous claims that this is so.

The individual procedures that would be utilized within a 40 hour a week ABA program would come from the vast wealth of reserach done using within subject designs. (which are not always only with a single subject)

Speech therapists frequently use the Kaufman cards to treat apraxia. This has been going on for over 10 years, and there is little to no experimental research. Same can be said for auditory processing disorder and listening to beeps on headphones. In fact, ASHAA admits there is a huge debate as to weather it is effective. Yet insurance companies are paying for speech therapy across the contry on what is referred to as best practice.

To deny that ABA is among one of the most research supported, "best practices," because there isn't a good group design studies baffles my mind...

This was not the intention of the original post made by interverbal.

So any rebuttals should be aimed at me, not interverbal, unless you are specifically talking about the strenths and weaknesses of the designs.

1:53 PM  
Anonymous Anonymous said...

Interverbal: Let me just address interobserver agreement as a control for bias. Usually I believe observers expect positive change, so it would not be unusual for both observers to agree on their assessment. It's not implausible that this is what accounts for overall improvements generally observed in placebo-controlled trials in autism.

So I'd say this control is not a great control. The control for coincidence is pretty good. The control for placebo effect basically does not exist, which, sure, is true of some group designs as well, even in Level-I randomized trials.

7:04 PM  
Anonymous Anonymous said...

KeithABA: In ABA there are many group studies, and I think that's where its efficacy should be evaluated. The main reason I doubt ABA efficacy is that the most rigorous trial (the only one that is Level-I, randomized) was largely unsuccessful. That's Smith et al. (2000) with errata (Smith, 2001).

Whenever effects fade away as you improve trial methodology, that's a red flag. (I'm thinking homeopathy.)

7:08 PM  
Blogger Interverbal said...

Hi Joseph

I agree that observers expect positive change. However remember that in most single case designs there are many opportunities or occurrences for data to be recorded. In order for your criticism to stand we have to accept that this subtle bias is not only shared, but that the bias shows up at the same time in the same way.

Let’s say that Bob’s bias was demonstrated in interval #18 and #22 in a 30 interval scoring period, by marking the incorrect “non-occurrence” of the behavior to decrease. What you are proposing here is that the other independent observer, Susie, would also mark #’s 18 and 22 as a non-occurrence too. And that Bob and Susie would be consistent enough in this regard to score at least 85% agreement overall (if they want to get published and often higher). I would be openly skeptical of this claim.

The control for placebo in vanilla ABAB is non-present, I agree here. However, if a second IV is added (and sometimes they are) then this aspect is controlled for. Also, moving past the reversal designs to the Alternating Treatments Design, which involves 2 or more treatments randomly alternated at least once every 3 sessions. Then again, this is controlled for.

Also, I am going have to disagree with you in terms of looking at how to assess the efficacy of ABA. If the question is, does this ABA program, produce significant results in terms of standardized testing, when compared to a control group? Then yes, a group design is the right design. Also, I agree with you re negative results in the presence of improved methodology throwing up red flags.

However, if we were looking at a specific aspect of ABA, and we wanted to know what kind of behavior pattern we would see, in the presence of the IV, then a single case design would be the right choice.

8:19 PM  
Blogger Interverbal said...

Keith,

I have also noticed that certain speech pathology practices (and OT practices too) lack any research based support, but seem to escape criticism.

I think there are a number of reasons why this is true. First in foremost I think this is because there have not been the extraordinary claims about recovery or cure due to speech therapy.

Although I suspect there are other reasons as well. A post and disucssion on that topic would be really interesting.

8:28 PM  
Blogger Alyric said...

Hey Jonathon

"Answers the question of to what extent the change in the independent variable caused the change in the dependent variable"

The really big question, to which there is no watertight answer, is what your ABA practitioner determines is the independent variable and the dependent variable. Having read some apparently quite respectable papers (the authors have been well known at any rate), to my reckoning they seemed a bit arbitrary. You know the mathematical theorem - let x = and carry on from there? It's nice in theory, but not so good in the real world with real people who do weird things on a regular basis for no obvious reason. I know you think that people are more tabula rasa than that, but I don't and I've yet to find anything to persuade me that that is a wrong view of people.

As for controlling for bias. Hmmm. I don't think this is done at all in a way that might make sense to people of scientific bents who are not behaviourists. Even in single subject studies - surely it would be possible to blind the observers to the purpose of the experiment at least. Even better if the observer was not of a behaviourist background, since much of the 'objectivity' of the professional here isn't so much objective as a species of group think. Your inter-rater reliability isn't worth much really.

Cheers

Al

PS

This is one of my favourite quotes

"It is on the whole an inappropriate methodology [behaviourism] in developing improved cognitive images of complex, unstable systems with changing parameters and cumulative structures, where rare events are significant. Humans are a supreme example of systems of this kind" K. E. Boulding (1984)

6:50 AM  
Blogger KeithABA said...

Joseph: Smith et al. (2000) with errata (Smith, 2001). Yeah, that one didn't look so good for an intensive ABA program. I think that the hard part of evaluating that article is that the specific procedures utilized are not listed in the article.

To me the more important question is, what did Smith do that worked and didn't work. The answer can't be ABA, because ABA isn't a procedure. In other words, there was no component analysis, and I'm not sure there could be...

Again, what is important is the difference in distinction between a 40 hour program, and it's ability to "recover" kids, versus the efficacy of specific procedures used to address specific behavior.

For the latter, there is a slew of good evidence, much of it replicated. A group study wouldn't be good for many of these procedures. For example, a child who's tantrums are maintained by escape, vs. a child who's tantrums are maintained by attention. A group study testing one procedure such as escape extinction simply would not work to reduce tantrums for the child who's behavior was maintained by attention. A group design with randomized groups would thus only work if all the children in the study already had a functional analysis that determined the function of the behavior was escape. Only then could the group study be used to evaluate the efficacy of escape extinction between subjects.

Interverbal: The claims are probably a big part of it. However, it seems like bloggers think ABA is getting sooooo much funding. In comparison to speech and OT, the funding is a minute fraction of the total amount of dollars spent.

8:22 AM  
Blogger Interverbal said...

Hi Alyric,

Well we are officially off topic now, but since the question is interesting., lets plow into it.

You write “The really big question, to which there is no watertight answer, is what your ABA practitioner determines is the independent variable and the dependent variable. Having read some apparently quite respectable papers (the authors have been well known at any rate), to my reckoning they seemed a bit arbitrary.”

I have never seen this argument before, and I find it interesting. Arbitrary in what way?

You write “It's nice in theory, but not so good in the real world with real people who do weird things on a regular basis for no obvious reason.”

Are you arguing that in the real world people do unexpected things, therefore behaviors do not have a cause?

You wrote “I know you think that people are more tabula rasa than that, but I don't and I've yet to find anything to persuade me that that is a wrong view of people.”

Really, what level of tabula rasa do you think I consider people, quasi tabula rasa, 3/4 tabula rasa, 1/16 tabula rasa. Hey just asking. You know, I have previously gone to some extent in an attempt to explain that my position is not the same as an enlightenment philosopher who has been dead for over 200 years. However, if you think otherwise please feel free to quote me where I argue that humans = tabual rasa.

Like other behaviorists, I am a determinist. However, the same is also true of more than a few in cognitive science. If you are seeking (feel free to correct this) to establish a case for free will, you are not necessarily going to do better with the Pinker’s of the world.

”As for controlling for bias. Hmmm. I don't think this is done at all in a way that might make sense to people of scientific bents who are not behaviourists.”

I am sorry this doesn’t make sense to you. Feel free to offer a specific criticism as to why we should expect bias to cause the same the same answer on the same intervals in a highly consistent manner.




You write “Even in single subject studies - surely it would be possible to blind the observers to the purpose of the experiment at least.”

Yes, you are right. But because we use interobserver agreement, I would argue that we fairly address this concern. I am willing to hear an argument against it, but you will have to deal with the interobserver agreement issue and statement about “group think” isn’t going to cut it.

You write “Even better if the observer was not of a behaviourist background, since much of the 'objectivity' of the professional here isn't so much objective as a species of group think.”

Group think is achieving a plan without adequate testing or criticism, for the sake of consensus. To get the same scores, merely having the same general bias isn’t going to do the trick. One would need not so much group-think, as telepathy. Ironically, wasn’t it you, who claimed to know what I was thinking earlier?

You write “Your inter-rater reliability isn't worth much really.”

And I am willing to listen to criticisms of it, but you need to form a cogent argument against it first.

8:28 AM  
Anonymous Anonymous said...

Certainly one thing ABA has that homeopathy (my example) doesn't have is plausibility. I mean, it's a training method, and I think there's no doubt you could use it to train skills with it.

That said, I think Lovaas (1987) was an unfortunately skewed and hyped paper. Most of the ABA discourse, and even recommendations by authorities, proceed from this one deeply flawed paper. At this point, if we look at more rigorous science since then, I think it would be difficult to argue, for example, that ABA is clearly superior to TEACCH.

10:40 AM  
Blogger Interverbal said...

I think beyond plausibility certain techniques have research based validity, take for example functional assesment.

I agree that Lovaas (1987) had a large result for a single piece of research. And I also agree, it is not the type of research that we would want as a strong proof of a methodology.

I would not argue that ABA has a superior track record in group research compared to TEACCH. Although the TEACCH research also deserves a critical look.

However, if we are willing to look at individual techniques, then this is where ABA is quite strong in terms of research.

8:25 PM  
Blogger KeithABA said...

Joseph:

Your position and thoughts on ABA vs Teach is totally valid. I'm not sure either I'd make a blanket statement that ABA is superior to TEACHH. In fact, the techniques promoted by TEACHH are more similar than different from what a proficient ABA practitioner would reccomend.

Your line of reason is very different than what I see on other blogs. Scepticism is good!!!

On other blogs, I think anyone who reads this one knows which I'm referring to, make blanket statements about ABA that are nothing more than mudslinging.

ABA practitioners are mean, slap kids, don't care about feelings, and are hellbent on punishment. I know so because Autism Diva posted about Lovaas and some article in which time out was being used innapropriately.

ABA is evil because the Judge Rotenburg center shocks children. Hello, they are not doing ABA there. Israel may be a behaviorist, but to make the assumption his beliefs are the same as all behaviorists beliefs is a pretty bad logic.

ABA practitioners con parents into thinking that ABA is the only thing that will save their child. Not any I know!

And Finally, the golden goose that ties back to this post specifically:

ABA has no research basis and is quackery! I know so because Michelle Dawson said there's no good group design study proving it's efficacy.

8:16 AM  
Blogger Alyric said...

Sorry about this - haven't had a lot of time


The 'arbitrary' point was a reference to what I see as the host of assumptions around what leads to what. It seems eminently sensible until you consider that Aristotle's 4 element version of the universe is founded on exactly the same thing - careful observation. Observation has some important differences to measurement.

How tabula rasa are you? An interesting question that:) Didn't I say you were more so that I am and that has to be true since you're the behaviorist and I am not of that persuasion.

"Like other behaviorists, I am a determinist. However, the same is also true of more than a few in cognitive science. If you are seeking (feel free to correct this) to establish a case for free will, you are not necessarily going to do better with the Pinker’s of the world."

All too true - Pinker is not what I call endorsement of anything including what makes us human, Like all determinists, he has the tendency to lop off the bits of humanity - like free will and self-agency that don't fit too well with his pet theory.



"I am sorry this doesn’t make sense to you. Feel free to offer a specific criticism as to why we should expect bias to cause the same the same answer on the same intervals in a highly consistent manner."

I think some of that consistency is contrived or a fair amount of inconsistency is ignored - I believe we've talked about the foibles of Pragmatism in the past.


"Yes, you are right. But because we use interobserver agreement, I would argue that we fairly address this concern. I am willing to hear an argument against it, but you will have to deal with the interobserver agreement issue and statement about “group think” isn’t going to cut it."

No insult intended :)

Once upon a time I was working as a Haematologist. Now Haematologists get to look down microscopes a lot and pontificate on what they see. This person has a viral infection - note the many lymphocytes and some of them have changed in characteristic ways and so on. This is observation and fairly skilled observation at that. It is not measurement however and I noted that it took time for inter-observer agreement to reach the point that folks agreed 90% of the time on what was significant variation in erythrocyte size for instance. It never got better than that and was often a great deal worse, but the significant point was that this, like your ABA, is agreed upon uniformity, which is a species of group think and not independent at all.

Keith wrote:

1. ABA is evil because the Judge Rotenburg center shocks children. Hello, they are not doing ABA

If they wanted to distance themselves from it they could have it shut down. Where is the ABA criticism of the JRC?

2. ABA has no research basis and is quackery! I know so because Michelle Dawson said there's no good group design study proving it's efficacy.

Not correct at all and proves Michelle's point that some folks think they're beyond criticism. When Michelle makes a statement about the methodology of a particular study, the chances of that statement being incorrect approach zero. You might be able to disagree, but you'll need to work at it. She night make a mistake as we all do, but point me to it specifically please. Otherwise this is mudslinging.

You know the more I read the more I appreciate the simple statement that "Autistics Deserve Better" which is Michelle's 'brick wall' quote. They do and if we don't criticise the shoddy that's never going to happen.

11:06 AM  
Blogger KeithABA said...

Aylyric,

1. Who could have closed down the Judge Rotenburg center? Your comment reads like somehow the behavior analyst certification board, or the community has control over what he does? He's not even board certified, how could any action be taken against him by behavior analysts?

The criticism is evident in the most recent article in Mother Jone's article "School of shock." In that Article Dr. Brian Iwata made some very sharp criticisms of the use of aversives at the Rottenburg center.

2. You took this out of context. The point in my post is that this is the line of reason used to bash ABA by many others who don't like ABA in the Blogosphere. Not that Michelle Dawson ever said that, in those exact words.

How you can take that comment and deduce that it proves Michelle's point that ABA is beyond criticism baffles my mind. I never said ABA was above criticism, I clearly did say, that despite the claim that there isn't a great group design does not mean that there isn't good scientific evidence that the procedures are effective.w

5:37 PM  
Blogger Interverbal said...

Hi Alyric,

No problem about the time, we all get busy.

The 'arbitrary' point was a reference to what I see as the host of assumptions around what leads to what. It seems eminently sensible until you consider that Aristotle's 4 element version of the universe is founded on exactly the same thing - careful observation. Observation has some important differences to measurement.

Well, in terms of the IV and DV, this can be put to the test. Did the repeated manipulation of the IV cause a change in the DV? What does that change look like when compared to the change in the IV?

In response to your citation of Aristotle, I will mention that science and metaphysics are alike in very many ways. They both can utilize rationality; they both can allow us to make hypotheses based on observation. The real difference is the one you bring up, that science has a laboratory, while metaphysics like what the logician Aristotle used, does not. But science in behavior analysis does use measurement even if the measurement is of behavior. I would strongly argue that behaviors with objective definitions can be measured. I think no only most behavior analysts, but most cognitive scientists and neurologists would agree.

“How tabula rasa are you? An interesting question that:) Didn't I say you were more so that I am and that has to be true since you're the behaviorist and I am not of that persuasion.”

That is a fair question. I do not believe in tabula rasa at all. I think it is an idea propagated by an enlightenment philosopher who was really trying to establish a case against a hereditary nobility and government in his country. I think some of the opponents of philosophical have tried very hard to force fit the idea of tabula rasa into behavior analysis, because it is so easily rebutted, by the likes of Kant and friends.

I think of behaviors which include thoughts, feelings, and emotions are determined by a complex interweaving between the environment and biology as described by Susan Oyama. I think we can use contingencies and other environmental factors to increase the likelihood of the (non) occurrence of behavior. However, I don’t think we can assign a probability to it.

“All too true - Pinker is not what I call endorsement of anything including what makes us human, Like all determinists, he has the tendency to lop off the bits of humanity - like free will and self-agency that don't fit too well with his pet theory.”

To lop off bits of humanity…. But isn’t that assuming what you should be proving? In order for this criticism to hold weight you must first establish that these parts (whatever they are) exist. In the absence of this, you are in danger of committing the circular reasoning fallacy of begging the question.

“I think some of that consistency is contrived or a fair amount of inconsistency is ignored - I believe we've talked about the foibles of Pragmatism in the past.”

Okay, will you now provide some examples?

“No insult intended :)”

None taken ;)

“It never got better than that and was often a great deal worse, but the significant point was that this, like your ABA, is agreed upon uniformity, which is a species of group think and not independent at all.”

We do reach consensus of what we mean very specifically by a behavior before we ever take data. That is true. I think this is were objective definitions come into play. If I say “that child is hurting himself” and we attempt to measure this behavior, then there is a great deal of room for interpretation. But, I can make the definition so much more specific than that. In well defined behaviors, there is not a whole lot of room for interpretation. If errors occur, it is more likely to be caused by the observer’s attention lagging for a moment.

Unless you are willing to call all well defined, pre-agreed upon standards groupthink, I don’t see any way forward for you here. And this would pull the rug out from almost any branch of science that I can think of. If you do, I would be happy to listen to it.

The next points you addressed are in response to Keith, but I feel the need to answer them as well.

“If they wanted to distance themselves from it they could have it shut down. Where is the ABA criticism of the JRC?”

And how are they to do this? Israel is not board certified, he has no approval to be yanked. His center does not operate by the grace of the Association for Behavior Analysis. His center does present sometimes at ABA conventions, but it seems fair to at least hear them out, even if we really disagree. I have read many disagreeable books and spoke with many disagreeable persons. I have never yet met a person who I wouldn’t even listen to however.



“Not correct at all and proves Michelle's point that some folks think they're beyond criticism.”

It looked to me from reading Keith, that he was actually criticizing the fallacy of the appeal to authority.

I can add to this and mention that some people who are supportive in part or whole of Michelle, sometimes misunderstand the nature of Michelle’s criticisms and what they accomplish.

Moreover, I think that while ABA can be insular, I sometimes wonder if this isn’t a response to the criticism we receive. And I don’t mean the volume or the excellence of the criticism, but the unfortunately common nonsense.

The early ND page “Asperger’s Express” contained a link to a conspiracy theorist style page about how skinner tortured his daughters and other complete twaddle. I love a good debate, where I can really learn something, but when I hear the words “robotization” or “culture of death” I usually don’t bother. I think Michelle has a few things to say about this as well.

“When Michelle makes a statement about the methodology of a particular study, the chances of that statement being incorrect approach zero. You might be able to disagree, but you'll need to work at it.”

And yet she has made mistakes, like we all do. To her credit she often is the first to spot them, and even if not usually accepts the criticism. However, there have been at least a few times Michelle and I have been irreconcilable on a specific issue of science. She believes her statement is correct, and I contest it, and vice versa.

Michelle is a brilliant person, an autodidact, and a true scientist in her own right. But she is usually, scrupulously careful about making the issue about the science or about the ethics and not about Michelle, or me, or you, or anyone. Not everyone who supports Michelle is as careful in this regard as she is and I wish they would be.

11:14 PM  
Blogger KeithABA said...

A quick attempt to see if this makes more sense about how inter-observer agreement is calculated.

Target Behavior: Hand Mouthing - Hand mouthing is defined as placing any part of the hand or fingers past the plane of the lips.

2 independent observers observe for 1 minute (typically it would be longer, like 15 minutes, but I don't want to make that many columns!)

They X an interval if the behavior defined above occurrs. If the behavior does not occurr, they do not X the interval.

Observer 1 Observer 2
0:05 X X
0:10
0:15 X
0:20 X X
0:25 X X
0:30
0:35
0:40 X
0:45 X X
0:50 X X
0:55
1:00 X X

Total number of Agreements + Disagreements = 10

Total intervals = 12

10/12 = 83%

The observers agreed on 83% of intervals that the behavior occurred.

Could two observers possibly having expectations for a behavior to occurr less, sure. But how would this bias lead them to pick the same intervals to "groupthink" that the behavior did not occurr when it did?

Even if they were "unconciously" marking that the behavior occurred less in the treatment condition versus a baseline condition, they would still have to agree upon which intervals to falsely mark. Hmmm, me thinks they have some psychic connection to rate the treatment intervals lower.

3:57 PM  
Blogger KeithABA said...

Oops, made a little error because I was typing fast.

Inter-observer agreement =

total number of agreements/
Total number of agreements + Disagreements.

Numbers are right, description was wrong.

3:59 PM  
Anonymous Anonymous said...

I am a fairly new student of ABA, so I hope it is ok if I post on this site and I apologize if my questions seem naive. Regarding interobserver agreement. I understand the idea of using interobserver agreement to establish reliability and maybe validity, but to use it correctly it seems to me must take quite a bit of time. My questions are these:
In real life treatment is interobserver agreement used regularly, and secondly is there research that shows that the use of interobserver agreement actually makes a difference in treatment success

10:28 PM  
Blogger Interverbal said...

Hi Anonymous

“I hope it is ok if I post on this site”

Relevant question are always welcome.

“but to use it correctly it seems to me must take quite a bit of time”

Yes, time and hassle.

“In real life treatment is interobserver agreement used regularly”

In my experience IOA data are almost never taken in real life. In fact this sometimes leads to problems. Let’s say I have a 5 year old student named Becky. She masters expressive alphabet naming with me. However, her Kindergarten teacher swears she can only name a handful of letters. This type of concern is more common than you might think. It can be a hassle to straighten out.

“and secondly is there research that shows that the use of interobserver agreement actually makes a difference in treatment success”

You are beginning to consider what leads to research success. Continue that behavior, it will serve you well in the future. However, your current question is disconnected from the purpose of IOA. IOA is not designed to play a role in treatment success. Remember IOA is simply a check of measurement reliability. It is used to ensure that the behavior defined/targeted by researchers, is the thing being measured and not something else. There is a lot of more that could be said here, but I think this is enough for now.

Let me know if I can help you with additional questions.

4:20 PM  

Post a Comment

<< Home