Article quality pre-and post-Sanger
- Randy from Boise
- Been Around Forever
- Posts: 12236
- kołdry
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Article quality pre-and post-Sanger
I'm not sure where main page comments are supposed to be made, but since it's a Kohs piece there and a Kohs thread here, this may suffice.
Per: >>Without Sanger, Wikipedia became less reliable over time, although continuing to grow in popularity.
I defy you, Greg, to demonstrate how any randomly selected 10 articles from Sanger-Wikipedia 2002 are "more reliable" than the 10 same pieces from Wikipedia 2012. This is an absolutely farcically incorrect statement and it's hanging there on the front page, like a BA out the back window of a 1956 Chevy.
RfB
(mod note, added link to blog post by Greg.)
Per: >>Without Sanger, Wikipedia became less reliable over time, although continuing to grow in popularity.
I defy you, Greg, to demonstrate how any randomly selected 10 articles from Sanger-Wikipedia 2002 are "more reliable" than the 10 same pieces from Wikipedia 2012. This is an absolutely farcically incorrect statement and it's hanging there on the front page, like a BA out the back window of a 1956 Chevy.
RfB
(mod note, added link to blog post by Greg.)
- Moonage Daydream
- Habitué
- Posts: 1866
- Joined: Tue Mar 20, 2012 12:41 pm
Re: Who's got Jimbo's back?
See where it says "leave comment" or "x comments" at the bottom right of each item? That's where you leave your comments.Randy from Boise wrote:I'm not sure where main page comments are supposed to be made, but since it's a Kohs piece there and a Kohs thread here, this may suffice.
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Who's got Jimbo's back?
How much do you want to bet? I'm not going to put in the work, if you're not going to put your money where your mouth is.Randy from Boise wrote:I defy you, Greg, to demonstrate how any randomly selected 10 articles from Sanger-Wikipedia 2002 are "more reliable" than the 10 same pieces from Wikipedia 2012. This is an absolutely farcically incorrect statement and it's hanging there on the front page, like a BA out the back window of a 1956 Chevy.
Mods, could you split off Randy's idiotic placement of his challenge in this completely unrelated thread? Maybe just put it in "Off topic", or start a thread in the Wikipedia forum, called "Accuracy challenge".
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Posts: 10891
- Joined: Wed Mar 14, 2012 11:32 pm
- Location: hell
Re: Article quality pre-and post-Sanger
Done, and gentlemen, if you're going to make statements like that, please feel free to offer some evidence to back your position up. Thankyew.
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Article quality pre-and post-Sanger
Fortunately, there was a University of Minnesota study about "damaged views" of Wikipedia, to which I handily linked in the blog post. Maybe Randy disagrees with the Golden Gophers.EricBarbour wrote:Done, and gentlemen, if you're going to make statements like that, please feel free to offer some evidence to back your position up. Thankyew.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Who's got Jimbo's back?
Whassamatter, is business so bad that you're hustling bets these days?thekohser wrote:How much do you want to bet? I'm not going to put in the work, if you're not going to put your money where your mouth is.Randy from Boise wrote:I defy you, Greg, to demonstrate how any randomly selected 10 articles from Sanger-Wikipedia 2002 are "more reliable" than the 10 same pieces from Wikipedia 2012. This is an absolutely farcically incorrect statement and it's hanging there on the front page, like a BA out the back window of a 1956 Chevy.
One hundred dollars. If I win, you make a check to the WMF. If you win, I pay the charity of your choice. Articles selected by hitting the SELECT RANDOM ARTICLE button until 10 pieces are selected that date back to 2002. Selection to be performed 5 by you and 5 by me on the honor system. Your entry is the last entry on record for 2002. My entry is the current version at the time of selection. Judgment of "reliability" to be performed by an independent person not affiliated with either Wikipediocracy or Wikipedia. A 5 to 5 score is no action.
RfB
- Kevin
- Critic
- Posts: 157
- Joined: Sun Mar 18, 2012 1:56 am
- Wikipedia User: Kevin
- Wikipedia Review Member: Kevin
- Actual Name: Kevin Godfrey
- Location: Adelaide, Australia
- Contact:
Re: Who's got Jimbo's back?
Good luck finding 10 articles. The earliest I found with 100 clicks or so was 'Toilets in Japan' from 2004.Randy from Boise wrote:Whassamatter, is business so bad that you're hustling bets these days?thekohser wrote:How much do you want to bet? I'm not going to put in the work, if you're not going to put your money where your mouth is.Randy from Boise wrote:I defy you, Greg, to demonstrate how any randomly selected 10 articles from Sanger-Wikipedia 2002 are "more reliable" than the 10 same pieces from Wikipedia 2012. This is an absolutely farcically incorrect statement and it's hanging there on the front page, like a BA out the back window of a 1956 Chevy.
One hundred dollars. If I win, you make a check to the WMF. If you win, I pay the charity of your choice. Articles selected by hitting the SELECT RANDOM ARTICLE button until 10 pieces are selected that date back to 2002. Selection to be performed 5 by you and 5 by me on the honor system. Your entry is the last entry on record for 2002. My entry is the current version at the time of selection. Judgment of "reliability" to be performed by an independent person not affiliated with either Wikipediocracy or Wikipedia. A 5 to 5 score is no action.
RfB
- Peter Damian
- Habitué
- Posts: 4206
- Joined: Thu Mar 15, 2012 8:14 pm
- Wikipedia User: Peter Damian
- Wikipedia Review Member: Peter Damian
- Location: London
- Contact:
Re: Article quality pre-and post-Sanger
This would be a stupid study, if we are picking random articles. The majority of articles in Wikipedia are trivia articles, and I believe in some sense they have 'improved' since the early days of Wikipedia. At least, they are much longer, and certainly more numerous than ten years ago, and since more is always better in the case of trivia, Wikipedia has improved, certainly.
However when it comes to 'serious subjects worthy of inclusion in a comprehensive and reliable reference work' it's a bit different. I have just completed a study of the Philosophy article from its inception as lecture notes for Larry's teaching classes, to the version of February 2012. The article is much longer now but unquestionably worse. There is much more of it, but by the same token there are many more errors, so it hasn't improved. The same could be said of many of the articles which were plagiarised wholesale from Britannica 1911. They weren't that good to start with, relative to more recent scholarship, and they have tended to deteriorate because of random silly additions over the years, by "Randies". I discussed this a while ago http://ocham.blogspot.co.uk/2010/06/wil ... ckham.html in respect of the article about Ockham.
I've given up blogging about these errors, because as soon as I do, some Wikipedian will correct them (as with the blog post I mentioned) and claim that this proves Wikipedia works. Well no it doesn't. There are thousands of these articles and the fact that one article has been slightly corrected doesn't mean that the others have been fixed. I can write about a homeless family, and perhaps a wealthy person will read it and help the family get a home, and then claim that the system of blogging about homeless families helps the problem of homelessness. No, the problem is more fundamental than that. You need to tackle the root causes of homelessness, unemployment, and by the same token you need to tackle the root causes of error in Wikipedia.
However when it comes to 'serious subjects worthy of inclusion in a comprehensive and reliable reference work' it's a bit different. I have just completed a study of the Philosophy article from its inception as lecture notes for Larry's teaching classes, to the version of February 2012. The article is much longer now but unquestionably worse. There is much more of it, but by the same token there are many more errors, so it hasn't improved. The same could be said of many of the articles which were plagiarised wholesale from Britannica 1911. They weren't that good to start with, relative to more recent scholarship, and they have tended to deteriorate because of random silly additions over the years, by "Randies". I discussed this a while ago http://ocham.blogspot.co.uk/2010/06/wil ... ckham.html in respect of the article about Ockham.
I've given up blogging about these errors, because as soon as I do, some Wikipedian will correct them (as with the blog post I mentioned) and claim that this proves Wikipedia works. Well no it doesn't. There are thousands of these articles and the fact that one article has been slightly corrected doesn't mean that the others have been fixed. I can write about a homeless family, and perhaps a wealthy person will read it and help the family get a home, and then claim that the system of blogging about homeless families helps the problem of homelessness. No, the problem is more fundamental than that. You need to tackle the root causes of homelessness, unemployment, and by the same token you need to tackle the root causes of error in Wikipedia.
οὐκ ἀγαθὸν πολυκοιρανίη: εἷς κοίρανος ἔστω
-
- Posts: 10891
- Joined: Wed Mar 14, 2012 11:32 pm
- Location: hell
Re: Article quality pre-and post-Sanger
PD hits the nail again: there is no point in talking about "article quality" or "accuracy" unless you're just talking about trivia collections. Wikipedia, as it sits, does a great job with sports trivia.
But if you're talking about science, or philosophy, or literature, or other serious academic subjects, Wikipedia can either be good--or worse than nothing. Because they actively discourage actual experts from writing about such subjects in any manner other than their hopelessly stilted, brain damaged way, and thus eliminate all the people who could make it into an encyclopedia. Just because a few math freaks like Mattbuck or GTBacchus have written some useful math articles doesn't mean Wikipedia covers the entire field well---it's all over the place, one specialty is well done, the next one is crap. Randomizing your expertise and forcing out real experts isn't going to be effective.
(And of course, there's no point in bringing this up, because very few Wikipedians ever gave a rat's ass about "boring" academic subjects. As long as Doctor Bloody Who and Japanese cartoons and their fave video-game franchise are all well-documented, they get what they want, and the hell with the rest.)
But if you're talking about science, or philosophy, or literature, or other serious academic subjects, Wikipedia can either be good--or worse than nothing. Because they actively discourage actual experts from writing about such subjects in any manner other than their hopelessly stilted, brain damaged way, and thus eliminate all the people who could make it into an encyclopedia. Just because a few math freaks like Mattbuck or GTBacchus have written some useful math articles doesn't mean Wikipedia covers the entire field well---it's all over the place, one specialty is well done, the next one is crap. Randomizing your expertise and forcing out real experts isn't going to be effective.
(And of course, there's no point in bringing this up, because very few Wikipedians ever gave a rat's ass about "boring" academic subjects. As long as Doctor Bloody Who and Japanese cartoons and their fave video-game franchise are all well-documented, they get what they want, and the hell with the rest.)
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Who's got Jimbo's back?
I'm not exactly certain, but in terms of personal income, my annual rate puts me at least in the top 3% to 6% of income earners in the United States. I have so many prospective customers for the Wikipedia-editing enterprise, I'm having to turn away some. Perhaps you're not aware, the wiki business constitutes less than 2% of my annual income. As I said, it's not about the money for me, it's about my not wasting my valuable time for you.Randy from Boise wrote:Whassamatter, is business so bad that you're hustling bets these days?
Judgment of "reliability" should be according to what "reliable" means. That is, "conforming to fact and therefore worthy of belief". If there is one factual error in an article, the article is not reliable. What do you not understand about the University of Minnesota study? Is this graphic confusing to you, or something?Randy from Boise wrote:Judgment of "reliability" to be performed by an independent person not affiliated with either Wikipediocracy or Wikipedia. A 5 to 5 score is no action.
Your terms still sound like a lot of work for me, for no purpose. (Who would be our judge? Is that for me to decide, or for you to decide?) Also, a donation from me to the WMF would be reprehensible, considering their waste, deceit, and child-porn and pedophilia agenda. I wouldn't want my name associated with that. My asking you to donate to the Bournelyf Special Camp, how would that be grating on you personally?
The University of Minnesota has already done our work, Randy. In January 2006, the chance of a Wikipedia article returning a damaged view was at least 20 times greater than it was in January 2003. Sure, most of the reason for that is that as articles get longer, there's more opportunity for a mistake. But, that's like comparing surgical procedures that take 30 minutes to complete, versus those that take four hours to complete, and saying that if the four-hour surgeries have a 5x greater operating-room mortality rate, that's okay, because the surgeries take longer to complete.
An encyclopedia article with a mistake in it is not "reliable", by definition.
I think we're done here, unless you want to pick the articles and pick our judge and allow my payment to be made to you, rather than the WMF.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- Vigilant
- Sonny, I've got a whole theme park full of red delights for you.
- Posts: 31776
- Joined: Thu Mar 29, 2012 8:16 pm
- Wikipedia User: Vigilant
- Wikipedia Review Member: Vigilant
Re: Article quality pre-and post-Sanger
I think a better metric might be 'factual errors per KB'.
Hello, John. John, hello. You're the one soul I would come up here to collect myself.
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Article quality pre-and post-Sanger
Mr. Kohs didn't say a word about "factual errors per kb," he said
The fact of the matter is that Greg's crazed hatred of Jimmy Wales and WMF has clouded his judgment, causing him to make patently nonsensical statements like the above. Anyone who pays the slightest attention to edit histories and actually LOOKS at the state of articles in Sangerpedia2002 knows full well that an examination of 10 at random by any disinterested observer willing to make a blind ruling on reliability would result in a $100 check to WMF and great mirth from me.
Go Broncos!
RfB
And I say that's horseshit — any 10 random articles from the extremely small and almost utterly unsourced Sangerpedia2002 are MORE reliable today. The fact that actually FINDING ten (or five!) articles from the glorious Sangerpedia2002 using the random search feature might be problematic would itself be a good exercise for the man pining away for the halcyon days of yesteryear." Without Sanger, Wikipedia became less reliable over time, although continuing to grow in popularity."
The fact of the matter is that Greg's crazed hatred of Jimmy Wales and WMF has clouded his judgment, causing him to make patently nonsensical statements like the above. Anyone who pays the slightest attention to edit histories and actually LOOKS at the state of articles in Sangerpedia2002 knows full well that an examination of 10 at random by any disinterested observer willing to make a blind ruling on reliability would result in a $100 check to WMF and great mirth from me.
Go Broncos!
RfB
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Article quality pre-and post-Sanger
Just in case Greg really does want me to put my money where my mouth is and needs to find a charity that would cause me equal pain to his having to pay out to WMF, I would suggest anything relating to the anti-abortion movement, the anti-union movement, pro-fundamentalist christian organizations, or the Republican Party would be loathesome to me. I'm sure there's something in this vein he'd be happy to support by making his banner.
As for an independent arbiter of "reliability," I have in mind a journalist or academic who has no dog in the fight — someone like Edwin Black, who has no love for either Wikipedia or Wikipedia Review/Wikipediocracy would work fine. I do know him so he may not be the ideal guy, but I'll bet he could suggest someone.
But I don't have any physical need to gamble, I just didn't want to puss out from a challenge. My basic point stands.
tim
As for an independent arbiter of "reliability," I have in mind a journalist or academic who has no dog in the fight — someone like Edwin Black, who has no love for either Wikipedia or Wikipedia Review/Wikipediocracy would work fine. I do know him so he may not be the ideal guy, but I'll bet he could suggest someone.
But I don't have any physical need to gamble, I just didn't want to puss out from a challenge. My basic point stands.
tim
- Vigilant
- Sonny, I've got a whole theme park full of red delights for you.
- Posts: 31776
- Joined: Thu Mar 29, 2012 8:16 pm
- Wikipedia User: Vigilant
- Wikipedia Review Member: Vigilant
Re: Article quality pre-and post-Sanger
Dude.
All I did was propose a better metric.
You OK?
All I did was propose a better metric.
You OK?
Hello, John. John, hello. You're the one soul I would come up here to collect myself.
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Article quality pre-and post-Sanger
Per this from Greg, above:
This is utterly irrelevant to the reliability of Wikipedia 2012 vs. the reliability of Sangerpedia2002, which is clearly what you were attempting to intimate, rationalized by misinterpretation of a really old study.
RfB
If you meant to say a Wikipedia in Jan. 2006 was more likely to have a damaged view than a Wikipedia article in January 2003, why didn't you say that?"In January 2006, the chance of a Wikipedia article returning a damaged view was at least 20 times greater than it was in January 2003."
This is utterly irrelevant to the reliability of Wikipedia 2012 vs. the reliability of Sangerpedia2002, which is clearly what you were attempting to intimate, rationalized by misinterpretation of a really old study.
RfB
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Article quality pre-and post-Sanger
No worries. Just killing time before I get busy on an article...Vigilant wrote:Dude.
All I did was propose a better metric.
You OK?
tim
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Article quality pre-and post-Sanger
That would be a useful metric, I agree. It would also take much, much more work.Vigilant wrote:I think a better metric might be 'factual errors per KB'.
The University of Minnesota published a paper that used a very simple metric -- if someone looked at an article at a given time, did it have "damage" in the article?
Now, how they defined "damage" was squirrelly, in my opinion.
In order that Randy/Tim's challenge not turn into a semester-long research initiative, I had in my mind that we'd look at a Wikipedia article in 2002 and count the number of factual errors, then look at that article again in 2012 and count the factual errors.
I remain very confident that the 2012 version will contain more factual errors on average than the 2002 version. As Randy/Tim kind of foams at the mouth here, I'm sure most of that would be due to expansion in article size. Then we'd clearly be venturing into some debate about "errors per kilobyte", because Randy/Tim would not likely be pleased with the (probable) fact that 2012 Wikipedia contains many more errors than 2002 Wikipedia.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Article quality pre-and post-Sanger
I can't imagine myself taking any bit of pleasure contributing money to any of those organizations. So, no thanks.Randy from Boise wrote:...I would suggest anything relating to the anti-abortion movement, the anti-union movement, pro-fundamentalist christian organizations, or the Republican Party would be loathesome to me. I'm sure there's something in this vein he'd be happy to support by making his banner.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Article quality pre-and post-Sanger
The age of a study doesn't have any bearing on how a temporal trendline can (or cannot) be derived from a known portion of time-based data. Granted, margin of error would begin to come into play if you tried to extrapolate a trendline up to the present from data collected from 1944 through 1947. However, in this case we are trying to extrapolate a trendline from 2002 through 2012, based on available data from 2003 to 2006. I fail to see how that's "utterly irrelevant", but then again, I've never played football on a blue field. Maybe you're more enlightened than me.Randy from Boise wrote:If you meant to say a Wikipedia in Jan. 2006 was more likely to have a damaged view than a Wikipedia article in January 2003, why didn't you say that?
This is utterly irrelevant to the reliability of Wikipedia 2012 vs. the reliability of Sangerpedia2002, which is clearly what you were attempting to intimate, rationalized by misinterpretation of a really old study.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- Randy from Boise
- Been Around Forever
- Posts: 12236
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Article quality pre-and post-Sanger
You're mistakenly presenting a vandalism study as a "factual reliability" study.thekohser wrote: The age of a study doesn't have any bearing on how a temporal trendline can (or cannot) be derived from a known portion of time-based data. Granted, margin of error would begin to come into play if you tried to extrapolate a trendline up to the present from data collected from 1944 through 1947. However, in this case we are trying to extrapolate a trendline from 2002 through 2012, based on available data from 2003 to 2006. I fail to see how that's "utterly irrelevant", but then again, I've never played football on a blue field. Maybe you're more enlightened than me.
You are also assuming a false linearity — bot-aided vandalism reversion in 2012 is far more effective than the hunt-and-peck corrections of 2006.
The study shows changes from Jan. 2006 over Jan. 2003. It does not demonstrate that WP2012 is "less reliable" than Sangerpedia2002.
RfB
- thekohser
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
- Contact:
Re: Article quality pre-and post-Sanger
Is it now? What study proved that out?Randy from Boise wrote:bot-aided vandalism reversion in 2012 is far more effective than the hunt-and-peck corrections of 2006.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
- Vigilant
- Sonny, I've got a whole theme park full of red delights for you.
- Posts: 31776
- Joined: Thu Mar 29, 2012 8:16 pm
- Wikipedia User: Vigilant
- Wikipedia Review Member: Vigilant
Re: Article quality pre-and post-Sanger
What you're talking around is this effective bit error rate for a set of wikipedia articles over a 5 year time domain.thekohser wrote:That would be a useful metric, I agree. It would also take much, much more work.Vigilant wrote:I think a better metric might be 'factual errors per KB'.
The University of Minnesota published a paper that used a very simple metric -- if someone looked at an article at a given time, did it have "damage" in the article?
Now, how they defined "damage" was squirrelly, in my opinion.
In order that Randy/Tim's challenge not turn into a semester-long research initiative, I had in my mind that we'd look at a Wikipedia article in 2002 and count the number of factual errors, then look at that article again in 2012 and count the factual errors.
I remain very confident that the 2012 version will contain more factual errors on average than the 2002 version. As Randy/Tim kind of foams at the mouth here, I'm sure most of that would be due to expansion in article size. Then we'd clearly be venturing into some debate about "errors per kilobyte", because Randy/Tim would not likely be pleased with the (probable) fact that 2012 Wikipedia contains many more errors than 2002 Wikipedia.
While this is more work than, "At some pseudo random intervals, an error was present in an article.", it is also a much more useful metric.
I would argue against choosing random articles for this experiment.
Choose high traffic articles that are not natural vandalism (hehe, nigger, hehe) magnets, yet represent those articles where the overall wikipedia editor demographic is present in approximately proportional representation.
Further, a mix of articles would be best.
Math/science, politics, religion, philosophy, economics. Nothing too intense in these domains or you only end up with subject experts and cranks.
Steer clear of pop culture articles. There's nothing but cranks there.
Same for P/I, troubles, animal welfare, etc, etc
I'd be willing to bet that 100 articles tracked over 5 years would give you a reasonable confidence interval.
Hello, John. John, hello. You're the one soul I would come up here to collect myself.
- Vigilant
- Sonny, I've got a whole theme park full of red delights for you.
- Posts: 31776
- Joined: Thu Mar 29, 2012 8:16 pm
- Wikipedia User: Vigilant
- Wikipedia Review Member: Vigilant
Re: Article quality pre-and post-Sanger
That's an interesting point.Randy from Boise wrote:You are also assuming a false linearity — bot-aided vandalism reversion in 2012 is far more effective than the hunt-and-peck corrections of 2006.
What's even more interesting is that the "bot-aided vandalism reversion" is very much a Maxwell's Demon.
Edits that increase the size of wikipedia's data store are, generally, assumed to be "good" edits while those that revert and/or remove byte counts are assumed to be bad edits.
Recently, an insane person, named Jeff Merkey, went on a tear through the orchid articles, adding megabytes of text to around 100 articles. His edits are almost certainly plagiarism and copyright violations. When I attempted to revert them, I was intercepted by one of these bots and eventually blocked. The edits, from over 100 IP addresses, are still in the orchid articles. There has been deafening silence from the administrators and WP:Plant project members.
Given this anecdotal evidence and, assuming much, that it is indicative of a larger tendency to keep rather than delete, I will make a prediction that the raw error rate in wikipedia articles tends to increase as we move forward in the time domain, assuming a static bit error rate. Bad edits will be 'sticky'. This increases byte count, generally, over time.
What happens to bit error rates, I am still undecided.
Hello, John. John, hello. You're the one soul I would come up here to collect myself.