Page view -to- Article length ratios
-
- Majordomo
- Posts: 13410
- kołdry
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Page view -to- Article length ratios
Are there any Wikipedia tech gurus who would be able to run an analysis of all Wikipedia articles, taking the number of average daily page views per article (throwing out the weird outliers that get something like a million views one day, then twelve views the next day), and dividing that by the length of the article in words or in bytes?
I would be interested to know which are the articles on Wikipedia that some editors have greatly expanded to enormous lengths, but nobody cares to read them. And likewise, which articles get tons of traffic, but editors have barely fleshed them out.
I would be interested to know which are the articles on Wikipedia that some editors have greatly expanded to enormous lengths, but nobody cares to read them. And likewise, which articles get tons of traffic, but editors have barely fleshed them out.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Been Around Forever
- Posts: 12277
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Page view -to- Article length ratios
VIEWS (#) / LENGTH (bytes)
How is "views per byte" meaningful?
Doesn't every popular article have a high "views per byte" number?
Doesn't every unpopular article have a low "views per byte" number?
An article with a million views in a year that is 100K long would be 10,000 views per byte.
An article with ten views in a year that is 100K long would be 0.1 views per byte.
That's just another way of saying that an article with 1 million views gets 100,000 times more views than one with 10 views.
So what? What hypothesis are you trying to test?
RfB
How is "views per byte" meaningful?
Doesn't every popular article have a high "views per byte" number?
Doesn't every unpopular article have a low "views per byte" number?
An article with a million views in a year that is 100K long would be 10,000 views per byte.
An article with ten views in a year that is 100K long would be 0.1 views per byte.
That's just another way of saying that an article with 1 million views gets 100,000 times more views than one with 10 views.
So what? What hypothesis are you trying to test?
RfB
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
Appropriate example for the shoe salesman...Randy from Boise wrote:VIEWS (#) / LENGTH (bytes)
How is "views per byte" meaningful?
It would suggest where readers are being "over-served" and "underserved" by content versus the demand for said content.
Doesn't every popular article have a high "views per byte" number?
Doesn't every unpopular article have a low "views per byte" number?
Have you ever seen a scatter plot? Note, it may be necessary to multiply or divide one of the measures by a constant, so that the ratios become more meaningful.
Wouldn't you be curious to learn more about the 54-inch person who wears a size 7 shoe? Better yet, the 51-inch person in the size 12 shoe? How about the 80-inch person who squeezes into a size 10 -- maybe they lost their toes to frostbite?
My dear Tim, I am not interested in the fat middle of the distribution. I am interested in learning more about the outliers.
Here's an example on Wikipedia:
The Empty Child (T-H-L) is about a Doctor Who episode. It is 22,046 bytes long. In a day, it receives about 240 page views. So, about 92 bytes of content per daily visitor.
IP address (T-H-L) is about the numerical label assigned to Internet devices. It is 28,921 bytes long, somewhat longer than The Empty Child. But each day, this article receives about 5,100 page views -- twenty-one times more than the article about a TV show episode.
While these are just examples that I pulled manually, it tends to inform us that (perhaps) the content and quality of the IP address article is something that Wikipedians should be encouraged to tend to, since so many more Internet users are craving an explanation of this phenomenon. However, Wikipedians don't want to be bothered with such arguments, because they are too busy building out articles about individual episodes of Doctor Who.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
Another interesting comparison is to look at the size of the articles about the Cleveland Browns (T-H-L) and the Detroit Lions (T-H-L). I mean, really. They are both NFL teams that have been around for more than seven decades, both comparably hapless in performance on the field. The Browns article does get almost (but not quite) two times the viewership as the Lions article. Yet, Cleveland's article is nearly 8 times longer than Detroit's article. Why?
Your response may be, "Who cares? They both suck!" But, then, aren't we inspired by the Free Culture Movement to strive for knowledge, for its own sake?
Your response may be, "Who cares? They both suck!" But, then, aren't we inspired by the Free Culture Movement to strive for knowledge, for its own sake?
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Genius
- Posts: 25599
- Joined: Wed Jan 02, 2013 8:15 pm
- Nom de plume: Poetlister
- Location: London, living in a similar way
Re: Page view -to- Article length ratios
If you want to look up something in Wikipedia, you probably won't know in advance how long it is. At one time, the longest article on the site was a list of hundreds of people who could claim the throne of England if everyone above them died. That article is now vastly shorter. Is it getting more or fewer views than before?
"The higher we soar the smaller we appear to those who cannot fly" - Nietzsche
-
- Habitué
- Posts: 4446
- Joined: Thu Mar 15, 2012 6:18 pm
- Wikipedia User: Nastytroll
- Wikipedia Review Member: Lilburne
Re: Page view -to- Article length ratios
This.Poetlister wrote:If you want to look up something in Wikipedia, you probably won't know in advance how long it is. At one time, the longest article on the site was a list of hundreds of people who could claim the throne of England if everyone above them died. That article is now vastly shorter. Is it getting more or fewer views than before?
People do a Google search and get a link to wiki crap, they click the link, they locate what they were searching for and move on. What you need to look at is the time that people spend on each page and whether the longer the article is the more time they spend on it. I suspect that there is very little variation in time between a 100K and a 10K page. That page size does not correlate to length of time people spend on it.
They have been inserting little memes in everybody's mind
So Google's shills can shriek there whenever they're inclined
So Google's shills can shriek there whenever they're inclined
-
- Been Around Forever
- Posts: 12277
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Page view -to- Article length ratios
Obfuscate much?thekohser wrote:Appropriate example for the shoe salesman...Randy from Boise wrote:VIEWS (#) / LENGTH (bytes)
How is "views per byte" meaningful?
It would suggest where readers are being "over-served" and "underserved" by content versus the demand for said content.
Doesn't every popular article have a high "views per byte" number?
Doesn't every unpopular article have a low "views per byte" number?
Have you ever seen a scatter plot? Note, it may be necessary to multiply or divide one of the measures by a constant, so that the ratios become more meaningful.
Wouldn't you be curious to learn more about the 54-inch person who wears a size 7 shoe? Better yet, the 51-inch person in the size 12 shoe? How about the 80-inch person who squeezes into a size 10 -- maybe they lost their toes to frostbite?
My dear Tim, I am not interested in the fat middle of the distribution. I am interested in learning more about the outliers.
Here's an example on Wikipedia:
The Empty Child (T-H-L) is about a Doctor Who episode. It is 22,046 bytes long. In a day, it receives about 240 page views. So, about 92 bytes of content per daily visitor.
IP address (T-H-L) is about the numerical label assigned to Internet devices. It is 28,921 bytes long, somewhat longer than The Empty Child. But each day, this article receives about 5,100 page views -- twenty-one times more than the article about a TV show episode.
While these are just examples that I pulled manually, it tends to inform us that (perhaps) the content and quality of the IP address article is something that Wikipedians should be encouraged to tend to, since so many more Internet users are craving an explanation of this phenomenon. However, Wikipedians don't want to be bothered with such arguments, because they are too busy building out articles about individual episodes of Doctor Who.
RfB
-
- Trustee
- Posts: 14122
- Joined: Wed Mar 14, 2012 11:54 pm
- Wikipedia User: Stanistani
- Wikipedia Review Member: Zoloft
- Actual Name: William Burns
- Nom de plume: William Burns
- Location: San Diego
Re: Page view -to- Article length ratios
My avatar is sometimes indicative of my mood:
- Actual mug ◄
- Uncle Cornpone
- Zoloft bouncy pill-thing
-
- Habitué
- Posts: 2620
- Joined: Fri Jan 31, 2014 5:05 pm
- Wikipedia User: Johnny Au
- Actual Name: Johnny Au
- Location: Toronto, Ontario, Canada
Re: Page view -to- Article length ratios
Even if the Cebuano Wikipedia were twice the size of the English Wikipedia, very much nobody would read it.
-
- Genius
- Posts: 25599
- Joined: Wed Jan 02, 2013 8:15 pm
- Nom de plume: Poetlister
- Location: London, living in a similar way
Re: Page view -to- Article length ratios
Right, so it would get very few page views. But we're only discussing the English site here.Johnny Au wrote:Even if the Cebuano Wikipedia were twice the size of the English Wikipedia, very much nobody would read it.
"The higher we soar the smaller we appear to those who cannot fly" - Nietzsche
-
- Been Around Forever
- Posts: 12277
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Page view -to- Article length ratios
Let me save people the time of constructing a methodology and gathering and summarizing data.
Indeed, literally every single popular article dwarfs the abysmal 0.02 Kohs Score™® for that piece.
We really have to stop paying content people by the word, it is not cost-effective for Wikipedia.
RfB
Take, for example, Socialist Party of Washington (T-H-L) — weighing in at a hefty 116.8K and attracting a paltry 2,404 hits during all of 2016.Conclusion: There are many long articles on Wikipedia that get few readers. There are a few very popular articles that are not long.
Indeed, literally every single popular article dwarfs the abysmal 0.02 Kohs Score™® for that piece.
We really have to stop paying content people by the word, it is not cost-effective for Wikipedia.
RfB
Last edited by Randy from Boise on Fri Jan 20, 2017 5:03 pm, edited 1 time in total.
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
Lilburne makes an excellent point.lilburne wrote:People do a Google search and get a link to wiki crap, they click the link, they locate what they were searching for and move on. What you need to look at is the time that people spend on each page and whether the longer the article is the more time they spend on it. I suspect that there is very little variation in time between a 100K and a 10K page. That page size does not correlate to length of time people spend on it.
I am still interested in some Wiki-coding nerd who has the ability and access to get me what I'm looking for. In the interest of social science.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
Looks like someone should have written a copyrighted monograph and sold it to an academic publisher, rather than plunking it down on Wikipedia for free.Randy from Boise wrote:Take, for example, Socialist Party of Washington (T-H-L) — weighing in at a hefty 116.8K and attracting a paltry 2,404 hits during all of 2016.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Been Around Forever
- Posts: 12277
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Page view -to- Article length ratios
You think academic publishers write checks, do you? Ha!!!thekohser wrote:Looks like someone should have written a copyrighted monograph and sold it to an academic publisher, rather than plunking it down on Wikipedia for free.Randy from Boise wrote:Take, for example, Socialist Party of Washington (T-H-L) — weighing in at a hefty 116.8K and attracting a paltry 2,404 hits during all of 2016.
Let me put it another way: I spent over 8 months going full out as co-editor on a book project that was put out by an illustrious academic publisher. I believe — I need to check, but I believe — that a total of 38 copies were sold in the first year. These are locked in a few libraries and if 10 copies have even been opened by readers, I would be surprised. For my work, I received a royalty of three books, a notch in my gunbelt, and prospects of a paperback edition this year that might sell in the low hundreds.
So now you tell me what the most effective information-transmission mechanism is: academic publishers or Wikipedia?
Compare and contrast: Lovestoneites (T-H-L) — 2,831 hits in 2016 / 62.75K = 0.045 Kohs Score™®
With this:
linkhttp://www.brill.com/products/book/amer ... es-1929-40[/link]
One is easily accessible by anyone anywhere with an interest or a question, the other is locked up by a greedhead publisher who will presumably turn over the manuscript to a Trotskyist publisher in Chicago shortly...
tim
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
And one I can go and mess with later tonight.Randy from Boise wrote:One is easily accessible by anyone anywhere with an interest or a question, the other is locked up by a greedhead publisher who will presumably turn over the manuscript to a Trotskyist publisher in Chicago shortly...
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Been Around Forever
- Posts: 12277
- Joined: Sun Mar 18, 2012 2:32 am
- Wikipedia User: Carrite
- Wikipedia Review Member: Timbo
- Actual Name: Tim Davenport
- Nom de plume: T. Chandler
- Location: Boise, Idaho
Re: Page view -to- Article length ratios
I actually haven't had much trouble with that across the Wiki. One good thing about limiting oneself to esoteric shit: vandals get no bang out of their vandalism. If an idiot takes a dump in the woods, does anyone care?thekohser wrote:And one I can go and mess with later tonight.Randy from Boise wrote:One is easily accessible by anyone anywhere with an interest or a question, the other is locked up by a greedhead publisher who will presumably turn over the manuscript to a Trotskyist publisher in Chicago shortly...
RfB
-
- the Merciless
- Posts: 3002
- Joined: Wed Apr 03, 2013 1:35 pm
Re: Page view -to- Article length ratios
It took Ming about 90 seconds to work out why: both of the articles link of to a "History of" subarticle, but in the case of the Lions someone also deleted all that material from the main article. For the Browns, nobody did the same. It's also possible that the Browns subarticle might be longer because nobody bothered to try to move the Lions to Baltimore.thekohser wrote:Another interesting comparison is to look at the size of the articles about the Cleveland Browns (T-H-L) and the Detroit Lions (T-H-L). I mean, really. They are both NFL teams that have been around for more than seven decades, both comparably hapless in performance on the field. The Browns article does get almost (but not quite) two times the viewership as the Lions article. Yet, Cleveland's article is nearly 8 times longer than Detroit's article. Why?
-
- the Merciless
- Posts: 3002
- Joined: Wed Apr 03, 2013 1:35 pm
Re: Page view -to- Article length ratios
The other thing is that most articles probably don't get read much beyond the lead, no matter how long they are.
-
- Majordomo
- Posts: 13410
- Joined: Thu Mar 15, 2012 5:07 pm
- Wikipedia User: Thekohser
- Wikipedia Review Member: thekohser
- Actual Name: Gregory Kohs
- Location: United States
Re: Page view -to- Article length ratios
Would be nice if the great content management company, The Wikimedia Foundation, had some statistically-reliable research about that, wouldn't it?Ming wrote:The other thing is that most articles probably don't get read much beyond the lead, no matter how long they are.
"...making nonsensical connections and culminating in feigned surprise, since 2006..."
-
- Gregarious
- Posts: 650
- Joined: Mon Apr 21, 2014 1:29 pm
- Wikipedia Review Member: Text
- Actual Name: Anonyymi
Re: Page view -to- Article length ratios
Groundwater contamination could be a problem.If an idiot takes a dump in the woods, does anyone care?
As less and less editors participate actively in cleaning up pages from vandalism, more outlier pages will retain pieces of vandalism, and consequentially more pages in the middle will start receiving incorrect data. Just yesterday the page about right hand and left hand traffic had some incorrect text above the big template at the top, which persisted for about 9 hours. Broken windows theory - Vandalized page theory!
-
- Habitué
- Posts: 4446
- Joined: Thu Mar 15, 2012 6:18 pm
- Wikipedia User: Nastytroll
- Wikipedia Review Member: Lilburne
Re: Page view -to- Article length ratios
Personally if I were interested I go for the book. If I don't really give a shit and just have a momentary interest then I'd take wikipedia, I doubt I'd remember much about it later, and would most likely have this reaction to the wikipedia articlethekohser wrote:And one I can go and mess with later tonight.Randy from Boise wrote:One is easily accessible by anyone anywhere with an interest or a question, the other is locked up by a greedhead publisher who will presumably turn over the manuscript to a Trotskyist publisher in Chicago shortly...
They have been inserting little memes in everybody's mind
So Google's shills can shriek there whenever they're inclined
So Google's shills can shriek there whenever they're inclined
-
- Habitué
- Posts: 2620
- Joined: Fri Jan 31, 2014 5:05 pm
- Wikipedia User: Johnny Au
- Actual Name: Johnny Au
- Location: Toronto, Ontario, Canada
Re: Page view -to- Article length ratios
Cluebot NG isn't omniscient.Textnyymi wrote:Groundwater contamination could be a problem.If an idiot takes a dump in the woods, does anyone care?
As less and less editors participate actively in cleaning up pages from vandalism, more outlier pages will retain pieces of vandalism, and consequentially more pages in the middle will start receiving incorrect data. Just yesterday the page about right hand and left hand traffic had some incorrect text above the big template at the top, which persisted for about 9 hours. Broken windows theory - Vandalized page theory!
In fact, it didn't fix the vandalism in History of immigration to Canada (T-H-L) that has an inaccurate lead.
-
- Genius
- Posts: 25599
- Joined: Wed Jan 02, 2013 8:15 pm
- Nom de plume: Poetlister
- Location: London, living in a similar way
Re: Page view -to- Article length ratios
Fortunately, as I pointed out in another thread, we have Wikipediocracybot, which fixes any errors that people here mention.Johnny Au wrote:Cluebot NG isn't omniscient.
In fact, it didn't fix the vandalism in History of immigration to Canada (T-H-L) that has an inaccurate lead.
"The higher we soar the smaller we appear to those who cannot fly" - Nietzsche