[Wikipedia-l] Re: 90000 articles

tarquin tarquin at planetunreal.com
Mon Nov 11 15:33:50 UTC 2002


Magnus Manske wrote:

> Larry Sanger wrote:
>
>> I think we should implement the greatly-pared-down counting scheme we 
>> were
>> discussing on the list earlier.  We don't have 90K articles.  We have
>> about 50K articles and about 40K geographical entries automatically
>> created from a database.
>>  
>>
> How about "90K entries, 50K of them articles" on the Main Page? (well, 
> the last part needs rewording...)

Good idea.

>
>
> I thought of something else: Automated "evaluation" of articles. I'm 
> uncertain if that has been mentioned before, though.
>
> We could add a score to each article. Certain properties are usually 
> good:
> * Many edits (as well as number of contributors)
> * many outgoing internal links
> * image links
> * links to other languages
>
> Furthermore, the following could be scored somehow:
> * text to list (* and #) ratio
> * text to external links ratio
> * text to headings ratio
> * text to number of paragraphs ratio
>
> Granted, such a scoring method won't be able to separate all good and 
> bad articles from each other. But, it could find many "bad" articles 
> (lists, text dumps, stubs, etc.), and give us some kind of statistics 
> about where we stand.

Good idea.
But don't be too harsh on * and #. I've often turned a large paragraph 
of "there are three types of trout commonly found, the blah trout which 
is ... ... ... , the foo trout which is ..." into a clearer list format.





More information about the Wikipedia-l mailing list