Wordcraft Home Page    Wordcraft Community Home Page    Forums  Hop To Forum Categories  Potpourri    One-look's perspective
Go
New
Find
Notify
Tools
Reply
  
One-look's perspective Login/Join
 
Member
Picture of shufitz
posted
Does the OneLook dictionary site have an obsession? Enter the ordinary word come, and see what's first listed in the Quick Definitions, in the box on the right.
 
Posts: 2666 | Location: Chicago, IL USAReply With QuoteReport This Post
Member
posted Hide Post
I don't know whether to take this as a serious question, or if it's just meant to point out a curiosity.

I do know that the 'quick definitions' are taken from the indexed public domain dictionaries (for obvious reasons) and at one time I think they used dictionary.com (until that source started using copyrighted material). I don't know how the current algorithm(s) work.
 
Posts: 334Reply With QuoteReport This Post
Member
Picture of BobHale
posted Hide Post
Seems a rather odd algorithm that puts that particular definition top and

quote:
move toward, travel toward something or somebody or approach something or somebody


at thirteenth.


"No man but a blockhead ever wrote except for money." Samuel Johnson.
 
Posts: 9422 | Location: EnglandReply With QuoteReport This Post
Member
posted Hide Post
quote:
Originally posted by BobHale:
Seems a rather odd algorithm that puts that particular definition top and

quote:
move toward, travel toward something or somebody or approach something or somebody


at thirteenth.


obviously it's not making value judgments.
 
Posts: 334Reply With QuoteReport This Post
Member
Picture of Kalleh
posted Hide Post
It just doesn't make sense. They can't be so stupid as to not be able to figure out that some people just look at that top definition. Or...maybe they can be...
 
Posts: 24735 | Location: Chicago, USAReply With QuoteReport This Post
Member
posted Hide Post
this is why I suggested that there may just be an algorithm that's rotating through a list of their unprotected sources.

edit: FWIW, this statement appears on OneLook’s acknowledgment page:

The content that appears in the "Quick Definitions" section of our results pages derives primarily from WordNet, a project of Princeton University [license info], and data from the U.S. Census Bureau. It also derives from the hundreds of user-submitted additions and corrections we've received over the years.

I get the impression that they aren't running a very tight ship* over at OneLook since Bob Ware left the captaincy -- perhaps they've just got a loose cannon.

*to wit, it often takes 6 months to get them to do a db update.

This message has been edited. Last edited by: tsuwm,
 
Posts: 334Reply With QuoteReport This Post
Member
posted Hide Post
here's a little more information on OneLook for those wots innerested..

OneLook is just part of a larger operation called Datamuse (YCLIU). Currently, Datamuse is managed on a day-to-day basis by a gent name of Harvey Beeferman. I used to have somewhat of a working relationship with Harvey's son, Doug.

At that time, Doug had a real interest in analyzing OneLook's DBs, and he was sending me a list of "chokes", i.e., words not indexed. This list was ordered by frequency, which eliminated a lot of the spelling errors (but for the common ones), and I could cull the list for obscure words, add them to wwftd, improving OL, and Viola's your aunt!

In late 2006, I got this note from Harvey:

Yes, I’m Doug’s Dad. He works full time in the computer industry and I manage Datamuse on a day-to-day basis. He does technical work for Datamuse when he has time. We’re using Bob Ware, the originator of OneLook on a consulting basis to maintain the database.

<shrugs>
 
Posts: 334Reply With QuoteReport This Post
Member
Picture of Richard English
posted Hide Post
I don't know how other dictionaries order their definitions - but there must obviously be some amount of personal judgement. Even the obvious system of frequency of use would only be possible for the written word. And I would imagine that the definition selected here would be more frequently used vocally than in writing Wink


Richard English
 
Posts: 8038 | Location: Partridge Green, West Sussex, UKReply With QuoteReport This Post
Member
posted Hide Post
quote:
Originally posted by Richard English:
I don't know how other dictionaries order their definitions


Remember that Onelook is not a dictionary.
 
Posts: 2428Reply With QuoteReport This Post
Member
Picture of Richard English
posted Hide Post
quote:
Remember that Onelook is not a dictionary.

Whatever it is it's a reference source and this challenge must be common to all reference sources. How do you decide priorities?


Richard English
 
Posts: 8038 | Location: Partridge Green, West Sussex, UKReply With QuoteReport This Post
Member
posted Hide Post
quote:
Originally posted by Richard English:
there must obviously be some amount of personal judgement.


I can't agree.

Consider our sample word: if you take OL's ack'ment at face value, the WordNet source says "The verb come has 21 senses," of which the offending one comes(no pun) 20th. the current word count of WordNet, per OL, is 119160. if you allow, say, 5 senses per average word (conservative), that gives you more than 500,000 senses to put in some order.

I'm not going to do an extensive study, but pick a random word. I'm going to try 'average': WordNet lists 6 senses for the adj., 'statistical norm' being no. 1; quick def'ns gives this second.

So what's obvious is that some reordering is going on. One might reasonably ask, why? WordNet is "freely and publicly available for download"; but perhaps there are issues having to do with copying large amounts of data, such as OL is doing.

in any event, personal judgment or randomizing, for up to a million data points..
 
Posts: 334Reply With QuoteReport This Post
Member
Picture of Richard English
posted Hide Post
I find it hard to understand why you don't agree with my comment that there must be some personal judgement when it comes to ordering.

Form the statistics you quote (although I confess I find some of them confusing)it seem clear that it would be impossible to use any strict numerical system or algorithm since there are just too many variables. In your own example of 5000 senses - how is one to decide how such senses are to be ordered? Does the 4,555th sense come before or after the 3,585th sense? And why?

Personal judgement must inevitably be used.


Richard English
 
Posts: 8038 | Location: Partridge Green, West Sussex, UKReply With QuoteReport This Post
Member
posted Hide Post
let me try this from a nother angle. I know from the situation at OL that they don't have the resources to make value judgments on 500,000 (not 5000) senses. I assume, from this, that they must have had their part-timer who "does technical work" gin up a randomizer to sequence all these senses, in a one-shot effort. (any value-judgments would then, per force, be programmed into his algorithm -- maybe he was the loose cannon.)

does this make any sense?

edit: I suppose, if one were really curious about this, one could imagine some other words with salacious shadings and see if they are consistently emPHAsized.

edit²: I tried 'head'; 'oral-genital stimulation' comes *way down the list.

This message has been edited. Last edited by: tsuwm,
 
Posts: 334Reply With QuoteReport This Post
Member
posted Hide Post
In WordNet, the senses are typically ordered by the number of annotations of that sense. There is no specific "ordering" of senses, although the most annotated synsets tend to be the most common. Also, the data is somewhat quirky. One will notice a preponderance of baseball terms ranking more relevantly than they will should.
 
Posts: 886 | Location: IllinoisReply With QuoteReport This Post
  Powered by Social Strata  
 

Wordcraft Home Page    Wordcraft Community Home Page    Forums  Hop To Forum Categories  Potpourri    One-look's perspective

Copyright © 2002-12