|
 |
|
|
11-09-2008, 12:19 PM
|
A theoretical question
|
Posts: 3,985
Name: Abel Mohler
Location: Asheville, North Carolina USA
|
I did some analysis of searches using the word "Wayfarer" today, and noticed something interesting. Here it is: http://www.google.com/search?hl=en&c...er&btnG=Search
The thing that is interesting about this word, is that it has a couple of different meanings, and contexts in which it is used. My question is, do you think it is possible that Google examines the context that words are used it, as one measure of weighting results?
The reason I use this as an example, is that I've often noticed that search results tend to be grouped together by context. On the wayfarer results page, it is obvious that the most commonly used context is that of the brand of sunglasses. Although it would make sense that the most links would be coming in to one of those pages, my quick analysis of the Amazon Page, for example, found that this was not in fact the case.
Google also has a shopping results section for this term. I am curious if this result is used as part of the context measurement, placing the Amazon page above others that have more links, and also more content, though they are not as relevant, since they are used in a less common context.
The original meaning of the word Wayfarer, is 1. someone who travels on foot. Later, it also came to mean a 2. class of sailboat. Much later than that it became a name for 3. a type of sunglasses. It could be said that the importance of the word, historically, is ordered 1-2-3, but the importance of the word, contextually, is rather, 3-2-1, the exact opposite, which is how the listing seem to be grouped.
If there are more results overall for Wayfarer, grouped with Sunglasses, wouldn't it make sense that the strongest results for that group are the most relevant?
Just a thought. Of course, I could be completely wrong. 
Last edited by wayfarer07; 11-09-2008 at 12:22 PM..
|
|
|
|
11-09-2008, 01:41 PM
|
Re: A theoretical question
|
Posts: 41,517
Name: Chris Hirst
Location: Blackpool. UK
|
Determining "context" of ambiguous one word searches is one thing totally eludes all search engines. So the results are simply ranked for the word and any "grouping" is the natural order rather than any particular effort.
If the word is more associated with one particular context then "grouping" simply does not exist. Take Ford for example 8 results for the car manufacturer. Harrison gets a look in near the end and the Ford Foundation breaks up the rest. Not a mention of a crossing place in a flow of water and no former US presidents feature in the list.
If there was any attempt to show a variation for context...
.. it failed miserably 
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
A foolish consistency is the hobgoblin of little minds
Thought for today:- I SEO the only industry where all the cowboys are Indians?
|
|
|
|
11-09-2008, 02:33 PM
|
Re: A theoretical question
|
Posts: 3,985
Name: Abel Mohler
Location: Asheville, North Carolina USA
|
While I think you make good points, and the results for Ford are interesting, it doesn't entirely disprove it. After all, when you are talking about a vast conglomerate such as Ford Motors, it is quite possible that the vast amount of links and other data may drown out certain other factors such as context.
As far as context being ambiguous, it really isn't, if you think about it. The one thing that might be confusing, is that context to a computer is much different than context to the human brain, which is an abstract concept. Context in computer terms would be about simple associations:
HTML Code:
<h1>Wayfarer</h1>
<p>...............sunglasses...................
.....................................................
......sunglasses.............</p>
HTML Code:
<strong>Wayfarer Sunglasses</strong>
HTML Code:
<dl>
<dt>Wayfarer</dt>
<dd>Sunglasses</dd>
</dl>
HTML is structured so that words are constantly grouped together and placed into context, from content structured with headings, to definition lists, to words grouped by <strong> text, etc, etc.
It wouldn't be difficult to define associations as strong, weak, or somewhere in between, by percentage, by cataloging these associations based on various types of associations. This might not work in every language, but it would work for some.
Now, just for arguments sake, let's say it is possible for Google to do this, and to create a weighting system based on this type of context, then use it as a factor when determining rankings. Would it be in their interest to do so? I think Google is interested in using factors that are very difficult to spoof by people like us. As far as context is concerned, it would be literally impossible to fake, since you would be forced to take over a huge percentage of the internet in order to do so.
Whereas links can be built in an automated fashion (whether they are then useful is highly debatable), but the ultimate democracy of the internet is content. Not everyone builds links, but everyone does create content.
If Google doesn't do this already, even to a very minor degree, I wonder if they are aware of its potential?
|
|
|
|
11-10-2008, 05:53 AM
|
Re: A theoretical question
|
Posts: 41,517
Name: Chris Hirst
Location: Blackpool. UK
|
Quote:
|
it is quite possible that the vast amount of links and other data may drown out certain other factors such as context
|
That goes for anything and everything
Quote:
|
As far as context being ambiguous, it really isn't, if you think about it. The one thing that might be confusing, is that context to a computer is much different than context to the human brain, which is an abstract concept. Context in computer terms would be about simple associations:
|
Absolutely agree.
Context, more often than not, is not ambiguous, but a single word that has two or more meanings or contexts it can be used in, is ambiguous.
Take "context" for example. To us "context" has one meaning, to a Google search it has several contexts.
search word context is one of the reasons that us humans are given some visual pointers on the SERP, in the form of a short snippet from the page and bolding of the word. The end user can glance at the results and see what context the word they looked for is being used in on the displayed page. This of course is where the "experts" get "density" confused with "proximity" and come up with the fuzzy concept of "stop words"
Quote:
|
It wouldn't be difficult to define associations as strong, weak, or somewhere in between, by percentage, by cataloging these associations based on various types of associations. This might not work in every language, but it would work for some.
|
Which they do already, once you get past a one word search and the context is defined.
wayfarer rigging
wayfarer lens
ford water
Quote:
|
Now, just for arguments sake, let's say it is possible for Google to do this, and to create a weighting system based on this type of context, then use it as a factor when determining rankings. Would it be in their interest to do so?
|
And they do, but as part of the personalised and geolocation results, rather than the general results.
Quote:
|
If Google doesn't do this already, even to a very minor degree, I wonder if they are aware of its potential?
|
They are definitely aware of the potential, if I recall they have a patent on defining context for search patterns. BUT they will also be aware of the pitfalls where if they get it wrong too often
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
A foolish consistency is the hobgoblin of little minds
Thought for today:- I SEO the only industry where all the cowboys are Indians?
|
|
|
|
11-10-2008, 07:31 AM
|
Re: A theoretical question
|
Posts: 19
|
Google already does what you are suggesting. It's something to do with keyword density. They can discriminate between stuffed keywords and contextual keywords up to a degree, further, they use html elements such as headings, etc to weigh the importance of that keyword.
When you search for a single word such as 'wayfarer' you cannot talk about context, but when you search for a phrase the results will be contextual.
For instance, when i search for 'wayfarer definition' it brings up contextual results about the definition of the word, not the sunglasses.
|
|
|
|
11-10-2008, 07:36 AM
|
Re: A theoretical question
|
Posts: 41,517
Name: Chris Hirst
Location: Blackpool. UK
|
Quote:
|
It's something to do with keyword density.
|
It's not "keyword density", but the word proximity (how close the words are in relation to each other) that determines context.
__________________
Chris. ->> Links are advertising NOT optimising!! <<-
A foolish consistency is the hobgoblin of little minds
Thought for today:- I SEO the only industry where all the cowboys are Indians?
|
|
|
|
11-10-2008, 08:09 AM
|
Re: A theoretical question
|
Posts: 19
|
Quote:
Originally Posted by chrishirst
It's not "keyword density", but the word proximity (how close the words are in relation to each other) that determines context.
|
I don't agree on that one. Look at his:
'He was a wayfarer wearing dark sunglasses'.
Would you associate it with the trade mark of the sunglasses? If we are talking about artificial intelligence both density and proximity could be a factor, density being the determining one.
You wouldn't tell that the document is about wayfarer sunglasses if it doesn't have enough density in the document, no matter how proximal it is on that occasion.
|
|
|
|
11-11-2008, 04:12 PM
|
Re: A theoretical question
|
Posts: 5,662
Name: John Alexander
|
Quote:
Originally Posted by carofl
'He was a wayfarer wearing dark sunglasses'.
Would you associate it with the trade mark of the sunglasses? If we are talking about artificial intelligence both density and proximity could be a factor, density being the determining one.
|
Now you need a parts of speech tagger based on the maximum entropy model. Wayfarer = noun, wearing = inflected verb. In your example, "wearing dark sunglasses" is an auxiliary description - "he was a wayfarer" is the noun phrase that heads the sentence. Draw out the tree.
Quote:
Originally Posted by carofl
You wouldn't tell that the document is about wayfarer sunglasses if it doesn't have enough density in the document, no matter how proximal it is on that occasion.
|
Density as an explanation is idiotic. Read Chomsky and Pinker.
|
|
|
|
11-12-2008, 11:23 AM
|
Re: A theoretical question
|
Posts: 19
|
Quote:
Originally Posted by Learning Newbie
Now you need a parts of speech tagger based on the maximum entropy model. Wayfarer = noun, wearing = inflected verb. In your example, "wearing dark sunglasses" is an auxiliary description - "he was a wayfarer" is the noun phrase that heads the sentence. Draw out the tree.
Density as an explanation is idiotic. Read Chomsky and Pinker.
|
What does Chomsky have to do with the search engine theory? Are we talking about literature or search engines here? If you cannot discriminate it, that's what i would call idiotic.
Just tell me how does Adsense provide contextual ads? By checking the figures of speech?
Further you don't understand what you read at all, i wonder how you read Chomsky. I intentionally emphasized there 'If we are talking about artificial intelligence' -----> search engines
If you think checking keyword density to get the context of a document is idiotic, then tell it to the people at google.
I'm getting angry everyday seeing such retarded remarks on webmaster forums, especially when it is a moderator.
Please think before insulting others
Last edited by carofl; 11-12-2008 at 11:41 AM..
|
|
|
|
11-12-2008, 12:08 PM
|
Re: A theoretical question
|
Posts: 3,985
Name: Abel Mohler
Location: Asheville, North Carolina USA
|
Note that he didn't say YOU ARE AN IDIOT. All he said was that density is an idiotic explanation. You are not an idiot, you have just picked up an idiotic explanation from other places on the web. Kind of like a common cold.
When something has "proper density", it is more likely just a symptom of noun-verb, header-text associations that are present on the page. Saying it is just simple density is kind of like assuming a big-league pitcher will throw you a soft-ball.
Quote:
Originally Posted by Learning Newbie
Read Chomsky and Pinker.
|
Yeah... I'm not sure what he meant by that either... 
|
|
|
|
11-12-2008, 02:06 PM
|
Re: A theoretical question
|
Posts: 5,662
Name: John Alexander
|
Quote:
Originally Posted by carofl
What does Chomsky have to do with the search engine theory? Are we talking about literature or search engines here?
|
Literature??
Noam Chomsky is the most acclaimed linguist of the 20th century. Steven Pinker isn't very far behind. Search engines search text, which is written language - I thought that would be obvious.
Quote:
Originally Posted by carofl
Just tell me how does Adsense provide contextual ads? By checking the figures of speech?
|
Again, pick up a book on linguistics. You'll learn (1) what's possible, (2) what's not possible, (3) how humans understand context and express it in language.
Quote:
Originally Posted by carofl
Further you don't understand what you read at all, i wonder how you read Chomsky. I intentionally emphasized there 'If we are talking about artificial intelligence' -----> search engines
|
You can't understand artificial intelligence without first grasping natural intelligence. Search engines and other text mining applications do their best to mimic human intelligence, so that they can recommend answers humans will like.
Quote:
Originally Posted by carofl
If you think checking keyword density to get the context of a document is idiotic, then tell it to the people at google.
|
While I'm telling them things they already know, I'd better suggest they don't build the Googleplex out of straw, and should brush their teeth 2 or 3 times a day?
Quote:
Originally Posted by carofl
Please think before insulting others
|
Look, I'm not trying to insult you, but keyword density is an idiotic way to try ( and fail miserably) to pull any meaningful information out of a text document. I'm suggesting that you read some books on linguistics so that you'll get an explicit sense of how people read meaning and relatedness out of text, because this is well known, and it's what search engines are modeled on.
|
|
|
|
11-13-2008, 08:12 PM
|
Re: A theoretical question
|
Posts: 19
|
Ok anyway, you're just mistaking getting the context of a document with the cognitive comprehension of it.
Google doesn't need to comprehend the document to serve contextual results which is what they do and invest all their money. It's statistics.
If it's 99% right in assuming my context that's ok for me.You tend to call it relatedness but i call it context. I could write more on why google serves contextual results but you'll keep denying it at that standpoint.
I search 'interest rates' and find contextual results for the word interest.
So bye, you keep reading Chomsky and i'll do my job.
|
|
|
|
11-13-2008, 08:21 PM
|
Re: A theoretical question
|
Posts: 181
Name: James Spinosa
Location: Fourth Floor Marketing
|
Google's algorithms cause people to link to commercial products for more than the definition of a word, there is not much incentive to the definition of the word wayfarer.
|
|
|
|
11-13-2008, 08:49 PM
|
Re: A theoretical question
|
Posts: 5,662
Name: John Alexander
|
Quote:
Originally Posted by carofl
Google doesn't need to comprehend the document to serve contextual results which is what they do and invest all their money. It's statistics.
|
Yes, statistics play a big role. Keyword density does not. Else every English language page on the internet would be related or contextually similar to "the" and "and".
Seriously, what I'm saying is people shouldn't invest their time and effort into things that don't work. If you don't want to do what does work, that's fine, but you'd do better by yourself to spend that time having fun, rather than fine tuning a number that nobody ( including Google's robots) will ever look at. I'm sorry that offends you.
|
|
|
|
|
« Reply to A theoretical question
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|