Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

PHP Forum


You are currently viewing our PHP Forum as a guest. Please register to participate.
Login



Freelance Jobs

Closed Thread
Remove a spesific character when accessed by search engine crawlers
Old 08-25-2007, 07:05 PM Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
Hi,

I want to use the Unicode characther U+AD ‘SOFT HYPHEN’ which is avsolutely lovely. Now even Firefox supports it! (Which makes a compleat set.)

However search engines—including Ask.com, Google, and Yahoo!—treats the character like a word separating character. On search result pages they display it as white space. But I figured this could be solved very simple:

Do not let the search engines crawl pages with that character. I though that somehow it would be possible to serve search engine crawlers another version of the page; without the U+AD charaters.

What I am askin for is this: How can I search a page and strip away all U+AD characters using PHP?
__________________
I do not share ad revenue.
Moldarin is offline
View Public Profile
 
 
Register now for full access!
Old 08-25-2007, 07:07 PM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
is there any great point to this amazingly pointless excercise?
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-25-2007, 07:56 PM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
Yes. Otherwise search engines such as Google would read my content as ‘val ue add ed pres en ta tion’ instead of ‘value added presentation’.
__________________
I do not share ad revenue.

Last edited by Moldarin; 08-25-2007 at 07:58 PM.. Reason: Fixed my hyphenation example.
Moldarin is offline
View Public Profile
 
Old 08-25-2007, 07:59 PM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
im sorry wtf?

Im sorry why would it read it like that?

MY point is its not going to matter really, as theres no reason to o use it hardly at all. and if it causes a problem that your worried about the simple answer is DONT USE IT.

Somethings its just not worth making a script to play with it ¬.¬
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE

Last edited by dansgalaxy; 08-25-2007 at 08:07 PM..
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-25-2007, 08:10 PM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
ereg_replace might do this task


miserable git
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE

Last edited by dansgalaxy; 08-25-2007 at 08:40 PM..
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-26-2007, 11:26 AM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
There are plenty of reasons to use hyphenation. The point of soft hyphen is that it only occurs when it is needed. But search engines read it as a separating character. They should ignore it all togheter.

Could you explain how you had pictured ereg_replace for this task? I am not too familiar with that one.
__________________
I do not share ad revenue.
Moldarin is offline
View Public Profile
 
Old 08-26-2007, 11:47 AM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
PHP Code:
<?php
$filecontents 
'all the text which-could have some-in-my opinion-unneeded hyphenations :p hehe';
 
echo 
ereg_replace("-"""$filecontents);
?>
and that should replace them with nothing.

so not-hing will become nothing.

and then all you have to do is stick that in a if/else statment, if search engoine run that script to remove all the hyphens if now show hyphens.

Happy now? ¬.¬ can i have my talkupation back now please.
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-26-2007, 11:58 AM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
You will get it back once I have a scalable version running. As I said: I am trying to do this on an entire document. The document is generated by WordPress, if that helps.

[whisper=dansgalaxy]I gave you negative talkupation because I though it was really unnescesary to criticize me for trying to present content in a best way as possible for users. What I am attempting is a hack that will allow search engines to understand the content as well.[/whisper]
__________________
I do not share ad revenue.

Last edited by Moldarin; 08-26-2007 at 11:58 AM.. Reason: Grammar lessons, anyone? :-(
Moldarin is offline
View Public Profile
 
Old 08-26-2007, 12:15 PM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
well like i said. there isnt a way to easily do this, and to be honest at the most i would think only a few words would need a hyphen on a page, and unles you are using them for a title or meta description they will not effect search engines that much.
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-26-2007, 12:31 PM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
I do not think you know what a soft hyphen is. Here is the potted version: Soft hyphens are suggested points of hyphenation. Examples were · marks a suggested hyphenation point: ‘Micro·soft’, ‘pres·en·ta·tion’, and ‘jour·nal’. When presented on screen this would look like ‘Microsoft’, ‘presentation’, and ‘journal’. BUT if the words appear at the end of a line, and it could break off with hyphenation; it would look like this (| marks line break): ‘Micro-|soft’, ‘presenta-|tion’, and ‘jour-|nal’. Good hyphenation will make any large block of text look very good. Just pick up a news paper!

The problem child of suggested hyphenation is search. It breaks browser-side in-document search (though Opera 9.5, and Firefox 3.a7 does finally ignore soft hyphens in search!!) and it breaks search engines. They read it as ‘Micro soft’, ‘pres en ta tion’, and ‘jour nal’. In other words it look like rubbish to them.

The fix: Remove soft hyphens when accessed by crawlers.
The problem: To achieve the above.
__________________
I do not share ad revenue.
Moldarin is offline
View Public Profile
 
Old 08-27-2007, 11:20 AM Re: Remove a spesific character when accessed by search engine crawlers
metho's Avatar
Ultra Talker

Posts: 481
Location: Gold Coast - Brisbane QLD, Australia
Trades: 0
How many people are going to search for "Micro·soft"? Where the hell is that on my keyboard anyway?

jebus!
__________________
I do
Please login or register to view this content. Registration is FREE
based.
Spend a lot of time in
Please login or register to view this content. Registration is FREE
.
And
Please login or register to view this content. Registration is FREE
chews up the rest.

Last edited by metho; 08-27-2007 at 11:21 AM.. Reason: I can spell 'many', I really can!
metho is offline
View Public Profile Visit metho's homepage!
 
Old 08-27-2007, 11:29 AM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
thaaank you
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-27-2007, 11:30 AM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
Am I really that bad at explaining this?

The hyphens are not rendered if they are not needed. So they only occour on potentional line breaks.
__________________
I do not share ad revenue.
Moldarin is offline
View Public Profile
 
Old 08-27-2007, 11:31 AM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
but the whole point is it dont really matter. at leats not enough to go to all this trouble.
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-27-2007, 11:35 AM Re: Remove a spesific character when accessed by search engine crawlers
metho's Avatar
Ultra Talker

Posts: 481
Location: Gold Coast - Brisbane QLD, Australia
Trades: 0
You're concise enough. However in my opinion, your question is ridiculous and comes across as pretentious. str_replace and ereg_replace will do the trick.
__________________
I do
Please login or register to view this content. Registration is FREE
based.
Spend a lot of time in
Please login or register to view this content. Registration is FREE
.
And
Please login or register to view this content. Registration is FREE
chews up the rest.
metho is offline
View Public Profile Visit metho's homepage!
 
Old 08-27-2007, 12:27 PM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
Im going to have one last go at trying to get you to see MY point.

yes i understand what and how a soft hyphen is used.
i see its purpose.
i can understand why you might feel you want to use it one your site.

BUT as a search engine ONLY indexs your page title.
keywords.
and a short description from your meta/page

IT doesnt matter!

you sure as hell shouldnt be using them in your title or keywords.

THIS IS MY POINT. you have no reason to use it in places google is likely to look. if you have some reasonable SEO (ie some metas)

If you still dont get it then im sorry, bye!
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-27-2007, 06:53 PM Re: Remove a spesific character when accessed by search engine crawlers
metho's Avatar
Ultra Talker

Posts: 481
Location: Gold Coast - Brisbane QLD, Australia
Trades: 0
Dan, only meta search engines behave in such a way, such as meta-crawler. Google, Yahoo, MSN and most of the others will cache a whole copy of the page. Do a search and click on 'cached result' underneath a result...

In simple terms, search engines index a page for results by breaking it down into unique words, take a tally of how many times these words appear on a page, and give a rating for what html element each word appears in.

Mold's reasoning is unsound because of 3 points. His proposal to show SE bots one version and human visitors another of his webpage could be interpreted as Cloaking.

Secondly, Moldy is apparently so vain about his blog that he would rather not have his page visited/indexed by search engines if they "display [U+AD] as white space". This got me riled. How fricking VAIN.

Lastly, this preciousness about paragraph appearances is folly. Many users will have browser preferences that will define a minimum font size. Once the font size goes up, the lines in his paragraph will break where the browser dictates, not where U+AD chars are placed.

I've been posting here for over 3 1/2 years and this tops the cake for most ridiculous question ever asked (some of your earliest questions came pretty close Dan, remember those days?).
__________________
I do
Please login or register to view this content. Registration is FREE
based.
Spend a lot of time in
Please login or register to view this content. Registration is FREE
.
And
Please login or register to view this content. Registration is FREE
chews up the rest.
metho is offline
View Public Profile Visit metho's homepage!
 
Old 08-28-2007, 07:24 AM Re: Remove a spesific character when accessed by search engine crawlers
dansgalaxy's Avatar
Defies a Status

Posts: 6,521
Name: Dan
Location: Swindon
Trades: 0
yes :P

i know about how it caches the page i was thinking more about on the actual resutls page unless they click cached pages which i would think they do not that often like so users are only fonna see a breif clip of it, and im thinking all his keywords would have softpyphens in
__________________
Discounted Web Hosting With XDnet!
>> Get 25% of hosting~ Promo: Webmaster-talk <<

Please login or register to view this content. Registration is FREE
dansgalaxy is offline
View Public Profile Visit dansgalaxy's homepage!
 
Old 08-28-2007, 09:04 AM Re: Remove a spesific character when accessed by search engine crawlers
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
I find it much simpler to use text-align:justify and not bother about which words possibly might need to break at line end just in case an end user is browsing at WxH or W1xH1 resolution and might just resize their browser whilst looking at this page.
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is online now
View Public Profile Visit chrishirst's homepage!
 
Old 08-28-2007, 12:40 PM Re: Remove a spesific character when accessed by search engine crawlers
Moldarin's Avatar
Extreme Talker

Latest Blog Post:
Keyword Density and Title Tags
Posts: 201
Trades: 0
Quote:
Originally Posted by metho View Post
Lastly, this preciousness about paragraph appearances is folly. Many users will have browser preferences that will define a minimum font size. Once the font size goes up, the lines in his paragraph will break where the browser dictates, not where U+AD chars are placed.
Well, the U+AD is suted even better when scaling. I will do a static example, as no one seams to understand it purpose.

Please note how the first example is a whole line shorter than the second, and how much better the text block looks. The hyphenation points occur where the soft hyphens—or U+AD as we all have got to know them—are.

With defined U+AD characters:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Cras sit amet quam. Morbi nec tellus. Mauris ante orci,
mattis at, euismod ut, iaculis vitae, dolor. Mauris nec tor-
tor at ante sagittis pulvinar. Mauris augue. Proin feugiat
mauris. Suspendisse lobortis facilisis urna. Nulla facilisi.
Nam dignissim, lacus a tristique pellentesque, eros elit por-
ta mi, et egestas eros dui sit amet sem. Nam eget augue
nec tellus porttitor imperdiet. Pellentesque rutrum suscip-
itrisus. Cras vitae purus.

Without:
Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Cras sit amet quam. Morbi nec tellus. Mauris ante orci,
mattis at, euismod ut, iaculis vitae, dolor. Mauris nec
tortor at ante sagittis pulvinar. Mauris augue. Proin
feugiat mauris. Suspendisse lobortis facilisis urna. Nulla
facilisi. Nam dignissim, lacus a tristique pellentesque,
eros elit porta mi, et egestas eros dui sit amet sem.
Nam eget auguenec tellus porttitor imperdiet.
Pellentesque rutrum suscipitrisus. Cras vitae purus.
itrisus. Cras vitae purus.
__________________
I do not share ad revenue.
Moldarin is offline
View Public Profile
 
Closed Thread     « Reply to Remove a spesific character when accessed by search engine crawlers

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.54849 seconds with 12 queries