Ads

Google Slap.......!

Wednesday, December 22, 2010

Basically, the Google Slap accomplishes several things at once. First, it drives up the price of your pay per click amount, sometimes asking as much as $10USD per click, which most small vendors cannot afford to pay. Second, they may reduce your page ranking (PR), which will automatically mean you have to pay more to have your ads featured through AdWords.

Google obviously has a right to determine which pages represent the greatest quality and most match to keywords, but in 2008, a round of Google Slap actions on small vendors took considerable toll, driving many companies out of business quickly. Many argued that their pages conformed to Google’s AdWords recommendations for pages, and yet they still receive very low PR rankings. This meant the enormously high pay per click fees were prohibitive, and comparable to banning certain vendors from using AdWords, though searches for information on these pages might still be revealed in a regular Google search, just not in ads.

When some companies receive a Google Slap, just about the only thing they can do is completely change their domain because it is very difficult to challenge Google on their ratings or assessment of pages. There are Internet horror stories on known pages, with excellent authoritative and original information being slapped, and being unable to recover. If you decide Google AdWords is for you, you should definitely read all information about what Google looks for to avoid receiving a Google Slap.

A few things that Google consistently appears to look for include original and significant content on landing pages (the page a person goes to when they click on your ad), transparent business dealings and upfront information about how you conduct business, and easy methods for searching your website from the landing page. To avoid a Google Slap you’ll want to not create a landing page that is just a table of contents and plenty of ads for other vendors, and you have to make sure your page conforms to any of Google’s Editorial Guidelines, too.

Read more ...

Location of keywords on a page

Monday, December 6, 2010

SEO Expert Advise

A very short rule for seo experts – the closer a keyword or keyword phrase is to the beginning of a document, the more significant it becomes for the search engine.


Text format and seo Search engines pay special attention to page text that is highlighted or given special formatting. We recommend:

Heading Tag

use keywords in headings. Headings are text highlighted with the «H» HTML tags. The «h1» and «h2» tags are most effective. Currently, the use of CSS allows you to redefine the appearance of text highlighted with these tags. This means that «H» tags are used less than nowadays, but are still very important in seo work.;

Anchor Text

Highlight keywords with bold fonts. Do not highlight the entire text! Just highlight each keyword two or three times on the page. Use the «strong» tag for highlighting instead of the more traditional «B» bold tag.

«TITLE» tag

This is one of the most important tags for search engines. Make use of this fact in your seo work. Keywords must be used in the TITLE tag. The link to your site that is normally displayed in search results will contain text derived from the TITLE tag. It functions as a sort of virtual business card for your pages. Often, the TITLE tag text is the first information about your website that the user sees. This is why it should not only contain keywords, but also be informative and attractive. You want the searcher to be tempted to click on your listed link and navigate to your website. As a rule, 50-80 characters from the TITLE tag are displayed in search results and so you should limit the size of the title to this length.

Keywords in links

A simple seo rule – use keywords in the text of page links that refer to other pages on your site and to any external Internet resources. Keywords in such links can slightly enhance page rank.

«ALT» attributes in images

Any page image has a special optional attribute known as "alternative text.” It is specified using the HTML «ALT» tag. This text will be displayed if the browser fails to download the image or if the browser image display is disabled. Search engines save the value of image ALT attributes when they parse (index) pages, but do not use it to rank search results.

Currently, the Google search engine takes into account text in the ALT attributes of those images that are links to other pages. The ALT attributes of other images are ignored. There is no information regarding other search engines, but we can assume that the situation is similar. We consider that keywords can and should be used in ALT attributes, but this practice is not vital for seo purposes.

Description Meta tag

This is used to specify page descriptions. It does not influence the seo ranking process but it is very important. A lot of search engines (including the largest one – Google) display information from this tag in their search results if this tag is present on a page and if its content matches the content of the page and the search query.

Experience has shown that a high position in search results does not always guarantee large numbers of visitors. For example, if your competitors' search result description is more attractive than the one for your site then search engine users may choose their resource instead of yours. That is why it is important that your Description Meta tag text be brief, but informative and attractive. It must also contain keywords appropriate to the page.

Keywords Meta tag

This Meta tag was initially used to specify keywords for pages but it is hardly ever used by search engines now. It is often ignored in seo projects. However, it would be advisable to specify this tag just in case there is a revival in its use. The following rule must be observed for this tag: only keywords actually used in the page text must be added to it.

-Guidelines By

Gaurav Mulherkar

(SEO Expert)
Read more ...

List Of Yahoo Services

Monday, November 29, 2010

Flickr
Flickr is a popular photo sharing service which Yahoo! purchased on 29 March 2005.

Yahoo! Advertising
A combination of advertising services owned by Yahoo!.

Yahoo! Answers
Yahoo! Answers is a service that allows users to ask and answer questions other users post. It competes with Ask MetaFilter. Yahoo! Answers uses a points system whereby points are awarded for asking and answering questions, and deducted for deleting a question or answer, or getting reported.

Yahoo! Avatars
Yahoo! Avatars allows users to create personalized character images, also known as avatars, which are displayed on

Yahoo! Messenger, Yahoo! Answers and the user's Yahoo! 360° profile.

Yahoo! Babel Fish
Yahoo! Babel Fish is a translation service.

Yahoo! Bookmarks
Yahoo! Bookmarks is a private bookmarking service. All Users from Yahoo MyWeb were transferred to this service.

Yahoo! Buzz
Yahoo! Buzz is a community based publishing service much like that of Digg, where users can buzz about certain
stories and allow them to be featured on the main page of the site.

Yahoo! Developer Network
Yahoo! Developer Network offers resources for software developers which use Yahoo! technologies and Web services.

Yahoo! Digu

Yahoo! Directory
Yahoo! was first formed as a web directory of web sites, organized into a hierarchy of categories and subcategories, which became the Yahoo! Directory. Once a human-compiled directory, Yahoo! Directory now offers two methods of inclusion: Standard, which is free and only available for non-commercial categories, and Express, which charges over US$300 for a quick inclusion in the directory.

Yahoo! Finance
Yahoo! Finance offers financial information, including stock quotes and stock exchange rates.

Yahoo! Games
Yahoo! Games allows users to play games, such as chess, billiards, checkers and backgammon, against each other. Users can join one of various rooms and find players in these rooms to play with. Most of the games are Java applets, although some require the user to download the game, and some games are single-player. Yahoo! acquired a one person effort called ClassicGames.com in 1997, which became Yahoo! Games.[citation needed]

Yahoo! Groups
Yahoo! Groups is a free groups and mailing list service which competes with Google Groups. It was formed when Yahoo! acquired eGroups in August 2000. Groups are sorted in categories similar to the Yahoo! Directory. Yahoo! Groups also offers other features such as a photographic album, file storage and a calendar.

Yahoo! Kids
Yahoo! Kids is a children's version of the Yahoo! portal. It also offers some online safety tips.

Yahoo! Local
Find local businesses and services and view the results on a map. Refine and sort results by distance, topic, or other factors. Read ratings and reviews. Uses hCalendar and hCardmicroformats, so that event and contact details can be downloaded directly into calendar and address-book applications.

Yahoo! Mail
Yahoo! acquired Four11 on 8 October 1997, and its webmail service Rocketmail became Yahoo! Mail. Since Google released Gmail on 1 April 2004, Yahoo! Mail has made several improvements to keep ahead of the competition, which also includes MSN Hotmail and AOL Mail. Yahoo! Mail is the only web-based email service that offers unlimited storage for all users. On 9 July 2004, Yahoo! acquired an e-mail provider named Oddpost and used its technology to create Yahoo! Mail Beta, which uses Ajax to mimic the look and feel of an e-mail client. On 19 June 2008, Yahoo Mail introduced its 2 new email domains: ymail.com and rocketmail.com ("@ymail.com" and "@rocketmail.com" athttp://mail.yahoo.com).[1]

Yahoo! Maps
Yahoo! Maps offers driving directions and traffic.

Yahoo! Meme
Yahoo! Meme is a beta social service, similar to the popular social networking site Twitter.

Yahoo! Messenger
Yahoo! Messenger is an instant messaging service first released on 21 July 1999, which competes with AOL Instant Messenger, MSN Messenger, Google Talk, ICQ and QQ. It offers several unique features, such as IMvironments, custom status messages, and custom avatars. On 13 October 2005, Yahoo! announced that Yahoo! Messenger and MSN Messenger would become interoperable.

Yahoo! Mobile
Yahoo! Mobile is a mobile website used predominantly in the UK. It offers mobile downloads such as ringtones.

Yahoo! Movies
Yahoo! Movies offers showtimes, movie trailers, movie information, gossip, and others.

Yahoo! Music
Yahoo! Music offers music videos and internet radio (LAUNCHcast), a for-fee service known as Yahoo! Music Unlimited and the Yahoo! Music Engine- which has been sold to Rhapsody on Oct. 31st, 2008.

Yahoo! News news updates and top stories at Yahoo! News, including world, national, business, entertainment, sports, weather, technology, and weird news.

Yahoo! OMG
OMG is a Yahoo! Entertainment online tabloid with most content provided by Access Hollywood and X17.

Yahoo! Parental Controls
Yahoo! Parental Controls are special controls given by parents for their children, closely associated with Yahoo! Kids.

Yahoo! Personals
Yahoo! Personals is an online dating service with both free and paid versions. However, the free service is limited, as only paying users can contact users they meet through Yahoo! Personals and exchange contact information.

Yahoo! Pipes
Yahoo Pipes is a free RSS mashup visual editor and hosting service.

Yahoo! Publisher Network
Yahoo! Publisher Network is an advertising program, which is currently in beta and only accepts US publishers.

Yahoo! Real Estate
Yahoo! Real Estate offers real estate-related information and allows users to find rentals, mew houses, real estate agents, mortgages and more.

Yahoo! Search
Yahoo! Search is a search engine which competes with MSN Search and market leader Google. Yahoo! relied on Google results from 26 June 2000 to 18 February 2004, but returned to using its own technology after acquiring Inktomi and Overture (which owned AlltheWeb and AltaVista). Yahoo! Search uses a crawler named Yahoo! Slurp.

Yahoo! Search Marketing
Yahoo! Search Marketing provides pay per click inclusion of links in search engine result lists, and also delivers targeted ads. The service was previously branded as Overture Services after Yahoo! acquired Overture in 2003.

Yahoo! Shopping
Yahoo! Shopping is a price comparison service, that allows users to search for products and compare prices of various online stores.

Yahoo! Small Business
Yahoo! Small Business offers web hosting, domain names and e-commerce services for small businesses.

Yahoo! Smush.it
Yahoo! Smush.it optimizes digital images by removing unnecessary bytes and reducing their file size.

Yahoo! Sports
Yahoo! Sports offers sports news, including scores, statistics, and fixtures. It includes a "fantasy team" game.

Yahoo! Travel
Yahoo! Travel offers travel guides, booking and reservation services.

Yahoo! TV
Yahoo! TV offers TV listings and scheduled recordings on Tivo box remotely.

Yahoo! Video
Yahoo! Video is a video sharing site.

Yahoo! Voice
Yahoo! Voice was formerly known as Dialpad. It is a Voice over IP PC-PC, PC-Phone and Phone-to-PC service.

Yahoo! Web Analytics
IndexTools was acquired by Yahoo! and re-branded as 'Yahoo! Web Analytics'.

Yahoo! Widgets
Yahoo! Widgets is a cross-platform desktop widget runtime environment. The software was previously distributed as a commercial product called 'Konfabulator' for Mac OS X andWindows until it was acquired by Yahoo! and rebranded 'Yahoo! Widgets' and made available for free.

Yahoo! 360° Plus Vietnam
Social networking services. Popular in Vietnam.
Read more ...

General Search Engine Information....

Saturday, November 27, 2010
History of search engines

In the early days of Internet development, its users were a privileged minority and the amount of available information was relatively small. Access was mainly restricted to employees of various universities and laboratories who used it to access scientific information. In those days, the problem of finding information on the Internet was not nearly as critical as it is now. Site directories were one of the first methods used to facilitate access to information resources on the network. Links to these resources were grouped by topic. Yahoo was the first project of this kind opened in April 1994. As the number of sites in the Yahoo directory inexorably increased, the developers of Yahoo made the directory searchable. Of course, it was not a search engine in its true form because searching was limited to those resources who’s listings were put into the directory. It did not actively seek out resources and the concept of seo was yet to arrive. Such link directories have been used extensively in the past, but nowadays they have lost much of their popularity. The reason is simple – even modern directories with lots of resources only provide information on a tiny fraction of the Internet.

For example, the largest directory on the network is currently DMOZ (or Open Directory Project). It contains information on about five million resources. Compare this with the Google search engine database containing more than eight billion documents. The WebCrawler project started in 1994 and was the first full-featured search engine. The Lycos and AltaVista search engines appeared in 1995 and for many years Alta Vista was the major player in this field. In 1997 Sergey Brin and Larry Page created Google as a research project at Stanford University. Google is now the most popular search engine in the world.

Currently, there are three leading international search engines – Google, Yahoo and MSN Search. They each have their own databases and search algorithms. Many other search engines use results originating from these three major search engines and the same seo expertise can be applied to all of them. For example, the AOL search engine (search.aol.com) uses the Google database while AltaVista, Lycos and AllTheWeb all use the Yahoo database.

Common search engine principles

To understand seo you need to be aware of the architecture of search engines. They all contain the following main components:

Spider - a browser-like program that downloads web pages.

Crawler – a program that automatically follows all of the links on each web page.

Indexer - a program that analyzes web pages downloaded by the spider and the crawler.

Database– storage for downloaded and processed pages.

Results engine – extracts search results from the database.

Web server – a server that is responsible for interaction between the user and other search engine

components. Specific implementations of search mechanisms may differ.

For example,

the Spider+Crawler+Indexer component group might be implemented as a single program that downloads web pages, analyzes them and then uses their links to find new resources. However, the components listed are inherent to all search engines and the seo principles are the same. Spider. This program downloads web pages just like a web browser. The difference is that a browser displays the information presented on each page (text, graphics, etc.) while a spider does not have any visual components and works directly with the underlying HTML code of the page. You may already know that there is an option in standard web browsers to view source HTML code.

Crawler.

This program finds all links on each page. Its task is to determine where the spider should go either by evaluating the links or according to a predefined list of addresses. The crawler follows these links and tries to find documents not already known to the search engine.

Indexer.

This component parses each page and analyzes the various elements, such as text, headers, structural or stylistic features, special HTML tags, etc.

Database.

This is the storage area for the data that the search engine downloads and analyzes. Sometimes it is called the index of the search engine. Results Engine. The results engine ranks pages. It determines which pages best match a user's query and in what order the pages should be listed.

This is done according to the ranking algorithms of the search engine. It follows that page rank is a valuable and interesting property and any seo specialist is most interested in it when trying to improve his site search results. In this article, we will discuss the seo factors that influence page rank in some detail.

Web server.

The search engine web server usually contains a HTML page with an input field where the user can specify the search query he or she is interested in. The web server is also responsible for displaying search results to the user in the form of an HTML page.
Read more ...

List of Google services

Friday, November 19, 2010

1. Google search
2. Google Adword
3. Google Adsense
4. Google apps
5. Google Analytics
6. Google map
7. Google webmaster tool
8. Google sites
9. Google feedburner
10. Google Picasa
11. Google orkut
12. Google Gmail
13. Google labs
14. Google earth
15. Google local business centre
16. Google books library
17. Gtalk
18. Google Blogger
19. Google docs
20. Google trands
21. Google global
22. Google checkout
23. Google pack
24. Google Calendar
25. Google Desktop
Read more ...

Do you know about Clickthrough rate (CTR)..?

Friday, November 19, 2010

Clickthrough rate (CTR) is the number of clicks your ad receives divided by the number of times your ad is shown (impressions). Your ad and keyword each have their own CTRs, unique to your own campaign performance.
A keyword's CTR is a strong indicator of its relevance to the user and the overall success of the keyword. For example, a well targeted keyword that shows a similarly targeted ad is more likely to have a higher CTR than a general keyword with non-specific ad text. The more your keywords and ads relate to each other and to your business, the more likely a user is to click on your ad after searching on your keyword phrase.
A low CTR may point to poor keyword performance, indicating a need for ad or keyword optimization. Therefore, you can use CTR to gauge which ads and keywords aren't performing as well for you and then optimize them.
CTR is also used to determine your keyword's Quality Score. Higher CTR and Quality Score can lead to lower costs and higher ad position.
Read more ...

Dynamic content used to be a red flag for search engine friendly design

Thursday, November 11, 2010

Dynamic content used to be a red flag for search engine friendly design, but times have changed. Search engines now include dynamically-generated pages in their indexes, but some particulars of dynamic pages can still be obstacles to getting indexed. Whether it’s keeping in synch with inventory or updating a blog, more than likely if you’re a website owner you have some level of dynamic or CMS-managed content on your site (and if not, you should really be looking into it for your next redesign). Follow the guidelines here to avoid major pitfalls and ensure that your dynamic body of work is search engine friendly from head to toe.

Rule #1: Be sure that search engines can follow regular HTML links to all pages on your site.

Any website needs individually linkable URLs for all unique pages on the site. This way every page can be bookmarked and deep linked by users, and indexed by search engines. But dynamic websites have an additional concern: making sure the search engine robots can reach all of these pages.

For example, suppose you have a form on your website: you ask people to select their location from a pull-down, and then when people submit the form your website generates a page with content that is specifically written for that geographical area. Search engine robots don't fill out forms or select from pull-down menus, so there will be no way for them to get to that page.

This problem can be easily remedied by providing standard type HTML links that point to all of your dynamic pages. The easiest way to do this is to add these links to your site map.

Rule #2: Set up an XML site map if you can’t create regular HTML links to all of your pages, or if it appears that search engines are having trouble indexing your pages.

If you have a large (10K pages or more) dynamic site, or you don’t think that providing static HTML links is an option, you can use an XML site map to tell search engines the locations of all your pages.

Most website owners tell Google and Yahoo! about their site maps through the search engines' respective webmaster tools (Links: Google Yahoo!). But if you're an early adopter, you should look into the new system whereby a site map can be easily designated in the robots.txt file using sitemap autodiscovery. Ask.com, Google and Yahoo! currently support this feature. Cool!

Rule #3: If you must use dynamic URLs, keep them short and tidy

Another potential problem - and this is one that is subject to some debate - is with dynamic pages that have too many parameters in the URL. Google itself in its webmaster guidelines states the following: "If you decide to use dynamic pages (i.e., the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few."

Here are a few guidelines you should follow for your website parameters:

Limit the number of parameters in the URL to a maximum of 2

Use the parameter "?id=" only when in reference to a session id

Be sure that the URL functions if all dynamic items are removed

Be sure your internal links are consistent - always link with parameters in the same order and format

Rule #4: Avoid dynamic-looking URLs if possible

Besides being second-class citizens of search, dynamic-looking URLs are also less attractive to your human visitors. Most people prefer to see URLs that clearly communicate the content on the page. Since reading the URL is one of the ways that people decide whether to click on a listing in search engines, you are much better off having a URL that looks like this:

rather than this:

We also think that static-looking, “human-readable” URLs are more likely to receive inbound links, because some people will be less inclined to link to pages with very long or complicated URLs.

Furthermore, keywords in a URL are a factor, admittedly not a huge one, in search engine ranking algorithms. Notice how, in the above example, the static URL contains the keywords “discount” and “church bells” while the dynamic URL does not.

There are many tools available that will re-create a dynamic site in static form. There are also tools that will re-write your URLs, if you have too many parameters, to "look" like regular non-dynamic URLS. We think these are both good options for dynamic Intrapromote has a helpful post on dynamic URL rewriting.

Rule #5: De-index stubs and search results

Have you heard of “website stubs?” These are pages that are generated by dynamic sites but really have no independent content on them. For example, if your website is a shopping cart for toys, there may be a page generated for the category “Age 7-12 Toys” but you may not actually have any products in this category. Stub pages are very annoying to searchers, and search engines, by extension, would like to prevent them from displaying in their results. So do us all a favor and either figure out a way to get rid of these pages, or exclude them from indexing using the robots.txt file or robots meta tag.

Search results from within your website is another type of page for which Google has stated a dislike: “Typically, web search results don’t add value to users, and since our core goal is to provide the best search results possible, we generally exclude search results from our web search index.” Here’s our advice: either make sure your search results pages add value for the searcher (perhaps by containing some unique content related to the searched term), or exclude them from indexing using the robots.txt file or robots meta tag.

Bonus Points: Handling duplicate content

While it's not a problem that's specific to dynamic sites, this rule is one that dynamic sites are more likely to break than static ones. If multiple pages on your site display materials that are identical or nearly identical, duplicates should be excluded from indexing using the robots.txt file or a robots meta tag. Think of it this way: you don’t want all your duplicate pages competing with each other on the search engines. Choose a favorite, and exclude the rest. [Editor's note: we no longer (2009) recommend de-indexing duplicate content. A better approach is to either redirect your duplicate pages to the primary page using a server-side, 301 redirect, or to set up a tag for any page that has been duplicated. A good explanation of best practices for handling duplicate content in 2009 can be found at Matt Cutts' Blog]

Dynamic content is usually timely and useful, which is why users love it, and the search engines want to list it. And now you know how to help your dynamic website reach its full search engine potential.


Read more ...

Web Crawling.............!!!

Monday, November 1, 2010

When most people talk about Internet search engines, they really mean World Wide Web search engines. Before the Web became the most visible part of the Internet, there were already search engines in place to help people find information on the Net. Programs with names like "gopher" and "Archie" kept indexes of files stored on servers connected to the Internet, and dramatically reduced the amount of time required to find programs and documents. In the late 1980s, getting serious value from the Internet meant knowing how to use gopher, Archie, Veronica and the rest.

Today, most Internet users limit their searches to the Web, so we'll limit this article to search engines that focus on the contents of Web pages.

Before a search engine can tell you where a file or document is, it must be found. To find information on the hundreds of millions of Web pages that exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites. When a spider is building its lists, the process is called Web crawling. (There are some disadvantages to calling part of the Internet the World Wide Web -- a large set of arachnid-centric names for tools is one of them.) In order to build and maintain a useful list of words, a search engine's spiders have to look at a lot of pages.

How does any spider start its travels over the Web? The usual starting points are lists of heavily used servers and very popular pages. The spider will begin with a popular site, indexing the words on its pages and following every link found within the site. In this way, the spidering system quickly begins to travel, spreading out across the most widely used portions of the Web.


Google began as an academic search engine. In the paper that describes how the system was built, Sergey Brin and Lawrence Page give an example of how quickly their spiders can work. They built their initial system to use multiple spiders, usually three at one time. Each spider could keep about 300 connections to Web pages open at a time. At its peak performance, using four spiders, their system could crawl over 100 pages per second, generating around 600 kilobytes of data each second.

Keeping everything running quickly meant building a system to feed necessary information to the spiders. The early Google system had a server dedicated to providing URLs to the spiders. Rather than depending on an Internet service provider for the domain name server (DNS) that translates a server's name into an address, Google had its own DNS, in order to keep delays to a minimum.

When the Google spider looked at an HTML page, it took note of two things:

  • The words within the page
  • Where the words were found

Words occurring in the title, subtitles, meta tags and other positions of relative importance were noted for special consideration during a subsequent user search. The Google spider was built to index every significant word on a page, leaving out the articles "a," "an" and "the." Other spiders take different approaches.

These different approaches usually attempt to make the spider operate faster, allow users to search more efficiently, or both. For example, some spiders will keep track of the words in the title, sub-headings and links, along with the 100 most frequently used words on the page and each word in the first 20 lines of text. Lycos is said to use this approach to spidering the Web.

Other systems, such as AltaVista, go in the other direction, indexing every single word on a page, including "a," "an," "the" and other "insignificant" words. The push to completeness in this approach is matched by other systems in the attention given to the unseen portion of the Web page, the meta tags. Learn more about meta tags on the next page.

Read more ...

Use Of Robot.txt file ................. !

Thursday, October 28, 2010


Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/index.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:
User-agent: *
Disallow: /
The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:
robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.
So don't try to use /robots.txt to hide information.
See also:
o Can I block just bad robots?
o Why did this robot ignore my /robots.txt?
o What are the security implications of /robots.txt?
The details
The /robots.txt is a de-facto standard, and is not owned by any standards body. There are two historical descriptions:
the original 1994 A Standard for Robot Exclusion document.
a 1997 Internet Draft specification A Method for Web Robots Control
In addition there are external resources:
HTML 4.01 specification, Appendix B.4.1
Wikipedia - Robots Exclusion Standard
The /robots.txt standard is not actively developed. See What about further development of /robots.txt? for more discussion.
The rest of this page gives an overview of how to use /robots.txt on your server, with some simple recipes. To learn more see also the FAQ.
How to create a /robots.txt file
Where to put it
The short answer: in the top-level directory of your web server.
The longer answer:
When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place.
For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt".
So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.
Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.
See also:
  • What program should I use to create /robots.txt?
  • How do I use /robots.txt on a virtual host?
  • How do I use /robots.txt on a shared host?
What to put in it The "/robots.txt" file is a text file, with one or more records. Usually contains a single record looking like this:

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /~joe/

In this example, three directories are excluded.
Note that you need a separate "Disallow" line for every URL prefix you want to exclude -- you cannot say "Disallow: /cgi-bin/ /tmp/" on a single line. Also, you may not have blank lines in a record, as they are used to delimit multiple records.

Note also that globbing and regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "User-agent: *bot*", "Disallow: /tmp/*" or "Disallow: *.gif".

What you want to exclude depends on your server. Everything not explicitly disallowed is considered fair game to retrieve. Here follow some examples:

To exclude all robots from the entire server
User-agent: *
Disallow: /

To allow all robots complete access
User-agent: *
Disallow:
(or just create an empty "/robots.txt" file, or don't use one at all)

To exclude all robots from part of the server
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

To exclude a single robot
User-agent: BadBot
Disallow: /

To allow a single robot
User-agent: Google
Disallow:

User-agent: *
Disallow: /

To exclude all files except one
This is currently a bit awkward, as there is no "Allow" field. The easy way is to put all files to be disallowed into a separate directory, say "stuff", and leave the one file in the level above this

directory:
User-agent: *
Disallow: /~joe/stuff/

Alternatively you can explicitly disallow all disallowed pages:

User-agent: *
Disallow: /~joe/junk.html
Disallow: /~joe/foo.html
Disallow: /~joe/bar.html

Read more ...

How XML sitemap is useful in Search Engine Optimization?

Wednesday, October 27, 2010
Search engine sitemaps are organized structures of websites' pages, usually useful to search engines for better content indexing. A search engine sitemap (meant only for search engines to see) is a tool provided by Google, Yahoo!, and MSN to allow webmasters to suggest how often, and in what order, a search engine should spider each page within their websites.
Google Sitemaps has become a popular tool in the world of webmasters. The service helps them continually submit fresh content to the search engine.

How to build Sitemaps?

Creating a search engine sitemap is quite simple thanks to a free online tool atXML-Sitemaps.com that will automatically spider your website and create the sitemap for you. They offer up to 500 webpages maximum spidered for free.

Alternative tools :

1. Simply create a sitemap with the free Search Engine Sitemap Generatorand upload it to your server.
2. The Google Sitemap Generator is an open source online application that can help you build a sitemap from scratch. It's a Python script that creates valid search engine sitemaps for your sites using the Sitemap protocol.

Upload the generated script (say sitemap.xml) to your Web site, and tell Google how to find it. Note the file name, and which URL you will use to find it. If you upload it to the root of your domain, then it will be http://www.seo-expert-gaurav.blogspot.com/sitemap.xml.

Now log into your Google sitemap account (Google Webmaster Tools), and point Google to the sitemap stored on your site, using the URL that you already know.

Should you get stuck at any point, feel free to browse through Google's official documentation and tutorials on the subject at Using the Sitemap Generator.

Advantages of search engine site map :

The advantage of Search Engine Sitemaps (or XML sitemaps) over a normal “page of links” sitemap is that you can:

1. Specify the priority of pages to be crawled and/or indexed.
2. Exclude lower priority pages.
3. Insure that Search Engines know about every page on your website.

Until today, only Google, Yahoo and MSN supported this protocol. Today however, there is a new member to the family. Ask.com.

No longer! Vanessa announced that all Search Engines have now agreed to accept Sitemap submissions through the robots.txt file on your server.

The robots.txt file is a Search Engine industry standard file that is the VERY FIRST file a LEGIT Search Engine will view when it first comes to your website. Now, you can simply add your sitemap URL to this file in the form of:
Sitemap: http://www.seo-expert-gaurav.blogspot.com/sitemap.xml

Simply create a sitemap with the free Search Engine Sitemap Generator and upload it to your server. Then open the robots.txt file on your server and add the address as above. This makes it as simple as ever to insure that all Search Engines know about your site, know about what pages are in your site and know what pages of your site to list in the search results.

How to Check Your Google Sitemap Reports :

Google will identify any errors in your site and publish the results to you in the form of a report.
Setps to check sitemap report:
1. Visit the Google Web master tools section of the site. This can be found at www.google.com/webmasters.
2. Add your site to Google if you haven't already done so.
3. Verify that you are the site's owner by either uploading an HTML file to your site or adding a Meta tag.
4. View the statistics and report that Google has already generated about your sitemap. It details the last time the spider went to your page. If your site is not new, then chances are Google has already crawled your site.
5. Change the option to "Enable Google Page Rank" when prompted in the install process of the program. Then hit Finish and wait.
6. Click the Add sitemap link to create a new Google sitemap.
7. Enter your sitemap to tell Google all about your pages.
8. Visit again to view the reports on your pages.

Read more ...

What is back link or inbound links ??

Friday, October 22, 2010
Back links are link to a website or web page. Inbound links were originally important (prior to the emergence of search engines) as a primary means of web navigation; today their significance lies in search engine optimization (SEO). The number of backlinks is one indication of the popularity or importance of that website or page (for example, this is use by Google to determine the PageRank of a webpage). Outside of SEO, the backlinks of a webpage may be of significant personal, cultural or semantic interest: they indicate who is paying attention to that page.
In basic link terminology, a backlink is any link received by a web node (web page, directory, website, or top level domain) from another web node [1]. Backlinks are also known asincoming links, inbound links, inlinks, and inward links.
Contents
[hide]
1 Search engine rankings
2 Technical
3 See also
4 References

[edit]Search engine rankings
Search engines often use the number of backlinks that a website has as one of the most important factors for determining that website's search engine ranking, popularity and importance. Google's description of their PageRank system, for instance, notes that Google interprets a link from page A to page B as a vote, by page A, for page B.[2] Knowledge of this form of search engine rankings has fueled a portion of the SEO industry commonly termed linkspam, where a company attempts to place as many inbound links as possible to their site regardless of the context of the originating site.
Websites often employ various techniques (called search engine optimization, usually shortened to SEO) to increase the number of backlinks pointing to their website. Some methods are free for use by everyone whereas some methods like linkbaiting requires quite a bit of planning and marketing to work. Some websites stumble upon "linkbaiting" naturally; the sites that are the first with a tidbit of 'breaking news' about a celebrity are good examples of that. When "linkbait" happens, many websites will link to the 'baiting' website because there is information there that is of extreme interest to a large number of people.
There are several factors that determine the value of a backlink. Backlinks from authoritative sites on a given topic are highly valuable.[3] If both sites have content geared toward the keyword topic, the backlink is considered relevant and believed to have strong influence on the search engine rankings of the webpage granted the backlink. A backlink represents a favorable 'editorial vote' for the receiving webpage from another granting webpage. Another important factor is the anchor text of the backlink. Anchor text is the descriptive labeling of the hyperlink as it appears on a webpage. Search engine bots (i.e., spiders, crawlers, etc.) examine the anchor text to evaluate how relevant it is to the content on a webpage. Anchor text and webpage content congruency are highly weighted in search engine results page (SERP) rankings of a webpage with respect to any given keyword query by a search engine user.
Increasingly, inbound links are being weighed against link popularity and originating context. This transition is reducing the notion of one link, one vote in SEO, a trend proponents[who?]hope will help curb linkspam as a whole.
It should also be noted that building too many Backlinks over a short period of time can get a website's ranking penalized, and in extreme cases, the website is de-indexed altogether. Anything above a couple of hundred a day is considered "dangerous".
[edit]Technical
When HTML (Hyper Text Markup Language) was designed, there was no explicit mechanism in the design to keep track of backlinks in software, as this carried additional logistical and network overhead.
Some website software internally keeps track of backlinks. Examples of this include most wiki and CMS software.
Most commercial search engines provide a mechanism to determine the number of backlinks they have recorded to a particular web page. For example, Google can be searched usinglink:wikipedia.org to find the number of pages on the Web pointing to http://wikipedia.org/. Google only shows a small fraction of the number of links pointing to a site. It credits many more backlinks than it shows for each website.
Other mechanisms have been developed to track backlinks between disparate webpages controlled by organizations that aren't associated with each other. The most notable example of this is TrackBacks between blogs.

Read more ...

Page Rank Based On Popularity

Saturday, October 16, 2010

The web search technology offered by Google is often the technology of choice of the world’s leading portals and websites. It has also benefited the advertisers with its unique advertising program that does not hamper the web surfing experience of its users but still brings revenues to the advertisers.


When you search for a particular keyword or a phrase, most of the search engines return a list of page in order of the number of times the keyword or phrase appears on the website. Google web search technology involves the use of its indigenously designed Page Rank Technology and hypertext-matching analysis which makes several instantaneous calculations undertaken without any human intervention. Google’s structural design also expands simultaneously as the internet expands.


Page Rank technology involves the use of an equation which comprises of millions of variables and terms and determines a factual measurement of the significance of web pages and is calculated by solving an equation of 500 million variables and more than 3 billion terms. Unlike some other search engines, Google does not calculate links, but utilizes the extensive link structure of the web as an organizational tool. When the link to a Page, let’s say Page B is clicked from a Page A, then that click is attributed as a vote towards Page B on behalf of Page A.




Read more ...

GOOGLE Algorithm Is Key

Saturday, October 16, 2010

Google has a comprehensive and highly developed technology, a straightforward interface and a wide-ranging array of search tools which enable the users to easily access a variety of information online.

Google users can browse the web and find information in various languages, retrieve maps, stock quotes and read news, search for a long lost friend using the phonebook listings available on Google for all of US cities and basically surf the 3 billion odd web pages on the internet! Google boasts of having world’s largest archive of Usenet messages, dating all the way back to 1981. Google’s technology can be accessed from any conventional desktop PC as well as from various wireless platforms such as WAP and i-mode phones, handheld devices and other such Internet equipped gadgets.

Read more ...

Ultimate Benefits of SEO

Saturday, October 16, 2010

Selecting search engine optimization for promoting your business on the internet, one must knows the ultimate benefits of seo campaign.

Perspective (Global / Regional)
Selecting your keywords or phrases to target your audience, search engine optimization ensures that you and your company are found globally or regionally by those who require exactly what you offer. SEO has many benefits for any organization which wants to reach all potential customers locally or globally. You can reach the targeted customers of your own choice.

Targeted Traffic
Search engine optimization campaign can increase the number of visitors for your website for the targeted keyword(s) or phrase. Converting those visitors into potential customers is one of the arts of search engine optimization. Search engine optimization is the only campaign which can derive targeted traffic through your website. Essentially more targeted traffic equal more sales.

Increase Visibility
Once a website has been optimized, it will increase the visibility of your website in search engines. More people will visit your website and it will give international recognition to your products/services.

High ROI (Return on Investment)
An effective SEO campaign can bring a higher return on your investment than any other type of marketing for your company. This will therefore increase your volume of sales and profit overall.

Long term positioning
Once a website obtains position through a SEO campaign, it should stay there for long term as opposed to PPC (Pay Per Click). SEO is a cheaper and long term solution than any other search engine marketing strategy.

Cost-effective
One of the great benefits of search engine optimization is that it is cost effective and requires the minimum amount of capital for the maximum exposure of your website.

Flexibility
It is possible to reach an audience of your own choice through a seo campaign. You can get traffic according to the organizational strategy to meet the needs and requirements of your choice.

Measurable results
It is a unique quality of seo campaigns that you can quantify the results of seo by positioning reports of search engines, visitor conversion and the other factors of this nature.

Read more ...

Types of Search Engines

Friday, October 15, 2010
Three Types of Search Engines

The term "search engine" is often used generically to describe crawler-based search engines, human-powered directories, and hybrid search engines. These types of search engines gather their listings in different ways, through crawler-based searches, human-powered directories, and hybrid searches.

Crawler-based search engines

Crawler-based search engines, such as Google (http://www.google.com), create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found. If web pages are changed, crawler-based search engines eventually find these changes, and that can affect how those pages are listed. Page titles, body copy and other elements all play a role.
The life span of a typical web query normally lasts less than half a second, yet involves a number of different steps that must be completed before results can be delivered to a person seeking information. The following graphic (Figure 1) illustrates this life span (from http://www.google.com/corporate/tech.html):
1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book - it tells which pages contain the words that match the query.
2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result.

3. The search results are returned to the user in a fraction of a second.

Human-powered directories

A human-powered directory, such as the Open Directory Project (http://www.dmoz.org/about.html) depends on humans for its listings. (Yahoo!, which used to be a directory, now gets its information from the use of crawlers.) A directory gets its information from submissions, which include a short description to the directory for the entire site, or from editors who write one for sites they review. A search looks for matches only in the descriptions submitted. Changing web pages, therefore, has no effect on how they are listed. Techniques that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

Hybrid search engines

Today, it is extremely common for crawler-type and human-powered results to be combined when conducting a search. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search (http://www.imagine-msn.com/search/tour/moreprecise.aspx) is more likely to present human-powered listings from LookSmart (http://search.looksmart.com/). However, it also presents crawler-based results, especially for more obscure queries.

Read more ...