2012

What’s “webspam,” as Google calls it, or search spam? Pages that try to gain better rankings through things like:

Tuesday, July 31, 2012

Keyword stuffing
Link schemes
Cloaking, “sneaky” redirects or “doorway” pages
Purposeful duplicate content

Keyword stuffing

"Keyword stuffing" refers to the practice of loading a webpage with keywords in an attempt to manipulate a site's ranking in Google's search results. Filling pages with keywords results in a negative user experience, and can harm your site's ranking. Focus on creating useful, information-rich content that uses keywords appropriately and in context.

To fix this problem, review your site for misused keywords. Typically, these will be lists or paragraphs of keywords, often randomly repeated. Check carefully, because keywords can often be in the form of hidden text, or they can be hidden in title tags or alt attributes.

Once you've made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.

Link schemes

Your site's ranking in Google search results is partly based on analysis of those sites that link to you. The quantity, quality, and relevance of links count towards your rating. The sites that link to you can provide context about the subject matter of your site, and can indicate its quality and popularity. However, some webmasters engage in link exchange schemes and build partner pages exclusively for the sake of cross-linking, disregarding the quality of the links, the sources, and the long-term impact it will have on their sites. This is in violation of Google's Webmaster Guidelines and can negatively impact your site's ranking in search results. Examples of link schemes can include:
Links intended to manipulate PageRank
Links to web spammers or bad neighborhoods on the web
Excessive reciprocal links or excessive link exchanging ("Link to me and I'll link to you.")

Buying or selling links that pass PageRank

The best way to get other sites to create relevant links to yours is to create unique, relevant content that can quickly gain popularity in the Internet community. The more useful content you have, the greater the chances someone else will find that content valuable to their readers and link to it. Before making any single decision, you should ask yourself the

question: Is this going to be beneficial for my page's visitors?

It is not only the number of links you have pointing to your site that matters, but also the quality and relevance of those links. Creating good content pays off: Links are usually editorial votes given by choice, and the buzzing blogger community can be an excellent place to generate interest.

Once you've made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.

Cloaking, sneaky Javascript redirects, and doorway pages

Cloaking

What is cloaking?

Cloaking

Cloaking refers to the practice of presenting different content or URLs to users and search engines. Serving up different results based on user-agent may cause your site to be perceived as deceptive and removed from the Google index.

Some examples of cloaking include:
Serving a page of HTML text to search engines, while showing a page of images or Flash to users.

Serving different content to search engines than to users.
If your site contains elements that aren't crawlable by search engines (such as rich media files other than Flash, JavaScript, or images), you shouldn't provide cloaked content to search engines. Rather, you should consider visitors to your site who are unable to view these elements as well. For instance:
Provide alt text that describes images for visitors with screen readers or images turned off in their browsers.

Provide the textual contents of JavaScript in a noscript tag.
Ensure that you provide the same content in both elements (for instance, provide the same text in the JavaScript as in the noscript tag). Including substantially different content in the alternate element may cause Google to take action on the site.

Sneaky JavaScript redirects

When Googlebot indexes a page containing JavaScript, it will index that page but it may not follow or index any links hidden in the JavaScript itself. Use of JavaScript is an entirely legitimate web practice. However, use of JavaScript with the intent to deceive search engines is not. For instance, placing different text in JavaScript than in a noscript tag violates our Webmaster Guidelines because it displays different content for users (who see the JavaScript-based text) than for search engines (which see the noscript-based text). Along those lines, it violates the Webmaster Guidelines to embed a link in JavaScript that redirects the user to a different page with the intent to show the user a different page than the search engine sees. When a redirect link is embedded in JavaScript, the search engine indexes the original page rather than following the link, whereas users are taken to the redirect target. Like cloaking, this practice is deceptive because it displays different content to users and to Googlebot, and can take a visitor somewhere other than where they intended to go.

Note that placement of links within JavaScript is alone not deceptive. When examining JavaScript on your site to ensure your site adheres to our guidelines, consider the intent.
Keep in mind that since search engines generally can't access the contents of JavaScript, legitimate links within JavaScript will likely be inaccessible to them (as well as to visitors without Javascript-enabled browsers). You might instead keep links outside of JavaScript or replicate them in a noscript tag.

Doorway pages

Doorway pages are typically large sets of poor-quality pages where each page is optimized for a specific keyword or phrase. In many cases, doorway pages are written to rank for a particular phrase and then funnel users to a single destination.

Whether deployed across many domains or established within one domain, doorway pages tend to frustrate users, and are in violation of our Webmaster Guidelines.

Google's aim is to give our users the most valuable and relevant search results. Therefore, we frown on practices that are designed to manipulate search engines and deceive users by directing them to sites other than the ones they selected, and that provide content solely for the benefit of search engines. Google may take action on doorway sites and other sites making use of these deceptive practice, including removing these sites from the Google index.

If your site has been removed from our search results, review our Webmaster Guidelines for more information. Once you've made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.

Duplicate content

Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include:

Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices

Store items shown or linked via multiple distinct URLs
Printer-only versions of web pages

If your site contains multiple pages with largely identical content, there are a number of ways you can indicate your preferred URL to Google. (This is called "canonicalization".) More information about canonicalization.

However, in some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results.

Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a "regular" and "printer" version of each article, and neither of these is blocked with a noindex meta tag, we'll choose one of them to list. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.

There are some steps you can take to proactively address duplicate content issues, and ensure that visitors see the content you want them to.
Use 301s: If you've restructured your site, use 301

redirects ("RedirectPermanent") in your .htaccess file to smartly redirect users, Googlebot, and other spiders. (In Apache, you can do this with an .htaccess file; in IIS, you can do this through the administrative console.)

Be consistent: Try to keep your internal linking consistent.

For example, don't link to http://www.example.com

/page/ and http://www.example.com/page andhttp://www.example.com/page/index.htm.

Use top-level domains:

To help us serve the most appropriate version of a document, use top-level domains whenever possible to handle country-specific content. We're more likely to know thathttp://www.example.de contains Germany-focused content, for instance,

than http://www.example.com/de or http://de.example.com.

Syndicate carefully:

If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you'd prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex meta tag to prevent search engines from indexing their version of the content.

Use Webmaster Tools to tell us how you prefer your site to be indexed: You can tell Google your preferred domain (for example, http://www.example.com or http://example.com).

Minimize boilerplate repetition: For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details. In addition, you can use the Parameter Handling tool to specify how you would like Google to treat URL parameters.

Avoid publishing stubs: Users don't like seeing "empty" pages, so avoid placeholders where possible. For example, don't publish pages for which you don't yet have real content. If you do create placeholder pages, use the noindex meta tag to block these pages from being indexed.

Understand your content management system: Make sure you're familiar with how content is displayed on your web site. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on the home page of a blog, in an archive page, and in a page of other entries with the same label.

Minimize similar content: If you have many pages that are similar, consider expanding each page or consolidating the pages into one. For instance, if you have a travel site with separate pages for two cities, but the same information on both pages, you could either merge the pages into one page about both cities or you could expand each page to contain unique content about each city.

Google does not recommend blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to us crawling too much of your website, you can alsoadjust the crawl rate setting in Webmaster Tools.

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

However, if our review indicated that you engaged in deceptive practices and your site has been removed from our search results, review your site carefully. If your site has been removed from our search results, review our Webmaster Guidelines for more information. Once you've made your changes and are confident that your site no longer violates our guidelines, submit your site for reconsideration.

In rare situations, our algorithm may select a URL from an external site that is hosting your content without your permission. If you believe that another site is duplicating your content in violation of copyright law, you may contact the site’s host to request removal. In addition, you can request that Google remove the infringing page from our search results by filing a request under the Digital Millennium Copyright Act.

What is a Blogroll?

Thursday, May 3, 2012

How Bloggers Use Blogrolls to Boost Traffic to Their Blogs

A blogroll is a list of links to blogs that the blogger likes. A blogroll is usually included in the blog's sidebar. Some bloggers divide their blogrolls into categories. For example, a blogger who writes about cars could divide his blogroll up into categories for links to other blogs he writes, other blogs about cars and other blogs he likes about unrelated topics. The blogroll can be set up based on each blogger's personal preferences, and it can be updated at any time.

Blogroll Etiquette

It's an unwritten rule in the blogosphere that if a blogger puts a link to your blog in his or her blogroll, you should reciprocate and add that blog's link in your own blogroll. Of course, each blogger approaches this with their own blogging goals in mind. Sometimes you may not like a blog that links to you through its blogroll. There are many reasons why you may decide not to reciprocate a blogroll link, but it's good blogging etiquette to at least review each blog that links to you through its blogroll to determine if you'd like to add that blog to your own blogroll or not.

Blogrolls as Blog Traffic Boosters

Blogrolls are great traffic driving tools. With each blogroll that your blog is listed on comes the possibility that readers of that blog will click on your link and visit your blog. Blogrolls equate to publicity and exposure across the blogosphere. Additionally, blogs with many incoming links (particularly those from high quality blogs as rated byGoogle page rank or Technorati authority), are usually ranked higher by search engines, which can bring additional traffic to your blog.

By:

Susan Gunelius,

About.com Guide

Pages With Too Many Ads “Above The Fold” Now Penalized By Google’s “Page Layout” Algorithm

Friday, April 27, 2012

Do you shove lots of ads at the top of your web pages? Think again. Tired of doing a Google search and landing on these types of pages? Rejoice. Google has announced that it will penalize sites with pages that are top-heavy with ads.

Top Heavy With Ads? Look Out!

The change — called the “page layout algorithm” — takes direct aim at any site with pages where content is buried under tons of ads.
From Google’s post on its Inside Search blog today:

We’ve heard complaints from users that if they click on a result and it’s difficult to find the actual content, they aren’t happy with the experience. Rather than scrolling down the page past a slew of ads, users want to see content right away.
So sites that don’t have much content “above-the-fold” can be affected by this change. If you click on a website and the part of the website you see first either doesn’t have a lot of visible content above-the-fold or dedicates a large fraction of the site’s initial screen real estate to ads, that’s not a very good user experience.
Such sites may not rank as highly going forward.

Google also posted the same information to its Google Webmaster Central blog.
Sites using pop-ups, pop-unders or overlay ads are not impacted by this. It only applies to static ads in fixed positions on pages themselves, Google told me.

How Much Is Too Much?

How can you tell if you’ve got too many ads above-the-fold? When I talked with the head of Google’s web spam team, Matt Cutts, he said that Google wasn’t going to provide any type of official tools similar to how it provides tools to tell if your site is too slow (site speed is another ranking signal).
Instead, Cutts told me that Google is encouraging people to make use of its Google Browser Size tool or similar tools to understand how much of a page’s content (as opposed to ads) is visible at first glance to visitors under various screen resolutions.
But how far down the page is too far? That’s left to the publisher to decide for themselves. However, the blog post stresses the change should only hit pages with an abnormally large number of ads above-the-fold, compared to the web as a whole:

We understand that placing ads above-the-fold is quite common for many websites; these ads often perform well and help publishers monetize online content.
This algorithmic change does not affect sites who place ads above-the-fold to a normal degree, but affects sites that go much further to load the top of the page with ads to an excessive degree or that make it hard to find the actual original content on the page.
This new algorithmic improvement tends to impact sites where there is only a small amount of visible content above-the-fold or relevant content is persistently pushed down by large blocks of ads.

Impacts Less Than 1% Of Searches

Clearly, you’re in trouble if you have little-to-no content showing above the fold for commonly-used screen resolutions. You’ll know you’re in trouble shortly, because the change is now going into effect. If you suddenly see a drop in traffic today, and you’re heavy on the ads, chances are you’ve been hit by the new algorithm.
For those ready to panic, Cutts told me the change will impact less than 1% of Google’s searches globally, which today’s post also stresses.

Fixed Your Ads? Penalty Doesn’t Immediately Lift

What happens if you’re hit? Make changes, then wait a few weeks.
Similar to how last year’s Panda Update works, Google is examining sites it finds and effectively tagging them as being too ad-heavy or not. If you’re tagged that way, you get a ranking decrease attached to your entire site (not just particular pages) as part of today’s launch.
If you reduce ads above-the-fold, the penalty doesn’t instantly disappear. Instead, Google will make note of it when it next visits your site. But it can take several weeks until Google’s “push” or “update” until the new changes it has found are integrated into its overall ranking system, effectively removing penalties from sites that have changed and adding them to new ones that have been caught.
Google’s post explains this more:

If you decide to update your page layout, the page layout algorithm will automatically reflect the changes as we re-crawl and process enough pages from your site to assess the changes.
How long that takes will depend on several factors, including the number of pages on your site and how efficiently Googlebot can crawl the content.
On a typical website, it can take several weeks for Googlebot to crawl and process enough pages to reflect layout changes on the site.

Our Why Google Panda Is More A Ranking Factor Than Algorithm Update article explains the situation with Panda, and how it took time between when publishers made changes to remove “thin” content to when they were restored to Google’s good graces. That process is just as applicable to today’s change, even though Panda itself now has much less flux.

Meanwhile, Google AdSense Pushes Ads

Ironically, on the same day that Google’s web search team announced this change, I received this message from Google’s AdSense team encouraging me to put more ads on my site:

This was in relation to my personal blog, Daggle. The image in the email suggests that Google thinks content pretty much should be surrounded by ads.

Of course, if you watch the video that Google refers me (and others) to in the email, it promotes careful placement, that user experience be considered and, at one point, shows a page top-heavy with ads as something that shouldn’t be done.

Still, it’s not hard to easily find sites using Google’s own AdSense ads that are definitely pushing content down as far down on their pages as they can or trying to hide it. Those pages, AdSense or not, are subject to the new rules, Cutts said.

Pages Ad-Heavy, But Not Top-Heavy With Ads, May Escape

As a searcher, I’m happy with the change. But it might not be perfect. For example, here’s something I tweeted about last year:

Yes, that’s my finger being used as an arrow. I was annoyed that to find the actual download link I was after was surrounded by AdSense-powered ads telling me to download other stuff.

This particular site was heavily used by kids who might easily click on an ad by mistake. That’s potentially bad ROI for those advertisers. Heck, as net-savvy adult, I found it a challenge.

But the problem here wasn’t that the content was pushed “below the fold” by ads. It was that the ratio of ads was so high in relation to the content (a single link), plus the misleading nature of the ads around the content.

Are Google’s Own Search Results Top Heavy?

Another issue is that ads on Google’s own search results pages push the “content” — the unpaid editorial listings — down toward the bottom of the page. For example, here’s exactly what’s visible on my MacBook Pro’s 1680×1050 screen:

(Side note, that yellow color around the ads in the screenshot? It’s much darker in the screenshot than what I see with my eyes. In reality, the color is so washed-out that it might as well be invisible. That’s something some have felt has been deliberately engineered by Google to make ads less noticeable as ads).

The blue box surrounds the content, the search listings that lead you to actual merchants selling trash cans, in this example. Some may argue that the Google shopping results box is further pushing down the “real content” of listings that lead out of Google. But the shopping results themselves do lead you to external merchants, so I consider them to be content.

The example above is pretty extreme, showing the maximum of three ads that Google will ever show above its search results (with a key exception, below). Even then, there’s content visible, with it making up around half the page or more, if you include the Related Searches area as content.

My laptop’s screen resolution is pretty high, of course. Others would see less (Google’s Browser Size tool doesn’t work to measure its own search results pages). But you can expect Google will take “do as I say, not as I do” criticism on this issue.

Indeed, I shared this story initially with the main details, then started working on this section. After that was done, I could see this type of criticism already happening, both in the comments or over on my Google+ post and Facebook post about the change.

Here’s a screenshot that Daniel Weadley shared in my Google+ post about what he sees on his netbook:

In this example, Google’s doing a rare display of four ads. That’s because it’s showing the maximum of three regular ads it will show with a special Comparison Ads unit on top of those. And that will just add fuel to criticisms that if Google is taking aim at pages top-heavy with ads, it might need to also look closer to home.

NOTE: About three hours after I wrote this, Google clearly saw the criticisms about ads on its own search results pages and sent this statement:

This is a site-based algorithm that looks at all the pages across an entire site in aggregate. Although it’s possible to find a few searches on Google that trigger many ads, it’s vastly more common to have no ads or few ads on a page.
Again, this algorithm change is designed to demote sites that make it difficult for a user to get to the content and offer a bad user experience.
Having an ad above-the-fold doesn’t imply that you’re affected by this change. It’s that excessive behavior that we’re working to avoid for our users.

Algorithms? Signals?

Does all this talk about ranking signals and algorithms have you confused? Our video below explains briefly how a search engine’s algorithm works to rank web pages:

Also see our Periodic Table Of SEO Ranking Factors, which explains some of the other ranking signals that Google uses in its algorithm:

The Periodic Table Of SEO Ranking Factors

Name The Update & More Info

Today’s change is a new, significant ranking factor for our table, one we’ll add in a future update, probably as Va, for “Violation, Ad-Heavy site.”

Often when Google rolls out new algorithms, it gives them names. Last year’s Panda Update was a classic example of this. But Google’s not given one to this update (I did ask). It’s just being called the “page layout algorithm.”

Boring. Unhelpful for easy reference. If you’d like to brainstorm a name, visit our posts on Google+ and on Facebook, where we’re asking for ideas.

Now for the self-interested closing. You can bet this will be a big topic of discussion at our upcoming SMX West search marketing conference at the end of next month, especially on the Ask The Search Engines panel. So check out our full agenda and consider attending.

Postscript: Some have been asking in the comments about how Google knows what an ad is. I asked, and here’s what Google said:

We have a variety of signals that algorithmically determine what type of ad or content appears above the fold, but no further details to share. It is completely algorithmic in its detection–we don’t use any sort of hard-coded list of ad providers.

The Penguin Update: Google’s Webspam Algorithm Gets Official Name

Friday, April 27, 2012

Move over Panda, there’s a new Google update in town: Penguin. That’s the official name Google has given to the webspam algorithm that it released on Tuesday.

What’s An Update?

For those unfamiliar with Google updates, I’d recommend reading my Why Google Panda Is More A Ranking Factor Than Algorithm Update post from last year. It explains how Google has a variety of algorithms used to rank pages.
Google periodically changes these algorithms. When this happens, that’s known as an “update,” which in turn has an impact on the search results we get. Sometimes the updates have a big impact; sometimes they’re hardly noticed.

Who Names Updates?

Google also periodically creates new algorithms. When this happens, sometimes they’re given names by Google itself, as with the Vince update in 2009. If Google doesn’t give a name, sometimes others such as Webmaster World may name them, as with the Mayday update in 2010.
With Penguin, history is repeating itself, where Google is belatedly granting a name to an update after-the-fact. The same thing happened with Panda last year.
When the Panda Update was first launched in February 2011, Google didn’t initially release the name it was using internally. I knew it, but I wasn’t allowed say what it was. Without an official name, I gave it an unofficial one of “Farmer,” since one of the reasons behind the update was to combat low-quality content that was often seen associated with content farms.
In the end, I suspect Google didn’t want the update to sound like it was especially aimed at content farms, so it eventually let the “Panda” name go public, in a Steven Levy interview for Wired about the update about a week after it launched. Panda took its name from one of the key engineers involved.

Say Hello To Penguin

Since Panda, Google’s been avoiding names. The new algorithm in January designed to penalize pages with too many ads above the fold was called the “page layout algorithm.” When Penguin rolled out earlier this week, it was called the “webspam algorithm update.”
Without a name for the new webspam algorithm, Search Engine Land was asking people for their own ideas at Google+ and Facebook, with the final vote making “Titanic” the leading candidate. A last check with Google got it to release its own official name of “Penguin.”

Search quality highlights: 40 changes for February

Friday, March 2, 2012

This month we have many improvements to celebrate. With 40 changes reported, that marks a new record for our monthly series on search quality. Most of the updates rolled out earlier this month, and a handful are actually rolling out today and tomorrow. We continue to improve many of our systems, including related searches, sitelinks, autocomplete, UI elements, indexing, synonyms, SafeSearch and more. Each individual change is subtle and important, and over time they add up to a radically improved search engine.

Here’s the list for February:

More coverage for related searches. [launch codename “Fuzhou”] This launch brings in a new data source to help generate the “Searches related to” section, increasing coverage significantly so the feature will appear for more queries. This section contains search queries that can help you refine what you’re searching for.
Tweak to categorizer for expanded sitelinks. [launch codename “Snippy”, project codename “Megasitelinks”] This improvement adjusts a signal we use to try and identify duplicate snippets. We were applying a categorizer that wasn’t performing well for our expanded sitelinks, so we’ve stopped applying the categorizer in those cases. The result is more relevant sitelinks.
Less duplication in expanded sitelinks. [launch codename “thanksgiving”, project codename “Megasitelinks”] We’ve adjusted signals to reduce duplication in the snippets forexpanded sitelinks. Now we generate relevant snippets based more on the page content and less on the query.
More consistent thumbnail sizes on results page. We’ve adjusted the thumbnail size for most image content appearing on the results page, providing a more consistent experience across result types, and also across mobile and tablet. The new sizes apply to rich snippet results for recipes and applications, movie posters, shopping results, book results, news results and more.
More locally relevant predictions in YouTube. [project codename “Suggest”] We’ve improved the ranking for predictions in YouTube to provide more locally relevant queries. For example, for the query [lady gaga in ] performed on the US version of YouTube, we might predict [lady gaga in times square], but for the same search performed on the Indian version of YouTube, we might predict [lady gaga in India].
More accurate detection of official pages. [launch codename “WRE”] We’ve made an adjustment to how we detect official pages to make more accurate identifications. The result is that many pages that were previously misidentified as official will no longer be.
Refreshed per-URL country information. [Launch codename “longdew”, project codename “country-id data refresh”] We updated the country associations for URLs to use more recent data.
Expand the size of our images index in Universal Search. [launch codename “terra”, project codename “Images Universal”] We launched a change to expand the corpus of results for which we show images in Universal Search. This is especially helpful to give more relevant images on a larger set of searches.
Minor tuning of autocomplete policy algorithms. [project codename “Suggest”] We have a narrow set of policies for autocomplete for offensive and inappropriate terms. This improvement continues to refine the algorithms we use to implement these policies.
“Site:” query update [launch codename “Semicolon”, project codename “Dice”] This change improves the ranking for queries using the “site:” operator by increasing the diversity of results.
Improved detection for SafeSearch in Image Search. [launch codename "Michandro", project codename “SafeSearch”] This change improves our signals for detecting adult content in Image Search, aligning the signals more closely with the signals we use for our other search results.
Interval based history tracking for indexing. [project codename “Intervals”] This improvement changes the signals we use in document tracking algorithms.
Improvements to foreign language synonyms. [launch codename “floating context synonyms”, project codename “Synonyms”] This change applies an improvement we previously launched for English to all other languages. The net impact is that you’ll more often find relevant pages that include synonyms for your query terms.
Disabling two old fresh query classifiers. [launch codename “Mango”, project codename “Freshness”] As search evolves and new signals and classifiers are applied to rank search results, sometimes old algorithms get outdated. This improvement disables two old classifiers related to query freshness.
More organized search results for Google Korea. [launch codename “smoothieking”, project codename “Sokoban4”] This significant improvement to search in Korea better organizes the search results into sections for news, blogs and homepages.
Fresher images. [launch codename “tumeric”] We’ve adjusted our signals for surfacing fresh images. Now we can more often surface fresh images when they appear on the web.
Update to the Google bar. [project codename “Kennedy”] We continue to iterate in our efforts to deliver a beautifully simple experience across Google products, and as part of that this month we made further adjustments to the Google bar. The biggest change is that we’ve replaced the drop-down Google menu in the November redesign with a consistent and expanded set of links running across the top of the page.
Adding three new languages to classifier related to error pages. [launch codename "PNI", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as “soft 404s”), pages that return valid text to a browser but the text only contain error messages, such as “Page not found.” It’s rare that a user will be looking for such a page, so it’s important we be able to detect them. This change extends a particular classifier to Portuguese, Dutch and Italian.
Improvements to travel-related searches. [launch codename “nesehorn”] We’ve made improvements to triggering for a variety of flight-related search queries. These changes improve the user experience for our Flight Search feature with users getting more accurate flight results.
Data refresh for related searches signal. [launch codename “Chicago”, project codename “Related Search”] One of the many signals we look at to generate the “Searches related to” section is the queries users type in succession. If users very often search for [apple] right after [banana], that’s a sign the two might be related. This update refreshes the model we use to generate these refinements, leading to more relevant queries to try.
International launch of shopping rich snippets. [project codename “rich snippets”]Shopping rich snippets help you more quickly identify which sites are likely to have the most relevant product for your needs, highlighting product prices, availability, ratings and review counts. This month we expanded shopping rich snippets globally (they were previously only available in the US, Japan and Germany).
Improvements to Korean spelling. This launch improves spelling corrections when the user performs a Korean query in the wrong keyboard mode (also known as an "IME", or input method editor). Specifically, this change helps users who mistakenly enter Hangul queries in Latin mode or vice-versa.
Improvements to freshness. [launch codename “iotfreshweb”, project codename “Freshness”] We’ve applied new signals which help us surface fresh content in our results even more quickly than before.
Web History in 20 new countries. With Web History, you can browse and search over your search history and webpages you've visited. You will also get personalized search results that are more relevant to you, based on what you’ve searched for and which sites you’ve visited in the past. In order to deliver more relevant and personalized search results, we’ve launched Web History in Malaysia, Pakistan, Philippines, Morocco, Belarus, Kazakhstan, Estonia, Kuwait, Iraq, Sri Lanka, Tunisia, Nigeria, Lebanon, Luxembourg, Bosnia and Herzegowina, Azerbaijan, Jamaica, Trinidad and Tobago, Republic of Moldova, and Ghana. Web History is turned on only for people who have a Google Account and previously enabled Web History.
Improved snippets for video channels. Some search results are links to channels with many different videos, whether on mtv.com, Hulu or YouTube. We’ve had a feature for a while now that displays snippets for these results including direct links to the videos in the channel, and this improvement increases quality and expands coverage of these rich “decorated” snippets. We’ve also made some improvements to our backends used to generate the snippets.
Improvements to ranking for local search results. [launch codename “Venice”] This improvement improves the triggering of Local Universal results by relying more on the ranking of our main search results as a signal.
Improvements to English spell correction. [launch codename “Kamehameha”] This change improves spelling correction quality in English, especially for rare queries, by making one of our scoring functions more accurate.
Improvements to coverage of News Universal. [launch codename “final destination”] We’ve fixed a bug that caused News Universal results not to appear in cases when our testing indicates they’d be very useful.
Consolidation of signals for spiking topics. [launch codename “news deserving score”, project codename “Freshness”] We use a number of signals to detect when a new topic is spiking in popularity. This change consolidates some of the signals so we can rely on signals we can compute in realtime, rather than signals that need to be processed offline. This eliminates redundancy in our systems and helps to ensure we can continue to detect spiking topics as quickly as possible.
Better triggering for Turkish weather search feature. [launch codename “hava”] We’ve tuned the signals we use to decide when to present Turkish users with the weather search feature. The result is that we’re able to provide our users with the weather forecast right on the results page with more frequency and accuracy.
Visual refresh to account settings page. We completed a visual refresh of the account settings page, making the page more consistent with the rest of our constantly evolving design.
Panda update. This launch refreshes data in the Panda system, making it more accurate and more sensitive to recent changes on the web.
Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.
SafeSearch update. We have updated how we deal with adult content, making it more accurate and robust. Now, irrelevant adult content is less likely to show up for many queries.
Spam update. In the process of investigating some potential spam, we found and fixed some weaknesses in our spam protections.
Improved local results. We launched a new system to find results from a user’s city more reliably. Now we’re better able to detect when both queries and documents are local to the user.

And here are a few more changes we’ve already blogged about separately:

Posted by Amit Singhal, Senior VP and Google Fellow

Google Algorithm Updates Announced: Panda Gets More Sensitive

Friday, March 2, 2012

Will Panda now be even pickier about your content?

I wasn’t expecting this to come until early March, since the month isn’t even over yet, but Google has gone ahead and released its monthly list of updates: 40 changes for February.

While we’ll take a deeper look into the list soon, it’s worth noting right off the bat that there is a Panda update listed. Late last week, in light of Panda’s one-year anniverary, I asked Google if the Panda adjustment from January’s list had been the most recent adjustment to Panda. The response I received from a spokesperson was:

“We improved how Panda interacts with our indexing and ranking systems, making it more integrated into our pipelines. We also released a minor update to refresh the data for Panda.”

This was basically what the company said in January. Now, in today’s list for February, Google says:

“This launch refreshes data in the Panda system, making it more accurate and more sensitive to recent changes on the web.”

So between January’s and February’s Panda news, it sounds like Panda is more ingrained into how Google indexes the web than ever before, and may even be pickier about quality.

Here’s the full list in Google’s words:

More coverage for related searches. [launch codename “Fuzhou”] This launch brings in a new data source to help generate the “Searches related to” section, increasing coverage significantly so the feature will appear for more queries. This section contains search queries that can help you refine what you’re searching for.

Tweak to categorizer for expanded sitelinks. [launch codename “Snippy”, project codename “Megasitelinks”] This improvement adjusts a signal we use to try and identify duplicate snippets. We were applying a categorizer that wasn’t performing well for our expanded sitelinks, so we’ve stopped applying the categorizer in those cases. The result is more relevant sitelinks.

Less duplication in expanded sitelinks. [launch codename “thanksgiving”, project codename “Megasitelinks”] We’ve adjusted signals to reduce duplication in the snippets forexpanded sitelinks. Now we generate relevant snippets based more on the page content and less on the query.

More consistent thumbnail sizes on results page. We’ve adjusted the thumbnail size for most image content appearing on the results page, providing a more consistent experience across result types, and also across mobile and tablet. The new sizes apply to rich snippet results for recipes and applications, movie posters, shopping results, book results, news results and more.

More locally relevant predictions in YouTube. [project codename “Suggest”] We’ve improved the ranking for predictions in YouTube to provide more locally relevant queries. For example, for the query [lady gaga in ] performed on the US version of YouTube, we might predict [lady gaga in times square], but for the same search performed on the Indian version of YouTube, we might predict [lady gaga in India].

More accurate detection of official pages. [launch codename “WRE”] We’ve made an adjustment to how we detect official pages to make more accurate identifications. The result is that many pages that were previously misidentified as official will no longer be.

Refreshed per-URL country information. [Launch codename “longdew”, project codename “country-id data refresh”] We updated the country associations for URLs to use more recent data.

Expand the size of our images index in Universal Search. [launch codename “terra”, project codename “Images Universal”] We launched a change to expand the corpus of results for which we show images in Universal Search. This is especially helpful to give more relevant images on a larger set of searches.
Minor tuning of autocomplete policy algorithms. [project codename “Suggest”] We have a narrow set of policies for autocomplete for offensive and inappropriate terms. This improvement continues to refine the algorithms we use to implement these policies.

“Site:” query update [launch codename “Semicolon”, project codename “Dice”] This change improves the ranking for queries using the “site:” operator by increasing the diversity of results.

Improved detection for SafeSearch in Image Search. [launch codename "Michandro", project codename “SafeSearch”] This change improves our signals for detecting adult content in Image Search, aligning the signals more closely with the signals we use for our other search results.

Interval based history tracking for indexing. [project codename “Intervals”] This improvement changes the signals we use in document tracking algorithms.
Improvements to foreign language synonyms. [launch codename “floating context synonyms”, project codename “Synonyms”] This change applies an improvement we previously launched for English to all other languages. The net impact is that you’ll more often find relevant pages that include synonyms for your query terms.

Disabling two old fresh query classifiers. [launch codename “Mango”, project codename “Freshness”] As search evolves and new signals and classifiers are applied to rank search results, sometimes old algorithms get outdated. This improvement disables two old classifiers related to query freshness.
More organized search results for Google Korea. [launch codename “smoothieking”, project codename “Sokoban4”] This significant improvement to search in Korea better organizes the search results into sections for news, blogs and homepages.

Fresher images. [launch codename “tumeric”] We’ve adjusted our signals for surfacing fresh images. Now we can more often surface fresh images when they appear on the web.

Update to the Google bar. [project codename “Kennedy”] We continue to iterate in our efforts to deliver a beautifully simple experience across Google products, and as part of that this month we made further adjustments to the Google bar. The biggest change is that we’ve replaced the drop-down Google menu in the November redesign with a consistent and expanded set of links running across the top of the page.

Adding three new languages to classifier related to error pages. [launch codename "PNI", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as “soft 404s”), pages that return valid text to a browser but the text only contain error messages, such as “Page not found.” It’s rare that a user will be looking for such a page, so it’s important we be able to detect them. This change extends a particular classifier to Portuguese, Dutch and Italian.

Improvements to travel-related searches. [launch codename “nesehorn”] We’ve made improvements to triggering for a variety of flight-related search queries. These changes improve the user experience for our Flight Search feature with users getting more accurate flight results.

Data refresh for related searches signal. [launch codename “Chicago”, project codename “Related Search”] One of the many signals we look at to generate the “Searches related to” section is the queries users type in succession. If users very often search for [apple] right after [banana], that’s a sign the two might be related. This update refreshes the model we use to generate these refinements, leading to more relevant queries to try.

International launch of shopping rich snippets. [project codename “rich snippets”]Shopping rich snippets help you more quickly identify which sites are likely to have the most relevant product for your needs, highlighting product prices, availability, ratings and review counts. This month we expanded shopping rich snippets globally (they were previously only available in the US, Japan and Germany).

Improvements to Korean spelling. This launch improves spelling corrections when the user performs a Korean query in the wrong keyboard mode (also known as an “IME”, or input method editor). Specifically, this change helps users who mistakenly enter Hangul queries in Latin mode or vice-versa.
Improvements to freshness. [launch codename “iotfreshweb”, project codename “Freshness”] We’ve applied new signals which help us surface fresh content in our results even more quickly than before.

Web History in 20 new countries. With Web History, you can browse and search over your search history and webpages you’ve visited. You will also get personalized search results that are more relevant to you, based on what you’ve searched for and which sites you’ve visited in the past. In order to deliver more relevant and personalized search results, we’ve launched Web History in Malaysia, Pakistan, Philippines, Morocco, Belarus, Kazakhstan, Estonia, Kuwait, Iraq, Sri Lanka, Tunisia, Nigeria, Lebanon, Luxembourg, Bosnia and Herzegowina, Azerbaijan, Jamaica, Trinidad and Tobago, Republic of Moldova, and Ghana. Web History is turned on only for people who have a Google Account and previously enabled Web History.

Improved snippets for video channels. Some search results are links to channels with many different videos, whether on mtv.com, Hulu or YouTube. We’ve had a feature for a while now that displays snippets for these results including direct links to the videos in the channel, and this improvement increases quality and expands coverage of these rich “decorated” snippets. We’ve also made some improvements to our backends used to generate the snippets.

Improvements to ranking for local search results. [launch codename “Venice”] This improvement improves the triggering of Local Universal results by relying more on the ranking of our main search results as a signal.

Improvements to English spell correction. [launch codename “Kamehameha”] This change improves spelling correction quality in English, especially for rare queries, by making one of our scoring functions more accurate.

Improvements to coverage of News Universal. [launch codename “final destination”] We’ve fixed a bug that caused News Universal results not to appear in cases when our testing indicates they’d be very useful.

Consolidation of signals for spiking topics. [launch codename “news deserving score”, project codename “Freshness”] We use a number of signals to detect when a new topic is spiking in popularity. This change consolidates some of the signals so we can rely on signals we can compute in realtime, rather than signals that need to be processed offline. This eliminates redundancy in our systems and helps to ensure we can continue to detect spiking topics as quickly as possible.
Better triggering for Turkish weather search feature. [launch codename “hava”] We’ve tuned the signals we use to decide when to present Turkish users with the weather search feature. The result is that we’re able to provide our users with the weather forecast right on the results page with more frequency and accuracy.

Visual refresh to account settings page. We completed a visual refresh of the account settings page, making the page more consistent with the rest of our constantly evolving design.

Panda update. This launch refreshes data in the Panda system, making it more accurate and more sensitive to recent changes on the web.

Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.

SafeSearch update. We have updated how we deal with adult content, making it more accurate and robust. Now, irrelevant adult content is less likely to show up for many queries.

Spam update. In the process of investigating some potential spam, we found and fixed some weaknesses in our spam protections.
Improved local results. We launched a new system to find results from a user’s city more reliably. Now we’re better able to detect when both queries and documents are local to the user.

More analysis to come.

By:
By Chris Crum

Facebook readying new premium ads?

Monday, February 27, 2012

Leaked document shows ad upgrade presentation apparently targeting the friends of advertisers' fans.

A snapshot of the leaked advertiser materials.

(Credit: GigaOm)

Facebook is expected to launch an upgrade next week to its premium ads program that will target the friends of advertisers' fans.

The new ads, which will launch February 29, will be created from content posted organically to pages, according to a purportedly leaked document (see below) obtained by GigaOm.

"Anything you can post to on your page, you can turn into an ad," the document says. "Upgraded ads can be targeted to anyone you want."

When the users seeing the ad are friends with fans of the page, Facebook will automatically expand the ad and provide social context about the friends. The ads will also feature an expanded interface allowing fans to comment on the post directly from the ad.

The social network promises advertisers that the new ads will be in a larger format, boosting engagement by 40 percent and increasing retention by 80 percent. Fan rates are also expected to increase 16 percent, as well as purchase intent, according to the documents, which appear to be a presentation prepared for Facebook advertisers.

The new premium ad types can feature a photo, video, question, status, event, or link. They will replace "classic" ad options such as Premium Like (Photo and Video), Pemium Event, Video Comment, and Premium Poll (Photo and Video), which will be phased out on February 29. Marketplace ads on the right column will not be affected.

Facebook representatives did not immediately respond to a request for comment.

Facebook Premium Ads Overview

Facebook Premium Ads Overview
by Steven Musil February 21, 2012 10:20 PM PST

Read more: http://news.cnet.com/8301-1023_3-57382395-93/facebook-readying-new-premium-ads/#ixzz1nYgZv6IQ

Google’s New Algorithm Update Impacts 35% Of Searches - 2012

Wednesday, January 4, 2012

Today, Google announced a change to its search algorithm that the company says will impact 35% of Web searches. The change builds on top of its previous “Caffeine” update in order to deliver more up-to-date and relevant search results, specifically those in areas where freshness matters. This includes things like recent events, hot topics, current reviews and breaking news items.

Google says that the new algorithm knows that different types of searches have different freshness needs, and weighs them accordingly. For example, a search for a favorite recipe posted a few years ago may still be popular enough to rank highly, but searches for an unfolding news story or the latest review of the iPhone 4S should bring the newer, fresher content first, followed by older results.

For searches about recent events and news, Google may now show search results towards the top of the page that are only minutes old, the company says. For regularly occurring events, like the Presidential election, the Oscars, a football game, company earnings, etc., Google knows that you’re likely interested in the most recent event, even if you don’t specify keywords indicating that.

That means a search for “Apple earnings” won’t (in theory) require you to also type in “Q4 2011″ in order to see the latest information. It will be implied that you meant this latest quarter, without the need for the extra text. Of course, Google was already ranking news items and stock symbols at the top of the page when users performed financial-related searches or searches for current information, but this algorithm change has an impact on the organic search results, too, not those from the verticals (search, finance, images, etc.) which have been integrated into Google’s Universal search.

For items that see regular updates, like consumer electronics reviews, reviews of a particular kind of car and more, Google will also feature the most current and up-to-date information above the rest.

This “freshness update,” is an extension of what Google begin last year with Caffeine, an under-the-hood improvement that, among other things, helped Google index content quicker, so results were more realtime. This year, Google also brought out its Panda update, which was meant to decrease the rankings of so-called “content farms” – SEO-optimized entities that critics said filled Google with low-quality results.

Now, it’s clear that Google understands that the most relevant search result is more often the one that’s relevant now – the one that’s bringing you new information. The update’s impact on Google Search is fairly substantial, with Google claiming that roughly 35% of search results will be affected by the changes.

Google used to have a search vertical specifically for the most recent updates at www.google.com/realtime, where it was indexing Twitter updates. However, when the contract with Twitter expired, Google shuttered the site (it now redirects to the Google homepage). Google said at the time that it planned to re-open the site with Google+ search results alongside other realtime sources of information. But with the new Google search update, a specific vertical for realtime information feels less necessary.
By,

Sarah Perez

Thursday, November 3rd, 2011