thousands of pages missing from google serps...and how to prevent the problem! by christine smith

20
#SMX #25A Christine Smith @websmithc ...and how to prevent the problem! Thousands of Pages Missing from Google SERPs

Upload: search-marketing-expo-smx

Post on 15-Apr-2017

8.056 views

Category:

Internet


0 download

TRANSCRIPT

Page 1: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

...and how to prevent the problem!

Thousands of Pages Missing from Google SERPs

Page 2: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ >1M technical documents

§ Self-support for server and software admins

§ Searchable by error codes, etc.

IBM’s self-support site

http://www-01.ibm.com/support/docview.wss?uid=swg21363866

Page 3: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

Thousands of pages missing from SERPs!

Page 4: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

Google monthly referrals dropped 28%

Page 5: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

ü Pages/URLs displayed correctly

ü Redirects were working normally §  302 www.ibm.com -> www-01.ibm.com (multiple infrastructures)

ü Canonical URLs were correct <link href=“http://www.ibm.com/support/docview.wss?uid=swg21363866" rel="canonical"/>

ü Robots.txt was not blocking anything

The Good News: The pages were fine

Page 6: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Only 10% of Sitemap URLs were indexed § Regenerated sitemaps

ü Corrected URLs to match canonical URLs ü Improved to 60% of Sitemap URLs indexed

ü  Today, 88% are indexed

§ But, still no Google referral improvement

Sitemaps were not optimal

Page 7: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Opened Google Site Search support ticket § Google findings: § A sampling of the missing URLs were:

1.  Marked as a duplicates of the Support Registration page and

2.  Were last crawled five months before!

Engaged Google

Page 8: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§  Since Google’s index marked the pages as duplicates…

Thousands of Support pages were effectively

deindexed

In other words…

Page 9: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ There was no way to get a list of all URLs affected

§ Panda was updated and Data Refreshed within a few weeks

Meanwhile, according to Google

Page 10: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Submitted a Manual Actions Reconsideration Request §  Some technical docs were incorrectly flagged

§ Requested increased crawl rate for the domain §  Hoping the pages would be revisited

We kept pulling every lever available…

Page 11: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

Google referrals increased 22% the following month

Page 12: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Traffic bounced back.. Almost overnight. § GSS ticket was closed, but not resolved § Was it.. Ø Increase crawl rate? Ø Better sitemaps? Ø Panda and data refresh? Ø Normal re-crawl of the pages?

What was the fix?

Page 13: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Likely culprit: §  Faulty redirect or §  A bad site maintenance redirect

§ Typical server response during outage: §  404 or 500 or 504 HTTP response §  Or 302 redirect to a maintenance page

More importantly: What was the cause?

Page 14: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Give a 503 Service Unavailable HTTP response!

§  Retry-after time in the header helpful §  Tells Google to come back later

§ Do not set all 5xx responses to 503 §  Google will ignore and assume site is down

Site maintenance – the right way

Reference: http://googlewebmastercentral.blogspot.com/2011/01/how-to-deal-with-planned-site-downtime.html

Page 15: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§ Some web platforms automatically give 503 HTTP response during an upgrade §  e.g. Wordpress

§ Others require workarounds, like §  Apache, IHS (IBM), IIS (Microsoft) rewrite rules §  Akamai logic

How to indicate site maintenance

Page 16: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

RewriteEngine on RewriteCond %{REQUEST_URI} !^(/maintenance.html)$

#RewriteRule ^(.*)$ - [R=503]

#Planned outage for /services/ path

RewriteRule ^(/services.*)$ - [R=503]

#Planned outage for /support/ path

RewriteRule ^(/support.*)$ - [R=503]

ErrorDocument 503 /maintenance.html

Update httpd.conf during maintenance

•  Requires Apache / IBM HTTP Server restart to update httpd.conf before and after maintenance •  Remember to both port 80 and port 443 •  If the entire site is down, return a 503 response on robots.txt

Page 17: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

Akamai flow for 503 HTTP response During planned outage, upload a filename matching the application path to a NetStorage /maintenance/ directory Custom Flow •  If Origin gives 500 or 504 response,

then •  If the maintenance file exists,

then •  Serve maintenance page

with 503 response •  Set retry-after to 1 day

•  Otherwise, serve error page matching Origin response (500/504)

•  Otherwise, serve Origin response

Page 18: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

ü  Never give 200 or 301 HTTP response when site is down! ü  Confuses the crawler/indexer ü  May cause pages to be “deindexed” as duplicates

ü  Configure 503 HTTP responses only during planned outages and…. ü  Don’t forget to remove when maintenance is over!

Lessons Learned

Page 19: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

THANK YOU!

SEE YOU @SMX WEST SAN JOSE, CA

MARCH 1-3, 2016

Page 20: Thousands of Pages Missing From Google SERPs...And How to Prevent the Problem! By Christine Smith

#SMX #25A Christine Smith @websmithc

§  Apache: §  https://gist.github.com/jjulian/1889874

§  IBM HTTP Server (IHS): §  http://www.ibm.com/support/docview.wss?

uid=swg21397422 §  Microsoft IIS:

§  http://serverfault.com/questions/483145/how-to-add-a-site-wide-downtime-error-message-in-iis-with-a-custom-503-error-co

References: Rewrite Rules for 503 Response