Advanced Cloaking Technique: How to feed password-protected content to search engine spiders

Hamlet Batista

Chief Executive Officer

Hamlet Batista is CEO and founder of RankSense, an agile SEO platform for online retailers and manufacturers. He holds US patents on innovative SEO technologies, started doing SEO as a successful affiliate marketer back in 2002, and believes great SEO results should not take 6 months

Try our SEO automation tool for free!

RankSense automatically creates search snippets using advanced natural language generation. Get your free trial today.

Getting Started with NLP and Python for SEO [Webinar]

Custom Python scripts are much more customizable than Excel spreadsheets. This is good news for SEOs — this can lead to optimization opportunities and low-hanging fruit. One way you can use Python to uncover these opportunities is by pairing it with natural language processing. This way, you can match how your audience searches with your...

READ POST

Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

As we continue to improve the RankSense app for Cloudflare, we are always working to make the app more intuitive and easy to use. I'm pleased to share that we have made significant changes to our SEO rules interface in the settings tab of our app. It is now easier to publish multiple rules sheets and to see which changes have not yet been published to production.

READ POST

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

For the following Ranksense Webinar, we were joined by Antoine Eripret, who works at Liligo as an SEO lead. Liligo.com is a travel search engine which instantly searches all available flight, bus and train prices on an exhaustive number of travel sites such as online travel agencies, major and low-cost airlines and tour-operators. In this...

READ POST

Jez

September 3, 2007 at 1:16 am

Hi Hamlet, Are you saying that by setting the user agent in Curl I could spider and rip SEOMOZ premium content? ;-) Jez
- Jez
  
  September 3, 2007 at 1:19 am
  
  On a related note, how do you maintain the lists of IP's for Google and other SE's? Outdated IP lists seem to be the risk with any form of cloaking...
  - Hamlet Batista
    
    September 3, 2007 at 6:31 am
    
    Jez - With my updated strategy you don't need to maintain IP lists. Please read it again carefully. You simply check the user agent and if it is XXBot you do a forward-reverse DNS lookup to confirm it is indeed such bot. No need to check list of IPs. Well, maybe to improve performance you'd want to cache the IPs of confirmed bots ;-)
randfish

September 3, 2007 at 6:05 am

Jez - actually, we don't cloak. Neither search engines, nor humans have access to the premium content unless they're logged into a premium account. Instead, we show non-logged-in users and bots a page with a summary of the content and an outline (for the guides).
- Hamlet Batista
  
  September 3, 2007 at 6:27 am
  
  Rand - Thanks for your comment. I was about to say the same. I checked the cached versions of your premium article pages and they ask for credentials.
  - David Hopkins
    
    September 3, 2007 at 10:13 am
    
    Again, any hope of being a smart ass were dashed as I read further through the article. I think you being a programmer (I’m sure you said Perl was your favourite?) is an important attribute of your SEO knowledge. Most SEOs don’t seem to have any programming knowledge. They are quite happy to talk about 301 redirects, but how many actually know what a HTTP header is? They have a pretty bullet-proof setup over there at SEOMoz. I also had a sniff around to find it watertight. I have had quite a few adventures with the curl libraries – all strictly legitimate I should add. The only flaw that comes to mind is the possibility of using proxies. I am not sure what you are doing at stage 2? Are you checking against a list? As you may have guessed, that list I sent over was the product of curl – I am adding to it from further sources at the moment. As for my idea for scoping out domains that you can buy for a few dollars its been pretty unsuccessful. I got so excited at seeing a domain with 20,000 links available for I bought it without hesitation only to find out that the links came from a handful of domains. Thankfully it only cost $9. Although I’ve since picked up a genuine PR5 domain for $9 and a three letter .com for $150. I have no idea what to do with them though. P.S. I’m impressed with RankSense. Like yourself I’m really busy and hope to get a better look at it later. I’ll give you some linkage when the time is right.
    - Hamlet Batista
      
      September 3, 2007 at 2:55 pm
      
      David, aka Mutiny, glad to see you using your real name. <blockquote> I think you being a programmer (I’m sure you said Perl was your favourite?) is an important attribute of your SEO knowledge. Most SEOs don’t seem to have any programming knowledge. They are quite happy to talk about 301 redirects, but how many actually know what a HTTP header is?</blockquote> My favorite is Python. You could say that I have an unfair advantage as many things in SEO are highly technical. I still need to play catchup with the Marketing aspect, though. <blockquote>I have no idea what to do with them though. </blockquote> Although I've bought domains in the past, I think that whole domaining thing is a little bit overrated. I prefer to buy sites where I can see profit potential before hand. Before expending a dime, make sure you have a plan to make it back. ;-)
      - David Hopkins
        
        September 4, 2007 at 8:40 am
        
        Unfortunatly they don't correlate on your top commenters. :(
      - Hamlet Batista
        
        September 6, 2007 at 5:37 am
        
        David, if you want I can change your site name to your real name in all your comments.
MB Web Design

September 3, 2007 at 8:31 am

Nice try - just give those good people at SEOmoz your money :p
egorych

September 4, 2007 at 9:58 am

Very interesting, really. I've translated your article into Russian. Good job.
- Hamlet Batista
  
  September 4, 2007 at 3:01 pm
  
  egorych - Yes, I noticed that. I used Google translate to understand it. Thanks for the translation. I though that was some new kind of automated scrapping.
Продвинутый клоакинг: как роботы индексируют платный контент.

October 19, 2007 at 6:10 pm

[...] Оригинал статьи: Advanced Cloaking Technique: How to feed password-protected content to search engine spiders [...]

Hamlet Batista

Jez

Jez

Hamlet Batista

randfish

Hamlet Batista

David Hopkins

Hamlet Batista

David Hopkins

Hamlet Batista

MB Web Design

egorych

Hamlet Batista

Продвинутый клоакинг: как роботы индексируют платный контент.

Try our SEO automation tool for free!

OUR BLOG

Latest news and tactics

Getting Started with NLP and Python for SEO [Webinar]

Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner

Request your SEO Monitoring Invitation

Advanced Cloaking Technique: How to feed password-protected content to search engine spiders

Hamlet Batista

Jez

Jez

Hamlet Batista

randfish

Hamlet Batista

David Hopkins

Hamlet Batista

David Hopkins

Hamlet Batista

MB Web Design

egorych

Hamlet Batista

Продвинутый клоакинг: как роботы индексируют платный контент.

Try our SEO automation tool for free!

OUR BLOG

Latest news and tactics

Getting Started with NLP and Python for SEO [Webinar]

Making it easier to implement SEO changes on your website

Changes to the RankSense SEO rules interface

How to Find Content Gaps at Scale: Atrapalo vs Skyscanner