Blind Ape Seo

The path of the ape

Filling autoblogs with Google news

Google news is a nice way to get some content into your niche autoblog, but you will have some preparing to do.

(For starters, get and install WP-o-Matic)

- Point your browser to news.google.com
- Enter your search phrase, for this example we will be using “chimpanzee birth” (Always good to keep up with the family)
- Now look on the lower left navigation and press the link for “RSS”
- You are taken to the page of the RSS feed, go and grab that URL. It should look like this: http://news.google.ch/news?ie=UTF-8&oe=UTF-8&
rls=org.mozilla%3Aen-US%3Aofficial&client=firefox-a&um=1&
tab=wn&q=chimpanzee+birth&output=rss&ned=:ePkh8BM9EwLbwQq0w4AFqy1A
qVyIFNweIwH_SKsbIrNcC6-xa3UWy88sBgBW2w2e

Unbelievably long Google link here
- Enter that as feed address into a new WP-o-Matic campaign

Now we have to get rid of the nastiness that is the Google news link. Basically, Google does not give you the direct links in the feed , but redirects them via their servers. Luckily, we can get rid of them by way of the rewrite ability of WP-o-Matic.

- Go to the “Rewrite” tab
- Enter this for origin:
/http...news.*amp;url=(.*)&cid.*"{1}/
- Check “Regex”
- Under “Rewrite to” enter
$1
- check “rewrite”
-Submit

Done.


6 Comments so far

  1. Stuart December 5th, 2007 4:37 pm

    Well, I cant agree more.

  2. randall March 12th, 2008 3:00 am

    The rewrite does not seem to work for me? Did G change the format? I am hopeless with regex :(

  3. emp March 12th, 2008 9:01 am

    Sometimes google changes the format (redirects are quite common).

    Regex are really worth your time, though.

    get yourself a book and learn. Or bribe me to write a tutorial.

    ::emp::

  4. mat May 10th, 2008 3:18 am

    Hi,
    The link I get following your instructions is:
    http://news.google.com/news?hl=en&ned=us&q=chimpanzee+birth&ie=UTF-8&output=rss

    This WP-o-Matic then is not able to fetch the single news.
    Any idea?

  5. mat May 11th, 2008 12:06 am

    Hi,
    I followed the instructions mentioned above. The problem I get is that no posts are fetched. Although I get no error message, WP-o-Matic says the following after trying to fetch for the first time: “Campaign processed. 0 posts fetched”.

    Any idea?

  6. emp May 15th, 2008 12:48 am

    You are running into what is known as the “inherent brittleness” of scraping.

    This basically means that as soon as your source (in this case, google news) changes the code, you are out of luck.

    Google has been known to add tracking code every once in a while. This might be the case, just check the source code and see if you need to adjust the regular expression.

Leave a reply