Nokogiri scraping with cookies
Nokogiri is a great library for scraping other websites content with XPath or css based selectors (you know: the way jquery lets you query for DOM elements). You can do something like this:
As you can see in the example above, the query is easy, fetching the document through Nokogiri is easy. The tricky part in the example above is that the URL will redirect you to a language select page. The Makro.be site requires a cookie to determine your language.
So, you won’t get the actual page you want to scrape. Luckily, the standard library within Ruby (open-uri) allows you to send a cookie header along with the ‘url’ in the ‘open()’ method. You can get the cookie string of your site like this:
(You can do this in the Firebug console in Firefox or the Developer Tools console in Chrome) All together: