Getting started with scraping metadata from websites using NodeJS is pretty easy, especially for newer sites. The difficulty comes once you want to start supporting the long tail of older or non-standard sites out there. Supporting the long-tail is where OpenGraph.io can help.

First, lets init a new node project:

npm init

Next, we’ll install the opengraph-io NPM module:

npm install opengraph-io –save

Finally, lets scrape some websites using our node app:

// Create a client to make requests to opengraph.io
var opengraph = require('opengraph-io')();

// Scrape Spotify.com (has OpenGraph tags)
opengraph.getSiteInfo('http://spotify.com' , function(err, result){
   console.log('Site title is', result.hybridGraph.title);
});

// Scrape Amazon Product
opengraph.getSiteInfo('https://www.amazon.com/Mountain-Three-Short-Sleeve-Green/dp/B000NZW3KC/ref=sr_1_1?ie=UTF8&qid=1481311192&sr=8-1&keywords=three+wolf+moon' , function(err, result){
   console.log('Site title is', result.hybridGraph.title);
});

You’ll notice that Opengraph.io will return OpenGraph tags when they are available. If any (or all) tags are not provided on a site, OpenGraph.io will infer what the OpenGraph tags probably should be. It doesn’t get much easier than that.

Since OpenGraph.io is a hosted service which is free for most users. You won’t have to worry about scaling infrastructure or working on supporting that long tail of projects. Anytime you come across a site that isn’t well supported, just drop us a line.