codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Follow publication

An Introduction to Web Scraping with Node JS

Brandon Morelli
codeburst
Published in
5 min readAug 1, 2017
Web Scraping. Photo by michael podger

What is web scraping?

Warnings.

What will we need?

Project Setup.

npm install --save request request-promise cheerio
const rp = require('request-promise');
const cheerio = require('cheerio');

Setting up the Request

const options = {
uri: `https://www.yourURLhere.com`,
transform: function (body) {
return cheerio.load(body);
}
};
const rp = require('request-promise');
const cheerio = require('cheerio');
const options = {
uri: `https://www.yourURLhere.com`,
transform: function (body) {
return cheerio.load(body);
}
};

Make the Request

rp(OPTIONS)
.then(function (data) {
// REQUEST SUCCEEDED: DO SOMETHING
})
.catch(function (err) {
// REQUEST FAILED: ERROR OF SOME KIND
});
rp(options)
.then(($) => {
console.log($);
})
.catch((err) => {
console.log(err);
});
node index.js// LOGS THE FOLLOWING:
{ [Function: initialize]
fn:
initialize {
constructor: [Circular],
_originalRoot:
{ type: 'root',
name: 'root',
namespace: 'http://www.w3.org/1999/xhtml',
attribs: {},
...
Boilerplate web scraping code

Using the Data

Selectors

<ul id="cities">
<li class="large">New York</li>
<li id="medium">Portland</li>
<li class="small">Salem</li>
</ul>
$('.large').text()
// New York
$('#medium').text()
// Portland
$('li[class=small]').html()
// <li class="small">Salem</li>

Looping

$('li').each(function(i, elem) {
cities[i] = $(this).text();
});
// New York Portland Salem

Finding

<ul id="cities">
<li class="large">New York</li>
<li id="c-medium">Portland</li>
<li class="small">Salem</li>
</ul>
<ul id="towns">
<li class="large">Bend</li>
<li id="t-medium">Hood River</li>
<li class="small">Madras</li>
</ul>
$('#cities').find('.small').text()
// Salem
$('#towns').find('.small').text()
// Madras

Children

$('#cities').children('#c-medium').text();
// Portland

Text & HTML

$('.large').text()
// Bend
$('.large').html()
// <li class="large">Bend</li>

Additional Methods

Chrome Developer Tools

Finding class names with chrome dev tools

Limitations

Go forth and scrape!

If this post was helpful, please click the clap 👏button below a few times to show your support! ⬇⬇

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Published in codeburst

Bursts of code to power through your day. Web Development articles, tutorials, and news.

Written by Brandon Morelli

Creator of @codeburstio — Frequently posting web development tutorials & articles. Follow me on Twitter too: @BrandonMorelli

Responses (13)

Write a response