Class: WebScraper

WebScraper

new WebScraper(uri)

Create a web scraper.

Parameters:
Name Type Description
uri string

The URI of the web page to scrape.

Source:

Members

count

Number of pages to fetch.

Source:

parser

Parser instance to handle the generator string.

Source:

uri

The URI of the first page to fetch.

Source:

Methods

cleanup() → {string}

Clean up a string's contents.

Source:
Returns:

A new string.

Type
string

createQueue(result) → {Promise|Queue}

Create a queue of tracks.

Parameters:
Name Type Description
result string

A newline-separated list of tracks.

Source:
Returns:

A queue of results.

Type
Promise | Queue

dispatch() → {Promise|Queue}

Dispatch entry.

Source:
Returns:

A queue of results.

Type
Promise | Queue

lastfm(uri, countopt) → {Promise|string}

Scrape a Last.fm tracklist.

Parameters:
Name Type Attributes Description
uri string

The URI of the web page to scrape.

count integer <optional>

The number of pages to scrape.

Source:
Returns:

A newline-separated list of tracks.

Type
Promise | string

pitchfork(uri, countopt) → {Promise|string}

Scrape a Pitchfork list.

Parameters:
Name Type Attributes Description
uri string

The URI of the web page to scrape.

count integer <optional>

The number of pages to scrape.

Source:
Returns:

A newline-separated list of albums.

Type
Promise | string

rateyourmusic(uri, countopt) → {Promise|string}

Scrape a Rate Your Music chart.

Parameters:
Name Type Attributes Description
uri string

The URI of the web page to scrape.

count integer <optional>

The number of pages to scrape.

Source:
Returns:

A newline-separated list of albums.

Type
Promise | string

reddit(uri, countopt) → {Promise|string}

Scrape a Reddit forum.

Handles post listing and comment threads. Employs Bob Nisco's heuristic for parsing comments.

Parameters:
Name Type Attributes Description
uri string

The URI of the web page to scrape.

count integer <optional>

The number of pages to scrape.

Source:
Returns:

A newline-separated list of tracks.

Type
Promise | string

scrape(uri, count) → {Promise|string}

Scrape a web page.

This function inspects the host of the web page and invokes an appropriate scraping function. The scraping functions are written in the following manner: they take the web page URI as input, fetch the page, and return a generator string as output (wrapped in a Promise). Schematically:

      web page:                      generator string
+-------------------+                   (Promise):
| track1 by artist1 |    scraping
+-------------------+    function    artist1 - track1
| track2 by artist2 |    =======>    artist2 - track2
+-------------------+                artist3 - track3
| track3 by artist3 |
+-------------------+

In the example above, the scraping function converts a table of tracks to a generator string on the form ARTIST - TRACK. If the input were an albums chart, then the output would be a string of #album commands instead. In other words, the scraping function should extract the meaning of the web page and express it as input to the generator.

Parameters:
Name Type Description
uri string

The URI of the web page to scrape.

count integer

Number of pages to fetch.

Source:
Returns:

A generator string.

Type
Promise | string

trim() → {string}

Clean up a string's whitespace.

Source:
Returns:

A new string.

Type
string

webpage(uri) → {Promise|string}

Scrape a web page.

This is a fall-back function in case none of the other scraping functions apply.

Parameters:
Name Type Description
uri string

The URI of the web page to scrape.

Source:
Returns:

A newline-separated list of tracks.

Type
Promise | string

youtube(uri) → {Promise|string}

Scrape a YouTube playlist.

Parameters:
Name Type Description
uri string

The URI of the web page to scrape.

Source:
Returns:

A newline-separated list of tracks.

Type
Promise | string