Package 'postlightmercury' reference manual

Title:	Parses Web Pages using Postlight Mercury
Description:	This is a wrapper for the Mercury Parser API. The Mercury Parser is a single API endpoint that takes a URL and gives you back the content reliably and easily. With just one API request, Mercury takes any web article and returns only the relevant content — headline, author, body text, relevant images and more — free from any clutter. It’s reliable, easy-to-use and free. See the webpage here: <https://mercury.postlight.com/>.
Authors:	Mikkel Freltoft Krogsholm
Maintainer:	Mikkel Freltoft Krogsholm <[email protected]>
License:	MIT + file LICENSE
Version:	1.2
Built:	2025-02-19 02:57:58 UTC
Source:	https://github.com/cran/postlightmercury

Turns NULL values in a list into NAs.

Description

Turns NULL values in a list into NAs.

Usage

null_to_na(mylist)
null_to_na(mylist)

Arguments

mylist

is a list, where the NULL values are to be turned into NAs.

Removes html

Description

The function uses tools from the rvest and xml2 packages to clean up the HTML and turning it into proper text.

Usage

remove_html(strings, trim = TRUE)
remove_html(strings, trim = TRUE)

Arguments

`strings`	the string(s) you want to clean
`trim`	should the string be trimmed or not

Value

a string

Examples

## Not run: 
# First get api key here: https://mercury.postlight.com/web-parser/

# Then run the code below replacing the X's wih your api key.
url <- "https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed"
my_data <- web_parser(page_urls = url,
                      api_key = XXXXXXXXXXXXXXXXXXXXXXX)

# With html formatting:
my_data$content

# Now remove it:
my_data$content <- remove_html(my_data$content)

# Without html formatting:
my_data$content

## End(Not run)
## Not run: 
# First get api key here: https://mercury.postlight.com/web-parser/

# Then run the code below replacing the X's wih your api key.
url <- "https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed"
my_data <- web_parser(page_urls = url,
                      api_key = XXXXXXXXXXXXXXXXXXXXXXX)

# With html formatting:
my_data$content

# Now remove it:
my_data$content <- remove_html(my_data$content)

# Without html formatting:
my_data$content

## End(Not run)

With just one API request, Mercury takes any web article and returns only the relevant content — headline, author, body text, relevant images and more — free from any clutter. It’s reliable, easy-to-use and free.

Usage

web_parser(page_urls, api_key)
web_parser(page_urls, api_key)

Arguments

`page_urls`	One or more urls to be parsed
`api_key`	Key for the API

Value

a tibble

Source

https://mercury.postlight.com/web-parser/

Examples

## Not run: 
# First get api key here: https://mercury.postlight.com/web-parser/

# Then run the code below replacing the X's wih your api key:
web_parser(page_urls = "https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed",
           api_key = XXXXXXXXXXXXXXXXXXXXXXX)

## End(Not run)
## Not run: 
# First get api key here: https://mercury.postlight.com/web-parser/

# Then run the code below replacing the X's wih your api key:
web_parser(page_urls = "https://trackchanges.postlight.com/building-awesome-cms-f034344d8ed",
           api_key = XXXXXXXXXXXXXXXXXXXXXXX)

## End(Not run)

Package 'postlightmercury'

Help Index

Turns NULL values in a list into NAs.

Description

Usage

Arguments

Removes html

Description

Usage

Arguments

Value

Examples

Parses web pages

Description

Usage

Arguments

Value

Source

Examples