Fetch page title from URL | Andrew's Tech Classes

This function loads the url and gets the title from the title tag and first h1 on the page.

It's in smart_link.module, so it can be called with drush eval.

drush ev "echo smart_link_get_info('https://andrewsclasses.com/courses')"

This is useful for developing, but this really belongs in a service class or another file that will only be loaded when called.

/**
 * For a given URL, return the html and h1 titles.
 *
 * @param string $url
 *
 * @return array
 */
function smart_link_get_info($url) {
  $info = [];
  $html5 = new \Masterminds\HTML5();
  $page = \Drupal::httpClient()->get($url)->getBody()->__toString();
  $parsed = $html5->loadHTML($page);

  $info['html_title'] = $parsed->getElementsByTagName('title')
    ->item(0)->textContent;
  $info['html_h1'] = $parsed->getElementsByTagName('h1')
    ->item(0)->textContent;

  return $info;
}

Topics

Module Development

Guzzle

Project: Smart Link

Fetch page title from URL

Service to get info from URL

Create Smart Link custom field

Retrieve URL info in Smart Link field

Smart Link field formatter

Smart Link Entity

Smart Link quick create form

TODO: Smart Link Entity Plugin

TODO: Capture screenshot of linked page

Smart Link permissions