Challenge #2 – 5 Minute Read

12 November 2018 Solved Twig Easy

In this challenge you will design an algorithm that predicts how long a piece of text takes to read. This challenge was inspired by the amusing discovery that this excellent, rather lengthy article was hard-coded (as were all other blog articles on the site) to be a 5 minute read.

Challenge

The challenge is to write a twig macro called "readTime" that will accept a single parameter: text (a string of plaintext or markdown). The macro should output a predicted length of time that the text will take to read, using medium.com as a measure of accuracy. So for example, calling:

{% set entry = craft.entries.slug('our-credentials').one() %}

{{ readTime(entry.text) }}

Might output:

1 minute

Whereas calling:

{% set entry = craft.entries.slug('an-annotated-webpack-4-config').one() %}

{{ readTime(entry.text) }}

Might output:

50 minutes

How you choose to calculate the read time based on the content of the text is entirely up to you. The goal is to come up with a solution that predicts read times similar to (or better than) those on medium.com.

Rules

The macro must output a predicted read time in minutes, given the parameter as described above. It should not rely on any plugins and the code will be evaluated based on the following criteria in order of priority:

  1. Originality
  2. Readability
  3. Accuracy

It can use whatever ingenious algorithm you come up with for predicting an article's read time, the more creative the better. The code should nevertheless be readable and easy to understand and the result should output read times similar to medium.com.

Tips

Begin with the easiest metric, word count, and experiment and tweak from there. Get inventive and add other metrics to the mix. If you feel up to the challenge then see what you can do with markdown text as the input (a HTML to markdown converter such as this one may help in testing).

Your results don't need to be identical to every article on medium.com, but the following articles will be used as an initial measure of accuracy:

Finally, when you have a working macro, see what read time your macro outputs for An Annotated webpack 4 Config for Frontend Web Development and then go tell Andrew Welch!

Solution

Depending on your source, the average reading speed of most adults is around 250 words per minute. According to this page on medium.com, read time is calculated as follows:

Read time is based on the average reading speed of an adult (roughly 265 WPM). We take the total word count of a post and translate it into minutes, with an adjustment made for images.

So to get us started let's split the text into words using the split filter with the assumption that words are separated by spaces. We can then get the number of words using the length filter and divide it by 265 words per minute to give us a read time.

{% macro readTime(text) %}

    {% set words = text|split(' ') %}
    {% set readTime = words|length / 265 %}
    {{ readTime }} minutes

{% endmacro %}

It is possible to solve this with a single line of code by simply chaining the twig filters together.

{% macro readTime(text) %}

    {{ text|split(' ')|length / 265 }} minutes

{% endmacro %}

We'll go back to the first, more readable solution, and fix 2 potential issues. The first occurs if the read time turns out to be a fraction such as 3.25, in which case we'll round it up using Craft's ceil function. The second occurs if the read time is 1, in which case we'll leave the "s" out of "minutes" to make it singular.

{% macro readTime(text) %}

    {% set words = text|split(' ') %}
    {% set readTime = ceil(words|length / 265) %}
    {{ readTime }} minute{{ readTime > 1 ? 's' }}

{% endmacro %}

Similar solutions: Quentin Delcourt, Spenser Hannon, Doug St. John, Philip Thygesen.


Another way to determine whether to output "minute" or "minutes" is to use Yii's internationalization (I18N) plural feature along with Craft's translate or t filter.

{% macro readTime(text) %}

    {% set words = text|split(' ') %}
    {% set readTime = ceil(words|length / 265) %}
    {{ readTime }} {{ '{n,plural,=1{minute} other{minutes}}'|t({n: readTime}) }}

{% endmacro %}

We could also take advantage of the duration feature to have the duration be automatically output in words. The parameter is expected in seconds so we multiply readTime by 60 to accomodate this.

{% macro readTime(text) %}

    {% set words = text|split(' ') %}
    {% set readTime = ceil(words|length / 265) %}
    {{ '{n,duration,%with-words}'|t({n: readTime * 60}) }}

{% endmacro %}

Up until now we've assumed that the text is provided as plain text, yet the challenge stated that it can also be provided as markdown text. In order to deal with this, we use the markdown filter provided by Craft to convert it into HTML code, followed by the striptags filter to remove all HTML tags, thereby converting it to plain text.

We can also get smarter about how we identify words, replacing all dashes and new lines with spaces using the replace filter before splitting the text.

{% macro readTime(text) %}

    {% set html = text|markdown %}
    {% set plaintext = html|striptags|replace({'—': ' ', '–': ' ', '-': ' ', '\n': ' '}) %}
    {% set words = text|split(' ') %}
    {% set readTime = ceil(words|length / 265) %}
    {{ readTime }} minute{{ readTime > 1 ? 's' }}

{% endmacro %}

Similar solutions: Patrick Harrington.


If we assume that the text is provided as markdown, we can take the number of images into account in the read time calculation, using the replace filter with a regular expression that looks for all instances of ![...](...) in the markdown text, or for all instances of <img ...> in the HTML, followed by the split filter. That allows us to calculate the number of images in the text and divide it by an arbitrary "images per minute" of 12, assuming that people spend an average of 5 seconds looking at each image (this will of course depend on the type of image: a cat versus a complex graph).

{% macro readTime(text) %}

    {% set html = text|markdown %}
    {% set imageCount = html|replace('/<img ([^>]+?)>/', '%%IMAGE%%')|split('%%IMAGE%%')|length - 1 %}
    {% set plaintext = html|striptags|replace({'—': ' ', '–': ' ', '-': ' ', '\n': ' '}) %}
    {% set words = text|split(' ') %}
    {% set readTime = ceil((words|length / 265) + (imageCount / 12)) %}
    {{ readTime }} minute{{ readTime > 1 ? 's' }}

{% endmacro %}

Similar solutions: Nate Iler, Alex Roper.


At this stage we've got a pretty accurate solution that takes words and images per minute into account. I even ran the code above on the An Annotated webpack 4 Config for Frontend Web Development article and got an amazingly rounded "90 minutes"!!

So, what other approaches could we take?

This solution by Matt Stein counts the number of unique words in the text using Craft's unique filter and does some maths to take the complexity of the text into account when calculating a read time.

This solution by Andrew Welch takes a different approach to calculating the number of words in the text. It counts the number of characters in the entire text and divides that by an average word length of 5.1 characters. The solution also caches the calculation globally using Craft's {% cache %} tags with a key constructed from the URL and the name of the entry field. This avoids having to calculate the entry's read time on every request.

{% cache globally using key craft.app.request.url~"entry.someRichText" %}
    {{ readTime(entry.someRichText) }}
{% endcache %}

Submitted Solutions

  • Quentin Delcourt
  • Patrick Harrington
  • Nate Iler
  • Andrew Welch
  • Spenser Hannon
  • Doug St. John
  • Matt Stein
  • Alex Roper
  • Philip Thygesen