In this challenge you will design an algorithm that predicts how long a piece of text takes to read. This challenge was inspired by the amusing discovery that this excellent, rather lengthy article was hard-coded (as were all other blog articles on the site) to be a 5 minute read.
The challenge is to write a twig macro called “readTime” that will accept a single parameter: text
(a string of plaintext or markdown). The macro should output a predicted length of time that the text will take to read, using medium.com as a measure of accuracy. So for example, calling:
{% set entry = craft.entries.slug('our-credentials').one() %}
{{ readTime(entry.text) }}
Might output:
1 minute
Whereas calling:
{% set entry = craft.entries.slug('an-annotated-webpack-4-config').one() %}
{{ readTime(entry.text) }}
Might output:
50 minutes
How you choose to calculate the read time based on the content of the text is entirely up to you. The goal is to come up with a solution that predicts read times similar to (or better than) those on medium.com.
The macro must output a predicted read time in minutes, given the parameter as described above. It should not rely on any plugins and the code will be evaluated based on the following criteria in order of priority:
It can use whatever ingenious algorithm you come up with for predicting an article’s read time, the more creative the better. The code should nevertheless be readable and easy to understand and the result should output read times similar to medium.com.
Begin with the easiest metric, word count, and experiment and tweak from there. Get inventive and add other metrics to the mix. If you feel up to the challenge then see what you can do with markdown text as the input (a HTML to markdown converter such as this one may help in testing).
Your results don’t need to be identical to every article on medium.com, but the following articles will be used as an initial measure of accuracy:
Finally, when you have a working macro, see what read time your macro outputs for An Annotated webpack 4 Config for Frontend Web Development and then go tell Andrew Welch!
Depending on your source, the average reading speed of most adults is around 250 words per minute. According to this page on medium.com, read time is calculated as follows:
Read time is based on the average reading speed of an adult (roughly 265 WPM). We take the total word count of a post and translate it into minutes, with an adjustment made for images.
So to get us started let’s split the text into words using the split
filter with the assumption that words are separated by spaces. We can then get the number of words using the length
filter and divide it by 265 words per minute to give us a read time.
{% macro readTime(text) %}
{% set words = text|split(' ') %}
{% set readTime = words|length / 265 %}
{{ readTime }} minutes
{% endmacro %}
It is possible to solve this with a single line of code by simply chaining the twig filters together.
{% macro readTime(text) %}
{{ text|split(' ')|length / 265 }} minutes
{% endmacro %}
We’ll go back to the first, more readable solution, and fix 2 potential issues. The first occurs if the read time turns out to be a fraction such as 3.25, in which case we’ll round it up using Craft’s ceil
function. The second occurs if the read time is 1, in which case we’ll leave the “s” out of “minutes” to make it singular.
{% macro readTime(text) %}
{% set words = text|split(' ') %}
{% set readTime = ceil(words|length / 265) %}
{{ readTime }} minute{{ readTime > 1 ? 's' }}
{% endmacro %}
Similar solutions: Quentin Delcourt, Spenser Hannon, Doug St. John, Philip Thygesen.
Another way to determine whether to output “minute” or “minutes” is to use Yii’s internationalization (I18N) plural feature along with Craft’s translate
or t
filter.
{% macro readTime(text) %}
{% set words = text|split(' ') %}
{% set readTime = ceil(words|length / 265) %}
{{ readTime }} {{ '{n,plural,=1{minute} other{minutes}}'|t({n: readTime}) }}
{% endmacro %}
We could also take advantage of the duration feature to have the duration be automatically output in words. The parameter is expected in seconds so we multiply readTime
by 60 to accomodate this.
{% macro readTime(text) %}
{% set words = text|split(' ') %}
{% set readTime = ceil(words|length / 265) %}
{{ '{n,duration,%with-words}'|t({n: readTime * 60}) }}
{% endmacro %}
Up until now we’ve assumed that the text is provided as plain text, yet the challenge stated that it can also be provided as markdown text. In order to deal with this, we use the markdown
filter provided by Craft to convert it into HTML code, followed by the striptags
filter to remove all HTML tags, thereby converting it to plain text.
We can also get smarter about how we identify words, replacing all dashes and new lines with spaces using the replace
filter before splitting the text.
{% macro readTime(text) %}
{% set html = text|markdown %}
{% set plaintext = html|striptags|replace({'—': ' ', '–': ' ', '-': ' ', '\n': ' '}) %}
{% set words = text|split(' ') %}
{% set readTime = ceil(words|length / 265) %}
{{ readTime }} minute{{ readTime > 1 ? 's' }}
{% endmacro %}
Similar solutions: Patrick Harrington.
If we assume that the text is provided as markdown, we can take the number of images into account in the read time calculation, using the replace
filter with a regular expression that looks for all instances of ![...](...)
in the markdown text, or for all instances of <img ...>
in the HTML, followed by the split
filter. That allows us to calculate the number of images in the text and divide it by an arbitrary “images per minute” of 12, assuming that people spend an average of 5 seconds looking at each image (this will of course depend on the type of image: a cat versus a complex graph).
{% macro readTime(text) %}
{% set html = text|markdown %}
{% set imageCount = html|replace('/<img ([^>]+?)>/', '%%IMAGE%%')|split('%%IMAGE%%')|length - 1 %}
{% set plaintext = html|striptags|replace({'—': ' ', '–': ' ', '-': ' ', '\n': ' '}) %}
{% set words = text|split(' ') %}
{% set readTime = ceil((words|length / 265) + (imageCount / 12)) %}
{{ readTime }} minute{{ readTime > 1 ? 's' }}
{% endmacro %}
Similar solutions: Nate Iler, Alex Roper.
At this stage we’ve got a pretty accurate solution that takes words and images per minute into account. I even ran the code above on the An Annotated webpack 4 Config for Frontend Web Development article and got an amazingly rounded “90 minutes”!!
So, what other approaches could we take?
This solution by Matt Stein counts the number of unique words in the text using Craft’s unique
filter and does some maths to take the complexity of the text into account when calculating a read time.
This solution by Andrew Welch takes a different approach to calculating the number of words in the text. It counts the number of characters in the entire text and divides that by an average word length of 5.1 characters. The solution also caches the calculation globally using Craft’s {% cache %}
tags with a key constructed from the URL and the name of the entry field. This avoids having to calculate the entry’s read time on every request.
{% cache globally using key craft.app.request.url~"entry.someRichText" %}
{{ readTime(entry.someRichText) }}
{% endcache %}
Solution submitted by Quentin Delcourt on 12 November 2018.
{% macro readTime(text) %}
{% set wordsPerMinute = 265 %}
{% set words = text|split(' ') %}
{% set readingMinutes = ceil((words|length)/wordsPerMinute) %}
{{ readingMinutes }} minute{{ (readingMinutes > 1) ? 's' }}
{% endmacro %}
Solution submitted by Patrick Harrington on 12 November 2018.
{% macro readTime(text, wpm) %}
{%- spaceless %}
{% set wordCount = text
|md|striptags
|replace({
"—": " ",
"–": " ",
"-": " ",
"\n": " ",
"\r\n": " ",
"\t": " ",
"\n\r": " "
})
|split(' ')
|length
%}
{% set readingTime = (wordCount / wpm|default(265)) |round %}
{{ "#{readingTime} minute#{readingTime != 1 ? 's'}" }}
{% endspaceless -%}
{% endmacro %}
Solution submitted by Nate Iler on 12 November 2018.
{% macro readTime(text, wordsPerMinute = 260, imagesPerMinute = 24) %}
{# normalize the input #}
{% set markup = text|markdown %}
{# img markup placeholder #}
{% set imagePlaceholder = '[[ IMAGE_PLACEHOLDER_' ~ random() ~ ' ]]' %}
{# replace 'img' tags with placeholder / split on placeholder / count'em #}
{% set images = (markup|replace('/<img([\w\W]+?)>/', imagePlaceholder)|split(imagePlaceholder)|length) - 1 %}
{# Remove markup / split on space / count'em #}
{% set words = markup|striptags|split(' ')|length %}
{# read time (whole minutes -> round up) #}
{% set minutes = ((words / wordsPerMinute) + (images / imagesPerMinute))|round(0, 'ceil') %}
{# the minute(s) label (translatable) #}
{% set label = ('minute' ~ (minutes == 1 ? '' : 's'))|t('site') %}
{{ minutes ~ ' ' ~ label }}
{% endmacro %}
Solution submitted by Andrew Welch on 13 November 2018.
{#
# Challenge #2 – 5 Minute Read
# https://craftcodingchallenge.com/challenge-2-5-minute-read
# @author Andrew Welch, nystudio107
#}
{#
# Calculate an estimated reading time in minutes, making some assumptions based
# English as the language, average word length, and average words per minute read
# @param string text
#}
{% macro readTime(text) %}
{#
# Assume English, based on the contest parameters/examples
# http://www.wolframalpha.com/input/?i=average+english+word+length
#}
{% set averageWordLength = 5.1 %}
{% set wordsReadPerMinute = 200 %}
{# Ballpark the number of words in the text; add 1 because spaces/punctuation #}
{% set averageNumberOfWords = (text | length) / (averageWordLength + 1) %}
{% set readingTimeInMins = round(averageNumberOfWords / wordsReadPerMinute) %}
{{ "{0, duration,%with-words}" | t([readingTimeInMins * 60]) }}
{% endmacro %}
{% from _self import readTime %}
{# No reason to do the work on every page load; cache the result #}
{% cache globally using key craft.app.request.url~"entry.someRichText" %}
{{ readTime(entry.someRichText) }}
{% endcache %}
Solution submitted by Spenser Hannon on 13 November 2018.
{% macro readTime(text) %}
{% set words = text|trim|split(' ') %}
{% set wordCount = words|length %}
{% set readTime = (wordCount / 233)|round %}
{{ max(readTime, 1) }} Minute{{ readTime > 1 ? 's' : '' }}
{% endmacro %}
Solution submitted by Doug St. John on 14 November 2018.
{% set entry = craft.entries.section('news').order('RAND()').one() %}
{% import _self as self %}
<p>{{ entry.title }} (ID: {{ entry.id }})</p>
{{ self.readTime(entry.body) }}
{% macro readTime(text) %}
{#
Words per minute as average adult reading speed
https://www.irisreading.com/what-is-the-average-reading-speed/
#}
{% set wpm = 225 %}
{% set numberOfWords = text|striptags|split(' ')|length %}
{# Round to whole numbers >= 1 #}
{% set readingMinutes = max(numberOfWords/wpm, 1) | round %}
<p>{{ numberOfWords }} words / {{ readingMinutes }} minutes</p>
{% endmacro %}
{# Other considerations: images, emojis, links, sidebar items, etc #}
Solution submitted by Matt Stein on 16 November 2018.
{# takes html or markdown #}
{% macro readTime(text) %}
{% spaceless %}
{# turn it into markup, whatever it is #}
{% set markup = text | trim | markdown %}
{# strip it down and replace characters that join perfectly good words #}
{% set strippedText = markup | striptags | replace('/[—+]/', ' ') %}
{# lowercase everything for fun, replace whitespace variations with spaces, and get rid of non-alpanumerics #}
{% set tokenizedText = strippedText | lower | replace('/[\s\n\r]+/', ' ') | replace('/[^a-zA-Z0-9- ]/', '') %}
{# separate the bits into "words" #}
{% set words = tokenizedText | split(' ') %}
{% set wordCount = words | length %}
{% set uniqueWords = words | unique %}
{# calculate the ratio of unique words used, which is our pedestrian difficulty factor #}
{% set uniqueWordRatio = uniqueWords | length / wordCount %}
{# plugin number vaguely provided by Googling human reading speed #}
{% set wordsReadPerMinute = 220 %}
{# calculate a multiplier based on a sensible and obvious bit of division #}
{% set magicNumber = uniqueWordRatio / 0.37934668071654 %}
{% set readTimeMinutes = ceil((wordCount / wordsReadPerMinute) * magicNumber) %}
{{ readTimeMinutes }} minutes
{% endspaceless %}
{% endmacro %}
Solution submitted by Alex Roper on 16 November 2018.
{% macro readTime(text) %}
{# Text filters:
# 1. Convert Markdown to HTML
# 2. Strip out all HTML tags
# 3. Replace more than one whitespace character
# (tabs, space, new lines) with only 1 space.
# 4. Trim excess trailing or leading spaces.
# 5. Create and array for words splitting at spaces.
# 6. Count the length.
-#}
{% set words = text|md|striptags|replace('/\s{1,}/', ' ')|trim()|split(' ')|length %}
{# 1. Find all images in Markdown: ![alt text](image.jpg)
# and replace them with "%%IMAGE%%"
# 2. Create and array for words splitting at "%%IMAGE%%"
# 3. Count the length less 1.
-#}
{% set images = text|replace('/!\[[^\]]*?\]\([^)]+\)/', '%%IMAGE%%')|split('%%IMAGE%%')|length - 1 %}
{# Calculate time read time for images in seconds.
# First image = 12 seconds.
# Each image after that is 1 second less than the previous image.
# Minimum is 3 seconds.
#
# Taken from
# Read Time and You
# Here’s how read time is calculated
# https://blog.medium.com/read-time-and-you-bc2048ab620c
-#}
{% set secondsPerImage = 12 %}
{% set imageDuration = 0 %}
{% for i in 1..images if images > 0 %}
{% set imageDuration = imageDuration + secondsPerImage %}
{% if secondsPerImage > 3 %}
{% set secondsPerImage = secondsPerImage - 1 %}
{% endif %}
{% endfor %}
{% set wordsPerMin = 275 %}
{% set readTime = ((words / wordsPerMin) + (imageDuration / 60))|round(0, 'ceil') %}
{{ readTime }} minute{{ readTime > 1 ? 's' }}
{% endmacro %}
Solution submitted by Philip Thygesen on 17 November 2018.
{% macro readTime(text) %}
{% set numberOfWords = text|split(' ')|length %}
{% set wordsPerMinute = 220 %}
{% set readTime = (numberOfWords/wordsPerMinute) %}
{% if readTime > 1 %}
{% set readTime = readTime|number_format(0, '.', ',') ~ ' Minutes'%}
{% else %}
{% set readTime = '1 Minute' %}
{% endif %}
{{ readTime }}
{% endmacro %}