Code used for this blog
Here are some of the custom plugins for Jekyll that I use.
- Customizations for kramdown
- Math support with KaTeX
- Verbatim include
- Post-processing HTML to handle table of contents
- Generating a sitemap from git data
Customizations for kramdown
This code adds a span with class allcaps
around words in uppercase
and helps to avoid hyphenating URLs. See
my blog post
for details about the implementation and the rationale behind this.
# coding: utf-8
#
# Customizations for kramdown.
#
# Copyright (c) 2017 Tobias Brink
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
module TobiasBrink
module_function # This makes the following functions available
# without including the module.
# Put a span with class "allcaps" around ALL CAPS words.
def span_allcaps(text)
text.gsub(%r{
[\p{Upper}0-9] # start with uppercase letter or digit
(?:[\p{Upper}0-9.'’]|&)+ # allow & or apostrophe
(?!\w\w) # not followed by two non-alphanumeric chars,
# one is still allowed for plurals like "URLs"
}x) do |m|
if m.to_s.gsub(/\p{^Upper}/, "").length > 1
"<span class=\"allcaps\">#{m}</span>"
else
m
end
end
end
end
# Custom kramdown converter ############################################
module Kramdown
module Converter
class TBHtml < Html
# Custom <a> tags.
#
# Add HTML class to avoid hyphenation to those links that have
# the same link text and href.
#
# Kramdown API usage:
# * inner: get the content of the HTML element, i.e., the
# link text
def convert_a(el, indent)
link_text = inner(el, indent)
if el.attr["href"] == link_text
if el.attr.key?("class")
el.attr["class"] += " no-hyphenation"
else
el.attr["class"] = "no-hyphenation"
end
end
super(el, indent)
end
# Increase spacing for ALL CAPS.
#
# Kramdown API usage:
# none
def convert_text(el, indent)
return TobiasBrink::span_allcaps(super)
end
end
end
end
# Liquid filter for the allcaps thing, because it is useful ############
module Jekyll
module AllCapsFilter
def span_allcaps(text)
return TobiasBrink::span_allcaps(text)
end
end
end
Liquid::Template.register_filter(Jekyll::AllCapsFilter)
# Use it in this custom Jekyll converter. ##############################
module Jekyll
module Converters
class Markdown
class KramdownTB < KramdownParser
# Override only the convert method to use my converter.
def convert(content)
Kramdown::Document.new(content, @config).to_t_b_html
end
end
end
end
end
Math support with KaTeX
This plugin is used to pre-render math with KaTeX (via execjs). It contains liquid tags to render math, but it also overrides kramdown’s math renderer with my own rendering.
#
# Math with KaTeX - Liquid tag and kramdown
#
# Copyright (c) 2017 Tobias Brink
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
require 'execjs'
require 'digest'
require 'set'
# KaTeX renderer #######################################################
module TobiasBrink
class CustomKaTeXRenderer
def initialize(config)
katex_config = config['katex'] || {}
path = katex_config['path_to_js'] || "./public/js/katex.min.js"
katexsrc = open(path).read
@katex_renderer = ExecJS.compile(katexsrc)
end
def render(latex_source, displaystyle)
return @katex_renderer.call("katex.renderToString",
latex_source,
displayMode: displaystyle)
end
end
# Have a global instance of the renderer. Do not reinitialize to
# speed up incremental generation.
def self.katex_set(config)
if !defined? @@katex_renderer
@@katex_renderer = CustomKaTeXRenderer.new(config)
end
end
def self.katex_renderer
return @@katex_renderer
end
end
# The KaTeX renderer is a global object, since this simplifies
# handling of config and we do not need multiple instances. As the
# config file is read once in the beginning, we hook to that point and
# create the KaTeX renderer.
Jekyll::Hooks.register :site, :post_read do |site|
TobiasBrink.katex_set(site.config)
end
# Liquid tag ###########################################################
module Jekyll
module Tags
class KatexBlock < Liquid::Block
def initialize(tag, markup, tokens)
super
@tag = tag
@tokens = tokens
@markup = markup.strip
@displaystyle = false
@label = nil
# Parse.
state = :token
@markup.split(/\b/).each do |i|
if state == :token
if (i == 'block') \
|| (i == 'displaystyle') \
|| (i == 'centred') \
|| (i == 'centered')
@displaystyle = true
state = :whitespace
elsif i == 'label'
state = :label_eq
else
raise "unknown token '#{i}' in latex tag"
end
elsif state == :whitespace
(i =~ /\s+/) || (raise "expected whitespace, " \
+ "not '#{i}' in latex tag")
state = :token
elsif state == :label_eq
i.gsub!(/\s/, "")
(i == '="') || (raise "expected '=\"', " \
+ "not '#{i}' in latex tag")
state = :label
@label = ""
elsif state == :label
if i.gsub(/\s/, "") == '"'
# label finished
state = :whitespace
else
@label += i
end
end
end
if (state != :whitespace) && (state != :token)
raise "incomplete label in latex tag"
end
end
def render(context)
latex_source = super
rendered = TobiasBrink.katex_renderer.render(latex_source,
@displaystyle)
if @displaystyle
if @label
page = context.registers[:page]
# Ensure the data structures are there.
page["current_equation_number"] \
|| page["current_equation_number"] = 1
page["equation_number_map"] \
|| page["equation_number_map"] = {}
# Check if the label is unique.
if page["equation_number_map"].key? @label
raise "duplicate equation label '#{@label}'"
end
# Store reference.
num = page["current_equation_number"]
page["equation_number_map"][@label] = num
# Create an ID, making sure it is using only alphanumeric
# chars.
the_id = " id=\"equation-" \
+ Digest::SHA256.hexdigest(@label) \
+ '"'
# Create div.
eq_num = "<div class=\"math-equation-number\">(#{num})</div>"
# Increase equation number.
page["current_equation_number"] += 1
else
eq_num = ""
the_id = ""
end
return "<div class=\"math-block\"#{the_id}>" \
+ '<div class="math-block-flex">' \
+ '<div class="math-block-inner">' \
+ rendered \
+ '</div>' \
+ eq_num \
+ '</div>' \
+ '</div>'
else
return rendered
end
end
end
end
end
Liquid::Template.register_tag('latex', Jekyll::Tags::KatexBlock)
module Jekyll
module Tags
class EqRefTag < Liquid::Tag
def initialize(tag, markup, tokens)
super
@label = markup.gsub(/^["\s]*/, "").gsub(/["\s]*$/, "")
end
def render(context)
page = context.registers[:page]
# Calculate the id.
the_id = "equation-" + Digest::SHA256.hexdigest(@label)
# Try to find the referenced equation's number. If we cannot
# find it, maybe because it will be defined later in the
# document, insert a magic string that will automatically be
# replaced in a post_render hook if the referenced equation
# actually exists. In order to be able to do that, we also
# store the deferred labels in the context.
num = page["equation_number_map"][@label]
if not num
# Ensure the data structure is there.
page["deferred_equation_labels"] \
|| page["deferred_equation_labels"] = Set.new([])
# Store our thing.
page["deferred_equation_labels"].add(@label)
num = "<!-- jwDBwQ7VD #{@label} -->"
end
# Done.
return "<a href=\"\##{the_id}\">(#{num})</a>"
end
end
end
end
Liquid::Template.register_tag('eqref', Jekyll::Tags::EqRefTag)
# Fix up references that come before the equation they refer to.
Jekyll::Hooks.register [:pages, :posts], :post_render do |post|
# Magic string replacements.
if post.data["deferred_equation_labels"]
post.data["deferred_equation_labels"].each do |label|
num = post.data["equation_number_map"][label]
num || (raise "unknown equation label '#{label}' in #{post.path}")
# Replace all the magic strings for the given label.
post.output.gsub!("<!-- jwDBwQ7VD #{label} -->", "#{num}")
end
end
end
# Kramdown support #####################################################
module Kramdown
module Converter
class TBHtml < Html
# Convert math using my KaTeX renderer instead of the
# standard renderers in kramdown. Mine has the advantage
# of executing the Javascript during build, keeping the
# actual website Javascript free.
#
# Kramdown API usage:
# * format_as_block_html:
# * format_as_span_html: as the name says
def convert_math(el, indent)
displaystyle = (el.options[:category] == :block)
rendered = TobiasBrink.katex_renderer.render(el.value,
displaystyle)
if displaystyle
# Create the nested structure we need for styling.
math_block_inner = \
format_as_block_html('div',
{"class" => "math-block-inner"},
rendered, indent)
math_block_flex = \
format_as_block_html('div',
{"class" => "math-block-flex"},
math_block_inner, indent)
# Add "math-block" class
if el.attr.key?("class")
el.attr["class"] += " math-block"
else
el.attr["class"] = "math-block"
end
return format_as_block_html('div', el.attr,
math_block_flex, indent)
else
return format_as_span_html('span', el.attr, rendered)
end
end
end
end
end
Verbatim include
The files shown on this page are actually read from the source code in
the _plugins
directory. This can be done with this simple custom
liquid tag:
#
# Liquid tag to include a file verbatim.
#
# Use inside code tags and so on.
#
# Copyright (c) 2017 Tobias Brink
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
module Jekyll
module Tags
class VerbatimIncludeTag < Liquid::Tag
def initialize(tag, markup, tokens)
super
@fname = markup.gsub(/^["\s]*/, "").gsub(/["\s]*$/, "")
end
def render(context)
return File.read(@fname)
end
end
end
end
Liquid::Template.register_tag('include_verbatim',
Jekyll::Tags::VerbatimIncludeTag)
Post-processing HTML to handle table of contents
For CSS styling reasons, I need the table of contents outside of the div which contains the output of the Markdown converter. This is of course impossible in the framework provided by Jekyll. Instead, I use nokogiri to find the table of contents, and to copy it to the correct place. The original table is hidden via CSS. To avoid duplicate IDs, I strip all IDs before inserting the copy. The insertion point is marked by a magic string.
#
# Move table of contents to the right parent element.
#
# Copyright (c) 2017 Tobias Brink
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
require 'nokogiri'
# A hook to copy the TOC to the footer.
#
# We do not use Nokogiri to manipulate the DOM directly because it
# introduces small errors and is slow. Instead, we just extract what
# we need and do basic string manipulation to insert the copy where we
# want. This is also more flexible since the layouts can specify where
# the TOC should go.
Jekyll::Hooks.register [:pages, :posts], :post_render do |post|
# Find all TOCs.
document = Nokogiri::HTML(post.output)
tocs = document.css(".post .post-content .page-toc")
tocs || next # no TOC, nothing to do.
# Into one string...
toc = tocs.to_a.map do |i|
# Remove IDs, to avoid duplicates.
i.xpath('//@id').remove
# To HTML.
i.to_html
end.join
# ...and into the document.
post.output.gsub!("<!--exl8hs TOC goes here exl8hs-->", toc)
end
Generating a sitemap from git data
Generate a sitemap.xml
file with modification dates from git. See
my blog post
for details.
#
# Auto-generate a sitemap.xml from git data.
#
# Copyright (c) 2017,2018,2019,2021 Tobias Brink
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation files
# (the "Software"), to deal in the Software without restriction,
# including without limitation the rights to use, copy, modify, merge,
# publish, distribute, sublicense, and/or sell copies of the Software,
# and to permit persons to whom the Software is furnished to do so,
# subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
module Jekyll
class TBSitemap < Page
def initialize(site, base, dir, entries)
@site = site
@base = base
@dir = dir
@name = 'sitemap.xml'
self.process(@name)
self.read_yaml(File.join(base, '_layouts'), 'sitemap.xml')
self.data['entries'] = entries
end
end
class TBSitemapGenerator < Generator
priority :lowest # run this one last so it can catch all
# generated pages
def generate(site)
# Get the last change time for the site. We'll assume this
# command will never fail.
sitetime = Time.at(get_timestamp_global)
.utc
.strftime "%Y-%m-%dT%H:%M:%SZ"
# Assemble sitemap.
sitemap = []
# For now, we do not include HTML files in the static files list.
# If this is desired at some point, add `site.static_files` below.
[site.posts.docs, site.pages].each do |l|
l.each do |page|
# Skip all non-html pages.
next if !(page.url.end_with?("/") || page.url.end_with?(".html"))
# Skip the 404 page.
next if page.url == "/404.html"
# Try to find source file.
src_file = page.path
# Get time.
timestamp = get_timestamp(src_file)
if timestamp == 0
# There is no source file, use sitetime.
time_str = sitetime
else
time_str = Time.at(timestamp)
.utc
.strftime "%Y-%m-%dT%H:%M:%SZ"
end
# We got it.
sitemap.push( { "loc" => page.url,
"lastmod" => time_str } )
end
end
# Check size.
if sitemap.length >= 45000
puts "WARNING, sitemap has #{count} entries, limit is 50,000"
end
# Create page.
sitemap_page = TBSitemap.new(site, site.source, "/", sitemap)
site.pages << sitemap_page
end
private
def exec_git(cmdline)
env = ENV.to_hash
# Remove all GIT_* env variables to ensure clean execution!
env.delete_if { |key, value| key.start_with?("GIT_") }
# Execute and return stdout
IO.popen(env, cmdline,
:unsetenv_others=>true) { |stdout| stdout.read }
end
def get_timestamp_global
exec_git(["git", "log", "-1", "--format=%ct"]).to_i
end
def get_timestamp(fname)
exec_git(["git", "log", "-1", "--format=%at", "--", fname]).to_i
end
end
end