Typographic improvements for Jekyll/kramdown
My blog is build with Jekyll and I write my posts with kramdown. This works nicely, but I wanted to implement some typographic subtleties that need support in the converter that produces the HTML. This blog post describes how I implemented this using some small Jekyll plugins. All of these plugins can be found on the plugin page of the blog.
No hyphenation for URLs
The blog enables hyphenation via CSS (hyphens: auto;
). This works
well, but certain content should not be hyphenated. URLs may contain
dashes, so any additional hyphenation leads to the display of invalid
URLs. While the actual link still works, the user will see the mangled
string. Of course, I cannot simply disable hyphenation for links
(<a>
tags), since the link text may consist of normal words, such
as this one. Hyphenation should be
disabled only if the link text is an actual URL, like
https://en.wikipedia.org/. For that I use a simple heuristic: If the
link text and the actual link are the same, disable hyphenation. This
is implemented by adding a class no-hyphenation
to these links and
disabling hyphenation with CSS.
The Jekyll plugin (at the time of writing) subclasses the kramdown HTML converter to modify how kramdown produces links:
class Kramdown::Converter::TBHtml < Kramdown::Converter::Html
def convert_a(el, indent)
link_text = inner(el, indent)
if el.attr["href"] == link_text
if el.attr.key?("class")
el.attr["class"] += " no-hyphenation"
else
el.attr["class"] = "no-hyphenation"
end
end
super(el, indent)
end
end
This simply adds a class no-hyphenation
to the <a>
element if the
link equals the link text. This class can then be styled using CSS. In
case the automatic detection fails, you can still add the class
manually. The CSS is very simple:
.post-content {
hyphens: auto;
}
.post-content .no-hyphenation {
hyphens: manual;
}
The final missing piece is how to make Jekyll use this generator. First, we need to register a new Markdown dialect:
class Jekyll::Converters::Markdown
class KramdownTB < KramdownParser
def convert(content)
Kramdown::Document.new(content, @config).to_t_b_html
end
end
end
This code subclasses the standard kramdown converter in order to keep
all its features intact. Finally, add something like the following to
_config.yml
:
markdown: KramdownTB
I unimaginatively used my initials and named the converter
KramdownTB
, but the name does not matter. If adapting this code, you
should take note of the magic in the above Ruby snippet: The name of
the converter class (TBHtml
) is mangled into the function
to_t_b_html
, i.e., every uppercase letter is converted to lowercase
followed by an underscore.
Adding space to ALL CAPS
If you use all caps text, you usually want to add some letterspacing. Why? Because it is easier to read and looks better: lammps vs. LAMMPS. The effect is subtle, but I think it improves legibility. While one should generally avoid all caps text, the appearance of acronyms is also improved. Therefore, I wanted this on my blog.
There is no 100% reliable detection of all caps, so I use heuristics that are good enough in most cases. The rest can be fixed manually. The plugin implementing this searches for the regular expression:
[\p{Upper}0-9](?:[\p{Upper}0-9.'’]|&)+(?!\w\w)
The first part, [\p{Upper}0-9]
, matches strings that start with with a
number or an uppercase letter.
This must be followed by one or more uppercase letters, numbers, full
stops, apostrophes, or ampersands: (?:[\p{Upper}0-9.'’]|&)+
.
Finally, (?!\w\w)
, the matched string may not be followed by more
than one letter. This ensures that plurals like “URLs” are matched,
but not “XCharter”.
The code looks somewhat like this:
def span_allcaps(text)
text.gsub(%r{[\p{Upper}0-9]
(?:[\p{Upper}0-9.'’]|&)+
(?!\w\w)
}x) do |m|
if m.to_s.gsub(/\p{^Upper}/, "").length > 1
"<span class=\"allcaps\">#{m}</span>"
else
m
end
end
end
There is an additional check to ensure that we have not matched a number, but have at least one letter in there. The matching CSS is simply:
.allcaps {
letter-spacing: 0.05em;
}
I found that the best place to employ this function is in the
convert_text
method of the kramdown HTML converter. So let’s add it
to the TBHtml
class from above:
class Kramdown::Converter::TBHtml < Kramdown::Converter::Html
def convert_text(el, indent)
return span_allcaps(super)
end
end
In my testing this seems to apply it in all the right cases.
Small caps
Sometimes, it is nice to have support for small caps in your
text. In theory, we can do this in Markdown by adding a class sc
*small caps*{:.sc}
and styling it using CSS:
.sc {
font-style: normal; /* disable italics */
font-variant: small-caps;
font-variant-numeric: oldstyle-nums; /* small caps look
best together with oldstyle numbers */
}
For my blog, there is one problem though: The font that I use, Bitstream Charter, does not support small caps. One would have to buy those and this is not only expensive but comes with weird licenses. I wanted to stay with free software. So what happens when applying the above CSS? Ugliness: what is this? These fake small caps should be completely avoided. Luckily, there is a little-known solution. Michael Sharpe created an extension to the Charter font for LaTeX and named it XCharter. This package also contains OpenType font files that can be converted for the web. Using this font for small caps yields much better results: in july 1969, man first landed on the moon. As you can see, the font also supports old style numbers.
Small improvements for great effect?
A large part of good typography is attention to detail. I am only an amateur, but I like how such simple steps can improve the quality of typesetting. Maybe I’m the only one who will ever notive these little details, but I believe the readability is improved either way.