Where I give less of a technical explanation and more of a slight rant on slowly moving away from big tech/proprietary solutions and modern software. There are some notes on the technical “how”, which are to be taken more as an inspiration than as a tutorial.
Like many other devs, I followed the fairly standard strategy
of using a static website generator (in my case it was Hugo) and hosting on GitHub pages
via an action that triggered the build after a push to the
main branch. I will not get into details, this is all
very well explained, both on Hugo’s and GitHub’s docs.
There were several things I disliked about this, not least of which was the relatively indecent level of hidden/abstracted complexity (from the amount of tools and supporting architecture required rather than user/dev-facing perspective) behind my very simple needs. Said needs were, as follows:
I did not need:
There was essentially a profound mismatch between the power that Hugo offers (and thus its abstractions) and my absolutely basic needs (apart, maybe, for the mathematical part).
Add onto that the fact that I was also relying on GitHub actions to generate a website I had already generated locally, from scratch, everytime. This only adds to the “hidden complexity” I was hinting at earlier. Realistically, it is deeply ridiculous for a blog of this one’s size to rely on a workflow as sophisticated as GitHub’s continuous integration. I need to make a few html pages and push them onto a server, I don’t really need to be precious about versioning…
This is a topic that could be a post (or several) on its own. There are essentially three aspects to it:
The first step for me was to find an alternative to Hugo. It handled a few things for me that required finding alternatives:
I considered writing my own solution to this for a moment. I did not think that parsing minimal markdown and generate the approriate html would have been very hard considering what little I needed. I thought of leaving the maths untouched and using Katex and be done with it.
Then, I was made aware that Katex, MathJax, and (obviously) generated images of maths formulas make them inaccessible to screen readers, and that a much better alternative was MathML. The idea of having a website free of JavaScript, and where maths was accessible and semantically embedded in the source was particularly enticing. Writing in MathML directly or writing a tool to convert LaTeX to MathML? Much less so.
Enter the one real dependency of my website: pandoc!
Want to generate an output.html from an
input.md file, turning you ugly LaTeX into horrifying
MathML?
pandoc -f markdown -t html --mathml input.md -o output.html
I also needed:
template.html file;And that’s achieved just as easily as:
pandoc -f markdown -t html \
--mathml \
--highlight-style=zenburn \
--toc \
--template=template.html \
input.md -o output.html
Remark : I have to admit relying on something as big as pandoc (or a LaTeX compiler) is pretty annoying, esp. given part of my reasons for migrating. I don’t see a better solution to both generate html from markdown at a minimal cost and “compile” maths snippets to html. If you know of one, do let me know.
Now obviously you don’t want to have to do that for each and every post individually (although why not, if you write as little as me?). I wanted to automate the website generation fully, so I did it as follows.
First, I structured my “source” like so:
sources/
|-- content/
| |-- blog/
| | |-- post1/
| | | |-- asset1.png
| | | \-- index.md
| | |-- post2/
| | | |-- asset2.png
| | | \-- index.md
| | \-- index.md
| \-- index.md
\-- template.html
so as to easily produce a static website that mirrors this structure as:
public/
|-- blog/
| |-- post1/
| | |-- asset1.png
| | \-- index.html
| |-- post2/
| | |-- asset2.png
| | \-- index.html
| \-- index.html
|-- index.html
\-- template.html
This can easily be achieved with any scripting language, copying over or syncing the assets and using pandoc to generate the html pages where needed. You can also check against the last-written times to handle incremental builds.
I used a make with a Makefile to do that, but
again, that’s pretty overkill. I will probably just take the time
to write a small bash script instead at some point.
There are many ways that that could be achieved. One thing I thought about was just locally changing my editor’s save shortcut to also call make after save, but I decided against. I dont want my editor to be responsible for this, especially if I want to write and save without rebuilding every time.
A good Linux utility for that is entr. You can
pipe your .md files’ paths (through the use of ls or
find for example) to entr and have it
call ‘make’ so your website is built everytime a save occurs. This
could also be a place to force a web-browser refresh if you are
also checking the output on localhost.
public website on localhostThere is not much to it there, python saves the day once more. Just:
python -m http.server
and open your favourite browser on
http://localhost:8000.
Again, maybe I should look into doing it from scratch just to learn a little bit about how this is achieved, but this is pretty low on my list of priorities of things I am interested in.
The last thing to do was to get off GitHub and onto Codeberg
and, additionally, not rely on actions and just push the
public/ directory to be served on Codeberg pages.
There are several valid approaches but I decided to keep the
generation side as its own private repo and to have a public pages
repo which only contains a copy of the public/
directory (plus the .git/ files). So the last piece
of the puzzle was just to write a small script in
sources/ that would copy over the changes in its
local public/ directory to the served one, and that
would cd into it to commit and push automatically,
with a prompted commit message.
It’s overall pretty primitive, but it works, saves me a lot of typing, and keeps the generation on my end rather than offloading it god knows where at god knows what cost.
A few quality-of-life features are going amiss. Most notably,
it would be comfortable to have the blog/ index page
be updated automatically whenever a new entry is added. There are
definitely ways to do that simply by leveraging the
YAML headers in the .md source files so as to pass
info such as date, title, abstract etc… and then use a Lua filter
in cunjunction with pandoc. It’s something I will have to do at
some point, especially if the post count grows large enough that
it necessitates separating the blog index into several pages.
Another loss from my method is that with my current setup, it
would be pretty awkward to try and update the blog if I didn’t
have local clones of both the source and
pages repos. Say I spot a typo, want to edit quickly
straight from the Codeberg interface. Well now I can’t generate
the html and I have to also edit it on pages so as to
keep things in sync, or just accept that I need to push as soon as
I get back on my machine. This can also be adressed but would
defeat the purpose of not relying on remote tooling more than is
necessary. Realistically, I only write on my machines anyways.
In the same vein, my solution only really works for
me. A command such as make is ubiquitous on
Linux boxes, so it makes sense to use it. A Windows user trying
the same method may, rightly, find that installing a Linux
subsystem or mSys2 or what-have-you just to build a blog is peak
insanity.
Lastly, although I like the accessibility and lightness of MathML, its rendering is really not the best. I will have to explore options that remain JS-free, whilst still having a more visually pleasing maths display. This probably means generating .png or .svg of the maths and linking to it whilst keeping the MathML as a fallback or description for screen readers. This is at the cost of more memory, which is probably not a load I am willing to put on Codeberg. Maybe if I end up self-hosting…
This migration was just part of me wanting to overall be more involved in the FOSS space. I don’t really have the time or the inclination to produce good code outside of work, so I have been pretty shy about showing it online, or contributing to actually useful projects. That being said, I can also see the pedagogical value of having some such snippets on a platform that advocates for freedom and openness.
I will be migrating my scrappy code from my private GitHub repos and make them public on Codeberg, and try to give them some much needed TLC. I will also try and contribute more to this blog. At a time when the answer always seems to be to throw compute at a problem, and use more and more abstract machinery, I feel it’s valuable to be reminded that sometimes, more “primitive” or lower-level solutions are not only more efficient but also much simpler to comprehend.