How We Built Our Blog
Yesterday, we announced the redesign of our blog and the addition of our engineering channel. This is the first post related to engineering detailing a walkthrough of what we’ve built, and what better than to blog about rebuilding our blog on our new blog? How meta.
It started with an engineering blog
A few months ago, I took up a de facto role heading up developer evangelism efforts at Stack Exchange. I say that with a caveat: we treat developer evangelism at Stack very differently than most other companies do. What we don’t want to do is create a team of people that travel around, speaking at events, trying to sell something and code occasionally – that didn’t make sense to us. Instead, what we really want to do is highlight the amazing public outreach work members of our engineering team are already doing. Most if not all of our developers are active members of not only Stack Overflow, but the larger technical community: writing blog posts, doing open source work, and speaking at conferences. We really want to shed some light on the individual public outreach efforts our engineering team is actively doing. Second, we want to make the philosophy that has made Stack Overflow so successful for the developer community more widely known. Who better to do that than the developers that helped shape that community?
So the first natural solution to address these goals would be to blog about the work we’re doing. We were inspired by a lot of great technical blogs out there like Code as Craft and OkTrends and the idea of an engineering blog that was similar to these examples had been thrown around. However, there were reservations about creating a completely separate blog: why fragment our readership even further? We had the official Stack Exchange company blog, the ServerFault blog for our SRE team, and the many personal blogs that our individual developers had. There were so many different avenues to publish our work, and we couldn’t figure out where this content would live. It seemed like if we created new kinds of content in a completely separate blog, the existing ecosystem would force us to fragment our audience even further. Otherwise we would host content that was simply inappropriate in one of the existing blogs. What we really needed was a single destination that could accommodate many different kinds of content instead of creating multiple destinations that specialized in just one kind.
Revisiting the Stack Exchange blog
After sending out the original proposal to the company, I quickly realized I might have stepped on a landmine of a project. The blog ecosystem was something that we’ve wanted to address for a very long time, and that meant that pretty much every part of the company would be affected by it and had strong opinions about the project. After considering all the comments I received, we came to a general conclusion: the ideal solution would be to take our most popular blog, the official Stack Exchange blog, and use it to house the new content we wanted – including the engineering posts. That turned out to be a much larger project that took six to eight weeks. There were some key parts to this solution that would make it work:
Channels
In the previous blog, we had all of our posts in a single column organized by tag. That meant everything we posted would go out to anyone reading our blog. The concern here was that if we just started writing very technical posts and put them in this channel, it would be relevant to a technical subset of our audience, but not to everyone. On the flip side, developers that would come to our blog to read technical content wouldn’t necessarily want to read or care about everything else we’re doing – they’re here for the engineering stuff.
What we came up with were “channels” – high level categories that would let us separate the major kinds of posts we’d be publishing. There are two main ones: company news and engineering. The company news channel would house all of the familiar content like podcasts, company announcements, and so forth. It would also let us add new kinds of content like those for internal culture and diversity related efforts. The engineering channel effectively became our solution for an engineering blog, and would host all of our technical walkthroughs, write-ups of evangelism efforts, and technical opinion pieces like we originally wanted.
Reposts
Another key part of this solution was the ability to repost. Many of the most popular posts related to Stack Exchange – especially the technical posts – were decentralized, housed in our many developers’ personal blogs. There was a good reason for this: we believe that our developers should publicly get credit for the things they build internally, and one of the best ways to claim it is to write about it on their personal blogs. Once we created an engineering channel, we didn’t want to make our developers feel like they had to choose between posting on the company blog or on their personal blog. It seemed like the wrong move.
Instead, we came up with a different approach: post on your personal blog about your work just like you always have, and if you want to repost on the company blog for more exposure, we will do so with a prominent link to the original source. It does two things: it gives our technical audience the ability to get an inside look into how we build things while still giving credit to the people that build them.
A new blogging engine
During the original proposal stage for the engineering blog, we also had a conversation about what engine we would use. At the time, all of our blogs were running WordPress…which we weren’t so happy about. It was very buggy, difficult to log in to, not very performant, and has caused our SRE team more than a few headaches. If we were really going to revamp the new company blog, it seemed like a lot of work to try and wrestle with our WordPress installation.
In comes Jekyll
Because we weren’t happy with WordPress, we started looking at other engines. We looked at static site generation in Go and even considered building our own engine, but what we decided to do was switch to Jekyll. It’s an open source static site generator built in Ruby that was great for building blogs on. A lot of developers I know were switching over to Jekyll for their personal sites and it was something we had played around with on some design projects. Switching to Jekyll had a number of advantages:
- Posts are in Markdown, something most of our company was familiar with
- Jekyll is just static site generation, so it’s much more performant
- Complete flexibility for front end work, no need to wrestle with templates
- Open source with a strong community, which we love
- Not WordPress or PHP
Jekyll seemed to fit the bill for what we wanted, and I started building a prototype. One of the other nice things about Jekyll is that it was supported by GitHub Pages, so I could actually share my work to the rest of the company without creating new builds. That leads me to my next point…
Open Source
We have a policy of being “default public” at Stack Exchange. That means we try and work in the open when we can, and that includes our technical work. Because the blog prototype I was building was already on GitHub, it made sense to just keep it publicly available. It was different from the work I was used to – having code available to those outside of our internal team changed the way I approached building the blog:
- Our blog was going to be used by people who are not very technical. Having a GUI for our community and marketing teams to create posts and preview Markdown without learning Git was great.
- It meant we left the door open to let people outside of company make changes, fix bugs, and possibly even contribute posts in the future.
- There are few examples of major migration efforts to Jekyll out there, and could be one of them.
- People could take the work we did and use it for themselves. There are insights in the migration that we want to give back to the community.
- If there are bugs you can fix them too!
Building the blog
Alright – enough about the ideation process. Now to the meaty technical parts. There was a lot of development work that went into this, so I’m only going to highlight some of the major pieces and insights we had building the blog:
Infrastructure
Starting up a Jekyll project is pretty straightforward. It’s ridiculously easy to install and with just a few naming conventions in the file structure, Jekyll takes care of the static site generation. Easy as it is, I decided to fork an existing bootstrapping repository called Jekyll Now made by a friend of mine (and VP of Engineering at Trello), Barry Clark.
Out of the box, most of what we needed was there: a posting engine with Markdown and code highlight support, JavaScript and CSS flexibility, and it was fast. Because all the hard work was being done up front, we were just serving static files and that was a huge performance boost over WordPress. In just a few minutes, I had a basic working blog that was being hosted on GitHub Pages. We were already well on our way to what we had originally set out to do.
Importing the old content
Moving all of our previous blog posts to the new solution was a hard requirement, and that meant taking all the content we had on our WordPress instance and converting that into files in Markdown. This was not easy: we had over 700 blog posts over the history of the company, each with comments, static assets, and deeplinking that had to be preserved. The process seemed pretty straightforward if not riddled with edge cases that needed to be resolved: WordPress has an export function that spits out a giant XML file (a whopping 30MB of text and we just had to use that file to convert and pull files.
Thankfully, there are a few libraries that makes this migration easier. I tried a few of the import methods that were recommended in the Jekyll documentation, but none of them seemed to do it cleanly. After a few more attempts, I finally stumbled upon exitwp and that seemed to work pretty well. You can see the commit here – it was over 5,000 different file changes with the initial import.
For the most part, this worked pretty well. I was surprised by how much was preserved and the fact that Jekyll supports arbitrary HTML in the Markdown files helped a lot. There were a few things that were lost in the process though: some images with captions were misformatted, embeds were lost for many of our podcasts, and the references to uploaded files in WordPress broke images across the board.
This is when I had to really intervene programmatically. I started writing Python scripts to go through every post and repair a lot of the work done here. Here’s the high level overview:
- For HTML errors, much of this repair work was manual. We looked at the top 50 high trafficked posts and the 100 most recent posts and manually went through to make sure everything was formatted well. This was done with a lot of help from our marketing team.
- I wrote a Python script to go through all the imported posts, looked for references to WordPress hosted content, download all the static assets into
images/wordpress
in the repo, and changed all the links to relative paths on Jekyll. Worked surprisingly well. - We used Soundcloud embeds for all of our podcasts, and the vast majority of this was broken. I wrote a Python script that used a regular expression to find the relevant URL for the embed and reinserted the code necessary to restore the player. You can see the code for this below. It’s very similar to the code I wrote for the image import as well.
{% highlight Python %}
import os, re, requests
rootdir = ‘_posts’
for subdir, dirs, files in os.walk(rootdir):
for file in files:
filename = os.path.join(subdir, file)
f = open(filename, "r")
contents = f.readlines()
f.close()
# Get WordPress
slug = filename.replace("_posts/", "").replace(".markdown", "")
splits = slug.split("-")
year = splits[0]
month = splits[1]
end = "-".join(splits[3:])
link = "/".join([year, month, end])
link = "/" + link
wordpress_url = "http://blog.stackoverflow.com" + link
if re.search('podcast', wordpress_url):
print wordpress_url
response = requests.get(wordpress_url)
if response:
for line in response.content.split("n"):
if re.search('<iframe|<object', line) and re.search("soundcloud", line):
contents.append('n'+line)
f = open(filename, "w")
f.write("".join(contents))
f.close()
continue
contents.append(iframe)
f = open(filename, "w")
f.write("".join(contents))
f.close()
{% endhighlight %}
Feeds and links
Another hard requirement was making sure that we didn’t break deep linking. The posts on our blog are referenced a lot across the Stack Exchange network as well on the Internet generally. To break our URL scheme for posts would have been absolutely disastrous. In addition to that, we needed to preserve the XML feed on our blog. There are integrations for the community bulletin, it’s a dependency for parts of the Stack Exchange sites themselves, and there are thousands of people that rely on the feed to get news from us. Preserving the links to our content and at least our main feed was extremely important.
Again, Jekyll thankfully had a feature to customize post URL structures. This was simply a setting in our _config.yml
file, and it took a single line of code:
{% highlight ruby %}
permalink: /:year/:month/:title/
{% endhighlight %}
The feed was a little more complicated. Taking a look at the XML feed that we had before, we had to make sure that all the data that was needed was preserved, but there wasn’t any native functionality in Jekyll that could meet our needs. There was a workaround in sight though. Jekyll’s templating engine wasn’t just good for taking Markdown and parsing it into HTML; it could be used to create arbitrary static files that were exposed on the site. That meant I could just create something like /feed/index.xml
and use the templating language to generate a feed! Here’s what that looks like:
{% highlight xml %}
{% raw %}
layout: null
<?xml version=”1.0″ encoding=”UTF-8″?>
<rss version=”2.0″
xmlns:content=”http://purl.org/rss/1.0/modules/content/”
xmlns:wfw=”http://wellformedweb.org/CommentAPI/”
xmlns:dc=”http://purl.org/dc/elements/1.1/”
xmlns:atom=”http://www.w3.org/2005/Atom”
xmlns:sy=”http://purl.org/rss/1.0/modules/syndication/”
xmlns:slash=”http://purl.org/rss/1.0/modules/slash/”
xmlns:itunes=”http://www.itunes.com/dtds/podcast-1.0.dtd”
xmlns:media=”http://search.yahoo.com/mrss/“
…
{% for post in site.posts limit:40 %}
{% unless post.draft %}
{% for author in site.authors %}
{% if author.id == post.author %}
{% endif %}
{% endfor %}
{% for category in post.tags %}
{% endfor %}
{% endunless %}
{% endfor %}
{% endraw %}
{% endhighlight %}
Pagination
This was actually one of the most fascinating technical challenges I ran into. Like I mentioned before, we have over 700 posts on our blog, and that was only going to grow, probably at an even faster speed. It was simply unreasonable to leave out pagination support. Sadly, Jekyll only has limited support for pagination. It could do so reasonably well for all posts, but we needed more fine grain pagination that would also filter by author and tag. In addition to the lack of customization for pagination, using the native capability really slowed down the build time. Pages were going through each post, each tag, and every other file trying to generate HTML for every permutation. Our build time for all posts went from about 5 seconds to several minutes, because Jekyll generation was taking MxNxO
time. There were simply no good solutions for this natively or that other people had solved with libraries I could find.
So I ended trying something completely different. Instead of relying on Jekyll to statically generate each page up front, I decided to generate JSON with all the posts and their metadata, then lazy load that JSON via AJAX and filter using the tags and authors for that post. It meant only one file had to be generated instead of every permutation, there was more programmatic flexibility with JavaScript to filter, and we could inject pagination into any page for any set of posts we wanted.
Here’s the json/index.json
file that is generated by Jekyll and used to populate the overview pages on each channel:
{% highlight json %}
{% raw %}
layout: null
{
“posts”: [
{% for post in site.posts %}
{
“title”: “{{ post.title | raw | escape }}”,
“url”: “{{ site.baseurl }}{{ post.url }}”,
“hero”: “{{ post.hero }}”,
“tags”: [{% for tag in post.tags %}”{{ tag }}”{% unless forloop.last %},{% endunless %}{% endfor %}],
“categories”: [{% for category in post.categories %}”{{ category }}”{% unless forloop.last %},{% endunless %}{% endfor %}],
“date”: “{{ post.date | date: “%B %e, %Y” }}”,
“author”: “{{ post.author }}”,
“draft”: {% if post.draft or post.date > site.time %}true{% else %}false{% endif %},
“content”: “{% if post.description %}{{ post.description | strip_html | strip_newlines }}{% else %}{% if post.excerpt %}{{ post.excerpt | markdownify | strip_html | strip_newlines }}{% else %}{{ site.description }}{% endif %}{% endif %}”
{% if forloop.last %}}{% else %}},{% endif %}{% endfor %}
],
“authors”: {
{% for author in site.authors %}
“{{ author.id }}”: {
“name”: “{{ author.name }}”,
“avatar”: “{{ author.avatar }}”,
“twitter”: “{{ author.twitter }}”,
“url”: “{{ site.baseurl }}/authors/{{ author.id }}”,
“job”: “{{ author.job }}”
{% if forloop.last %}}{% else %}},{% endif %}{% endfor %}
}
}
{% endraw %}
{% endhighlight %}
And here’s the js/index.js
file that is in every page that loads that file in, filters the right posts, and injects the controls.
{% highlight javascript %}
{% raw %}
$(document).ready(function() {
var page = 1;
var data;
if (typeof pagination !== 'undefined') {
if (pagination === true) {
$(".pagination").css("display", "block");
}
}
if (typeof top_active !== 'undefined') {
$("nav a." + top_active).addClass("active");
}
if (typeof channel !== 'undefined') {
if (channel != 'company' && channel != 'engineering') {
$(".subheader a.category").removeClass("active");
$(".subheader a.category#" + channel).addClass("active");
}
}
if ($("div.pagination").length > 0) {
check_page();
} else {
$(".posts").css("visibility", "visible");
}
...
function check_page() {
var href = window.location.hash;
var matched = href.match('^#page');
if (matched) {
var page_num = href.split("#page")[1];
if (page_num = parseInt(page_num)) {
page = page_num;
}
}
$.getJSON(prefix + "/json/index.json", function(response) {
console.log(response)
if (response) {
var result = [];
for (key in response.posts) {
var post = response.posts[key];
if (post.draft) continue;
if (typeof channel !== 'undefined') {
if (post.tags.indexOf(channel) == - 1) {
continue;
}
}
if (typeof top_active !== 'undefined' && (top_active == "engineering" || top_active == "company")) {
if (post.tags.indexOf(top_active) == - 1) {
continue;
}
}
if (typeof author_id !== 'undefined') {
if (post.author != author_id) {
continue;
}
}
result.push(post);
}
response.posts = result;
data = response;
set_page(page);
// Set total pages
$("span.page.total_pages").html(Math.max(Math.ceil(data.posts.length / 5), 1));
// Bind page clicks
$("a.page").click(function() {
var selected_page = ($(this).hasClass("older") ? page + 1 : page - 1);
if (valid_page(selected_page)) {
set_page(selected_page);
}
});
$(window).hashchange(function() {
var selected_page = parseInt(window.location.hash.substr(5));
if (valid_page(selected_page)) {
set_page(selected_page);
}
})
}
});
}
function set_page(page_num) {
var max = Math.ceil(data.posts.length / 5);
if (data == null) return;
if (!valid_page(page_num)) {
if (page_num > max) page_num = max;
}
page = page_num;
if (typeof pagination !== 'undefined' && pagination === true) {
window.location.hash = "#page" + page_num.toString();
}
$(window).scrollTop(0);
var next = page_num + 1;
var previous = page_num - 1;
$("span.page.current_page").html(page_num);
$("a.page.older").css("visibility", page_num >= max ? "hidden" : "visible");
$("a.page.newer").css("visibility", page_num <= 1 ? "hidden" : "visible");
var offset = (page_num - 1) * 5;
var authors_posted = [];
for (var i = 0; i < 5; i++) {
var post = data.posts[offset + i];
var article = $("article.post:eq(" + i + ")");
article.css("border-bottom", i == data.posts.length - 1 ? "none" : null);
if (i >= data.posts.length || typeof post == 'undefined') {
article.css("display", "none");
continue;
}
var author = data.authors[post.author];
article.find(".title").html(post.title).attr("href", post.url);
article.find(".avatar").attr("src", author.avatar);
var post_info = [];
post_info.push(post.date);
post_info.push('By ' + author.name);
post_info.push('In ' + post.tags.join(", "))
article.find(".post-info").html(post_info.join(" • "));
article.find("img.hero").attr("src", post.hero);
article.find("a.hero_url").attr("href", post.url);
article.find(".excerpt").html(post.content);
article.find("a.read-more").attr("href", post.url);
article.css("display", "block");
var podcast = (post.tags.indexOf('podcasts') > -1);
article.find(".podcast-container").css("display", podcast ? "block" : "none");
var no_background = (podcast || post.hero.length == 0);
article.find(".hero-container").css("display", no_background ? "none" : "block");
var auth = $("#authors-container .author-container:eq(" + i + ")");
if (auth && authors_posted.indexOf(author.twitter) == -1) {
auth.find(".avatar-link").attr("href", author.url);
auth.find(".avatar").attr("src", author.avatar);
auth.find(".name-link").attr("href", author.url).html(author.name);
auth.find("p.job").html(author.job);
authors_posted.push(author.twitter);
auth.css("display", "block");
} else {
auth.css("display", "none");
}
}
$(".posts").css("visibility", "visible");
}
function valid_page(page_num) {
return (page_num > 0 && page_num <= Math.max(Math.ceil(data.posts.length / 5), 1))
}
});
{% endraw %}
{% endhighlight %}
Comments
The last thing I want to go over is comments. Comments are a really important part of our blog – if they weren’t preserved, we’d be missing out on a huge amount of context and losing valuable content by our community. It was critical that capability was maintained and that we also migrate all the previous comments over. There was really only one common solution we found that was compatible: Disqus.
Now, I don’t have many issues with Disqus as a product – I’ve seen them work really well on other platforms, but it meant we had to sacrifice a few things like Stack Exchange login capability. Those were things we could reasonably deal with. The main problem was importing comments from WordPress. We ran into so many issues trying to do this import well:
- Using Disqus’ built-in WXR importer to take the XML export from WordPress and migrate (errors, didn’t work)
- Using their official plugin on WordPress, installing, and doing a sync (also didn’t work)
- Using their API to parse the XML, and create threads and posts into our account (didn’t work either, API is not very well documented and kept running into auth errors)
The worst part of this is how unsupported we were by the Disqus team. We waited on the order of weeks for support responses and for over a month they went unresolved. Sending in official support tickets, emails, and posts on their Discuss forum went unnoticed. Even tweeting didn’t seem to work:
@disqushelp Trying to move the Stack Overflow blog to Disqus, but imports (even with all troubleshoot attempts) not working. Case #437292
— Jon Chan (@JonHMChan) May 18, 2015
@disqushelp Guys – any update on #437292? WP import is our last hurdle to launching.
— Jon Chan (@JonHMChan) May 21, 2015
@disqushelp Guys – I’ve had a support ticket open for over 2 weeks now #437292. Any progress on this?
— Jon Chan (@JonHMChan) May 29, 2015
In the end, we opted to do something else entirely: we would do a manual import of comments to date and statically generate JSON that would be lazy loaded and injected into individual posts. These would simply be HTML that we marked as archived comments. You can see the heart of the Python script to generate the JSON I wrote here:
{% highlight python %}
result = {}
for post in blog_data:
slug = post[“url”].replace(“http://blog.stackexchange.com“, “”)
slug = slug[:-1] if slug[:1] == “/” else slug
result[slug] = []
for comment in post[“comments”]:
del(comment[“author_email”])
del(comment[“ip_address”])
date_posted = time.strptime(comment[“date”], “%Y-%m-%d %H:%M:%S”)
comment[“date”] = time.strftime(“%b %d, %Y”, date_posted)
result[slug].append(comment)
json_result = json.dumps({"response": result[slug]})
parts = slug.split("/")
if len(parts) < 3:
print slug
print parts
directory = "json/comments/" + parts[1] + "/" + parts[2]
if not os.path.exists(directory):
os.makedirs(directory)
f = open("json/comments" + slug + ".json", "w")
f.write(json_result)
{% endhighlight %}
For commenting moving forward, we would use Disqus with a blank slate. If they ever got back to us or helped us with an import, we would backfill it then (we still want their help!) We simply couldn’t keep delaying shipping the new blog and found an interim solution.
Final thoughts
This was a much larger project than I had originally anticipated, but I’m very happy with the results. Early feedback has been pretty positive, and while there were inevitably bugs with such a huge migration, with Meta and the code open sourced, we’re fixing them quickly. We’ve even received pull requests already with changes and bug fixes.
In the end, I’m glad that there’s finally a single destination for our engineering team to reach out to the technical community we love being a part of. You’ll see much more content like this from our engineering team moving forward, and we’d love to hear your feedback. Until next time!
Jon is a developer and heads up evangelism efforts at Stack Exchange. You can follow him on Twitter.
45 Comments
Did you consider using something like Discourse and it’s ability to embed comments into posts? Using that you could keep the SE login options, right?
We did. When we were originally looking at solutions, we were also evaluating whether we could import previous comments – Discourse doesn’t have as much support for it as Disqus does: https://meta.discourse.org/t/is-it-possibile-to-import-existing-wordpress-comments-along-with-post-into-discourse/18157
That being said, we did end up shipping without a truly complete import. That leaves the door open again for a change, but if Disqus works for us moving forward, I don’t see too much reason to switch again.
FYI Disqus supports single sign on integration so SE login should work too.
I’m surprised that the lack of feedback from Disqus didn’t lure you away – it doesn’t really give you a great image of what you’re getting into.
Was it not possible to manually import comments into Discourse either?
Good writeup!
Looks like a solid basis for a lot of things, perhaps even community blogging on SO one day?
Actually the feed for StackExchange / StackOverflow blog (in Digg Reader) after the change first has shown many old posts as unread, then going to original post stopped working. I have to delete odl feed and get new.
Before I start, I know about Jekyll:
http://stackoverflow.com/tags/jekyll/topusers
Jekyll is of course hugely popular, but it is a nightmare setting it up on Windows. Their “Get up and running in seconds” on the home page is bullshit because Ruby doesnt even come preinstalled on Windows. I switched to Hugo. Hugo is open source, written in Go, download is a single exe. Has built in support for pagination, and loose requirements on folder structure. For example you can put each post in its own folder if you want with the accompanying images.
Also, because of its size, the old blog XML does not render on GitHub:
http://github.com/StackExchange/blog/blob/c285122/blog.xml
I appreciate the transparency, but no one is going to submit a Pull Request for that XML file, same for that image folder. So you have dumped an extra 65 MB on anyone who might want to contribute to the blog. Better solution: move binary files and files that were never a part of the current blog out.
Even for a largely .NET shop, we didn’t have much trouble with Jekyll not being as well-supported on Windows. As I mentioned in the post, Jekyll was something we used sparingly for other projects, and many of our own devs switched to it for their personal blogs – it’s something we were comfortable in.
As for the XML file, agreed and it’s been removed. For the images, it’s a bit harder. We need to find a better way to migrate that out of the repo and somewhere external without breaking references everywhere.
The target audience probably doesn’t have a windows-majority (as you would find in the general population), being rather technical – eg: markdown, git, etc, tend to be technologies that are picked up by techies, not the general population, so I don’t think issues with MSFT are really relevant to Jekyll.
Disregarding Windows as a development environment is just lazy programming. Jekyll is a Ruby gem, thats means it is and should be cross platform, including Windows. It appears every statement you have made is a generalization, do you have any data to back it up? If not, spare me your opinions.
I’m on Mac OS and I’ve had huge issues using it in my local development workflow as well. I love the idea behind it, and once its running its great, but I personally just spent way too much time just to set it up. And unfortunately, the documentation is not good imo.
https://github.com/jsturtevant/jekyll-vagrant
jekyll-vagrant worked really well for me on Windows. Installed Vagrant, enabled Hyper-V (Windows 8.1), cloned the repo, then vagrant up and I was off and running.
By the way the jekyll instructions specifically say that it won’t work well on Windows and that it isn’t officially supported, and if you’ve used Ruby on Windows for any length of time then you know damn well that ruby gems are not guaranteed to be cross platform.
When I open this blog post in Thunderbird, https://platform.twitter.com/jot.html pops up in my Firefox. It’s not the case with other blog posts. Interesting bug, not sure where exactly.
I would like to see the motivation to choose Jekyll vs other open source static engines like Hugo or Pretzel which present some advantages like speed or easy customization.
I can see Hugo, but is Pretzel even cross platform? They only have Windows build available
Yes, notoriously but SO has a strong investment on .Net so could have been a viable choice for them. My question is to understand the specific reason for SO to choose Jekyll amongst some hundreds generator (https://staticsitegenerators.net/). I would be surprised that they did no evaluation and even measured the tools to chose the one that suited best.
agreed. i would like to have seen what SO could have done with an investment in Pretzel.
I only mentioned it briefly by saying we “looked at static site generation in Go (Hugo) and even considered building our own engine,” but we did do quite a bit of evaluation here. After looking at the typical factors (performance, build times, platform support), the tradeoffs between each platform really came down to how robust and active a community behind it was (Jekyll won out here) and what we were familiar and comfortable building in (Jekyll was previously in some projects and many of our devs’ personal blogs are on it).
Thanks Jon for the insight. That’s exactly the kind of evaluation that real people does on real projects: the technical merits of a solution are just one variable of the equation; organization weight as much.
Typo:
“There are few examples of major migration efforts to Jekyll out there, and **THIS** could be one of them.”
I hate to say it, but the JavaScript here isn’t of the best level. Some stuff isn’t great, and some stuff like `$(“.subheader a.category#” + channel).addClass(“active”); should definitely be improved in my humble opinion. I’d love a chance to improve it – would pull requests be entertained?
Good writeup otherwise.
Absolutely, that’s why we open sourced it. Go at it!
Just throwing it out there that your public WordPress dump discloses the names, e-mail addresses, IPs, etc. for anyone who ever commented on your posts. I believe most people commenting on WordPress blogs expect that information to remain private, especially since the default WordPress commenting form says the e-mail address will never be made public.
I suppose you could’ve got permission or had it in your site’s terms, but I’d consider this a fairly serious information disclosure issue and remove that file from the public repository if at all possible (assuming it hasn’t been forked yet)…
Noted and it’s been resolved.
Very nice article! Would be nice to now what the main reason against hugo was (which did not require writing pagination yourself) ? thx
Very amazing & cool. However I noticed a 404.
Screenshot: http://grab.by/IzKu
Nice! Would love to hear what the developers / bloggers think about it. What do you use for Author’s to generate posts? Do you have a GUI like http://prose.io/ or do you just edit directly on github? Do you have an approval process using pull requests? And how do you upload / structure images?
Kinda disturbing you got no support from Disqus…
Somewhat surprising they’ve not tried Discourse!
Discourse is a “forum”, not a commenting system.
Well… it *can* also be used as a comment system in static blog posts (see http://eviltrout.com/2014/01/22/embedding-discourse.html). It may have better support… and it’s built by StackExchange alumni.
Nice work Jon!
Our designer also took inspiration from Etsy’s engineering blog and created eatcodeplay.com. We’re still on WordPress but are hoping to move to Jekyll soon as well.
WordPress is quite heavy, you can apply lots of caching and optimization to it, but ultimately it will never be performing as good as a static site. Static content makes it a lot easier to distribute over a CDN as well.
I ditched WordPress a while back as well, for an engine called Snow, which is pretty much inspired by Jekyll: http://www.filipekberg.se/2014/05/21/goodbye-wordpress-hello-snow/
Built-in search is convenient in WordPress, how did replacing it with Google Custom Search work out so far?
Interesting post! We had a big discussion at our company about whether we should go with WordPress, Ghost, Jekyll, Middleman or something else. Even though we’re a technology company (Code School), we ended up going with WordPress — which allows non-technical people to easily manage the blog. How do you’ll foresee bringing new non-technical people on board with the blog?
Another option I’d wonder about for your situation is WordPress Multisite — which allows for the same WordPress installation to independently host multiple sites (as far as I know — I haven’t tried it so I’m a bit fuzzy on the details). Did you’ll look into that option at all?
Oh I love Code School! (I’m the founder of Bento on the side)
For our non-techies we actually think learning Markdown isn’t nearly so hard, and the GitHub interface makes it pretty darn easy to make changes, add new posts, and commit changes. There’s also http://prose.io/ that we’re exploring that may make it even easier to use for non-developers. We think it’s actually a plus that our marketing team and to some extent our community team learn some basic technical skills that are needed to use the blog.
Would be happy to chat with you more about it (and about Code School!)
Prose is pretty great! Unfortunately it currently has 125 open issues and they’ve been looking for new maintainers for almost a year: https://github.com/prose/prose/issues/743
Hey Jon. Sorry to hear you’re still having trouble with your Disqus import. I just wanted to let you know that we’re still here to help. We identified the cause of the errors you emailed us about in case #437292. We’re not sure if you’ve been receiving our recent emails. I’ve added your Disqus email address, just in case you’re missing our replies. You can reply in that thread to let us know what outstanding issues you are hitting with the import process –– we’re happy to assist further to make sure we get this resolved.
Also, sorry we missed you on Twitter. Our primary support channel is https://discussdisqus.disq.us. More details on our available support resources can be found here: https://disqus.com/support.
It’s awesome to see how much you all value community. Looking forward to hearing from you.
Hi, my name is Michael and DISQUS is almost impossible to contact and you’re the only support member I see. I post mainly on WYFF channel 4 in Greenville SC and have phoned them a dozen times over my issue and they refuse to help. I have first off clones I reported to your site and WYFF who threaten terrorist acts and to molest children. Second and more distressing I have people who use my real name in profiles and my actual pictures to threaten terrorist acts on schools, says they are child molesters and very horrible things. WYFF only says they let the site moderate itself. I have threatened legal actions because it is Libel and WYFF and DISQUE does nothing to block offending ISP address or profiles… If you look into it the clone is of course under my name “Liberal Like Christ’ The ones who use my picture and real name is “Michael Xai’ and using a picture of me as a child is “Victim of Liberal Like Christ” …. WYFF refuses to help in any way, refuses to give me contact information to contact you and just says they take no responsibility for their site…. Please help me with this as people are committing crimes under my name and pictures and it has me very upset.
RSS is a fickle fickle beast. You might want to peek at http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fblog.stackexchange.com%2Ffeed%2F.
Great post!
I have been going through my own Jekyll revelation and migration, for me moving a lot of sites from Drupal.
I’ve been using Jekyll for creating a website that also feeds the components behind a board game, so that involves PDF generation and the like, been working quite well so far.
Migration has often been the biggest issue. For my next batch I have some Drupal content that goes back years, still not entirely sure how I’m going to accomplish it yet.
Great Job Jon, congratulations! I wonder how do you create authors feature for Jekyll?
Great post. When I find some time, I will take a deep look into this pagination work around.
Really good blog,thank you very much for your time in writing the posts.
I’ve just created my first website(s) this weekend with Google Sites and Github Pages (which uses Jekyll). However after you wrote this piece a new player entered the stage called Hugo which is using compiled Golang instead of interpreted Ruby Gems that Jekyll uses. As such Hugo is 35 times faster. Hugo also works nicely with Bootstrap 5 (so they say) which gives a better mobile frontend (so they say).
All I can say is both Jekyll and Hugo excel at blogs and I have no blogs BUT I do have 88k points in upvoted Ask Ubuntu (on Stack Exchange) answers I’d like to convert into blog posts. Although I’ve processed all my answers in BASH (very painful) I’ve now learned Python and will rewrite that old BASH program. Also since then I’ve learned SQL which makes data mining the answers in Ask Ubuntu (plus some in Stackoverflow and Linux & Unix) easier.
To make a long story short, has anyone done this already? Convert Stack Exchange Q&A into blog posts?