Ricky Kresslein

Atom Feed Generator

2024-05-25

I wanted to create an RSS/Atom feed for this blog, but had no idea how to go about it. Since this website is pure HTML, I can't just install a plugin that does it all automatically. After searching around, I found a basic template for an Atom feed. I don't want to copy and paste everything manually every time I write a new post, so I decided to write a program that would generate the feed for me.

Since Rust is fast and I'm familiar with it, that's what I decide to write the program in. Here is my code. You won't be able to copy and paste this because it is specific to the formatting of my blog directory, but hopefully it will be a useful guide to get you started.

use scraper::{Html, Selector};
use std::fs;

fn main() {
    let mut feed: String = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<feed xmlns=\"http://www.w3.org/2005/Atom\" xmlns:dc=\"http://purl.org/dc/elements/1.1/\">
\t<author><name>RickyKresslein</name></author>
\t<id>https://kressle.in/feed.xml</id>
\t<title>Ricky Kresslein's Blog</title>"
        .to_owned();
    let blog_page =
        fs::read_to_string("/location/kressle.in/blog")
            .expect("Failed to read file");
    let blog = Html::parse_document(&blog_page);
    let tr = Selector::parse("tr").unwrap();
    let td = Selector::parse("td").unwrap();

    for row in blog.select(&tr) {
        let mut cells = row.select(&td);
        let date = cells.next().unwrap().inner_html();
        let link_frag = Html::parse_fragment(&cells.next().unwrap().inner_html());
        let a = Selector::parse("a").unwrap();
        let link_text = link_frag.select(&a).next().unwrap();
        let link = link_text.value().attr("href").unwrap();
        let title = link_text.inner_html();
        let article_html = fs::read_to_string(&format!(
            "/location/kressle.in/{}",
            link
        ))
        .expect("Failed to read file");
        let article_content = Html::parse_document(&article_html);
        /*let desc_selector = Selector::parse(r#"meta[name="description"]"#).unwrap();
        let meta_description = article_content.select(&desc_selector).next().unwrap();
        let description = meta_description.value().attr("content").unwrap();*/

        // Post title, link, and date
        let mut article: String = "".to_owned();
        article.push_str(&format!("\n\t<entry>\n\t\t<title>{}</title>", title));
        article.push_str(&format!("\n\t\t<id>https://kressle.in/{}</id>", link));
        article.push_str(&format!(
            "\n\t\t<link href=\"https://kressle.in/{}\"/>",
            link
        ));
        article.push_str(&format!("\n\t\t<dc:date>{}T00:00:00+00:00</dc:date>", date));
        article.push_str(&format!("\n\t\t<updated>{}T00:00:00+00:00</updated>", date));
        // Summary shows in reader instead of post - don't use
        // article.push_str(&format!("\n\t\t<summary>{}</summary>", description));

        // Post Content
        article.push_str("\n\t\t<content type=\"html\">\n\t\t\t");
        let p = Selector::parse("p").unwrap();
        let mut paragraphs: Vec<_> = article_content.select(&p).collect();
        paragraphs.pop(); // Remove nav & email elements
        for paragraph in paragraphs {
            let paragraph_fixed = &paragraph.html().replace("<", "&lt;");
            let paragraph_fixed = &paragraph_fixed.replace(">", "&gt;");
            let paragraph_fixed = &paragraph_fixed.replace("\"", "&quot;");
            let paragraph_fixed = &paragraph_fixed.replace("\n", "\n\t\t\t");
            article.push_str(&paragraph_fixed);
        }
        article.push_str("\n\t\t</content>\n\t</entry>");

        feed.push_str(&article);
    }
    feed.push_str("\n</feed>");

    // Write feed.atom
    fs::write("feed.xml", feed).expect("Failed to write feed.xml");
}

If you do want to more-or-less copy and paste that code, your blog post index will need to be set up exactly like mine:

<table>
    <tr>
        <td>2024-05-24</td>
        <td><a href="dvorak">Typing Dvorak</a></td>
    </tr>
    <tr>
        <td>2024-05-21</td>
        <td><a href="txt">Text Notes</a></td>
    </tr>
</table>

There you have it. I just run that program, move the output feed.xml file to my website directory, and publish the changes. Anyone following this blog via RSS will then receive my latest post right in their reader, with very little effort on my part.