summaryrefslogtreecommitdiff
path: root/posts/how-i-generate-my-rss-feed.html
blob: 752bb0c8fddc478699e205dccb397dcddaafc494 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
--post-date: 2023-01-06
--type: blog
    <article>
      <h1>How I Generate My RSS Feed</h1>
      <p>I only just now started supplying an RSS feed to you fine people! You
      can subscribe to it at <a href=
      "/blog/feed.rss">www.senders.io/blog/feed.rss</a>!</p>
      <p>I decided rather than manually generating the file contents I’d hook
      into my pre-existing publish scripts to be able to generate the RSS
      file.</p>
      <h2>Publishing blog posts - shell scripts ftw</h2>
      <p>In <a href="/blog/2022-11-06/">My Markdown -&gt; HTML Setup</a> I
      touch on how I publish my markdown files into HTML for this blog. But
      what I don’t <em>really</em> touch on is the shell scripts that tie the
      whole process together.</p>
      <p>What I have is two, now three, scripts that feed the whole
      process:</p>
      <ol>
        <li><code>publish-blog.sh</code> - the main script</li>
        <li><code>compile-md.sh</code> - generates the HTML output</li>
        <li><code>update-feed.sh</code> - generates/appends the RSS feed</li>
      </ol>
      <p>The <code>update-feed.sh</code> script is the new one I just
      added.</p>
      <p><code>publish-blog.sh</code> is the primary interface, I supply the
      date of the post and the path to the md file and that calls compile and
      update to automate the entire process.</p>
      <p>Without going into TOO much detail you can view the latest versions of
      the scripts at <a rel="external noopener noreferrer"
         target="_blank"
         href=
         "https://git.senders.io/senders/senders-io/tree/">git.senders.io/senders/senders-io/tree/</a>.</p>
      <p>But the gist of the scripts is I parse out the necessary details,
      find/replace some tokens in template files I have setup for headers and
      footers, and concat the outputs into the final output HTML files, and now
      RSS feed.</p>
      <h3>update-feed.sh</h3>
      <p>Source File: <a rel="external noopener noreferrer"
         target="_blank"
         href=
         "https://git.senders.io/senders/senders-io/tree/update-feed.sh">git.senders.io/senders/senders-io/tree/update-feed.sh</a></p>
      <p>This script is pretty interesting. I didn’t want to deal with any XML
      parsers and libraries to just maintain a proper XML rss file and push
      items into the tree. Rather, I just follow a similar setup to my markdown
      generation. I leverage some temporary files to hold the contents, a
      static temp file for the previously generated content, and at the end
      swap the temp file with the real file.</p>
      <p>I take in an input of the publish date (this is the date from the
      publish script), the title, and the HTML file path. These are all already
      variables in the publish script, but also something I can manually supply
      if I need to publish an older article, or something I wrote directly in
      HTML.</p>
      <p>The core of the script is found here:</p>
      <pre><code>PUBDATE=$(date -d &quot;$1&quot; -R)
TITLE=$2
FILE_PATH=$3
PERMALINK=$(echo &quot;${FILE_PATH}&quot; | sed -e &quot;s,${TKN_URL_STRIP},${URL_PREFIX},g&quot;)
LINK=$(echo &quot;${PERMALINK}&quot; | sed -e &quot;s,${TKN_INDEX_STRIP},,g&quot;)

# Generate TMP FEED File Header

cat -s $FILE_RSS_HEADER &gt; $FILE_TMP_FEED
sed -i -E &quot;s/${TKN_BUILDDATE}/${BUILDDATE}/g&quot; $FILE_TMP_FEED
sed -i -E &quot;s/${TKN_PUBDATE}/${PUBDATE}/g&quot; $FILE_TMP_FEED

# Generate TMP Item File

cat -s $FILE_RSS_ITEM_HEADER &gt; $FILE_TMP_ITEM
sed -i -E &quot;s~${TKN_TITLE}~${TITLE}~g&quot; $FILE_TMP_ITEM
sed -i -E &quot;s/${TKN_PUBDATE}/${PUBDATE}/g&quot; $FILE_TMP_ITEM
sed -i -E &quot;s,${TKN_PERMALINK},${PERMALINK},g&quot; $FILE_TMP_ITEM
sed -i -E &quot;s,${TKN_LINK},${LINK},g&quot; $FILE_TMP_ITEM
sed -n &quot;/&lt;article&gt;/,/&lt;\/article&gt;/p&quot; $FILE_PATH &gt;&gt; $FILE_TMP_ITEM
cat -s $FILE_RSS_ITEM_FOOTER &gt;&gt; $FILE_TMP_ITEM

# Prepend Item to items list and overwrite items file w/ prepended item
## In order to &quot;prepend&quot; the item (so it&#39;s on top of the others)
## We need to concat the tmp item file with the existing list, then
## we can push the contents over the existing file
## We use cat -s to squeeze the blank lines
cat -s $FILE_ITEM_OUTPUT &gt;&gt; $FILE_TMP_ITEM
cat -s $FILE_TMP_ITEM &gt; $FILE_ITEM_OUTPUT

# Push items to TMP FEED
cat -s $FILE_ITEM_OUTPUT &gt;&gt; $FILE_TMP_FEED

# Push RSS footer to TMP FEED
cat -s $FILE_RSS_FOOTER &gt;&gt; $FILE_TMP_FEED
echo $FILE_TMP_FEED

# Publish feed
cat -s $FILE_TMP_FEED &gt; $FILE_RSS_OUTPUT

echo &quot;Finished generating feed&quot;
</code></pre>
      <p>Some key takeaways are:</p>
      <ol>
        <li>sed lets you do regex with delimiters that AREN’T <code>/</code> so
        you can substitute something that shouldn’t actually ever show up in
        your regex. For me that is <code>~</code>.</li>
        <li>I always forget you can use sed to extract between tokens - which
        is how I get the CDATA for the RSS: <code>sed -n
        &quot;/&lt;article&gt;/,/&lt;\/article&gt;/p&quot;</code></li>
        <li><code>mktemp</code> is really REALLY useful - and I feel is under
        utilized in shellscripting</li>
      </ol>
      <p>The obvious cracks are:</p>
      <ol>
        <li>I rely SO much on <code>sed</code> that it’s almost certainly going
        to break</li>
        <li>I don’t have much other flag control to do partial generation - so
        if I need to do something either starting partway through or not finish
        the full process, I don’t have that.</li>
        <li>Sometimes things can break silently and it will go through, there
        is no verification or like manual checking along the way before
        publishing the feed.rss</li>
      </ol>
      <p>The final two can easily be managed by writing the feed to a location
      that isn’t a temp file and I can manually do the <code>cat -s
      $FILE_TMP_FEED &gt; www/blog/feed.rss</code> myself after I check it
      over.</p>
      <p>But for now I’ll see if I ever have to redo it. I don’t think anyone
      will actually sub to this so I don’t really need to care that much if I
      amend the feed.</p>
      <h2>Where to put the feed URL</h2>
      <p>I never intended to provide an RSS feed. I doubt anyone but me reads
      this, and from my previous experience with gemini feed generation was a
      bit of a headache.</p>
      <p>A quick aside: I really only decided thanks to Mastodon. I was
      thinking during the Twitter meltdown “what if twitter but RSS” (I know
      super unique idea). But basically like a true “microblog”. And some OSS
      tools to publish your blog. This got me reading the RSS spec and looking
      into it more - which then lead me down the using the RSS readers more (in
      conjunction with gemini, and Cortex podcast talking about using RSS
      more).</p>
      <p>But I’ve decided to just put the RSS feed in the blog index, on my
      homepage, and that’s it. I don’t need it permanently in the header.</p>
      <h2>Conclusion</h2>
      <p>I didn’t have much to share here, it doesn’t make too much sense to
      write a big post on what can be explained better by just checking out the
      shell scripts in my git source. The code speaks better than I ever
      could.</p>
      <p>I really, really like shell scripting.</p>
    </article>