1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
|
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="generator"
content="HTML Tidy for HTML5 for Linux version 5.6.0">
<title>senders.io - How I Generate My RSS Feed</title>
<link rel='stylesheet'
type='text/css'
href='/index.css'>
<meta name="viewport"
content="width=device-width, initial-scale=1">
</head>
<body>
<div id='header'>
<a class='title'
href='/'>senders.io</a>
<nav>
<a href="/blog">blog</a> <a rel="external noopener noreferrer"
target="_blank"
href="https://github.com/s3nd3r5">github</a> <a rel=
"external noopener noreferrer"
target="_blank"
href="https://git.senders.io">cgit</a> <a rel=
"me external noopener noreferrer"
target="_blank"
href="https://tech.lgbt/@senders">mastodon</a>
</nav>
</div>
<div id="body">
<article>
<h1>How I Generate My RSS Feed</h1>
<p>I only just now started supplying an RSS feed to you fine people! You
can subscribe to it at <a href=
"/blog/feed.rss">www.senders.io/blog/feed.rss</a>!</p>
<p>I decided rather than manually generating the file contents I’d hook
into my pre-existing publish scripts to be able to generate the RSS
file.</p>
<h2>Publishing blog posts - shell scripts ftw</h2>
<p>In <a href="/blog/2022-11-06/">My Markdown -> HTML Setup</a> I
touch on how I publish my markdown files into HTML for this blog. But
what I don’t <em>really</em> touch on is the shell scripts that tie the
whole process together.</p>
<p>What I have is two, now three, scripts that feed the whole
process:</p>
<ol>
<li><code>publish-blog.sh</code> - the main script</li>
<li><code>compile-md.sh</code> - generates the HTML output</li>
<li><code>update-feed.sh</code> - generates/appends the RSS feed</li>
</ol>
<p>The <code>update-feed.sh</code> script is the new one I just
added.</p>
<p><code>publish-blog.sh</code> is the primary interface, I supply the
date of the post and the path to the md file and that calls compile and
update to automate the entire process.</p>
<p>Without going into TOO much detail you can view the latest versions of
the scripts at <a rel="external noopener noreferrer"
target="_blank"
href=
"https://git.senders.io/senders/senders-io/tree/">git.senders.io/senders/senders-io/tree/</a>.</p>
<p>But the gist of the scripts is I parse out the necessary details,
find/replace some tokens in template files I have setup for headers and
footers, and concat the outputs into the final output HTML files, and now
RSS feed.</p>
<h3>update-feed.sh</h3>
<p>Source File: <a rel="external noopener noreferrer"
target="_blank"
href=
"https://git.senders.io/senders/senders-io/tree/update-feed.sh">git.senders.io/senders/senders-io/tree/update-feed.sh</a></p>
<p>This script is pretty interesting. I didn’t want to deal with any XML
parsers and libraries to just maintain a proper XML rss file and push
items into the tree. Rather, I just follow a similar setup to my markdown
generation. I leverage some temporary files to hold the contents, a
static temp file for the previously generated content, and at the end
swap the temp file with the real file.</p>
<p>I take in an input of the publish date (this is the date from the
publish script), the title, and the HTML file path. These are all already
variables in the publish script, but also something I can manually supply
if I need to publish an older article, or something I wrote directly in
HTML.</p>
<p>The core of the script is found here:</p>
<pre><code>PUBDATE=$(date -d "$1" -R)
TITLE=$2
FILE_PATH=$3
PERMALINK=$(echo "${FILE_PATH}" | sed -e "s,${TKN_URL_STRIP},${URL_PREFIX},g")
LINK=$(echo "${PERMALINK}" | sed -e "s,${TKN_INDEX_STRIP},,g")
# Generate TMP FEED File Header
cat -s $FILE_RSS_HEADER > $FILE_TMP_FEED
sed -i -E "s/${TKN_BUILDDATE}/${BUILDDATE}/g" $FILE_TMP_FEED
sed -i -E "s/${TKN_PUBDATE}/${PUBDATE}/g" $FILE_TMP_FEED
# Generate TMP Item File
cat -s $FILE_RSS_ITEM_HEADER > $FILE_TMP_ITEM
sed -i -E "s~${TKN_TITLE}~${TITLE}~g" $FILE_TMP_ITEM
sed -i -E "s/${TKN_PUBDATE}/${PUBDATE}/g" $FILE_TMP_ITEM
sed -i -E "s,${TKN_PERMALINK},${PERMALINK},g" $FILE_TMP_ITEM
sed -i -E "s,${TKN_LINK},${LINK},g" $FILE_TMP_ITEM
sed -n "/<article>/,/<\/article>/p" $FILE_PATH >> $FILE_TMP_ITEM
cat -s $FILE_RSS_ITEM_FOOTER >> $FILE_TMP_ITEM
# Prepend Item to items list and overwrite items file w/ prepended item
## In order to "prepend" the item (so it's on top of the others)
## We need to concat the tmp item file with the existing list, then
## we can push the contents over the existing file
## We use cat -s to squeeze the blank lines
cat -s $FILE_ITEM_OUTPUT >> $FILE_TMP_ITEM
cat -s $FILE_TMP_ITEM > $FILE_ITEM_OUTPUT
# Push items to TMP FEED
cat -s $FILE_ITEM_OUTPUT >> $FILE_TMP_FEED
# Push RSS footer to TMP FEED
cat -s $FILE_RSS_FOOTER >> $FILE_TMP_FEED
echo $FILE_TMP_FEED
# Publish feed
cat -s $FILE_TMP_FEED > $FILE_RSS_OUTPUT
echo "Finished generating feed"
</code></pre>
<p>Some key takeaways are:</p>
<ol>
<li>sed lets you do regex with delimiters that AREN’T <code>/</code> so
you can substitute something that shouldn’t actually ever show up in
your regex. For me that is <code>~</code>.</li>
<li>I always forget you can use sed to extract between tokens - which
is how I get the CDATA for the RSS: <code>sed -n
"/<article>/,/<\/article>/p"</code></li>
<li><code>mktemp</code> is really REALLY useful - and I feel is under
utilized in shellscripting</li>
</ol>
<p>The obvious cracks are:</p>
<ol>
<li>I rely SO much on <code>sed</code> that it’s almost certainly going
to break</li>
<li>I don’t have much other flag control to do partial generation - so
if I need to do something either starting partway through or not finish
the full process, I don’t have that.</li>
<li>Sometimes things can break silently and it will go through, there
is no verification or like manual checking along the way before
publishing the feed.rss</li>
</ol>
<p>The final two can easily be managed by writing the feed to a location
that isn’t a temp file and I can manually do the <code>cat -s
$FILE_TMP_FEED > www/blog/feed.rss</code> myself after I check it
over.</p>
<p>But for now I’ll see if I ever have to redo it. I don’t think anyone
will actually sub to this so I don’t really need to care that much if I
amend the feed.</p>
<h2>Where to put the feed URL</h2>
<p>I never intended to provide an RSS feed. I doubt anyone but me reads
this, and from my previous experience with gemini feed generation was a
bit of a headache.</p>
<p>A quick aside: I really only decided thanks to Mastodon. I was
thinking during the Twitter meltdown “what if twitter but RSS” (I know
super unique idea). But basically like a true “microblog”. And some OSS
tools to publish your blog. This got me reading the RSS spec and looking
into it more - which then lead me down the using the RSS readers more (in
conjunction with gemini, and Cortex podcast talking about using RSS
more).</p>
<p>But I’ve decided to just put the RSS feed in the blog index, on my
homepage, and that’s it. I don’t need it permanently in the header.</p>
<h2>Conclusion</h2>
<p>I didn’t have much to share here, it doesn’t make too much sense to
write a big post on what can be explained better by just checking out the
shell scripts in my git source. The code speaks better than I ever
could.</p>
<p>I really, really like shell scripting.</p>
</article>
<div id="footer">
<i>January 06, 2023</i>
</div>
<div id='copyright'>
© 2023 senders dot io - <a rel="license external noopener noreferrer"
target="_blank"
href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA 4.0</a>
unless otherwise noted.
</div>
</div>
</body>
</html>
|