~erock/prose-official-blog

d80c5b7fc7c7051fedae413af614713712fb9fa7 — Eric Bower 3 months ago 8701d69
tweaks
1 files changed, 75 insertions(+), 21 deletions(-)

M rfc-rss.md
M rfc-rss.md => rfc-rss.md +75 -21
@@ 24,7 24,7 @@ Features:
- We could provide aggregate rss feeds for collections
- Web view for the feeds?

## what can we offer over the other readers?
# what can we offer over the other readers?

We would try to provide a great reading experience from the terminal.  No need
to install an RSS reader like newboat.  No need to sync a config file across


@@ 49,12 49,21 @@ is we might have to deal with content security policy and ensuring we could
render the html content consistently.  It definitely opens us open to a bunch
of edge cases.

## how it works
# how it works

A user would `scp` an `opml` file, a file containing a lists of rss feeds, or
`echo "<feed>" | ssh reads.sh`.  

We would dedupe the feeds and add them to our `posts` table.
We would dedupe the feeds and add them to our `posts` table.  Because an RSS
feed can contain a bunch of metadata about a feed, we should capture as much of
that as possible inside the `posts` table.  The downside is we use `posts` for
a lot of our services (e.g. lists, prose, and pastes) so we want to be careful
not to overload this table.  Having said that, I think an rss feed fits into
the post paradigm.  We just need to add a `data jsonb` column to `posts`.

```sql
ALTER TABLE posts ADD COLUMN data jsonb;
```

Triggers for fetching feeds:



@@ 77,24 86,61 @@ email.  They click the button in the email -> we delay shutdown for 30 days.
We would probably create a separate table for the feed results in order to
optimize storing an retrieval.

`feed_entry`

new table: `feed_entry`

```sql
CREATE TABLE IF NOT EXISTS feed_entry (
  id          uuid NOT NILL DEFAULT uuid_generate_v4(),
  post_id     uuid NOT NULL,
  author      character varying(250),
  category    character varying(250),
  published   timestamp without time zone NOT NULL DEFAULT NOW(),
  rights      character varying(2000),
  source      character varying(2000),
  content     text,
  contributor character varying(250),
  atom_id     character varying(250),
  link        character varying(2000),
  summary     text,
  title       character varying(250),
  created_at  timestamp without time zone NOT NULL DEFAULT NOW(),
  updated_at  timestamp without time zone NOT NULL DEFAULT NOW(),
  CONSTRAINT  entry_unique_atom_id UNIQUE (atom_id, post_id),
  CONSTRAINT  feed_entry_pkey PRIMARY KEY (id),
  CONSTRAINT  fk_entry_posts
    FOREIGN KEY(post_id)
  REFERENCES posts(id)
  ON DELETE CASCADE
  ON UPDATE CASCADE
);
```
id          uuid
post_id     uuid
author      varying char(250)
category    varying char(250)
published   datetime
rights      varying char(2000)
source      varying char(2000)
content     text
contributor varying char(250)
atom_id     varying char(250)
link        varying char(2000)
summary     text
title       varying char(250)
created_at  datetime
updated_at  datetime

## queue system

We will probably also want a queuing system.  I figured we could just build one
that fits our purposes inside our database.

new table: `app_queue`

```sql
CREATE TYPE JOB_STATUS AS ENUM ('in_progress', 'success', 'fail');
CREATE TYPE JOB_TYPE AS ENUM ('fetch_feed');

CREATE TABLE IF NOT EXISTS app_queue(
  id          uuid NOT NILL DEFAULT uuid_generate_v4(),
  post_id     uuid NOT NULL,
  status      JOB_STATUS,
  type        JOB_TYPE,
  input       jsonb # params needed to execute job
  output      jsonb # result of job
  created_at  timestamp without time zone NOT NULL DEFAULT NOW(),
  CONSTRAINT  queue_pkey PRIMARY KEY (id),
  CONSTRAINT  fk_queue_posts
    FOREIGN KEY(post_id)
  REFERENCES posts(id)
  ON DELETE CASCADE
  ON UPDATE CASCADE
);
```

I haven't figured out a great way for users to add metadata to their feeds.


@@ 106,7 152,7 @@ format seems like a good candidate.
I like the idea of storing the results in the database, but I could also see an
argument for using something more ephemeral like redis.

## is this a premium service?
# is this a premium service?

The litmus test for a premium service at pico is: does it cost us more
resources to keep the service up and running?


@@ 114,3 160,11 @@ resources to keep the service up and running?
I think creating, categorizing, and viewing the results of your feeds should be
free.  The premium service would be the email digest component.  We could also
offer an always fetching feature for premium users.

# risks

- RSS readers have been done before
- Syncing feeds can be costly in terms of compute resources
- Following atom entry links to the webpage puts us in the scraping category
  which opens us up to stability issues (e.g. some sites deny scrapers)
- Web view might run into content security policy issues