Julian​Garamendy​.dev

Using Dev.to as CMS for my Next.js blog

23 January, 2021

My Next.js site fetches my published articles using the dev.to API. It generates all pages at build time, and then individually again on every request. Also new articles can appear in my blog without having to build and deploy. Here's how it works.

I'll assume the reader is somewhat familiar with Data Fetching in Next.js.

There are 2 pages in my blog: one with a list of articles (/pages/blog.ts), and one for individual articles (/pages/[slug].ts) which uses dynamic routing.

For the full source code you can check this repository.

Diagram (without cache)

List of Articles

The blog.ts page exports a getStaticProps function. This fetches all of my articles from dev.to using the dev.to API. The list is then passed to the page component and rendered.

Filtering Articles

For some reason, I don't want ALL my published dev.to articles to be listed in my own blog.

Another "problem" to solve is slugs. Dev.to auto-generates slugs using the title and adding some random characters. On my own site I want to be able to choose my own slugs.

I found one solution that may solve both problems. By using the canonical_url frontmatter property to express that I want the article on my blog, and what the slug should be.

For every dev.to article I want to publish in my own blog I specify a cononical_url that starts with my own site url. For example: My blog post at https://dev.to/juliang/react-state-management-in-2020-3c58 has a canonical_url: https://juliangaramendy.dev/blog/react-state-management-2020.

After fetching the entire list, only the articles that are published and have a valid canonical_url pass the filter.

Another benefit is that dev.to will display a nice "Originally published at juliangaramendy.dev" message, with a link to the article on my site.

Article Page

For the individual articles the /[slug].ts page is generated. This module also exports a getStaticProps function that returns a single article, which is sent as props to the React component and rendered.

But I can't fetch the individual article from dev.to API because the slug is not the same. So it would not be found.

https://dev.to/juliang/ https://juliangaramendy.dev/blog/
react-state-management-in-2020-3c58 react-state-management-2020

What I do is: fetch the entire list again, find the article with a canonical_url property matching the slug parameter, and return it.

The /[slug].ts module also exports a getStaticPaths function that returns a list of slugs to be pre-generated at build time.

For this, I fetch the entire list again and return all the slugs.

export async function getStaticPaths() {
  const posts = await getAllPosts()
  const paths = posts.map((post) => ({ params: { slug: post.slug } }))
  return { paths, fallback: true }
}

I set fallback: true so that new articles that didn't exist at build time, can still be found on my blog when requested.

Regenerating article pages

When returning the static props I indicate that the page may be regenerated after 1 second using the revalidate property:

export async function getStaticProps(context) {
  const slug = context.params?.slug
  const post = await getPostBySlug(`${slug}`)
  return post ? { props: { post }, revalidate: 1 } : { notFound: true }
}

This makes use of Next.js Incremental Static Regeneration

Avoiding re-fetching every time

The above implementation is a bit naive.

The Next.js build process would call a function that fetches the entire list of articles multiple times in the span of a few seconds. This is both unnecessary and problematic, because the dev.to API has a usage limit.

Caching the network request in memory

A huge improvement is to wrap the fetch call in a static cache so that subsequent calls will be avoided and the cache returned instead.

Diagram (memory cache)

The cache has to expire at some point, otherwise new articles may not be "picked up" by the blog. I set to to 1 minute, but may extend it to 10 minutes in the future. I'm sure people can wait 10 minutes to read my posts.

Here's one implementation.

// cache.ts

type CacheEntry = {
  expiresAt: number
  value: unknown
}

const cacheMap: Record<string, CacheEntry> = {}

const EXPIRATION = 1000 * 60 * 1 // 1 minute

export async function getCached<V>(key: string, fn: () => Promise<V>): Promise<V> {
  if (shouldRevalidate(key)) {
    await revalidateKey(key, fn)
  }
  return cacheMap[key].value as V
}

function shouldRevalidate(key: string): boolean {
  return cacheMap[key] ? new Date().getTime() > cacheMap[key].expiresAt : true
}

async function revalidateKey<V>(key: string, fn: () => Promise<V>) {
  const response = await fn()
  cacheMap[key] = {
    value: response,
    expiresAt: new Date().getTime() + EXPIRATION,
  }
  return cacheMap[key].value as V
}

We use it like this:

function fetchAllDevArticles(): Array<Article> {
  return fetch('https://dev.to/api/articles/me/published', { 
    headers: { 'api-key': process.env.DEVTO_API_KEY || '' },
  }).then((r) => r.json())
}

async function getAllDevArticles() {
  const articles = await getCached('dev.to/articles', fetchAllDevArticles)
  return articles.filter(article => !!article.canonical_url)
}

Every time we call getAllDevArticles we either fetch and get the response, or we get the cached value. The first time, of course the value is fetched, then the cache would be used a few times, then after the expiration time, the next request will be result in an API request again.

But this isn't good enough. In my experience this only works some times.

When I build locally on my computer, I can see 4 concurrent Node processes running. This results in 4 network requests and it's usually fine.

But when it's built on Vercel, there are usually 8 concurrent Node processes and the dev.to API fails with an error:

❌ 429 Too Many Requests

One way to solve this is by storing our cache on disk, so that more than one process can benefit from it.

Caching the network request on disk

Diagram (with disk cache)

This works because the Next.js build triggers one network request the first time and then it saves the response to disk.

The next time the request would be made, by any process, the disk cache will be tried first. If one exists and it has not expired, it will be used, otherwise the request will be made again and stored.

Here's one implementation:

import sha from 'sha-1'
import os from 'os'
import { join } from 'path'
import { readJsonFile, writeJsonFile } from './fs-read-write-json'

const getTempPath = (path: string) => join(os.tmpdir(), path)

type CacheEntry<V> =
  | {
      status: 'done'
      expiresAt: number
      value: V
    }
  | {
      status: 'pending'
      expiresAt: number
    }

export function getCached<V>(
  key: string,
  fn: () => Promise<V>,
  cacheDurationMs: number
) {
  const tempFilePath = getTempPath(`./jgdev-cache-${sha(key)}.json`)

  const getEntry = async () =>
    readJsonFile<CacheEntry<V> | undefined>(tempFilePath)

  const writePendingEntry = () =>
    writeJsonFile(tempFilePath, {
      status: 'pending',
      expiresAt: new Date().getTime() + 10000,
    })

  const writeEntry = async (value: V) =>
    writeJsonFile(tempFilePath, {
      status: 'done',
      expiresAt: new Date().getTime() + cacheDurationMs,
      value,
    })

  const wait = (t: number) => new Promise((resolve) => setTimeout(resolve, t))

  const revalidate = async (): Promise<V> => {
    await writePendingEntry()
    const value = await fn()
    await writeEntry(value)
    return value
  }

  const hasExpired = (entry: CacheEntry<V>) =>
    new Date().getTime() > entry.expiresAt

  const getValue = async (): Promise<V> => {
    const entry = await getEntry()
    if (entry) {
      if (entry.status === 'done' && !hasExpired(entry)) {
        return entry.value
      } else if (entry.status === 'pending' && !hasExpired(entry)) {
        await wait(500)
        return getValue()
      }
    }
    return revalidate()
  }

  return getValue()
}

Here's the read/write module if you're curious.

It's important to use the os.tmpdir(), if deploying on Vercel, because we can't write to the file system anywhere else.

❌ Error: EROFS: read-only file system

For the full source code you can check this repository.

Please, let me know what you think in the comments. I wonder if there's a library I could use to replace my cache module.


Photo by Fitore F on Unsplash


Comment on dev.to: https://dev.to/juliang/using-dev-to-as-cms-for-my-next-js-blog-1aan