Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collections to build #94

Merged
merged 1 commit into from
Apr 14, 2021
Merged

Add collections to build #94

merged 1 commit into from
Apr 14, 2021

Conversation

drwpow
Copy link
Member

@drwpow drwpow commented Apr 14, 2021

Changes

Screen Shot 2021-04-14 at 15 57 31

Adds collections to the build, so they expand to multiple .html files.

Please refer to the comments for thoughts.

Testing

  • Tests are passing
  • Tests updated where necessary

Docs

  • Docs / READMEs updated
  • Code comments added where helpful

return {
statusCode: 404,
error: new Error('Not Found'),
collectionInfo: additionalURLs.size ? { additionalURLs } : undefined,
Copy link
Member Author

@drwpow drwpow Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little bit of a hack to get around a chicken-or-egg problem.

In our blog example, we find a pages/$tag.astro collection. Great, so that generates /tag/index.html? Not so fast! It generates /tag/movie/index.html and /tag/television/index.html! But we don’t know what it generates before we load it. But we have to load it using its generated URLs. Ack!

In this scenario, we load /tag, which will 404. But because it’s a collection, it will return 404 but with the loaded data we need to complete the build step. We only got that 404 response because that collection loaded up its data, looked through everything, and at the end determined our request /tag didn’t match the expected format /tag/:tag. But since it had to do work to deliver that 404, it returned what it figured out up till that point so we know how to proceed.

Anyway, the hacky bit is returning collectionInfo from our runtime.load() response regardless of statusCode. But this was the least-disruptive way I could think to do this, without making a runtime.loadCollection() or some API that lives completely outside runtime.load()


/** Recursively build collection URLs */
async function loadCollection(url: string): Promise<LoadResult | undefined> {
if (builtURLs.has(url)) return; // this stops us from recursively building the same pages over and over
Copy link
Member Author

@drwpow drwpow Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the thing that keeps the hack from being too bad: in our process of collecting URLs, we keep track of ones we’ve built already and only build new ones.

Let’s look at the worst offender here:

  • Pass 1: load pages/$tag.astro, get back ['/tag/movie/', '/tag/television']. These are both new, so we start up a 2nd pass.
  • Pass 2: /tag/movie -> ['/tag/movie/1', '/tag/movie/2'] / /tag/television -> ['/tag/television/1']. These are new as well, so we move to a 3rd pass.
  • Pass 3: /tag/movie/1 -> ['/tag/movie/1, '/tag/movie/2'] / /tag/movie/2 -> ['/tag/movie/1, '/tag/movie/2'] / /tag/television/1 -> ['/tag/television/1']. All these are repeats, so we’re done.

That last step is where we could be rendering the same pages over-and-over again. The builtURLs only crawls new URLs.

Any collection can be fully built in 2 or at most 3 passes (3 passes if the first request was a “miss” as it was with /tag missing parameters). While this can probably be improved, 3 passes is still far more efficient than p * n passes (p = all possible parameter values × n = number of pages per parameter values).

const pages = await allPages(pageRoot);

try {
await Promise.all(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the addition of collections, pages now build in parallel rather than serially. This could have a big impact on build times.

@matthewp
Copy link
Contributor

lgtm

@drwpow drwpow merged commit f28cebc into main Apr 14, 2021
@drwpow drwpow deleted the collections-build branch April 14, 2021 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants