Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collections to build #94

Merged
merged 1 commit into from
Apr 14, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 106 additions & 25 deletions src/build.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import type { AstroConfig, RuntimeMode } from './@types/astro';
import type { LogOptions } from './logger';
import type { LoadResult } from './runtime';
import type { AstroRuntime, LoadResult } from './runtime';

import { existsSync, promises as fsPromises } from 'fs';
import { relative as pathRelative } from 'path';
Expand All @@ -13,6 +13,18 @@ import { collectStatics } from './build/static.js';

const { mkdir, readdir, readFile, stat, writeFile } = fsPromises;

interface PageBuildOptions {
astroRoot: URL;
dist: URL;
filepath: URL;
runtime: AstroRuntime;
statics: Set<string>;
}

interface PageResult {
statusCode: number;
}

const logging: LogOptions = {
level: 'debug',
dest: defaultLogDestination,
Expand Down Expand Up @@ -55,6 +67,78 @@ async function writeResult(result: LoadResult, outPath: URL, encoding: null | 'u
}
}

/** Collection utility */
function getPageType(filepath: URL): 'collection' | 'static' {
if (/\$[^.]+.astro$/.test(filepath.pathname)) return 'collection';
return 'static';
}

/** Build collection */
async function buildCollectionPage({ astroRoot, dist, filepath, runtime, statics }: PageBuildOptions): Promise<PageResult> {
const rel = pathRelative(fileURLToPath(astroRoot) + '/pages', fileURLToPath(filepath)); // pages/index.astro
const pagePath = `/${rel.replace(/\$([^.]+)\.astro$/, '$1')}`;
const builtURLs = new Set<string>(); // !important: internal cache that prevents building the same URLs

/** Recursively build collection URLs */
async function loadCollection(url: string): Promise<LoadResult | undefined> {
if (builtURLs.has(url)) return; // this stops us from recursively building the same pages over and over
Copy link
Member Author

@drwpow drwpow Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the thing that keeps the hack from being too bad: in our process of collecting URLs, we keep track of ones we’ve built already and only build new ones.

Let’s look at the worst offender here:

  • Pass 1: load pages/$tag.astro, get back ['/tag/movie/', '/tag/television']. These are both new, so we start up a 2nd pass.
  • Pass 2: /tag/movie -> ['/tag/movie/1', '/tag/movie/2'] / /tag/television -> ['/tag/television/1']. These are new as well, so we move to a 3rd pass.
  • Pass 3: /tag/movie/1 -> ['/tag/movie/1, '/tag/movie/2'] / /tag/movie/2 -> ['/tag/movie/1, '/tag/movie/2'] / /tag/television/1 -> ['/tag/television/1']. All these are repeats, so we’re done.

That last step is where we could be rendering the same pages over-and-over again. The builtURLs only crawls new URLs.

Any collection can be fully built in 2 or at most 3 passes (3 passes if the first request was a “miss” as it was with /tag missing parameters). While this can probably be improved, 3 passes is still far more efficient than p * n passes (p = all possible parameter values × n = number of pages per parameter values).

const result = await runtime.load(url);
builtURLs.add(url);
if (result.statusCode === 200) {
const outPath = new URL('./' + url + '/index.html', dist);
await writeResult(result, outPath, 'utf-8');
mergeSet(statics, collectStatics(result.contents.toString('utf-8')));
}
return result;
}

const result = (await loadCollection(pagePath)) as LoadResult;
if (result.statusCode === 200 && !result.collectionInfo) {
throw new Error(`[${rel}]: Collection page must export createCollection() function`);
}

// note: for pages that require params (/tag/:tag), we will get a 404 but will still get back collectionInfo that tell us what the URLs should be
if (result.collectionInfo) {
await Promise.all(
[...result.collectionInfo.additionalURLs].map(async (url) => {
// for the top set of additional URLs, we render every new URL generated
const addlResult = await loadCollection(url);
if (addlResult && addlResult.collectionInfo) {
// believe it or not, we may still have a few unbuilt pages left. this is our last crawl:
await Promise.all([...addlResult.collectionInfo.additionalURLs].map(async (url2) => loadCollection(url2)));
}
})
);
}

return {
statusCode: result.statusCode,
};
}

/** Build static page */
async function buildStaticPage({ astroRoot, dist, filepath, runtime, statics }: PageBuildOptions): Promise<PageResult> {
const rel = pathRelative(fileURLToPath(astroRoot) + '/pages', fileURLToPath(filepath)); // pages/index.astro
const pagePath = `/${rel.replace(/\.(astro|md)$/, '')}`;

let relPath = './' + rel.replace(/\.(astro|md)$/, '.html');
if (!relPath.endsWith('index.html')) {
relPath = relPath.replace(/\.html$/, '/index.html');
}

const outPath = new URL(relPath, dist);
const result = await runtime.load(pagePath);

await writeResult(result, outPath, 'utf-8');
if (result.statusCode === 200) {
mergeSet(statics, collectStatics(result.contents.toString('utf-8')));
}

return {
statusCode: result.statusCode,
};
}

/** The primary build action */
export async function build(astroConfig: AstroConfig): Promise<0 | 1> {
const { projectRoot, astroRoot } = astroConfig;
Expand All @@ -77,30 +161,27 @@ export async function build(astroConfig: AstroConfig): Promise<0 | 1> {
const statics = new Set<string>();
const collectImportsOptions = { astroConfig, logging, resolve, mode };

for (const pathname of await allPages(pageRoot)) {
const filepath = new URL(`file:https://${pathname}`);
const rel = pathRelative(astroRoot.pathname + '/pages', filepath.pathname); // pages/index.astro
const pagePath = `/${rel.replace(/\.(astro|md)/, '')}`;

try {
let relPath = './' + rel.replace(/\.(astro|md)$/, '.html');
if (!relPath.endsWith('index.html')) {
relPath = relPath.replace(/\.html$/, '/index.html');
}

const outPath = new URL(relPath, dist);
const result = await runtime.load(pagePath);

await writeResult(result, outPath, 'utf-8');
if (result.statusCode === 200) {
mergeSet(statics, collectStatics(result.contents.toString('utf-8')));
}
} catch (err) {
error(logging, 'generate', err);
return 1;
}

mergeSet(imports, await collectDynamicImports(filepath, collectImportsOptions));
const pages = await allPages(pageRoot);

try {
await Promise.all(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the addition of collections, pages now build in parallel rather than serially. This could have a big impact on build times.

pages.map(async (pathname) => {
const filepath = new URL(`file:https://${pathname}`);

const pageType = getPageType(filepath);
const pageOptions: PageBuildOptions = { astroRoot, dist, filepath, runtime, statics };
if (pageType === 'collection') {
await buildCollectionPage(pageOptions);
} else {
await buildStaticPage(pageOptions);
}

mergeSet(imports, await collectDynamicImports(filepath, collectImportsOptions));
})
);
} catch (err) {
error(logging, 'generate', err);
return 1;
}

for (const pathname of await allPages(componentRoot)) {
Expand Down
31 changes: 27 additions & 4 deletions src/runtime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,19 @@ interface RuntimeConfig {
frontendSnowpackConfig: SnowpackConfig;
}

// info needed for collection generation
type CollectionInfo = { additionalURLs: Set<string> };

type LoadResultSuccess = {
statusCode: 200;
contents: string | Buffer;
contentType?: string | false;
};
type LoadResultNotFound = { statusCode: 404; error: Error };
type LoadResultRedirect = { statusCode: 301 | 302; location: string };
type LoadResultNotFound = { statusCode: 404; error: Error; collectionInfo?: CollectionInfo };
type LoadResultRedirect = { statusCode: 301 | 302; location: string; collectionInfo?: CollectionInfo };
type LoadResultError = { statusCode: 500 } & ({ type: 'parse-error'; error: CompileError } | { type: 'unknown'; error: Error });

export type LoadResult = LoadResultSuccess | LoadResultNotFound | LoadResultRedirect | LoadResultError;
export type LoadResult = (LoadResultSuccess | LoadResultNotFound | LoadResultRedirect | LoadResultError) & { collectionInfo?: CollectionInfo };

// Disable snowpack from writing to stdout/err.
snowpackLogger.level = 'silent';
Expand Down Expand Up @@ -82,6 +85,8 @@ async function load(config: RuntimeConfig, rawPathname: string | undefined): Pro

// handle collection
let collection = {} as CollectionResult;
let additionalURLs = new Set<string>();

if (mod.exports.createCollection) {
const createCollection: CreateCollection = await mod.exports.createCollection();
for (const key of Object.keys(createCollection)) {
Expand All @@ -100,6 +105,7 @@ async function load(config: RuntimeConfig, rawPathname: string | undefined): Pro
}
let requestedParams = routes.find((p) => {
const baseURL = (permalink as any)({ params: p });
additionalURLs.add(baseURL);
return baseURL === reqPath || `${baseURL}/${searchResult.currentPage || 1}` === reqPath;
});
if (requestedParams) {
Expand Down Expand Up @@ -135,22 +141,38 @@ async function load(config: RuntimeConfig, rawPathname: string | undefined): Pro
.replace(/\/1$/, ''); // if end is `/1`, then just omit
}

// from page 2 to the end, add all pages as additional URLs (needed for build)
for (let n = 1; n <= collection.page.last; n++) {
if (additionalURLs.size) {
// if this is a param-based collection, paginate all params
additionalURLs.forEach((url) => {
additionalURLs.add(url.replace(/(\/\d+)?$/, `/${n}`));
});
} else {
// if this has no params, simply add page
additionalURLs.add(reqPath.replace(/(\/\d+)?$/, `/${n}`));
}
}

data = data.slice(start, end);
} else if (createCollection.pageSize) {
// TODO: fix bug where redirect doesn’t happen
// This happens because a pageSize is set, but the user isn’t on a paginated route. Redirect:
return {
statusCode: 301,
location: reqPath + '/1',
collectionInfo: additionalURLs.size ? { additionalURLs } : undefined,
};
}

// if we’ve paginated too far, this is a 404
if (!data.length)
if (!data.length) {
return {
statusCode: 404,
error: new Error('Not Found'),
collectionInfo: additionalURLs.size ? { additionalURLs } : undefined,
Copy link
Member Author

@drwpow drwpow Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little bit of a hack to get around a chicken-or-egg problem.

In our blog example, we find a pages/$tag.astro collection. Great, so that generates /tag/index.html? Not so fast! It generates /tag/movie/index.html and /tag/television/index.html! But we don’t know what it generates before we load it. But we have to load it using its generated URLs. Ack!

In this scenario, we load /tag, which will 404. But because it’s a collection, it will return 404 but with the loaded data we need to complete the build step. We only got that 404 response because that collection loaded up its data, looked through everything, and at the end determined our request /tag didn’t match the expected format /tag/:tag. But since it had to do work to deliver that 404, it returned what it figured out up till that point so we know how to proceed.

Anyway, the hacky bit is returning collectionInfo from our runtime.load() response regardless of statusCode. But this was the least-disruptive way I could think to do this, without making a runtime.loadCollection() or some API that lives completely outside runtime.load()

};
}

collection.data = data;
}
Expand Down Expand Up @@ -178,6 +200,7 @@ async function load(config: RuntimeConfig, rawPathname: string | undefined): Pro
statusCode: 200,
contentType: 'text/html; charset=utf-8',
contents: html,
collectionInfo: additionalURLs.size ? { additionalURLs } : undefined,
};
} catch (err) {
if (err.code === 'parse-error') {
Expand Down