Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap Generation #49

Closed
gryphonmyers opened this issue Apr 29, 2021 · 15 comments
Closed

Sitemap Generation #49

gryphonmyers opened this issue Apr 29, 2021 · 15 comments
Labels
enhancement ✨ New feature or request

Comments

@gryphonmyers
Copy link
Contributor

gryphonmyers commented Apr 29, 2021

Sitemaps are an extremely common need. Basically every application should have one for SEO purposes. Since this plugin is taking on responsibility for rendering out all the routes of an application, I feel like there should be a solution (or at least guidance) facilitating the generation of a sitemap file.

One possibility would be to simply treat a sitemap as any other page file. There may be issues around it showing up as a route alongside "real" application routes though, not sure. Sitemaps are also xml, generally placed at root as sitemap.xml, so I'm not sure if the prerendering logic could currently handle that.

Another option would be to automatically generate sitemaps for users based on their prerender routes. There would need to be some additional configurability in order to facilitate things like i18n, lastmod, etc. This also means users who didn't bother to fill in prerender hooks (perhaps they're just using straight SSR) won't get a sitemap. Otherwise though, it seems like a more user-friendly option.

Thoughts / ideas?

@brillout
Copy link
Member

Basically every application should have one for SEO purposes

How so? I thought that sitemaps are only useful to let crawlers know about unreachable/non-linked pages (or to make crawlers discover pages faster that are reachable only after going through a lot of links.)

If it requires only a minimal amount of changes to src/ I'd be fine with accomodating for it. We should be careful about feature creep.

@chrisvariety
Copy link
Contributor

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

@gryphonmyers
Copy link
Contributor Author

How so? I thought that sitemaps are only useful to let crawlers know about unreachable/non-linked pages (or to make crawlers discover pages faster that are reachable only after going through a lot of links.)

It's best practice to always have one for medium to large sites, then point Google at it. Google may be able to crawl all your routes, but it's imperfect. In more complex i18n scenarios where you're performing server-side geo redirects it is 100% necessary in order to inform Google about pages it can't reach.

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

It is common to do it this way, yes. All I'm saying is since this project is an alternative to products like Next/Nuxt, we should have a solution in mind for this extremely common need. A separate package makes total sense.

Considering that I need this functionality, I'd be happy to work on a solution once we have consensus on what that solution should be.

@brillout
Copy link
Member

How about:

  1. Expose all page routes at contextProps.pageRoutes.

  2. Expose all page URLs at contextProps.pageUrls. (Dynamic routes are included when doing pre-rendering.)

  3. addPage() API.

// Npm package @vite-plugin-ssr/sitemap

import { addPage } from 'vite-plugin-ssr/api'

export { addSitemap }

function addSitemap() {
  addPage({
    '.page.js': require.resolve('./path/to/sitemap.page.js'),
    '.page.route.js': require.resolve('./path/to/sitemap.page.route.js'),
    '.page.server.js': require.resolve('./path/to/sitemap.page.server.js'),
    // Empty `sitemap.page.client.js` for zero browser-side JavaScript
    '.page.client.js': require.resolve('./path/to/sitemap.page.client.js'),
  })
}
// vite.config.js

import ssr from 'vite-plugin-ssr/plugin'
import { addSitemap } from '@vite-plugin-ssr/sitemap'

addSitemap()

module.exports = {
  plugins: [ssr()]
}

next.js doesn't have this built-in. I'd say this is more of a guide if anything. This gets fairly complicated as you can see by https://github.com/iamvishnusankar/next-sitemap

I aslo care about keeping core lean & simple. The addPage() API enables further automatic generation of Term of Services Page, /manifest.json, etc.

@truumahn
Copy link

truumahn commented Jun 30, 2022

For prerendering with sitemaps, I did the following workaround, using sitemap.js:

I created a prerender.ts file:

import { prerender } from 'vite-plugin-ssr/cli';
import { SitemapStream, streamToPromise } from 'sitemap';
import { Readable } from 'stream';
import { locationOrigin } from './env';
import fs from 'fs/promises';

// An array with your links
const urlList: string[] = [];
const stream = new SitemapStream({ hostname: locationOrigin });

prerender({ pageContextInit: { urlList } }).then(() => {
    // Return a promise that resolves with your XML string
    streamToPromise(Readable.from(urlList).pipe(stream)).then((data) => {
        fs.writeFile('./dist/client/sitemap.xml', data.toString());
    });
});

This passes down a urlList named variable that pages can populate during prerender with their URLs.

Inside _default.page.server.ts I do the following:

import { locationOrigin } from '@env';

async function render(pageContext: PageContextBuiltIn & PageContext) {
    if (pageContext.urlList && pageContext.url !== '/fake-404-url') {
        pageContext.urlList.push(`${locationOrigin}${pageContext.url}`);
    }
    // ...
}

When building, I use ts-node and this self-tailored prerender file: vite build && vite build --ssr && ts-node ./prerender.ts

I'm sure this could be made more elegant, and the logic could be refactored so both SSR and SSG uses the same sitemap generation, but it does its job for the time being.

@brillout brillout added the enhancement ✨ New feature or request label Apr 27, 2023
@brillout
Copy link
Member

Automatic sitemap generation would be great and it's definitely on the radar to enable the ecosystem to build such extensions.

Actually, it may already be possible with https://vite-plugin-ssr.com/extends. (It will require to read private pageContext._* properties but we can make them public/stable.) Contributions much welcome to try.

Closing in the meantime as it's not a top priority for now. Also, it may be already possible (soon).

@brillout brillout closed this as not planned Won't fix, can't repro, duplicate, stale May 31, 2023
@schaschko
Copy link

@brillout Can you please elaborate on how this works using extends?

(Currently I'm generating a sitemap.xml by using the information which pages are prerendered to dist/client.)

@brillout
Copy link
Member

@schaschko Check the private pageContext._* properties, I believe they'll give you the information you need. Keep me updated: we can then turn the private properties public including proper documentation.

@schaschko
Copy link

schaschko commented Jun 19, 2023

Not sure I'm on the right path here. What I could come up with, using pageContext._allPageIds and the diff scaffolded from a react-ts app:

diff --git a/pages/about/index.page.server.tsx b/pages/about/index.page.server
.tsx
new file mode 100644
index 0000000..6b12acf
--- /dev/null
+++ b/pages/about/index.page.server.tsx
@@ -0,0 +1,50 @@
+import ReactDOMServer from "react-dom/server";
+import { PageShell } from "./PageShell";
+import { escapeInject, dangerouslySkipEscape } from "vite-plugin-ssr/server";
+import logoUrl from "./logo.svg";
+import type { PageContextServer } from "./types";
+
+export { Page };
+
+function Page(pageProps) {
+  console.log(pageProps);
+  return <>
+    Sitemap constructed from pageProps (failed)
+    {/* <?xml version="1.0" encoding="UTF-8"?> */}
+    {/* <urlset xmlns="https://www.sitemaps.org/schemas/sitemap/0.9"> */}
+    {/* <url> */}
+    {/* <loc>https://www.example.com/foo.html</loc> */}
+    {/* <lastmod>2022-06-04</lastmod> */}
+    {/* </url> */}
+    {/* </urlset> */}
+  </>;
+}
+
+export { onBeforeRender };
+
+async function onBeforeRender(pageContext) {
+  return {
+    pageContext: {
+      pageProps: pageContext._allPageIds,
+    },
+  };
+}
+
+export { render };
+
+async function render(pageContext: PageContextServer) {
+  const { Page, pageProps } = pageContext;
+  // This render() hook only supports SSR, see https://vite-plugin-ssr.com/re
nder-modes for how to modify render() to support SPA
+  if (!Page)
+    throw new Error("My render() hook expects pageContext.Page to be defined"
);
+  const pageHtml = ReactDOMServer.renderToString(<Page {...pageProps} />);
+
+  const documentHtml = escapeInject`${dangerouslySkipEscape(pageHtml)}`;
+
+  return {
+    documentHtml,
+    pageContext: {
+      // We can add some `pageContext` here, which is useful if we want to do
 page redirection https://vite-plugin-ssr.com/page-redirection
+    },
+  };
+}
diff --git a/pages/about/index.page.tsx b/pages/about/index.page.tsx
deleted file mode 100644
index 3cf7a11..0000000
--- a/pages/about/index.page.tsx
+++ /dev/null
@@ -1,14 +0,0 @@
-import './code.css'
-
-export { Page }
-
-function Page() {
-  return (
-    <>
-      <h1>About</h1>
-      <p>
-        Example of using <code>vite-plugin-ssr</code>.
-      </p>
-    </>
-  )
-}
diff --git a/vite.config.ts b/vite.config.ts
index f476887..f57b373 100644
--- a/vite.config.ts
+++ b/vite.config.ts
@@ -1,9 +1,15 @@
-import react from '@vitejs/plugin-react'
-import ssr from 'vite-plugin-ssr/plugin'
-import { UserConfig } from 'vite'
+import react from "@vitejs/plugin-react";
+import ssr from "vite-plugin-ssr/plugin";
+import { UserConfig } from "vite";

 const config: UserConfig = {
-  plugins: [react(), ssr()]
-}
+  plugins: [
+    react(),
+    ssr({
+      prerender: true,
+      includeAssetsImportedByServer: true,
+    }),
+  ],
+};

-export default config
+export default config;

I tried to somehow get a "blank" page, where I could inject the xml-sitemap code. The most minimal I got was, but that was no minimal enough:

<head><link rel="stylesheet" type="text/css" href="https://localhost:3000/assets/static/default.page.server.d4835ae9.css"></head>
Sitemap constructed from pageProps (failed)

I also tested truumahn solution, which worked, until I tried to await some stuff from db in onBeforeRender, then it stopped working..

@brillout
Copy link
Member

I'm not sure I understand your problem. But seems like a user land problem.

I'm realizing that this data: https://github.com/brillout/vite-plugin-ssr/blob/70ab60b502a685e39e65417a011c134fed1b5bd5/vite-plugin-ssr/shared/route/loadPageRoutes.ts#L14-L21 isn't accessible, I can make it available over pageContext._pageRoutes if you believe you need that.

@brillout
Copy link
Member

I can make it available over pageContext._pageRoutes if you believe you need that.

Done. You can now access all the internal routing information over pageContext._pageRoutes.

If many use it then I'll make it a stable public API.

@briansunter
Copy link

briansunter commented Nov 21, 2023

I noticed _pageRoutes doesn't include the pages from +onBeforePrerenderStart. Is there any way to make this work?

For example, it just shows

{
  pageId: '/pages/hello',
  comesFromV1PageConfig: true,
  routeFunction: [Function: route],
  routeDefinedAt: '/pages/hello/+route.ts',
  routeType: 'FUNCTION'
}

But not the routes for the individual hello pages from the demo

@brillout
Copy link
Member

@briansunter Since you provide the list of URLs, you already have that information: you don't need it from Vike. That said, alternatively, you can use pageContext._preerenderContext.pageContexts. Keep in mind that it's internal so make sure to pin Vike's version.

I'm curious: do other React/Vue/... frameworks provide features in that regard?

Also sponsoring welcome (you'll get a bump in feature request prioritization).

@briansunter
Copy link

briansunter commented Nov 21, 2023

Hey @brillout thats basically what I was thinking: merging the static pages from this property with the URLs I'm providing in pre-render.

Should be able to import the route functions from the different pages to build that list as well.

I don't think react has anything like this built in, I'm comparing it more to SSGs like eleventy that have knowledge of all static and dynamic routes.

Manually merging it will work, just wasn't sure if there was a better way, since I'm already providing those routes to vike.

That internal property looks exactly like what I need. Will try that out and post a reference implementation if anyone's interested.

And I'll definitely look into getting more involved after finishing my current project.

@brillout
Copy link
Member

@briansunter Thanks for circling back on this. Also make sure to check https://vike.dev/markdown#page-list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants