Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🤖 New Script – AI Automated Content Audit #1983

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

isaaclombardssw
Copy link

@isaaclombardssw isaaclombardssw commented Aug 1, 2024

To efficiently work through the large amount of content on the Tina docs pages, I've added a Python3 content audit script that uses local AI to analyze each page with a variable prompt, and I've included a readme file for usage.

The implementation leverages ollama as the local LLM server of choice – making an HTTP request for each separate file.
The URL can be swapped out and payload modified to use other local LLM servers, or other APIs that may have better results or a larger parameter size.

The script generates two files, one that contains a list of all discovered markdown content.
The other, auditor-responses.md contains each file and marks it as either "clean" or with the prompt response text.
To get the filter effect (marking files as clean), the prompt needs to specify that it should start with either yes or no.
The response file is generated with markdown syntax.

General Contributing:

All New Content Submissions: (To be confirmed by reviewer)

  • Title is short & specific
  • Headers are logically ordered & consistent
  • Purpose of document is explained in the first paragraph
  • Procedures are tested and work
  • Any technical concepts are explained or linked to
  • Document follows structure from templates
  • All links work
  • The spelling and grammar checker has been run
  • Graphics and images are clear and useful
  • Any prerequisites and next steps are defined.

@isaaclombardssw isaaclombardssw requested a review from a team as a code owner August 1, 2024 00:57
Copy link

vercel bot commented Aug 1, 2024

@isaaclombardssw is attempting to deploy a commit to the TinaCMS Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Member

@bradystroud bradystroud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Isaac, this is good. we should definitely have some SSW Rules that that the readme links to.
Can you send a 'to myself' for that?

Comment on lines +7 to +9
QUESTION_FILE = "content-auditor-prompt.txt"
OUTPUT_FILE = "auditor-responses.md"
FOUND_FILES = "auditor-found-files.txt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a gitignore for these files, we dont want them pushed.

What do you think?

Comment on lines +5 to +6
The Tina documentation is exhaustive and contains mentions of planned future features.
To check these at once we've created this script that uses every document or blog file as content to be run against a LLM prompt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this more generic

Suggested change
The Tina documentation is exhaustive and contains mentions of planned future features.
To check these at once we've created this script that uses every document or blog file as content to be run against a LLM prompt.
The Tina documentation is exhaustive. To make maintenance and spotting issues easy, we've created this script that runs a LLM prompt against every .mdx file it can find.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants