Skip to content

OriNachum/autonomous-intelligence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tau - The Autonomous, Understanding robot

This is Tau!
Tau is inspired by Pi.AI and if you havent tried Pi yet, I strongly encourage you to try.
Like Pi, Tau's conversation is on continual conversation, unlike Chat based bots which feature many conversations and threads.
This is by design - Tau has a single conversation, like speaking to a human.
This is reflected by consulting Tau in decisions made along development: Order of features, voice type, etc.

Tau is a personal fun project.
I opened it as an open source for anyone to experiment with (fork), or just follow. (A star is appreciated!)
If you fork - delete history and facts to reset their knowledge and embark the journey anew!

Update status

  • System Prompt: Speech-actions speak conversation structure.
  • Conversation loop: A continueous conversation with ongoing context.
  • Immediate memory: Reduce context by summarizing it to key points. Inject memory to System prompt.
  • Long term memory: Save the running memory to vector database.
  • Speech: Voice based conversation with hearing and speaking. (Whisper and OpenAI TTS)
  • Vision infra: Set up Hailo-8L as an internal vision webservice.
    • Setup Hailo-8L on Raspberry Pi, validate examples work.
    • Look for best practices and options for integrating Hailo in your application.
    • Find a suitable, working architecture to wrap hailo as a service
    • Implement and improve the wrapper
    • Pending Hailo review and my fixes (update, will be integrates as community-examples, confirmed by Hailo)
    • Integrate in the system, allow Tau to recognize faces
    • add more-than-one models uo be used serially, or use different devices (Coral, Sony AI Camera x2, Jetson)
  • Long term fetching: Pull from long term memory into context.
  • Entity based memory: Add GraphRAG based memory.
    • Learn about GraphRAG, how to implement, etc.
    • Use or implement GraphRAG
  • Advanced voice: Move to ElevenLabs advanced voices.
  • Tool use
    • Add frameqork for actions:
    • Open live camera feed action
    • Snap a picture
  • Introspection: Add Introspection agent for active and background thinking and processing.
  • Growth: Add nightly finetuning, move to smaller model.

Prerequisites

Tau should be able to run on any linux with internet, but was tested only on a raspberry pi 5 8GB with official OS 64bit.
Raspberry AI Kit is needed for vision (Can be disabled in code - configuration support per request/in future)

Keys

All needed keys are in .env_sample.
Copy it to .env and add your keys.
Currently, the main key is OpenAI (Chat, Speech, Whisper), and VoyageAI + Pinecone is for vectordb

I plan on moving back to Anthropic (3.5 sonnet only)

Groq was used for a fast understand action usecase

Installation

  1. Cloning Git repositories 1.1. Clone this repository to your Raspberry Pi:
git clone https://github.com/OriNachum/autonomous-intelligence.git

1.2. Clone this repository to your Raspberry Pi:

git clone https://github.com/OriNachum/hailo-rpi5-examples.git

I have a pending PR to integrate this to main repo.

https://github.com/hailo-ai/hailo-rpi5-examples/pull/50

If you do, set up the your machine for Hailo-8L chip per Hailo's instructions.

  1. Copy .env_sample to .env and add all keys:
  • ANTHROPIC_API_KEY: used for Claude based text completion and vision. Currently unused.
  • OPENAI_API_KEY: Used for Speech, Whisper, vision and text.
  • GROQ_API_KEY: Used for a super quick action understanding, May be replaced with embeddings.
  • VOYAGE_API_KEY: VoyageAI is recommended by Anthropic. They offer the best embeddings to date (of when I selected it), and offer a great option for innovators.
  • PINECONE_API_KEY: API Key of pinecone. Serverless is a great option.
  • PINECONE_DIMENSION: Dimension of the embeddings generated by Voyage. Used for the setup of Pinecone
  • PINECONE_INDEX_NAME: Name of the index in Pinecone, for memory

Usage

There are five programs to run by this order:

  1. hailo-rpi5-examples:
  2. basic-pipelines/detection_service.py: This runs the camera and emits events on changes on detection
  3. autonomous-intelligence
  4. services/face_service.py: this starts the face app, and reacts when speech occurs
  5. tau.py: this is the main LLM conversation loop
  6. tau_speech.py: this consumes speech events, and produces actual speech
  7. services/microphone_listener.py this listens to your speech and emits events to tau.py as input

License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published