Table of Contents
A demo of neural search for audio data based Vggish model.
- In order to run this example, you should have youtube-dl, ffmpeg available in your system. Please refer to the specific installation instructions.
- For MacOS users, a libmagic installation will furtherly be needed and can be obtained by running
brew install libmagic
- You can add to your system the python libraries required for this example by running the following:
pip install -r requirements.txt
- In this example, we use the Vggish model to encode the sound files. You can find more details about the model at https://github.com/tensorflow/models/tree/master/research/audioset/vggish. Use the following cmd to download the models. For downloading the audioset data, we adapt the codes from the
runme.sh
script at https://github.com/qiuqiangkong/audioset_tagging_cnn . We provide the following script, it will download 10 audio files from the audioset dataset.
bash download_model.sh
bash download_data.sh
After preparing the data, here is how the folder looks like,
.
├── Dockerfile
├── README.md
├── app.py
├── data
├── ├── metadata
│ │ └── eval_segments.csv
│ ├── YjmN-c5mDxfw.wav
│ ├── Yjo9lFbGXf_0.wav
│ └── Yjzij1UX73kU.wav
├── download_data.sh
├── download_model.sh
├── flows
│ ├── index.yml
│ └── query.yml
├── get_data.sh
├── models
│ ├── vggish_model.ckpt
│ └── vggish_pca_params.npz
├── pods
│ ├── chunk_merger.yml
│ ├── customized_executors.py
│ ├── doc.yml
│ ├── encode.yml
│ ├── rank.yml
│ ├── segment.yml
│ ├── vec.yml
│ └── vggish
│ ├── mel_features.py
│ ├── vggish_input.py
│ ├── vggish_params.py
│ ├── vggish_postprocess.py
│ └── vggish_slim.py
├── requirements.txt
└── tests
├── data
│ ├── YjmN-c5mDxfw.wav
│ ├── Yjo9lFbGXf_0.wav
│ └── Yjzij1UX73kU.wav
└── test_audio_search.py
- Alternatively, you can also use you own
.wav
files. Make sure the files are underdata/
. For example, ourget_data.sh
script downloads a few Beethoven symphonies from Wikimedia Commons. This is a small dataset so indexes quickly. Just run
sh ./get_data.sh
Command | Description |
---|---|
python app.py -t index |
To index files/data |
python app.py -t query |
To run query on the index |
Then open https://jina.ai/jinabox.js/ for querying.
To mount local directory and run:
docker run -v "$(pwd)/data:/workspace/data" -v "$(pwd)/workspace:/workspace/workspace" jinaai/hub.app.audio-search:0.0.1 index
Run the following cmd and open https://jina.ai/jinabox.js/ for querying
docker run -p 65481:65481 -e "JINA_PORT=65481" -v "$(pwd)/workspace:/workspace/workspace" jinaai/hub.app.audio-search:0.0.1 search
The best way to learn Jina in-depth is to read our documentation. Documentation is built on every push, merge, and release event of the master branch. For more details, check out:
- Jina command line interface arguments explained
- Jina Python API interface
- Jina YAML syntax for executor, driver and flow
- Jina Protobuf schema
- Environment variables used in Jina
- ... and more
- Slack channel - a communication platform for developers to discuss Jina
- Community newsletter - subscribe to the latest update, release and event news of Jina
- LinkedIn - get to know Jina AI as a company and find job opportunities
- - follow us and interact with us using hashtag
#JinaSearch
- Company - know more about our company, we are fully committed to open-source!
Copyright (c) 2020 Jina AI. All rights reserved.