VocalDoccano

VocalDoccano is a Python script for voice-controlled annotation in Doccano. It enables users to perform sequence labeling tasks using voice commands, offering an alternative to manual annotation via mouse and keyboard.

By using voice commands, users can potentially annotate text with less physical effort. This script is a little experiment to see if we can make the process a bit more engaging and, dare I say, fun, by using speech recognition...anyway..

Requirements

Running Doccano Instance: You need a Doccano instance up and running and accessible via its API. It doesn't matter how you installed Doccano (Docker, pip), as long as VocalDoccano can connect. I used Docker Compose installation. You'll need your Doccano URL, username, and password.
Vosk Speech Recognition Model: You'll need a speech recognition model from Vosk. Small models are usually fine and you can grab them from https://alphacephei.com/vosk/models. The vosk-model-small-en-us-0.15 model (which is mentioned in the config file) is a good starting point for English.

Installation

Clone the Repository:

git clone https://github.com/sgoettel/VocalDoccano/
cd VocalDoccano

Install Dependencies:
```
pip install -r requirements.txt
```

Configuration

You'll find all the settings in the config/config.ini file.

[Doccano]:
- url: The web address of your Doccano (like http://localhost/).
- username: Your Doccano login name.
- password: Your Doccano login password.
- project_id: The ID of the specific project in Doccano you want to work on.
[Startup]:
- document_id: The ID of the specific document to load automatically on startup.
[Vosk]: Settings for the speech recognition.
- model_path: Tell VocalDoccano where your downloaded Vosk model is. The path should be relative to the main folder of the VocalDoccano repository. For example: models/vosk-model-small-en-us-0.15.
[Audio]: Settings for the sound you hear when you tag something.
- sound_path: The location of the sound file. This is also relative to the main repository folder (e.g., assets/tag_notification.wav). (Sound effect provided by Universfield from Pixabay.)
[BatchSettings]: These settings control how often your annotations are sent to Doccano.
[RetrySettings]: If something goes wrong when sending annotations to Doccano, these settings tell the script how many times to try again and how long to wait.
[DisplaySettings]: Controls how many words you see when you use the show next command, and also how much text is shown initially when you load a document.

Usage

Go to the src directory:
```
cd src
```
Run the script:
```
python vocal_doccano.py
```
(python src/vocal_doccano.py from the main directory also works)

The script will try to connect to Doccano and load your document. Keep an eye on the terminal for messages.

Voice Commands

mark: Starts selecting text at the current word.
mark next: Expands the current selection to include the next word.
mark back: Shrinks the current selection by one word.
<label>: Say the name of a label (like "Person", "Location", etc.) to tag the currently selected text.
next: Moves the cursor forward.
back: Moves the cursor backward.
show next: Shows you the next bit of text in the console.
undo: Removes the last tag you created in the current batch (before sending to Doccano).
exit: Saves any annotations you've made but haven't sent yet and then quits the script.

(Want to change the commands? You'll find them in the command_map within the VoiceAnnotationTool class in src/vocal_doccano.py.)

Important: VocalDoccano uses the exact labels you've set up in your Doccano project as the voice commands for tagging. So, if you have a label called "Organization", you'd say "Organization".

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
config		config
model		model
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
todo		todo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VocalDoccano

Requirements

Installation

Configuration

Usage

Voice Commands

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

sgoettel/VocalDoccano

Folders and files

Latest commit

History

Repository files navigation

VocalDoccano

Requirements

Installation

Configuration

Usage

Voice Commands

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages