This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
verbal_morality_statue [2017/10/22 15:51] – created deepnemo | projects:verbal_morality_statute [2024/01/05 21:34] (current) – Overhaul kratenko | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===== Verbal Morality | + | ====== Verbal Morality |
- | ==== General Function ==== | ||
- | The VMSE2000 is the newest iteration in a long standing series of verbal hate crime prevention devices. It is able to detect language violations in all languages for the region of purchase and works in a range of up to 6 meters, while being able to work in conjunction with other instances of the VMSE2000 to cover your available space and keep you safe from dreaded language violations. | + | <WRAP column 47%> |
+ | {{: | ||
+ | </ | ||
+ | <WRAP column 47%> | ||
+ | {{: | ||
+ | </ | ||
- | ==== Detailed Specification ==== | + | The VMSE 2000 is the newest iteration in a long standing series of verbal hate crime prevention devices. It is able to detect language violations in all languages for the region of purchase and works in a range of up to 6 meters, while being able to work in conjunction with other instances of the VMSE2000 to cover your available space and keep you safe from dreaded language violations. |
- | * The VMSE2000 listens for speech input and detects language violations | + | Our analysts predict that the device integrate into everyone' |
- | * Consequences | + | |
- | * The language violation is announced | + | |
- | ==== What needs to be done ==== | ||
- | Minimum Viable Product: | + | ===== Detailed Specification ===== |
- | * evaluate Kaldi | + | * The VMSE 2000 listens for speech input and detects verbal morality violations committed by maniacs currently in the device' |
- | * does it still have pre-trained models? | + | * Consequences of a detected language violation is a verbal notification as well as a printed receipt for a credit fine of a sum adequate to the severity of the individual violation. |
- | * does it run on a Raspi? | + | * The morality violation is announced with a sharp warning tone guaranteeing attention to this crime, both by the maniac himself as well as society in general. |
- | * find alternatives | + | * The public display of the maniacs moral misdemeanour will apply social pressure on the maniac, leading |
- | * data set of language violations | + | * The continuous enforcement |
- | * dict.cc | + | * The VMSE 2000 emits a modern aura of morality with aesthetics inspired by the best designers of San Angeles. |
- | * leo | + | |
- | * movie dataset? | + | |
- | * there should | + | ===== Development ===== |
- | * ensure passable noise robustness | + | |
- | * two microphones | + | The VMSE 2000 is one of the most important projects by Deep Cyber, if not one of the most important efforts of our lifetime. While the physical parameters and technical details have been finalised in an early state of the development back in 2017, the component required for a reliable detection of violations proved do be much more complicated. The depth of Cyber necessary |
- | * two pairs of speakers | + | |
- | * amplifier | + | ====== Hardware ====== |
- | * two sound cards for Raspi (USB) | + | |
- | * suitable cases | + | * Raspberry Pi 5 (4 also successfully tested) |
+ | * Thermal Printer (compatible with `python-escpos`) | ||
+ | * USB Audio Adapter | ||
+ | * PlayStation Eye USB camera | ||
+ | |||
+ | ====== Software used ====== | ||
+ | |||
+ | * OpenAI whisper model (base) prompted for your language of choice! | ||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | |||
+ | ====== Source Code ====== | ||
+ | |||
+ | * You can find the firmware / device glue at [[https:// | ||
+ | * The voice detection part can be found at [[https:// | ||
+ | |||
+ | ====== Previous Iterations ====== | ||
+ | |||
+ | There were a lot of iterations to get to this result (and a lot of non-development as well - this project started in 2017 after all). | ||
+ | We tested DeepSpeech, DeepSpeech V2, RNN on DeepSpeech 2 feature extractors and binary classification RNNs trained from scratch. In the end the simplest and most robust model was OpenAI whisper. Our suspicion is that the amount of data, it's variance and the resulting robustness to noise (microphone as well as background) is what makes the difference. | ||
- | Bonus: | ||
- | * portable power (printer needs a lot of power) | ||