Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Last revisionBoth sides next revision
verbal_morality_statue [2017/10/22 15:51] – created deepnemoprojects:verbal_morality_statute [2024/01/05 21:20] – [Development] kratenko
Line 1: Line 1:
-===== Verbal Morality Statue Enforcer 2000 =====+====== Verbal Morality Statute Enforcer 2000 ======
  
-==== General Function ====+**Documentation is WIP @ 37c3 - please consult nemo / rey / kratenko for questions**
  
-The VMSE2000 is the newest iteration in a long standing series of verbal hate crime prevention devices. It is able to detect language violations in all languages for the region of purchase and works in a range of up to 6 meters, while being able to work in conjunction with other instances of the VMSE2000 to cover your available space and keep you safe from dreaded language violations.+The VMSE 2000 is the newest iteration in a long standing series of verbal hate crime prevention devices. It is able to detect language violations in all languages for the region of purchase and works in a range of up to 6 meters, while being able to work in conjunction with other instances of the VMSE2000 to cover your available space and keep you safe from dreaded language violations.
  
-==== Detailed Specification ====+Our analysts predict that the device integrate into everyone's way of life and be an accepted part of society no later than 2032. We are happy to provide you with our adaptation of [[https://www.youtube.com/watch?v=dz4HEEiJuGo|how the future will look like]] through the influence of the VMSE 2000.
  
-  * The VMSE2000 listens for speech input and detects language violations 
-  * Consequences of a detected language violation is a verbal notification as well as a printed receipt for a credit fine of 1$ in BTC including a link to the web page (QR code 2000) 
-  * The language violation is announced with a sharp warning tone at the beginning for guaranteeing attention to this crime 
  
-==== What needs to be done ====+===== Detailed Specification =====
  
-Minimum Viable Product:+  * The VMSE 2000 listens for speech input and detects verbal morality violations committed by maniacs currently in the device's proximity.  
 +  * Consequences of a detected language violation is a verbal notification as well as a printed receipt for a credit fine of a sum adequate to the severity of the individual violation. 
 +  * The morality violation is announced with a sharp warning tone guaranteeing attention to this crime, both by the maniac himself as well as society in general. 
 +  * The public display of the maniacs moral misdemeanour will apply social pressure on the maniac, leading to an adjustment of the subjects moral values. 
 +  * The continuous enforcement of the verbal morality statute by the VMSE 2000 will result in a better society for everyone's benefit.
  
-  * evaluate Kaldi  + 
-    * does it still have pre-trained models? +===== Development ===== 
-    does it run on a Raspi? + 
-  * find alternatives to Kaldi +The VMSE 2000 is one of the most important projects by Deep Cyber, if not one of the most important efforts of our lifetime. While the physical parameters and technical details have been finalised in an early state of the development back in 2017, the component required for a reliable detection of violations proved do be much more complicated. The depth of Cyber necessary to overcome each and every obstacle on the way of fulfilling our goal, required a prolonged era of research. After six years of development we were finally able to present our fully functioning prototype to the public at the very end of 2023 on the perfect event for such a presentation: The Chaos Communication Congress, the 37C3. 
-  * data set of language violations + 
-    dict.cc +====== Hardware ====== 
-    leo + 
-    movie dataset? +Raspberry Pi 5 (4 also successfully tested) 
-    there should be enough to do a proper train/val split +- Thermal Printer (compatible with `python-escpos`) 
-  ensure passable noise robustness +- USB Audio Adapter 
-  * two microphones +- PlayStation Eye USB camera for (taped-over CCD) 
-  * two pairs of speakers + 
-  * amplifier for speakers+====== Software used ====== 
 + 
 +- OpenAI whisper model (base) prompted for your language of choice! 
 +- [[https://github.com/aarnphm/whispercpp|whispercpp python bindings]] 
 +- [https://github.com/openvinotoolkit/openvino|OpenVINO]] for speeding up encoding 
 + 
 +====== Source Code ====== 
 + 
 +You can find the firmware / device glue at [[https://github.com/deepestcyber/vmse2000-firmware/]]. 
 +The voice detection part can be found at [[https://github.com/deepestcyber/vmse2000-detector/]]. 
 + 
 +===== Previous Iterations ===== 
 + 
 +There were a lot of iterations to get to this result (and a lot of non-development as well - this project started in 2017 after all). 
 +We tested DeepSpeech, DeepSpeech V2, RNN on DeepSpeech 2 feature extractors and binary classification RNNs trained from scratch. In the end the simplest and most robust model was OpenAI whisper. Our suspicion is that the amount of data, it's variance and the resulting robustness to noise (microphone as well as background) is what makes the difference. 
 + 
 + 
 + 
 +===== Material needed ===== 
 + 
 +==== Data ==== 
 +  [[https://tatoeba.org/|Tatoeba]] hat Daten in verschiedenen Sprachen mit [[https://tatoeba.org/eng/tags/show_sentences_with_tag/454|tags]] und auch Audio 
 +  * https://gist.github.com/jamiew/1112488 -- All the dirty words from Google's "what do you love" project: http://www.wdyl.com/ 
 +  * http://www.cs.cmu.edu/~biglou/resources/bad-words.txt -- another bad word list 
 +  * https://www.hatebase.org/ -- world's largest online repository of structured, multilingual, usage-based hate speech 
 +  * https://github.com/t-davidson/hate-speech-and-offensive-language -- ein paper dazu (textbasiert) 
 +  //Sentiment analyses of single words or short phrases// unter https://www.crowdflower.com/data-for-everyone/ 
 +==== SW ==== 
 + 
 +  daten! big! für deepest cyber! 
 +  ASR 
 +     https://github.com/PaddlePaddle/DeepSpeech 
 +     https://github.com/mozilla/DeepSpeech 
 + 
 +==== HW ==== 
 +  * two area micros -> rey fragt mal rum 
 +  * two pairs of speakers -> rey fragt mal rum 
 +  * amplifier for speakers -> rey fragt mal rum
   * two sound cards for Raspi (USB)   * two sound cards for Raspi (USB)
   * suitable cases   * suitable cases
Line 34: Line 72:
  
   * portable power (printer needs a lot of power)   * portable power (printer needs a lot of power)
 +
 +