A speaking camera using Pi3 and Tensorflow

Update – I’ve done something similar more recently.

Danbri made one of these and I was so impressed I had a go myself, with a couple of tweaks. It’s very easy to do. He did all the figuring out what needed to be done – there’s something similar here which did the rounds recently. Others have done the really heavy lifting – in particular, making tensorflow work on the Pi.

Barnoid has done lovely¬†things with a similar system but cloud based setup for his Poetoid Lyricam – he used a captioner similar to this one that isn’t quite on a pi with python hooks yet (but nearly) (Barnoid update – he used Torch with neuraltalk2.

The gap between taking a photo and it speaking is 3-4 seconds. Sometimes it seems to cache the last photo. It’s often wrong ūüôā

I used USB audio and a powerfulish speaker. A DAC would also be a good idea.


Image the pi and configure

diskutil list
diskutil unmountDisk /dev/diskN
sudo dd bs=1m if=~/Downloads/2016-09-23-raspbian-jessie.img of=/dev/rdiskN

log in to the pi, expand file system, enable camera

sudo raspi-config

optionally, add in usb audio or a DAC



install pico2wave (I tried espeak but it was very indistinct)

sudo pico /etc/apt/sources.list

# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi

sudo apt-get update
sudo apt-get install fakeroot debhelper libtool help2man libpopt-dev hardening-wrapper autoconf
sudo apt-get install automake1.1 # requires this version
mkdir pico_build
cd pico_build
apt-get source libttspico-utils
cd svox-1.0+git20130326 
dpkg-buildpackage -rfakeroot -us -uc
cd ..
sudo dpkg -i libttspico-data_1.0+git20130326-3_all.deb
sudo dpkg -i libttspico0_1.0+git20130326-3_armhf.deb
sudo dpkg -i libttspico-utils_1.0+git20130326-3_armhf.deb


sudo apt-get install mplayer
pico2wave -w test.wav "hello alan" | mplayer test.wav

install tensorflow on raspi

sudo apt-get install python-pip python-dev
wget https://github.com/samjabrahams/tensorflow-on-raspberry-pi/raw/master/bin/tensorflow-0.10.0-cp27-none-linux_armv7l.whl
sudo pip install tensorflow-0.10.0-cp27-none-linux_armv7l.whl
install prerequisitites for classify_image.py
git clone https://github.com/tensorflow/tensorflow.git # takes ages
sudo pip install imutils picamera 
sudo apt-get install python-opencv


cd /home/pi/tensorflow/tensorflow/models/image/imagenet

install danbri / my hacked version of classify_image.py

mv classify_image.py classify_image.py.old
curl -O "https://gist.githubusercontent.com/libbymiller/afb715ac53dcc7b85cd153152f6cd75a/raw/2224179cfdc109edf2ce8408fe5e81ce5a265a6e/classify_image.py"


python classify_image.py



Machine learning links

[work in progress – I’m updating it gradually]

Machine Learning

Google Apologizes After Photos App Autotags Black People as ‚ÄėGorillas‚Äô¬†– a very upsetting and embarrassing misclassification. Flickr’s system did the same thing but in a less visible way.

How Vector Space Mathematics Reveals the Hidden Sexism in Language Рvery interesting work analysing Word2vec, and particularly their mechanisms for fixing the problem

There is a blind spot in AI research¬†¬†Kate Crawford and Ryan Calo, Nature, October 2015 – a call for “A practical and broadly applicable social-systems analysis thinks through all the possible effects of AI systems on all parties”

a ProPublica investigation in May 2016 found that the proprietary algorithms widely used by judges to help determine the risk of reoffending are almost twice as likely to mistakenly flag black defendants than white defendants

As a first step, researchers ‚ÄĒ across a range of disciplines, government departments and industry ‚ÄĒ need to start investigating how differences in communities‚Äô access to information, wealth and basic services shape the data that AI systems train on.

Maciej CegŇāowski – SASE Panel¬†– Maciej on why not being able to understand the mechanisms by which ML systems come to their results is problematic, or as he puts it

“Instead of relying on algorithms, which we can be accused of manipulating for our benefit, we have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias.”

All it takes to steal your face is a special pair of glasses Рreport on a paper experimentally tricking a commercial face recognition system into misidentifying  people as specific individuals. Depends on a feature of some DNNs that means that small perturbations in an image can produce misclassifications Рas described in the next paper:

Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.


Filter bubbles

How the Internet Is Loosening Our Grip on the Truth¬†–¬†

In a recent Pew Research Center survey,¬†81 percent of respondents¬†said that partisans not only differed about policies, but also about ‚Äúbasic facts.‚ÄĚ


Psychologists and other social scientists have repeatedly shown that when confronted with diverse information choices, people rarely act like rational, civic-minded automatons. Instead, we are¬†roiled by preconceptions and biases, and¬†we usually do what feels easiest¬†‚ÄĒ we gorge on information that confirms our ideas, and we shun what does not.

The spreading of misinformation online. Del Vicario, Michela and Bessi, Alessandro and Zollo, Fabiana and Petroni, Fabio and Scala, Antonio and Caldarelli, Guidoand Stanley, H. Eugene and Quattrociocchi, Walter  Proceedings of the National Academy of Sciences, 113 (3). pp. 554-559. ISSN 1091-6490 (2016)


Many mechanisms cause false information to gain acceptance, which in turn generate false beliefs that, once adopted by an individual, are highly resistant to correction.


Our findings show that users mostly tend to select and share content related to a specific narrative and to ignore the rest. In particular, we show that social homogeneity is the primary driver of content diffusion, and one frequent result is the formation of homogeneous, polarized clusters.

The End of the Echo Chamber¬†¬†– , Feb 2012. Summary of Facebook’s large-scale experiments in 2010¬†with selective removal of links on¬†EdgeRank (fb newsfeed display algo).

If an algorithm like EdgeRank favors information that you’d have seen anyway, it would make Facebook an echo chamber of your own beliefs. But if EdgeRank pushes novel information through the network, Facebook becomes a beneficial source of news rather than just a reflection of your own small world.


… it doesn‚Äôt address whether those stories differ ideologically from our own general worldview. If you‚Äôre a liberal but you don‚Äôt have time to follow political news very closely, then your weak ties may just be showing you lefty blog links that you agree with‚ÄĒeven though, under Bakshy‚Äôs study, those links would have qualified as novel information

What’s wrong with Big Data¬†– Some interesting examples – pharmacology and chess, but overall argument a bit unclear.


Artificial Intelligence Is Helping The Blind To Recognize Objects

UK Hospitals Are Feeding 1.6 Million Patients’ Health Records to Google’s AI

Speak, Memory – When her best friend died, she rebuilt him using artificial intelligence¬†– a chatbot version of a real person based on chatlogs. You could probably do that¬†with me…