A speaking camera using Pi3 and Tensorflow

Danbri made one of these and I was so impressed I had a go myself, with a couple of tweaks. It’s very easy to do. He did all the figuring out what needed to be done – there’s something similar here which did the rounds recently. Others have done the really heavy lifting – in particular, making tensorflow work on the Pi.

Barnoid has done lovely¬†things with a similar system but cloud based setup for his Poetoid Lyricam – he used a captioner similar to this one that isn’t quite on a pi with python hooks yet (but nearly) (Barnoid update – he used Torch with neuraltalk2.

The gap between taking a photo and it speaking is 3-4 seconds. Sometimes it seems to cache the last photo. It’s often wrongūüôā

I used USB audio and a powerfulish speaker. A DAC would also be a good idea.

Instructions

Image the pi and configure

diskutil list
diskutil unmountDisk /dev/diskN
sudo dd bs=1m if=~/Downloads/2016-09-23-raspbian-jessie.img of=/dev/rdiskN

log in to the pi, expand file system, enable camera

sudo raspi-config

optionally, add in usb audio or a DAC

Test

speaker-test

install pico2wave (I tried espeak but it was very indistinct)

sudo pico /etc/apt/sources.list

# Uncomment line below then 'apt-get update' to enable 'apt-get source'
deb-src http://archive.raspbian.org/raspbian/ jessie main contrib non-free rpi

sudo apt-get update
sudo apt-get install fakeroot debhelper libtool help2man libpopt-dev hardening-wrapper autoconf
sudo apt-get install automake1.1 # requires this version
mkdir pico_build
cd pico_build
apt-get source libttspico-utils
cd svox-1.0+git20130326 
dpkg-buildpackage -rfakeroot -us -uc
cd ..
sudo dpkg -i libttspico-data_1.0+git20130326-3_all.deb
sudo dpkg -i libttspico0_1.0+git20130326-3_armhf.deb
sudo dpkg -i libttspico-utils_1.0+git20130326-3_armhf.deb

test

sudo apt-get install mplayer
pico2wave -w test.wav "hello alan" | mplayer test.wav

install tensorflow on raspi

sudo apt-get install python-pip python-dev
wget https://github.com/samjabrahams/tensorflow-on-raspberry-pi/raw/master/bin/tensorflow-0.10.0-cp27-none-linux_armv7l.whl
sudo pip install tensorflow-0.10.0-cp27-none-linux_armv7l.whl
install prerequisitites for classify_image.py
git clone https://github.com/tensorflow/tensorflow.git # takes ages
sudo pip install imutils picamera 
sudo apt-get install python-opencv

test

cd /home/pi/tensorflow/tensorflow/models/image/imagenet

install danbri / my hacked version of classify_image.py

mv classify_image.py classify_image.py.old
curl -O "https://gist.githubusercontent.com/libbymiller/afb715ac53dcc7b85cd153152f6cd75a/raw/2224179cfdc109edf2ce8408fe5e81ce5a265a6e/classify_image.py"

run

python classify_image.py

done!

 

Machine learning links

[work in progress – I’m updating it gradually]

Machine Learning

Google Apologizes After Photos App Autotags Black People as ‚ÄėGorillas‚Äô¬†– a very upsetting and embarrassing misclassification. Flickr’s system did the same thing but in a less visible way.

How Vector Space Mathematics Reveals the Hidden Sexism in Language Рvery interesting work analysing Word2vec, and particularly their mechanisms for fixing the problem

There is a blind spot in AI research¬†¬†Kate Crawford and Ryan Calo, Nature, October 2015 – a call for “A practical and broadly applicable social-systems analysis thinks through all the possible effects of AI systems on all parties”

a ProPublica investigation in May 2016 found that the proprietary algorithms widely used by judges to help determine the risk of reoffending are almost twice as likely to mistakenly flag black defendants than white defendants

As a first step, researchers ‚ÄĒ across a range of disciplines, government departments and industry ‚ÄĒ need to start investigating how differences in communities‚Äô access to information, wealth and basic services shape the data that AI systems train on.

Maciej CegŇāowski – SASE Panel¬†– Maciej on why not being able to understand the mechanisms by which ML systems come to their results is problematic, or as he puts it

“Instead of relying on algorithms, which we can be accused of manipulating for our benefit, we have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias.”

All it takes to steal your face is a special pair of glasses Рreport on a paper experimentally tricking a commercial face recognition system into misidentifying  people as specific individuals. Depends on a feature of some DNNs that means that small perturbations in an image can produce misclassifications Рas described in the next paper:

Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

 

Filter bubbles

How the Internet Is Loosening Our Grip on the Truth¬†–¬†

In a recent Pew Research Center survey,¬†81 percent of respondents¬†said that partisans not only differed about policies, but also about ‚Äúbasic facts.‚ÄĚ

[…]

Psychologists and other social scientists have repeatedly shown that when confronted with diverse information choices, people rarely act like rational, civic-minded automatons. Instead, we are¬†roiled by preconceptions and biases, and¬†we usually do what feels easiest¬†‚ÄĒ we gorge on information that confirms our ideas, and we shun what does not.

The spreading of misinformation online. Del Vicario, Michela and Bessi, Alessandro and Zollo, Fabiana and Petroni, Fabio and Scala, Antonio and Caldarelli, Guidoand Stanley, H. Eugene and Quattrociocchi, Walter  Proceedings of the National Academy of Sciences, 113 (3). pp. 554-559. ISSN 1091-6490 (2016)

 

Many mechanisms cause false information to gain acceptance, which in turn generate false beliefs that, once adopted by an individual, are highly resistant to correction.

[…]

Our findings show that users mostly tend to select and share content related to a specific narrative and to ignore the rest. In particular, we show that social homogeneity is the primary driver of content diffusion, and one frequent result is the formation of homogeneous, polarized clusters.

The End of the Echo Chamber¬†¬†– , Feb 2012. Summary of Facebook’s large-scale experiments in 2010¬†with selective removal of links on¬†EdgeRank (fb newsfeed display algo).

If an algorithm like EdgeRank favors information that you’d have seen anyway, it would make Facebook an echo chamber of your own beliefs. But if EdgeRank pushes novel information through the network, Facebook becomes a beneficial source of news rather than just a reflection of your own small world.

[…]

… it doesn‚Äôt address whether those stories differ ideologically from our own general worldview. If you‚Äôre a liberal but you don‚Äôt have time to follow political news very closely, then your weak ties may just be showing you lefty blog links that you agree with‚ÄĒeven though, under Bakshy‚Äôs study, those links would have qualified as novel information

What’s wrong with Big Data¬†– Some interesting examples – pharmacology and chess, but overall argument a bit unclear.

Applications

Artificial Intelligence Is Helping The Blind To Recognize Objects

UK Hospitals Are Feeding 1.6 Million Patients’ Health Records to Google’s AI

Speak, Memory – When her best friend died, she rebuilt him using artificial intelligence¬†– a chatbot version of a real person based on chatlogs. You could probably do that¬†with me…

 

A presence robot with Chromium, WebRTC, Raspberry Pi 3 and EasyRTC

Here’s how to make a presence robot with Chromium 51, WebRTC, Raspberry Pi 3 and EasyRTC. It’s actually very easy, especially now that Chromium 51 comes with Raspian Jessie, although it’s taken me a long time to find the exact incantation.

If you’re going to use it for real, I’d suggest using the Jabra 410 speaker / mic. I find that audio is always the most important¬†part of a presence robot, and the Jabra provides excellent sound for a meeting of 5 – 8 people and will work for meetings with larger groups too. I’ve had the most reliable results using a separate power supply for the Jabra, via a powered hub. The whole thing still occasionally fails, so this is a work in progress. You’ll need someone at the other end to plug it in for you.

I’ve had fair success with a “portal” type setup with the Raspberry Pi touchscreen, but it’s hard to combine the Jabra and the screen in a useful box.

28988574596_16e6ea7321_k

As you can see, the current container needs work:

28937012764_040f8af3d1_k

Next things for me will be some sort of expressivity and / or movement. Tristan suggests emoji. Tim suggests pipecleaner arms. Henry’s interested more generally in emotion expressed via movement. I want to be able to rotate. All can be done via the WebRTC data channel I think.

You will need

  • Raspberry Pi 3 + SD card + 2.5A¬†power supply
  • Jabra Mic
  • Powered USB hub (I like this one)
  • A pi camera – I’ve only tested it with a V1
  • A screen (e.g. this TFT)
  • A server, e.g a Linode, running Ubuntu 16 LTS. I’ve had trouble with AWS for some reason, possibly a ports issue.

Instructions

Set up the Pi

(don’t use jessie-lite, use jessie)

diskutil list
diskutil unmountDisk /dev/diskN
sudo dd bs=1m if=~/Downloads/2016-09-23-raspbian-jessie.img of=/dev/rdiskN

Log in.

sudo raspi-config

expand file system, enable camera (and spi if using a TFT) and boot to desktop, logged in

Update everything

sudo apt-get update && sudo apt-get upgrade

Set up wifi

 sudo pico /etc/wpa_supplicant/wpa_supplicant.conf
 
 network={
   ssid="foo"
   psk="bar"
 }

Add drivers

sudo pico /etc/modules
i2c-dev
snd-bcm2835
bcm2835-v4l2

Add V4l2 video drivers (for Chromium to pick up the camera): argh

sudo nano /etc/modprobe.d/bcm2835-v4l2.conf
options bcm2835-v4l2 gst_v4l2src_is_broken=1

Argh: USB audio

sudo pico /boot/config.txt 

#dtparam=audio=on ## comment this out
sudo pico /lib/modprobe.d/aliases.conf
#options snd-usb-audio index=-2 # comment this out
sudo pico ~.asoundrc
defaults.pcm.card 1;
defaults.ctl.card 0;

Add mini tft screen (see http://www.spotpear.com/learn/EN/raspberry-pi/Raspberry-Pi-LCD/Drive-the-LCD.html )

curl -O http://www.spotpear.com/download/diver24-5/LCD-show-160811.tar.gz
tar -zxvf LCD-show-160811.tar.gz
cd LCD-show/
sudo ./LCD35-show

Rename the bot

sudo pico /etc/hostname
sudo pico /etc/hosts

You may need to enable camera again via sudo raspi-config

Add autostart

pico ~/.config/lxsession/LXDE-pi/autostart
@lxpanel --profile LXDE-pi
@pcmanfm --desktop --profile LXDE-pi
@xscreensaver -no-splash
@xset s off
@xset -dpms
@xset s noblank
#@v4l2-ctl --set-ctrl=rotate=270 # if you need to rotate the camera picture
@/bin/bash /home/pi/start_chromium.sh
pico start_chromium.sh
#!/bin/bash
myrandom=$RANDOM
#@rm -rf /home/pi/.config/chromium/
/usr/bin/chromium-browser --kiosk --disable-infobars --disable-session-crashed-bubble --no-first-run https://your-server:8443/bot.html#$myrandom &

Assemble everything:

  • Connect the USB hub to the Raspberry Pi
  • Connect the Jabra to the USB hub
  • Attach the camera and TFT screen

On the server

Add keys for login

mkdir ~/.ssh
chmod 700 ~/.ssh
pico ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

Install and configure Apache (I used this guide for letsencrypt)

sudo apt-get install apache2
sudo mkdir -p /var/www/your-server/public_html
sudo chown -R $USER:$USER /var/www/your-server/public_html
sudo chmod -R 755 /var/www
nano /var/www/your-server/public_html/index.html
sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/your-server.conf
sudo nano /etc/apache2/sites-available/your-server.conf
   
<VirtualHost *:80>     
        ServerAdmin webmaster@localhost
        ServerName your-server
        ServerAlias your-server
        ErrorLog ${APACHE_LOG_DIR}/your-server_error.log
        CustomLog ${APACHE_LOG_DIR}/your-server_access.log combined
RewriteEngine on
RewriteCond %{SERVER_NAME} = your-server
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,QSA,R=permanent]
</VirtualHost>
sudo a2ensite your-server.conf
sudo service apache2 reload
sudo service apache2 restart

Add certs

You can’t skip this part – Chrome and Chromium won’t work without https

sudo apt-get install git
sudo git clone https://github.com/letsencrypt/letsencrypt /opt/letsencrypt
cd /opt/letsencrypt
./letsencrypt-auto --apache -d your-server
/opt/letsencrypt/letsencrypt-auto renew >> /var/log/le-renew.log
sudo /opt/letsencrypt/letsencrypt-auto renew >> /var/log/le-renew.log
sudo mkdir /var/log/
sudo mkdir /var/log/lets-encrypt

Auto-renew certs

sudo /opt/letsencrypt/letsencrypt-auto renew >> /var/log/lets-encrypt/le-renew.log
crontab -e
# m h  dom mon dow   command
30 2 * * 1 /opt/letsencrypt/letsencrypt-auto renew >> /var/log/lets-encrypt/le-renew.log

Get and install the EasyRTC code

Install node

curl -sL https://deb.nodesource.com/setup | sudo bash -

sudo apt-get install -y nodejs

Install the easyrtc api

cd /var/www/your-server/
git clone https://github.com/priologic/easyrtc

Replace the server part with my version

cd server
rm -r *
git clone https://github.com/libbymiller/libbybot.git
cd ..
sudo npm install

Run the node server

nohup node server.js &

Finally

Boot up the pi, and on your other machine go to

https://your-server:8443/remote.html

in Chrome.

When the Pi boots up it should go into full screen Chromium at¬†https://your-server:8443/bot.html¬†¬†– there should be¬†a prompt to accept the audio and video on the pi – you need to accept that once and then it’ll work.

Troubleshooting

Camera light doesn’t go on

Re-enable the camera using

sudo raspi-config

No video

WebRTC needs a lot of ports open. With this config we’re just using some default STUN and TURN ports. On most wifi networks it should work, but on some restricted or corporate networks you may have trouble. I’ve not tried running my own TURN servers, which in theory would help with this.

No audio

I find linux audio incredibly confusing. The config above is based around this answer. YMMV especially if you have other devices attached.

Working from home

A colleague asked me about my experiences working from home so I’ve made¬†a few notes here.

I’m unusual in my department in that I work from home three or four days a week, and one or two in London, or very occasionally¬†Salford. I started off in this job on an EU-funded project where everyone was remote, and so it made little difference where I was physically as long as we synced up regularly. Since then I’ve worked on multiple other projects where the other participants are mostly in one place and I’m elsewhere. That’s made it more difficult, but also, sometimes, better.

A buddy

Where everyone else is in one place, the main thing I need to function well is one or more buddies who are physically there, who¬†remember to¬†call me in for meetings and let me know anything significant that’s happening that I’m missing because I’m not physically there. The first of these is the most important. Being remote you are easily forgettable. Without Andrew, Dan, Joanne, Tristan, and now Henry and Tim, I’d sometimes be left out.

IRC or slack

I’ve used IRC for years for various remote things (we used to do “scheduled topic chats” 15 year ago on freenode for various Semantic Web topics), the various bots that keep you informed and help you share information easily – loggers and @Edd’s¬†“chump” in particular, but also #swhack bots of many interesting kinds.¬†I learned a huge amount from friends in W3C who are mostly remote from each other and have made¬†lots of tools and bots¬†for helping them manage conference calls for many years.

Recently our team have started using slack as well as irc, so now I’m on both:¬†Slack means that a much more diverse set of people¬†are happy to participate, which is great. It can be very boring working on your own, and these channels make for a sense of community, as well as being useful for specific timely exchanges of information.

Lots of time on organisation

I spend a lot of time figuring out where I need to be and making decisions about what’s most important, and what needs to¬†be face to face and what can be a call. Also:¬†trying to figure out how annoying I’m going to be to the other people in a meeting, and whether I’m going to be able to contribute successfully, or whether it’s best to skip it. I’ve had to learn to¬†ignore the fomo.

I have a text based todo list, which can get a little out of control, but in general has high level goals for this week and next, goals for the day, as well as specific tasks that need to be done on any particular day or a particular time. I spend a little time each morning¬†figuring these out, and making sure I have a good sense of my calendar (Dan Connolly taught me to do this!). In general, juggling urgent and project-managery and less-urgent exploratory work is difficult and I probably don’t do enough of the latter (and I probably don’t look far enough ahead, either). I sometimes schedule my day quite concretely with tasks at specific times to make sure I devote¬†thinking time for¬†specific problems,¬†or when I have a ton to do, or a lot of task switching.

Making an effort not to work

Working at home means I could work any time, and having an interesting job means that I’d probably quite enjoy it, too. There’s a temptation to do the boring admin stuff in work and leave the fun¬†stuff until things are quieter in the evenings or at the weekend. But I make an effort not to do this, and it helps that the team I work in don’t work late or at weekends. This is a good thing. We need downtime or we’ll get depleted (I did in my last job, a startup, where I also worked at home most of the time, and¬†where we were across multiple timezones).

Weekends are fairly easy to not work in, evenings are harder, so I schedule other things to do where possible (Bristol Hackspace, cinema, watching something specific on TV, other technical personal projects).

Sometimes you just have to be there

I’m pretty good at doing meetings remotely but we do a lot of workshops which involve getting up and doing things, writing things down on whiteboards etc. I also chair a regular meeting that I feel¬†works better if I’m there. When I need to be there a few days, I’m lucky enough to be able to stay with some lovely friends, which means its a pleasure rather than being annoying and boring to not be at¬†home.

What I miss and downsides

What I miss is the unscheduled time working or just hanging out with people. When I’m in London my time is usually¬†completely scheduled, which is pretty knackering. Socialising gets crammed into short trips to the pub. The commute means I lose my evening at least once a week and sometimes arrive at work filled with train-rage (I guess the latter is¬†normal for anyone who commutes by rail).

Not being in the same place as everything day to day means that I¬†miss some of the up-and down-sides of being physically¬†there, which are mostly about¬†spontaneity: I never get included in ad-hoc meetings, so have more time to concentrate but also miss some interesting things; I don’t get distracted (by fun or not-fun) things, including bad¬†moods in¬†the organisation, gossip, but also impromptu games, fun trips out etc etc.

And finally…

For me, working from home in various capacities has given me opportunities I’d never have had, and I’m very lucky to be able to do it in my current role.

Wifi-connect – quick wifi access point to tell a Raspberry Pi about a wifi network

This is all Andrew Nicolaou‘s work. I’m just making a note of it here so others can have a go.

An important part of Radiodan is the way it simplifies connecting a device to a wifi network. The pattern is more common now for screenless devices РChromecast uses it and ESPs have code patterns for it.

The idea is that if it can’t find a known wifi network, the device creates its own¬†access point, you connect to it on a different device such as a phone or laptop, it pops up a web page for you and you add in the wifi details of the network nearby that you want it to connect to.

Andrew, Dan Nuttall and Chris Lowis wrote the original code – which I wrote up here – and then recently Andrew¬†investigated Resin’s approach, which seems to be more reliable. Resin uses their own platform and Docker images which we’re not using, so Andrew un-dockerised it, and has recently rolled it into the new iteration¬†of Radiodan that we’re working on.

If you want to use it in your own project without Radiodan, here are some instructions. It uses a branch of the Radiodan provisioning code, but just installs the relevant pieces and doesn’t delete anything.

First make sure you have a wifi card with the right chipset Рor a Pi3 (scroll down for special Pi3 instructions). Then:

Provision an SD card (this is on Mac OS X)

diskutil list
diskutil unmountDisk /dev/disk2
sudo dd bs=1m if=~/Downloads/2016-02-09-raspbian-jessie.img of=/dev/rdisk2

Put it in the Pi, login, expand the filesystem, reboot and login again.

Checkout the Radiodan code and provision the relevant parts.

sudo apt-get update -y && sudo apt-get upgrade -y
git clone https://github.com/radiodan/provision.git
cd provision
git fetch origin
git checkout -b minimal origin/minimal
sudo ./provision iptables node wifi-connect

reboot. Wait a minute or two and you’ll see a wifi access point called “radiodan-configuration”. Connect to it and a browser window will pop up. Select the wifi network¬†you want to connect the Pi to, add the password and save. Connect back to the wifi¬†network you selected for the Pi and you should be able to ssh to it at pi@raspberrypi.local

For a Raspberry Pi 3, you’ll need to tweak things in order to make it possible for the built in wifi:

sudo apt-get install raspi-config
sudo BRANCH=next rpi-update