A friend of mine baught a cheap Action Camera through Amazon a few days ago. He asked me to take a look at the mobile app to see if it was possible to stream the live preview to a PC or a RaspberryPi to use it as a cheap surveillance camera. The camera is running Android and it’s possible to connect to ADB though USB.
The first step was to make it easier to connect to the camera. When using the mobile application, you have to be connected to the cameras
own WiFi access point. To be usable as a network surveillance camera, the camera has to connect to an existing WiFi network.
Using adb shell
, I started to explore the camera to be greeted with this prompt:
root@camdroid:/ #
The ADB-daemon is running as root by default. This made things easy, I pushed a new wpa_supplicant.conf
file into /data/misc/wifi/
containing my WiFi configuration
update_config=1
ctrl_interface=DIR=/data/system/wpa_supplicant GROUP=wifi
eapol_version=1
ap_scan=1
fast_reauth=1
network={
ssid="MySSID"
psk="MyPSK"
key_mgmt=WPA-PSK
priority=241
}
For wpa_supplicant to start properly, I had to create the /data/system
folder but that already did the trick. The camera successfully associated
with my WiFi network, it was just missing an IP-Address. Running dhcpcd wlan0
on the camera solved this quickly and the camera could be reached through
my local network.
The next step was to record the communication between the camera and the mobile application to reverse engineer the protocol and commands used to start the preview
streaming. On the camera, a control daemon is listening for connections on port 6666 TCP. The client application connects to this daemon
and sends a login packet to the server. The packet contains the username and the password required to access the camera. Both are not user
configurable inside the application. The username is admin
, the password is 12345
.
The basic packet structure looks like this:
The length field is an unsigned 16Bit little endian integer and the length value excludes the header length (8 Bytes). I only reverse engineered some of the commands:
Command Description
0x00 0x00 0x01 0x10 Login
0x00 0x00 0x01 0x11 Login Accepted
0x00 0x00 0x01 0xFF Start Preview Stream
0x00 0x00 0x11 0x12 Alive Request
0x00 0x00 0x11 0x13 Alive Response
0x00 0x00 0xA0 0x34 Request Firmware Information
0x00 0x00 0xA0 0x35 Firmware Information Response
The Login packet has the username and password inside the payload. Both are encoded as 64 Byte ASCII strings with zero padding. While connected, the camera will periodically ping the client using
Alive Request
packets. The client has to respond to these messages to not be considered disconnected. To start the streaming of the video preview, the client sends the 0x01 0xFF
command with 8 Bytes
of empty payload (0x00).
When the client requests the camera to start streaming the preview, the video stream will be sent via UDP to port 6669 of the client. The video stream however is packaged into another protocol. The video protocol has fields for a sequence number, a message type, and the messages length.
The sequence number starts with zero and is monotonically increasing with every packet. The message type can be 0x00 0x01
or 0x00 0x02
. The first message type contains raw H.264 video data, the second
message type contains a counter, counting the elapsed milliseconds of video stream after every cluster of video packets. The first test was to dump the complete payload of every type 0x01 packet into a file.
This worked right out of the box, ffplay
is able to play this (headless) H.264 file flawlessly! The goal however was to send the video stream to VLC or any other RTP client.
I had a look at the RTP-message structure and how to transmit H.264 using RTP and created a crude and naive RTP implementation in JavaScript:
var rtpPacket = Buffer.alloc(12);
rtpPacket.writeUInt16BE(0x8063, 0);
rtpPacket.writeUInt16BE(sequenceNumber, 2);
rtpPacket.writeUInt32BE(elapsed * 90, 4);
rtpPacket.writeUInt32BE(0, 8);
rtpPacket = Buffer.concat([rtpPacket, frameBuffer]);
The above code allocates a Buffer for an RTP header, writes the required header information and a custom AVP type (99) into the header and appends the frame buffer. The frame buffer is a Buffer filled with the concatenated contents of all type 0x01 packets between the type 0x02 packets (usually 5). The sequence number is a monotonically incremented counter, the value for elapsed is taken from the type 0x02 packet. The multiplication with 90 is part of the H.264 spec, H.264 RTP streams MUST have a sample rate of 90 kHz. The elapsed timer is couinting in millseconds, so multiplying by 90 gives us the number of elapsed samples. The last field inside the RTP header is a SSRC identifier which I just zeroed.
To get VLC to play a raw RTP stream, you need a SDP description. This is a minimal (working) example:
m=video 8888 RTP/AVP 99
a=rtpmap:99 H264/90000
When opening this file, VLC will listen for a H.264 stream on UDP port 8888. For some weird reason VLC is not able to play an already running stream so VLC has to be started first. If you start to send the generated RTP data to VLC you will see the preview image on your computer.
There has not been much information on these cameras on the internet already and my code is far from perfect, but working as a proof of concept. Feel free to play with it and try to fix the stuttering issue. The complete JavaScript code is available on GitHub.