Revisiting a Video Chat Application with modern JavaScript and Elixir/Phoenix
In a previous post we talked about implementing a simple video chat with WebRTC and Elixir. This update will touch on some of the API changes that have happened since.
NOTE: I will also be using javascript's async/await
, closures, and modules.
For starters, the newest version of elixir/phoenix will not be compatible due to changes in the dependencies. I'm writing this for elixir v1.9.2 and phoenix 1.4.10. So let's jump right in! If you've already followed the installation guide for phoneix, then we should be able to start with the following command:
mix phx.new phoenix_webrtc
Be sure to hit y
to have the dependencies installed for the front-end
NOTE: In my code on github, you'll notice that I used phoenix_webrtc_revisited
as
the project name.
This will create our project directory and setup our webpack and front-end. This will make creating the simple version of our video chat much easier.
Since the project is pretty well bootstrapped, we can skip creating the controllers and handling the response with our index page. Instead, we can jump straight to getting our Channel set up for the traffic we expect.
This means un-commenting the channels setup, and changing the name of the topic that we expect for the handler:
# /phoenix_webrtc/lib/phoenix_webrtc_web/channels/user_socket.ex
# ...
# Channels
channel "video:*", PhoenixWebrtcWeb.VideoChannel
Now that we have that set up, we need to create the handler for the channel.
Create a file in the same directory called video_channel.ex
and put this in
it:
# lib/phoenix_webrtc_web/channels/video_channel.ex
defmodule PhoenixWebrtcWeb.VideoChannel do
use Phoenix.Channel
def join("video:peer2peer", _message, socket) do
{:ok, socket}
end
def handle_in("peer-message", %{"body" => body}, socket) do
broadcast_from!(socket, "peer-message", %{body: body})
{:noreply, socket}
end
end
What we're doing here is setting up the sub-topic join with "video:peer2peer".
We also want to handle sending the negotiation messages back out, so we need to
create a handle_in
function to deal with that.
Using the broadcast_from
function will broadcast to all except the socket
that sent the originating message. This is nice since we don't want to have to
make that distinction in the client side.
That's largely all the back-end setup that is required to get a basic example working. We'll have quite a bit more to do on the front-end however, since the WebRTC APIs have changed more since the last post. Let's start with some plain HTML.
<!-- lib/phoenix_webrtc/templates/page/index.html/eex -->
<label for="local-stream">Local Video Stream</label>
<video id="local-stream" autoplay muted></video>
<label for="remote-stream">Remote Video Stream</label>
<video id="remote-stream" autoplay></video>
<button id="connect">Connect</button>
<button id="call">Call</button>
<button id="disconnect">Disconnect</button>
A pretty simple layout with two video tags to handle the streams. We have a
connect button so that we can start the requests when we're ready (i.e. once
we've got the console open in both tabs), as well as call and disconnect. Make
sure you add the autoplay
part so that the video starts as soon as you assign
the stream to the element. Muting the local video will also help with
feedback.
Moving on to the actual client side code, we'll update the socket.js
file to
appear as follows:
// assets/js/socket.js
import { Socket } from 'phoenix';
let socket = new Socket('/socket', { params: { token: window.userToken } });
socket.connect();
let channel = socket.channel('video:peer2peer', {});
channel
.join()
.receive('ok', resp => {
console.log('Joined successfully', resp);
})
.receive('error', resp => {
console.error('Unable to join', resp);
});
export default channel;
There's a lot of comments generated by phx.new
and we can go ahead an delete
them. Additionally, we need to update the topic and subtopic of our channel
connection. I personally like to use console.error
to report out errors,
since it makes it easier to filter in the dev-tools. We're also only going to
only export the channel, since we're not going to be connecting to any other
topics.
Now we can wire up our buttons and capture our initial application state:
// assets/js/app.js
import css from '../css/app.css';
import 'phoenix_html';
import channel from './socket';
const connectButton = document.getElementById('connect');
const callButton = document.getElementById('call');
const disconnectButton = document.getElementById('disconnect');
const remoteVideo = document.getElementById('remote-stream');
const localVideo = document.getElementById('local-stream');
const remoteStream = new MediaStream();
setVideoStream(remoteVideo, remoteStream);
let peerConnection;
disconnect.disabled = true;
call.disabled = true;
connectButton.onclick = connect;
callButton.onclick = call;
disconnectButton.onclick = disconnect;
Two things to note here:
1. We're creating an empty MediaStream
for the remote video element and
setting it as the stream right off. We can do this because we only expect
to have the one remote video source, and it's pretty easy to handle this way.
- We are declaring a module level variable to hold the peer connection. This is a convenience to allow us to easily access it from within the handlers. There are other approaches that might be better, but this quick and dirty approach allows us to write a simpler example.
As a bit of an aside, since you're going to see these invoked in examples, here's the definition of my logging helpers:
// assets/js/app.js
const reportError = where => error => {
console.error(where, error)
}
function log() {
console.log(...arguments)
}
I just find them helpful and terse enough to add to small projects easily. The
reportError
function is curried because that makes it easier to pass in as a
handler method to catch. Reporting the error alone without context is sometimes
confusing.
This is where we're going to diverge from the original. In the first post we
had relied on callbacks in the API, but this is now deprecated in favor of
promises. Given the nature of some of those promises, I think it makes for
much more readable code to use async/await
syntax. This will require a
little configuration change, so we'll need to update our assets/.babelrc
as such:
{
"presets": [
[
"@babel/preset-env",
{
"useBuiltIns": "usage"
}
]
]
}
This is just telling babel to supply the polyfills for our async/await
language based on whether we use it. It may produce a warning about core.js,
but this can be ignored for now.
Now that we can start writing in our app.js
, we'll go ahead an make a pair of
helpers to set/unset the video stream object:
function setVideoStream(videoElement, stream) {
videoElement.srcObject = stream;
}
function unsetVideoStream(videoElement) {
if (videoElement.srcObject) {
videoElement.srcObject.getTracks().forEach(track => track.stop())
}
videoElement.removeAttribute('src');
videoElement.removeAttribute('srcObject');
}
Setting the video stream to an element is easy enough, but unsetting it requires us to loop through the tracks and stop them prior to removing them. This just tells the device that the media is no longer required, instead of waiting for the object to be garbage collected.
We're going to call mediaDevices.getUserMedia
to get a promise for the media
stream object. This promise will resolve once the user has selected to allow
the page access to their device. If they never click, or if they click to deny,
then the promise simply never resolves. Since we can't really do any video
chatting if they never allow it, we can safely hinge the program on an await
of this promise. So we'll set up the local stream in the connect()
function
like so:
async function connect() {
connectButton.disabled = true;
disconnectButton.disabled = false;
callButton.disabled = false;
const localStream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: true,
});
setVideoStream(localVideo, localStream);
}
Feel free to separate out the media constraints object passed into the
getUserMedia
method.
We're also setting the buttons' disabled state to reflect the fact that we are going to be connected. We can start fleshing out the disconnect button at this point:
function disconnect() {
connectButton.disabled = false;
disconnectButton.disabled = true;
callButton.disabled = true;
unsetVideoStream(localVideo);
unsetVideoStream(remoteVideo);
remoteStream = new MediaStream();
setVideoStream(remoteVideo, remoteStream);
}
You should be able to start up the server and see the page rendering out your video streams. If you click the connect button you'll be prompted for permissions, and if you allow it, your camera will start streaming to the local video element!
But we don't really want to just stare ourselves down so we need to setup the
actual RTC portion with our peer. This will require that we create an
RTCPeerConnection
object and set up the correct event handlers on it:
async function connect() {
// ...
peerConnection = createPeerConnection(localStream);
}
// ...
function createPeerConnection(stream) {
let pc = new RTCPeerConnection({
iceServers: [
// Information about ICE servers - Use your own!
{
urls: 'stun:stun.stunprotocol.org',
},
],
});
pc.ontrack = handleOnTrack;
pc.onicecandidate = handleOnIceCandidate;
stream.getTracks().forEach(track => pc.addTrack(track));
return pc;
}
In order to keep the connect()
function fairly clean, we split out the
creation logic for the connection object. We'll need to flesh out those even
handlers though. It's also worth noting that there are several more handlers
than this, however the addstream
handler from the previous post is now
deprecated.
The createPeerConnection
helper takes a MediaStream as an argument. This
allows us to add the MediaTracks
for the local stream to the peer connection.
Essentially this will ensure that once we connect with a peer, they'll be
receiving the stream from our machine to theirs.
We should also go ahead and close the connection and nullify the module level variable when we disconnect:
function disconnect() {
connectButton.disabled = false;
disconnectButton.disabled = true;
callButton.disabled = true;
unsetVideoStream(localVideo);
unsetVideoStream(remoteVideo);
remoteStream = new MediaStream();
setVideoStream(remoteVideo, remoteStream);
peerConnection.close();
peerConnection = null;
}
NOTE: In an actual application, you might want to clear the event handlers on the connection prior to closing it to avoid errors while it's being shut down.
Once we have our RTCPeerConnection
we can use it to create an offer for
connection. This is essentially call
ing another peer, so we'll put this
into the call function. The createOffer
method will return a promise that
resolves to an RTCSessionDescription
. This description is what we need to
send over the signal server. We can await
this description in the call
function as well, and then set the localDescription
of our peer connection as
follows:
async function call() {
let offer = await peerConnection.createOffer();
peerConnection.setLocalDescription(offer);
channel.push('peer-message', {
body: JSON.stringify({
'type': 'video-offer',
'content': offer
}),
});
}
Because we have several types of messages that require specific methods to be
called in response, we add a type
attribute to our message body. Essentially
we'll have four types: 'video-offer', 'video-answer', 'ice-candidate', and
'disconnect'. The 'ice-candidate' event is referring to Interactive
Connectivity Establishment (ICE).
The logic for sending a message will get pretty repetitive with four message
types, so we can go ahead and refactor the connect function with a new
pushPeerMessage
helper:
// ...
function pushPeerMessage(type, content) {
channel.push('peer-message', {
body: JSON.stringify({
type,
content
}),
});
}
// ...
async function call() {
// ... all the code we wrote before channel.push
pushPeerMessage('video-offer', offer);
}
This will make it quite a bit easier to chat with the server without needing to type out the structure of the message, and will save us from typos in the message string. Not that that's ever been a problem for me...
Let's flesh out the event handlers that we haven't defined yet, handleOnTrack
and handleIceCandidate
. These two functions are critical for negotiating the
peer connection through the signal sever. For now though, we can just inspect
their arguments to get an idea of how the program flows:
function handleOnTrack(event) {
log(event);
}
function handleIceCandidate(event) {
log(event);
}
Go ahead and run the application to see what gets logged out when you connect.
You should see some number of icecandidate
events. If you inspect those,
then you'll see that they have a candidate
member. This is what we want to
communicate to the other peer for our ICE. Once they settle on a candidate
they can establish the p2p connection and start sending packets!
So we'll just flesh out the handler to send that message to our peer:
function handleIceCandidate(event) {
if (!!event.candidate) {
pushPeerMessage('ice-candidate', event.candidate);
}
}
We're guarding here against a null
candidate, which is an indication that the
candidate gathering process is done. There are other ways of handling this,
but they would complicate the simplicity of this example.
With that, we can go ahead and start thinking about receiving the messages:
channel.on('peer-message', payload => {
const message = JSON.parse(payload.body);
switch (message.type) {
case 'video-offer':
log('offered: ', message.content);
break;
case 'video-answer':
log('answered: ', message.content);
break;
case 'ice-candidate':
log('candidate: ', message.content);
break;
case 'disconnect':
disconnect();
break;
default:
reportError('unhandled message type')(message.type);
}
});
If you run the application and connect then call, you should see the messages hit the server, but nothing is output to the logs. This confirms that we aren't broadcasting the message back to ourselves.
If you open two tabs and click connect in both tabs, then call in tab 1, you
should see the logs on tab 2. They should start with the initial
'video-offer'
, and then follow with the 'ice-candidate'
s.
We're not going to implement anything to allow the user to accept/decline an incoming call, so we'll just write a helper to push back and answer.
async function answerCall(offer) {
let remoteDescription = new RTCSessionDescription(offer);
peerConnection.setRemoteDescription(remoteDescription);
let answer = await peerConnection.createAnswer();
peerConnection
.setLocalDescription(answer)
.then(() =>
pushPeerMessage('video-answer', peerConnection.localDescription)
);
}
We can just call that from the 'video-offer'
case with our message content:
channel.on('peer-message', payload => {
const message = JSON.parse(payload.body);
switch (message.type) {
// ...
case 'video-offer':
log('offered: ', message.content);
answerCall(message.content);
break;
// ...
}
});
Now you should be able to connect between two tabs, and when you call from one you'll see the answer logged out (along with some new 'ice-candidates'). At this point though, you'll get an error about the ICE negotiation failing. This is because we aren't actually adding the ICE candidates to our peer connection yet. This is simple enough though:
// ...
case 'ice-candidate':
log('candidate: ', message.content);
let candidate = new RTCIceCandidate(message.content);
peerConnection.addIceCandidate(candidate).catch(reportError);
break;
// ...
The last thing to do is to handle receiving an answer from the remote client.
This is where we finally see the fruits of our labor! We're going to need to
set the remote description again here, so we can reduce some duplication again
in the answerCall
function by breaking out the folowing:
function receiveRemote(offer) {
let remoteDescription = new RTCSessionDescription(offer);
peerConnection.setRemoteDescription(remoteDescription);
}
async function answerCall(offer) {
receiveRemote(offer);
let answer = await peerConnection.createAnswer();
peerConnection
.setLocalDescription(answer)
.then(() =>
pushPeerMessage('video-answer', peerConnection.localDescription)
);
}
Now we can recieve the video answer easily enough:
case 'video-answer':
log('answered: ', message.content);
receiveRemote(message.content);
break;
Simply setting up the remote isn't enough though. While this will work, check
it out, you'll quickly notice that the remote video isn't actually showing us
anything. This is because the stream's tracks aren't ever correctly handled,
we need to jump back into handleOnTrack
:
function handleOnTrack(event) {
remoteStream.addTrack(event.track);
}
Finally, we just need to make sure that our disconnect is being communicate to the peer.
function disconnect() {
connectButton.disabled = false;
disconnectButton.disabled = true;
callButton.disabled = true;
unsetVideoStream(localVideo);
unsetVideoStream(remoteVideo);
peerConnection.close();
peerConnection = null;
remoteStream = new MediaStream();
setVideoStream(remoteVideo, remoteStream);
pushPeerMessage('disconnect', {});
}
And with that, this example is done. You should now be able to deploy this and connect between peers.
NOTE: If your connection doesn't work, it's possible that the peer that you wish to connect to is behind a symmetric NAT or some other restrictive firewall. You'll need to establish a TURN server as a proxy to the peer to peer connection. This is outside the scope of this post though.
The source code for this exercise can be found on github.
It's worth noting that there are quite a few features that we didn't explore that would be pretty critical to getting a fully serviceable application running. However, this is a fun way to get your feet wet with WebRTC, and thanks to Phoenix it's super easy.
If you have any questions/comments/conecerns feel free to reach out to me on GitHub. Otherwise, happy hacking!