February 8, 2018
Ep. #8, VideoLAN with Jean-Baptiste Kempf
In episode 8 of Demuxed, guest Jean-Baptiste Kempf of VideoLAN to discusses what the future holds for streaming video as well as VLC player'...
In Episode 3 of Demuxed, Matt, Steve and Phil are joined by SSB BART Group Lead Accessibility Consultant Owen Edwards. The group discusses the role that tools such as captions and screen readers have played in the the past, as well as how accessibility standards will be part of the way we consume content in the future.
About the Guests
Matt McClure: Hey, everybody. Welcome to Demuxed. If you've been paying attention recently, you've probably seen some news about UC Berkeley and their free online course content potentially going away. Today we want to talk about what's going on, what's the news, what we see as the larger trends here and what we can take away as people working in the video industry.
To really help us dig in here, we've got Owen Edwards from the SSB BART Group. We've known Owen for a while here, as a contributor to Video.js, a world-renowned Demuxed speaker, and amongst other things, just generally an all-around great dude with a lot of background in accessibility. So, do you want to tell us a little about your background, Owen?
Owen Edwards: Sure. I started out in engineering in the Bay Area during the big dot-com boom the first time round, and decided to take a break from that for a little while and go and spend some time in the mountains. I went up to Colorado to teach snowboarding and fell in with an adaptive program there where I got to see the effect that inclusion of people with disabilities has on, not just them, but on everybody.
The idea of standing next to somebody who's blind, who's about to go down a black run, the idea of a young man in a sit-ski, who's lost his legs in combat, being challenged to a race by a 60-year-old who lost his legs 40 years ago. It just changes your whole perspective.
That idea of people with disabilities, it's easy to see them as them versus us, or a group, and then people who are able-bodied. But it's so much about inclusion. It's so much about universal design, about bringing people in.
The challenge of designing a system that works for people with disabilities just broadens our ability to make systems that work for everybody.
So, since then, as you mentioned, I work at SSB BART Group. We do accessibility testing, auditing and consulting for online digital products: both software, web content, documents, and help people comply with the various regulations.
Matt: Great. If you haven't seen it yet, by the way, his talk from Demuxed 2015 is really great. It ends with a plea that'll have you cutting onions in the kitchen.
I wanted to start off really quickly by talking about kind of the background here, before we dive into what's going on at UC Berkeley. And by the way, this isn't the first time that this kind of free-course content and accessibility has made the news. Harvard and MIT had a similar thing happen in 2015.
But I think it's helpful for us to talk about why this is a problem. What's difficult about doing these things online? And what makes it a challenge to make things accessible online? Let's start off with captions. What is a caption? What is a subtitle?
Steve Heffernan: I remember that was a question early on. The answer eluded me for a long time. But the way I understand it now, and you can correct me if I'm wrong, Owen, but captions are specifically for the hard of hearing, whereas subtitles are for translations into other languages. Is that right?
Owen: Generally speaking, yep, that's right. It depends where you're speaking from. If you're in England, they use the terminology differently. So things with subtitles and subtitles for the deaf and hard of hearing, the term captions isn't really used. But over here, you're right. Captions are specifically targeted at people who are deaf or hard of hearing. Subtitles are usually a translation.
Steve: Okay, that's interesting to know, that it's U.S.-specific. In general, captions have been kind of a challenge on the web. At least, they first started, you had them in Flash, to a degree. They've never been really prominent everywhere, but you started having them in Flash.
Then when HTML5 came along, we got a new format called WebVTT, and that helped captions push forward. But there's a general challenge around just creating the captions, making sure they're available. Any thoughts on the complexities there, around actually making captions available?
Owen: I think part of the complexity comes from, as with any technology, the multiple different standards. There were various different standards, and as you mentioned, one of them, SRT, kind of evolved into WebVTT, which is a relatively simple text format and gives very easy editing. But there's also content coming from, say, DVD, which is much more of a bitmapped image, and then broadcast standards, which are much more complex.
So translating, transcoding between them causes complexity. I think that's caused a lot of the problems, that Flash players tended to support some of the more basic formats like SRT and timed text.
Matt: We'll go more into the rules around once you take broadcast content and try to put it online. But WebVTT is kind of what I've always known as, getting started with Video.js and working with online video, honestly that's the primary thing that I've ever worked with, has been WebVTT.
I remember that being the first topic that I really cut my teeth on at FOMS, was talking about WebVTT. But Phil, can you tell us at all about, at the BBC, did you guys have any specific workflows for transitioning from, I mean this might have been a little bit after your time with the rulings there, I don't know if they affected the BBC at all, but what was that like?
Phil Cluff: I ended up well in the pain places of subtitles at the BBC. For a long time, I think I wrote three different subtitling delivery systems at the BBC over the years for my sins. I don't know how it always somehow landed back in my lap, but we certainly had a lot of challenges around bringing in subtitle sources from multiple places.
Generally, you come from a broadcast world, you're either picking up live subtitles or you're picking up VOD subtitles, ready-produced for playout, or you're picking up some live subtitles feed as it goes out live on air. And there's a lot of challenges for each of those.
Obviously they're coming in in very different formats. Usually the live subtitles come in as proprietary formats, and that'll have to be converted. The BBC, as it happens, is actually not an SRT or a WebVTT outfit; it's actually entirely a TTML, timed text outfit. So whenever you see subtitles from the BBC, they've come from a timed text workflow. But they actually come, a lot of the time, originally and as a STL, which is probably something no one's ever seen here. It's a very old kind of binary format.
Certainly a lot of the workflows we had to build were around, "Hey, let's take the subtitle file and convert it from A to B to C, hopefully without losing all the information that's in there."
The interesting thing is there's way more information, usually, than people ever consider.
There's coloring information, there's position information, and some of those people think, "Well, as long as you've got the text, what does it matter?" Well, it kind of matters if your subtitles are over a critical piece of the content, right? If your subtitle's being overlaid on top of the question in a quiz show, then your subtitles are useless and they're not relevant any more.
So a lot of the time was spent really converting between formats, and the tool chains we built to do that varied from pillscripts that existed for five years through building custom applications to deal with it, as well as using a lot of open-source tools that were out there to deal with converting subtitles.
By the end of it, we had this tool chain that was, "Hey, pass this incoming file through four tools," and then actually you've got something that might work at the end of it. Maintaining all of the quality and all of the nuances through that is often difficult.
There was a point where we realized we had, for example, characters that were not generally supported by a lot of the players, and one of these was actually the pound sign in the UK. So we actually ended up realizing that a lot of subtitles didn't have a pound sign in correctly, and the "pound sign" in this context is the currency, not the hash sign, as I would call it in America. But yeah, it, a lot of the time there's so much complexity that we can lose there.
Matt: Yeah, an important think to note here, too, for any listeners that maybe haven't dug into the caption/subtitles world, an important distinction to make here is the difference between burned-in captions versus what we're talking about, which are these sidecar formats.
So, like a WebVTT file, web-video time text, is that it? Things like WebVTT are these text files, that Owen said earlier, that sit alongside your video so you deliver these multiple text tracks. These can be different languages and they just exist in your HTML5 player or whatever it is. And then the player then knows how to translate the timecodes that are in this text file with the content.
That's a different level of complexity from the player perspective than, say, if you just have the video file itself having captions hard coded. And if you've ever pirated video online, which I'm sure nobody listening ever has, and found hard-coded Chinese captions, like that's not what we're talking about here, for the most part. We're talking about these sidecar formats that can provide different types of captions for the end user.
Steve: Speaking to the complexity of that, Video.js was actually the first open-source player that supported the WebVTT format, and I was pretty proud of that. But then the spec for WebVTT just started to balloon up and require more and more specific features that we didn't have the ability to keep up with.
At one point I remember talking to the Opera developers about what they were building in order to support the spec. Essentially what they were building was an entire collision-detection engine to understand exactly where subtitles and captions were being overlaid, in order to support Japanese subtitles that might be vertical text and alongside of horizontal text.
It was just insane to hear what they were having to go through in order to get these things to show up properly. It's a lot more complex than you might expect at first glance.
Phil: And of course subtitles always go left to right, right? They never go right to left... Oh, no, wait... hang on...
Owen: Matt, the specific terms you were clarifying there is open caption versus closed caption, right? Open caption is ones you can't turn off, they're burnt in. Closed caption, which is the term most people understand now, they don't realize that's the sidecar format, so you can turn it on or off.
Matt: That is the CC button on your player. Gotcha. Moving away from just captions itself, because that's not the end of accessibility online.
S- It might be interesting just to hear, from a higher level, what are the different types of disabilities? Captions are obviously for the hearing impaired, right? But there's so many different types of disabilities.
I think that was one of the things that surprised me in talking with people like Owen, is the wide range of potential disabilities that are out there and why we have these different technologies.
Owen: Obviously there's a huge spectrum of different kinds of disabilities. And I think a lot of people are, particularly around autism, starting to understand the concept of spectrum, the concept that a disability is not black-and-white. It's not on or off.
But the major groupings that we look at the most are blindness and low vision, color-blindness, which is sometimes surprising that it's considered a disability since it's so common, so broadly-experienced. But that can tie in with some similar effects of low vision.
Steve: What are some specifics around the difference between "blind" and "low vision"? How does that manifest itself?
Owen: It depends where you're talking, in terms of what counts as blind from a disability level. Legally blind is actually not completely blind, it's not what people think of as completely blind. It's below 20/200, so it's not able to drive because they're blind. It's that kind of blindness.
As opposed to, you can't see even light and dark. So, you get a lot of gradations of blindness that, once you get into, "Can they perceive light and dark? Can you count fingers? If fingers are held up in front of your face? Could you count those?" To actually them testing your acuity, that 20/200 versus 20/20 that we're so familiar with.
And then in low-vision there's all sorts of different things that can be central vision loss where there's a small area in the retina that doesn't work. There can be peripheral vision loss, there can be blurring, there can be differences in color perception. There's so many different things. There's a lot of different strategies for coping with that.
Matt: I hadn't really picked up on the color blindness issues until I had a co-worker that was colorblind, shout out to Chris Warren in Minneapolis. But there'd be times when I was giving him crap because his color scheme and his text-setter was just atrocious, it was the ugliest shades of whatever colors were thrown together. And I was making fun of him for how heinous it was, and he was like, "Oh, but I can actually see the difference between these. You're just an asshole."And I was like, "Oh, great. Well, I... hm, yes."
And so you actually see what these things mean. If the only differentiator that you have between two buttons, the button that you should not click and the button that you should click, is red and green, and you can't tell the difference between red and green, then your warning versus, "let's go" signal there is just totally lost.
There's actually some interesting projects out there that'll help you enable modes in your browsers that'll do low-vision testing so you can kind of see what it looks like from the perspective of somebody with low vision or color blindness, or something like that.
Owen: All the way down to simulating total blindness, where it just switches off the screen. It's fantastic.
Phil: I'd pay for that service.
Matt: Okay, so that's the visual ones. And then you move onto the hearing facilities.
Owen: Again, there's a range. Deafness and hard of hearing. Total deafness, obviously, any sound needs to be represented in a different form. Hard of hearing is maybe they have, a person who has trouble separating speech from background noise, speech from a soundtrack in a movie. Or it might be that they have deafness in one ear.
So, for example, the iPhone has a feature where it can reroute all of the audio into one ear or the other, rather than splitting it between both. There's areas where, again, it's not black-and-white. It's people who have some kind of impairment. How do you deal with that?
Steve: That's really interesting. Okay, so, that's hearing. And then what is manual?
Owen: The other large area is physical disabilities which affect peoples' ability to manipulate content, manipulate the devices that allow them to access the content. Particularly it's the mouse we're thinking of. It could be that they're missing limbs. It could be some kind of muscular impediment, something like Parkinson's, cerebral palsy, that just makes coordination very difficult.
So it's anything in those areas all the way through to quadriplegics, who may be unable to move any limbs, and use some kind of alternative input device, something head-mounted or a sip/puff switch which allows them to communicate with a computer or control a computer.
Steve: Otherwise we might be talking about speech control or something along those lines.
Owen: Exactly, that's another big area: speech control.
Matt: Interesting. Anything not covered in those three categories?
Owen: There certainly are. There are broader issues with cognitive and learning disabilities. That's a large one that hasn't really been adequately covered in some of the accessibility technologies but is certainly an area in education. There's a lot of people that are very concerned about how to keep the attention of somebody with ADD, ADHD. How do you allow somebody to simplify content on a page?
I was talking to a friend recently who was talking about somebody with a traumatic brain injury who was saying that websites can be very tough if there's a lot of content on there. And they look away. If there's a noise in the house, and they look away and they look back, they're totally lost in the page.
So what they're looking for is the ability to simplify a page, to take out the content that isn't strictly necessary. Particularly things that are flashing, that are blinking are distracting.
Phil: Adverts, you know, that sort of thing?
Matt: So what you're saying is, the marquee and blink tags were probably not big losses.
Owen: Right. And naturally, one specific area that is flagged in existing regulations, is things that might cause epileptic seizures, particularly flashing content. And there's specific definitions of what is likely to trigger that.
Phil: Certain episodes of Pokemon spring to mind.
Steve: There certainly is quite a wide variety there, and it's going to take us a lot of work to get all of that supported.
Owen: Just to touch on a couple of others, there are areas where it's a combination of disabilities. People who are deaf-blind are a difficult population to serve, but they're certainly out there. Often they're Braille readers, so they have the tactile reading in Braille. It's where you really have to consider all the different alternative modes of information delivery.
One thing I also always think about when we're talking about those core sets of disabilities, vision, hearing loss and mobility or coordination loss, is that, unfortunately, we're all getting older. And until we can solve that, we're all going to move towards having less ability.
We may not call it disability, but if we don't design a world for the future that we can access, we're going to be the people off on an island somewhere, the old people's island.
Steve: That is a really interesting way to think about it.
Phil: You know, design the world that we're going to use, please. I might get there sooner than you guys, but you'll get there eventually.
Matt: I'm banking on modern medicine, I'm going to be 28 forever. So, briefly let's talk through some of these different technologies that we would use, especially focusing on people that are building video experiences on the web.
A lot of these are things that at least Owen, Steve and I have dealt with a lot with Video.js. In some of these instances, it has been issues on an issue tracker that have spanned quite literally years. Things like Flash versus HTML alone.
Steve: Because iOS goes to full-screen, you can't overlay any text on top of iOS when it's in full-screen. So you're kind of out of luck there.
Matt: Or if the browser actually did support WebVTT, as they soon after did, if we'd left it in there altogether, then you'd end up with double captions, which was unfortunate. And so we couldn't just let the browser take over entirely once support came in more, because then we had to also support Flash.
As you can see, it's this crazy game of give and take that I don't know if we ever totally solved. But it feels like we're in a good place now, at least, thanks to Gary Katsevman from the Video.js team and Owen. Between the two of them they put in a lot of work, especially in the last eight months, really solidifying VTT support, moving to VTT.js, the awesome open-source project from Mozilla, which is, if I recall correctly, the captions engine in Firefox itself?
Matt: Anything I'm missing there? Because that's a really common one. Less so now than approximately three months, four months, when Chrome kills Flash entirely from the face of the earth? Is there anything else I'm missing there? Because that's also a platform thing, not just Flash versus HTML5.
Steve: Honestly, I can't speak much to the Flash specifics around captioning, because we always took the approach of just overlaying HTML on top of a Flash player. So there may be other complexities to Flash itself, but I think everybody's kind of moving forward from that set of things.
Matt: Is that what you're seeing, too, as well?
Owen: Yeah. I mean, one of the big things with Flash was the accessibility barriers that people experienced around it, and trying to address those. It's not that they couldn't be addressed, they just didn't seem to be consistently made available.
So there were a lot of barriers, a lot of people who couldn't play online video that was Flash-based.
Matt: And this world of Flash versus HTML5, Video.js in particular, not to toot our own horn too hard here, but the thing that we got around a lot of these differences between Flash and HTML5 was because we had the same interface. So, things like key accessibility, like tab order, whether or not we would use divs versus buttons. And we ended up using divs because it made some things easier.
But then we had to add all this crazy ARIA syntax everywhere, and ARIA syntax, if you're not familiar, is tooling that allows you to make elements act like other elements. So a screen-reader can appropriately interact, is that the right way of describing that, Owen?
Owen: Yeah, it identifies them out to a screen reader in a particular way. It doesn't necessarily change how they act, but it identifies them out to the screen reader so that the screen reader understands they're more than just a group of divs.
Matt: Got it.
Phil: That's something that actually really surprised me the first time I did it, was actually trying to view your own stuff in a screen reader. It's something you never think about, but it was a fascinating experience for me the first time around. Even just picking personal website projects, sticking it in a screen reader and realizing it totally doesn't work if you can't see the content. I mean, really encourage people to even try that.
Owen: Right, and even without a screenreader, just to turn the CSS off, will often give them the first idea of that, "Oh, this is just several lines of text. A line that says 'pause' and a line that says 'stop.'"
Matt: The divs-versus-button thing transitions nicely into focus styles. This is a really common thing that we see people, that appreciate good design, frequently just turn off the focus dials. Because there's this big, ugly gradient border thing around your elements, and so people just immediately kind of focus on none.
I feel like that's started to get less common, because there's been a ton of articles written about how you should never do this because it just ruins the accessibility. And it's such an easy piece of accessibility to not kill. Right?
It's just an extra thing that you have to take from a design perspective. But implementation-wise, that's not that bad.
Owen: I think there is a way to actually turn off the mouse pointer with an HTML element. I feel like we should do that a little bit more, to just have certain elements that turn off the mouse pointer. Because that's the equivalent of turning off the visual focus; just turn off the mouse pointer and try to click a button. I don't know where it is.
Matt: We can talk about this more as we go into what the accessibility rules are. But we start talking about changing the content itself. There's things like audio descriptions where, instead of just a text track, it'll actually like describe what's going on in the video... Oh, I guess a text track wouldn't make any sense from what an idea description does, if you can't read the text track.
Owen: There's some discussion around that.
Matt: Oh really?
Owen: There's some discussion about having a text track, which is then available to the screen reader.
Matt: Oh, right.
Owen: So, if the person's going to have a screen reader anyway, can the screen reader read out a text track? The timing of it gets very difficult.
Matt: Audio descriptions in particular, this kind of goes into when we talk about changing content for disabilities. If you have content that's specifically meant for disabilities, where you have larger pauses, you'll hold off on going to a cut scene so you have more time to do audio description.
This is just to call it out one more time, another great thing that you would probably benefit from watching Owen's talk: he shows two videos side-by-side, one of which where there's audio descriptions added, and the content itself is actually different.
You see a person walking down the street for five seconds longer because it gives a little bit more time to describe what's going on in the video, rather than trying to just pause the video and shove more content in or have somebody speak really, really quickly over the content itself.
A change in the content itself has it's own rats nest of troubles that come from there, right? If you have an hour long piece of content and then you pause things automatically to finish your screen reading, what does that do to your actual content link? What does your player show? Things like that are interesting troubles that you wouldn't necessarily think of unless you're actually implementing it.
Owen: Right. And we've kind of skipped over or out to the more complicated side of audio description. Just to be clear what the basic idea is, is to allow blind people to enjoy a video and to get the full content from a video without being able to see it. And it isn't just blind people. It can be low-vision people, or it could be somebody who is unable to pay attention or watching a video on their watch.
Steve: It's us in 20 years.
Owen: It's us in 20 years, that's right. Listen, there was a guy, a baseball commentator, who just retired yesterday. He's just this incredible, and I don't know his name, because I don't follow baseball.
Phil: Is it like rounders?
Owen: Is it like cricket? I don't know. You take these people who are these incredible radio announcers who knew how to tell a story and then there are movie producers who know how to tell an incredible story visually, but the two don't have to be exclusive.
That idea, that you can have an audio track that parallels a video, that conveys the content,
in the same way you can have a sports game and have somebody commentating all through it and knowing when to be quiet and let people hear the roar of the crowd.
And so then, Matt, what you're talking about is those situations where there isn't enough space, when the production doesn't allow for putting in description in the available time and what do you do about that? And that's when you get into that idea of a longer edit or pausing the content.
Matt: So, moving on to milestones here, around legal milestones involving accessibility, particularly as it relates to web video: we've got the FCC ruling. This was the one that I was the most familiar with coming into WebVTT, and captions was the FCC ruling that anything that is broadcast over traditional mediums with captions also needed to be made online with captions. Does that butcher that too bad?
Owen: That's pretty much it, yeah. That's part of CVAA, the 21st Century Communications and Video Accessibility Act. Thank God they shortened that one. It was a requirement that if there are captions when you broadcast something, when you then take it online, those captions have to go with it. There's not really an excuse to say, "Well, it's a little hard to carry them over." You've already got them, come on. You just need Phil's translation programs.
Steve: All five of them right, Phil?
Phil: Just to say your name. Just pick one off the shelf.
Owen: Right. Might be a little bit dusty, but just dust it off and feed your content through it. And I think that was the first round of it and some of it was precipitated by the idea that when there was the switch to digital television, there were FCC requirements before that, that analog TVs, anything over 13 inches, had to have captions.
People switched over to digital TV and the captions were still being broadcast, but there was a converter box and it wasn't clear what handled the captions or how they got parsed. And so there were situations where the captions were getting lost.
And so this was a, "Hey, we need to have standards in place and requirements, regulations that specify if you've got them. They've got to be carried."
Matt: Interesting. I wonder what the effect of that was, as we started to move forward into more IPTV scenarios. Because I think with the X1 that I've got now, it's all IPTV. I don't think I do any traditional broadcast.
Phil: Yeah, you can still do a transport stream with subtitles in there, right? Even though it's IPTV, it's not really fundamentally changing the delivery technology.
Matt: Right. My thought there, I wonder if that's going to change more as that progresses.
Owen: I think that's where it's got complicated, because the FCC has authority over broadcast TV, but the question was whether it had authority over IPTV.
Steve: I can say, from the Video.js perspective, we saw a huge influx of requests around accessibility after this point. Before that, it was always, unfortunately, kind of a secondary priority for the project because, in an open-source project, you tend to prioritize things based on the number of people who are screaming about a problem and the number of people who need a feature.
So, just looking at browser usage and things like that, accessibility, unfortunately, kind of never bubbled to the top of either of those things. But when this legal ruling came out, that was a huge push for, I think, every player out there to start actually paying specific attention to these things and beefing up their support for accessibility. So that was really cool to see.
Owen: Right, and that's the reason that that kind of legislation has to go through.
Unfortunately, accessibility doesn't bubble to the top. It's rarely a large user group.
And it's shame because there's an often-cited statistic that the largest user of captions are sports bars in airports. That's not people with disabilities. It's not the deaf people turning it on. It's that, as soon as you mandate that it has to be on every TV, people find out how useful it is.
Phil: Right. It's interesting for me in particular. Overwhelmingly, everything I watch on Netflix is with the subtitles turned on. Not because I need them, but my girlfriend finds it easier to read English and hear it a lot of the times, if there's accents or things like that.
I think what that often highlights to you, when you are a native speaker of the language and then you watch the subtitles, how bad a lot of the situations can be. How often you find subtitles where they're lazily subbed or well, it doesn't quite fit. Just stuff it in, maybe lose some of the meaning of it, and that inevitably happens.
It definitely isn't the case that the primary use is for people with hearing issues anymore.
Matt: Well, this feels like a good segue into, giving a little background on what the UC Berkeley situation is, or looking back in time, what the MIT, Harvard situations were.
If you haven't been following along, if somehow you've missed this on Facebook and Twitter, what happened was UC Berkeley ended up taking down a lot of free online course content because the Federal Court said they needed to be following ADA with their online course content.
They basically said, "This is free content that we provide online. We don't have the funding or the means to actually follow all of these rules, so we're just going to make it unavailable to everyone," was kind of the gist. Is that roughly what you guys saw out of the situation?
Owen: Right, it was a lawsuit brought by The Department of Justice. It's their Office of Civil Rights, and it was specifically about ADA. It hasn't, as I understand it, gone to court yet. It was just a lawsuit that has been brought. The Department of Justice does these quite often, and often what they end up with is a consent decree, which is an agreement that the problems will be fixed without any finding of fault. So that all the finance and effort, or money and effort, goes into fixing the problems, the Department of Justice isn't looking to get some big settlement.
I shouldn't speak for them. It doesn't seem like they are. It doesn't seem like that's the point, and so when a lot of people get concerned about these lawsuits, I think that they should be clear about that. That what the Department of Justice has done in pursuing ADA lawsuits, is to say, "You have a problem. You could have fixed it, you haven't, now there's going to be consequences if you don't within the next, certain amount of time."
Matt: They always say "never read the comments," and that's almost always true, but it's been interesting to see what people have taken away from this ruling. Half of the comments you see are from people that are like, "This makes total sense. This is free online course content, why would we shut out a meaningful portion of the population, especially with something that's paid for with public money?"
And the other half of people are saying, like, "Well, you've got hundreds of thousands of people that can be helped by this free online course content, but we're going to rip it out of their hands because of a few needy people." I'm painting an evil character of the people that are against it, but if people say it reasonably, it doesn't sound that bad.
Like, "Oh, you've got millions of people that can access, and then a few thousand...? The ends don't justify the means. A lot of people are missing out on content just because of a very small number of people." If you're not painting somebody as the bad guy, it can sound more reasonable.
Steve: One way I heard it put is, basically, saying if you're standing at the top of stairs giving out free money, should you be required to build a wheelchair ramp? Or stop giving away the free money?
Phil: Where did you find that one?
Steve: It's Hacker News, of course. Also, where are these stairs?
Matt: I guess it would be useful to actually talk about the root of this hubbub, the specific challenges for Berkeley they've mentioned, and what's the reasoning behind them saying it's too expensive for them to move on? So, the first thing that I saw mentioned was, sorry, in the complaint, the first thing mentioned was, many of the videos do not have captions.
So, at first blush, that one feels pretty reasonable, except for number four was talking about some of the videos that had automatically generated captions were inaccurate and incomplete.
So it feels like you've got an instance where some videos had no captions, and the videos that did have captions were probably machine-generated. I know how good my text-to-speech transcription on my Google Voice voice mails is... I don't know what they could possibly be talking about there, about "inaccurate"...
Let's start there, with captioning. Clearly, machine-generated just isn't quite there yet. Is there a mix in your work? What have you seen there, in terms of that being prohibitively expensive? I assume you're going to have to have a human get involved in some way.
Owen: Right. There are a number of different ways to try to generate captions. You can do them manually, but there are a lot of caption providers out there. Companies where you can outsource that work and have them do it and they follow FCC guidelines, which require a certain level of accuracy, and not accuracy of the words but accuracy of timing, correct placement, accuracy in punctuation, those kind of things.
There's a broad definition of what accuracy is.
We've seen instances of clients using automatically generated captions where they were so wrong that in one case, it used a word that we didn't think that client wanted on their video. And in another case, it took us a while to work out that the platform had mis-recognized what language the subtitles were in, or the audio was in. It had auto-captioned them in Dutch even though they were spoken in English, and then also translated them into English.
Steve: Oh, my. Wow.
Owen: You can imagine that was a little far from ideal. What you come to is, "What's the cost? What's good enough?" You've got to have some line because otherwise you have no line at all.
You've got to have something of, "This is what's required from a publicly-funded university providing educational content," and a university that's well respected for its involvement with people with disabilities. Berkeley as a city and also as a university has had such a tie to the disability community for such a long time that the idea of not catering to those people seems disappointing.
Phil: Just out of interest, do you have a finger in the air, thought on how much money we're talking about here? What's the cost? Say I go out there and get a compliant subtitler company to do an hour of content, pretty simple English, spoken language, not complex stuff. What's the actual feel of how much that costs?
Owen: I knew you'd ask that question, and no, I don't have a specific answer on that one.
Owen: I can point you to the right people who can answer that. We don't propose any one particular subtitling company, but there are a number of very reputable ones out there, and they can give you an idea. The thing is, it's not as simple as it being a straight minute-by-minute, you know? There's all the overhead. Is it five hours of two-minute videos, or is five hours of two-hour videos?
Steve: I think the bottom line there is it does cost more money, because there's nothing free out there. And so if you didn't plan to have that expense, it's all additional.
Matt: The caption side of things feels pretty straightforward to me. Captions: it's content. It should be pretty easy just to ship that off to somebody and have them do text-to-speech. We've literally been doing this for the entire time that digital content has been available.
The ones that, I think, might feel a little bit more complex, are many videos lack, quote, "an alternative way to access images or visual information." So this is charts, graphs, animations, URLs and slides and many documents, quote, "associated with online courseswere inaccessible to individuals with vision disabilities who use screen readers because the document was not formatted properly." So, those two feel related to me, but I'm not even sure. If I had a graph, how do I make that graph more accessible?
Owen: Right. And the thing is, I think a lot of people who've done website development, who have the first inkling around accessibility, understand that any image needs an alternate text description.
You guys are nodding your head. I mean, that's the basic point of accessibility. You have an image, and it used to be that if you had a slow connection, it wouldn't even download the images, right? It would just have the little text description.
The image is there to convey a piece of information, and as you start thinking about it this way, the text can convey that.
And I would say that in most cases, the image is there to back up that description. In fact, if I put together a PowerPoint slide, I hope that that slide only backs up what I'm saying. Whether I'm fully describing what's on the slide, "Here's a line graph that goes from seven to five to seven..." or whether I'm saying, "The numbers bear out that this is more popular," the information should be there to back up what's happening in the speech or what you're talking about.
There are a lot of very good educators out there that are making good use of explaining things and making sure that that information is coming across. But we've all seen lecturers who will just throw up a slide and not describe it, and at that point you're thinking, "Well, why didn't I just get the notes? Why did I even come?" So, for educational content, good teachers should be doing that already.
They're already describing what's in there. And so that's why those things are being flagged, is when it's not fully described. And then you're talking about that post-production issue of adding description, which is more time consuming than creating captions, which is more complicated, but it's really about pushing it further up the production chain.
Matt: So, really, it's just, "Be better teachers," is what you're saying?
Steve: Oh, geez.
Matt: I'm just kidding. If anyone from Berkeley heard that, I was just kidding. I'm sure you're all very wonderful.
Let's talk a little bit about what this means for the world at large. UC Berkeley, as we've already talked about, isn't the only one that's run into this. Again, we had MIT and Harvard do this last year. I'm sure those are not the only people, that are putting out free content, that maybe aren't captioning them the way that the ADA would say they should.
How would you say this is going to affect universities and putting content out? To be clear, I don't think Berkeley's going to leave this stuff offline. I don't think they're just going to be like, "No, fine," and take their ball and go home. I don't think that's their intent at all here.
I have to assume that there is a knee-jerk reaction to the federal government getting involved and then pulling back so that they could survey the landscape and then come back in with a solution. I have to assume that this stuff isn't just getting ripped out of every course content online. But is that the impression you get? What to do you see this doing to other universities and their course content?
Owen: Yes, I think the initial reaction from Berkeley, in terms of saying they're just going to pull their content, is a little surprising. But it's actually not, if you look at the long history of accessibility. That for many years, advocates of accessibility have pushed for accessibility to be included in, from the get-go in the production process, in the construction of websites, in the construction of content.
Organizations have pushed back and said it's too expensive, and then it gets pushed further down the line.
You're retrofitting accessibility, putting a ramp to a building that already exists, and it's much more expensive.
And lawsuits come and people throw their hands up and say, "This is way too expensive," and that's unfortunate when these organizations have often been approached in advance to start thinking about it in advance.
Matt: I think I was preparing for my joke too early and kind of missed out on some of the original meaning there, but that clicked a little bit better. So, what you're saying is, teach the course content in person, as if you had those same people with disabilities already sitting in the room with you. So, that way, when you're actually preparing that content, there isn't as much additional work when it comes to actually providing it online.
Matt: I'd be interested to see what that is, what the difference is for content that does have students in the class that are already like that. What the disparity in costs there is between making that content accessible versus seeing the kid with obviously low vision in the front of the class. How do you change the way you teach to accommodate for them?
Phil: I actually had this when I was in university, one of my friends, a very short-sighted guy. That came out wrong, hang on. One of my friends has some eyesight issues, and one of the ways the university helped him was that they actually gave him a laptop with a camera with a big zoom lens attached to it.
This is 100% true, and I'm not going to name my university. It's pretty easy to find out, though. But, yeah, they would give him a camera with a zoom lens for lectures where they hadn't prepared content or, for example, where they were doing, say, programming live in the lecture, places you're not going to be able to preemptively prepare it. They just gave him a camera and said, "Well, zoom in on the screen, and that's what you're going to get."
Matt: Interesting. I'd say that a lot of these same learnings probably translate to other things such as edX, for example. UC Berkeley had their content on edX, so I assume that they just kind of flow with whatever the university is doing.
But things like Udacity, Khan Academy, Coursera, I have to assume that, because this stuff's already being prepared for consumption, especially paid content, it feels like they would kind of already be incentivized to make sure they had as big of an audience as possible.
Steve: Is that true, though? How well would you say that accessibility has proliferated across all the different websites, as opposed to just in the public institutions that are being required to support it?
Owen: Well, it really depends. Some organizations are very forward thinking and recognize that that's necessary. Some of them know that; they have a policy as an organization, as a university, as a non-profit, that they will be accessible. And many don't, and so they do put themselves at this risk of legal action, which is always going to be much more expensive.
Phil: So, if it's paid content, does the same rule apply? Are you to open to being sued if it's paid content?
Owen: Paid content in terms of...?
Phil: Say I pay for a course on one of these websites, am I entitled to an accessible copy of that content?
Owen: One of the things that Berkeley highlighted in their response to this lawsuit was only about free online material. The department's findings, I'm reading a direct quote here from Department of Justice, "The department's findings do not implicate the accessibility of educational opportunities provided to our enrolled students."
What they're saying is, Berkeley has already done a fantastic job of making their content for their enrolled students accessible, and they've done that because those are the people paying. And perhaps they thought that, by releasing free video, they weren't expected to be as compliant.
But, yeah, absolutely, for the paid content, they are doing it, and they are required to make that accessible.
Phil: This all loops back into the huge amount of people who now don't get the content at all, right? And that's where it becomes a problem.
Owen: As I was saying, we've seen this a lot over the years in accessibility, where the cost of it gets kicked down the road and just snowballs. It's kind of like a technical debt. It's accessibility debt. That it just keeps building up and building up and then you get hit with a lawsuit. And guess what? This is going to be expensive.
It's not uncommon that, in situations like this, organizations have turned around and said that the cost is prohibitive. It is possible that Berkeley is putting that out so that they can then go back to the Department of Justice and negotiate and say, either, "Can we make this less of a requirement? Maybe only for certain content, or give us a longer timeline so that we can spread the cost over multiple years." That's something we see sometimes. A settlement agrees a longer timescale.
Steve: I want to say that, from my perspective, it feels like a good thing that they actually took it down and made that statement. Just from seeing what you're talking about. The history of ADA and everything, it's never going to become something that everybody does until drastic things happen around accessibility, right? Until it's actually not accessible.
And the more people continue to run into things like this, the more we're going to develop the technology that they need, because they're going to be demanding it. The more that we're going to be thinking about these things ahead of time, when we're producing the video, but without significant events like this, it's not going to come to the forefront of the conversation.
Owen: Exactly. That's how it's a good thing from the accessibility field, that it does push up its priority and get people thinking about it earlier.
Steve: I have one question. When it comes to international audience, because when we're talking about web video, we're not just talking about the U.S. and the U.S. regulations, but we're talking about accessibility for everyone around the world. Is the U.S. ahead or behind when it comes to accessibility of web video?
Owen: As far as web video goes, I don't know. I think so. The UK is ahead, and I'm sure Phil can speak to that.
In terms of general web accessibility, a number of countries, Australia, Canada, are somewhat ahead of the U.S.
I don't know about videos specifically, but I think it's probably that other countries are ahead. One thing of note is that a regulation in Canada, the Accessibility for Ontarians with Disabilities Act, AODA, which actually specifically delayed the requirements for audio description and captioning of live content by an extra four years from any other requirement. And that's for government-produced websites and content, for industry produced websites and content, they also look to go to WCAG AA, but leave out captioning of live content and description.
Owen: That kind of recognizes that it has been an area that hasn't had a lot of focus and that there is cost involved.
Matt: Yeah, totally.
Phil: Interesting that some of the live content in particular, this actually came up as a pitch in the UK a while ago, it was proposed that all TV channels would have an enforced delay to improve the quality of live subtitling. And it got completely thrown out, but in terms of improving quality, it would have been a massive step forward. But no, people want their TV live.
Matt: As we move forward what we're seeing is a lot more content, on the UGC front, lots of open-source projects where people are focusing on accessibility. We've already talked about the open-source side of things, but how do you see this affecting UGC?
Owen: In terms of UGC, there are a few projects that specifically demonstrate the possibilities of using user generated content to make things more accessible. Amara is a website that allows people to contribute captions and subtitles to existing video, like YouTube video.
One of the very core examples they have is translating the State of the Union speech almost immediately afterwards so that people across the world can view the State of the Union.
There's some great examples of how you can crowd source that kind of accessibility.
Another project that I was involved with, based out of the Smith-Kettlewell Research Institute here in San Francisco, was called YouDescribe. It was one of the first reasons that I approached you guys about Video.js, and that is user-generated audio description, where people can record their own description of a YouTube video and then share it with everybody else.
That's an incredible opportunity for educators, parents, people who regularly run up against the problems of accessibility for blind children, for blind relatives, to say, "I can make this accessible for you. I already have to do that anyway. I have to sit down at home and just walk you through it, because that's the only way that exists. And I can share that." It's then just the availability of platforms, the availability of online systems, that will support that distribution.
Phil: Also from a UGC perspective, when does it become an issue? We've got millions of hours of YouTube video going up every month. When does it become a requirement for PewDiePie, when does it become a requirement for that sort of thing, to do have any accessibility on their content?
Like we said earlier, YouTube's attempts at auto captioning stuff generally is mediocre at best. When does it become a requirement for those content creators to actually start thinking about accessibility of their content?
Owen: I don't know that that's necessarily an answered question at this point, but with all that's happening with CVAA impacting video, that's originally broadcast on TV and then moved over to online. I think it's coming.
I think once platforms like YouTube put in the infrastructure to support it, then there's going to be a question of, "Well, why not?" There's this constant breaking down of the barriers. It doesn't take so much work to get any particular thing done, and things become more accessible. Yay.
Matt: I think that's a great segue into wrapping things up. I'm going to put you on the spot here, Owen, because I've seen Owen. He's spoken at the SF Video Technology meetups. He spoke at Demuxed last year, and both times, again, as I mentioned earlier, the conclusions of, "Why we should actually care about these things beyond just ticking boxes?" were pretty moving. Make us all cry here, Owen. Why should I care beyond just ticking boxes, technically?
Owen: As I presented at the video meetup, there's a number of reasons. There's really three reasons, I think, that motivate us to do this. One is my boss said so. The second one is that a lot of the people that are in this field, a lot you guys that I've had the opportunity to meet, are incredibly smart people who are, it seems, driven by addressing problems, fixing problems, and
there's a lot different ways that those problem-solving skills can be used to address social good, and this is one.
I gave the example of a blind snowboarder. I have video of him, and if you watch that, you might be inspired to do something that isn't easy. Because that's what he was doing.
Phil: Is the video subtitled? Is it audio-described?
Steve: Thanks for ruining the moment.
Matt: Nice, Phil.
Owen: No, there's no speech in it. It's purely audio. The on-the-spot moment is, look around when you're out on the street, when you are in any setting. There are a lot of people around you who have a disability. Some of them are obvious and some of them aren't, and it's all too easy to think of them as "them versus us."
But if you talk to a person with a disability, get past that awkwardness of, "You're blind" or "I don't know how to talk to somebody who's deaf," because I struggle with talking to somebody who's deaf through a sign language interpretor.
Once you get past that barrier and find out they're a person and that they know what struggle is and that they don't want to be some inspirational person, they just want to be a person, maybe they want to sit down with their family and watch Orange Is The New Black or Daredevil because it has an inspirational blind person in it.
Watch Daredevil on Netflix, turn on the description that goes along with it, close your eyes and just see how different an experience it is.
Matt: And then imagine yourself on your iPhone 30 as an 110-year-old person wishing that you had an experience that was as equal as it was to when you were 30 years old and could actually see.
I actually think that's a legitimate argument for all of this. As engineers that might not have any of these problems right now; we're all young. Selfishly, we should be solving these problems for ourselves because it's coming. Hearing and vision loss is in each and every one of our futures.
Steve: What was that?
Phil: Not so much if you don't look at your iPhone so much.
Matt: There you go. Well, thanks again, Owen. Thanks to SSB BART Group for letting you come and talk to us. We really appreciate it, and this is probably going to get released after Demuxed 2016, but hopefully that content is available shortly after the conference. That's all we have for today.
Owen: Thanks, man.
Phil: Thanks, Owen. See you at Demuxed.