Transcript: Video Meets AI – A Briefing Webinar

Posted on by Larry

NOTE: This transcript was created automatically using Adobe Premiere Pro, then lightly edited for clarity and punctuation. You can view the original webinar here.]

Larry Jordan: Welcome to this PowerUp webinar, taking a look at “Video Meets AI – A Briefing.” My name is Larry Jordan and I am delighted to have you with us.

Artificial intelligence, which we call AI, is causing seismic tremors of fear and loathing throughout the entire creative community. So this week I decided, rather than just cower in despair, we’d try to find out what’s real and what’s hype and what we can do about it.

To do that, I want to introduce my guest, Sam Bogoch. Sam is the founder and CEO of Axle AI. I first met Sam more than ten years ago, as he was trying to explain the concept of media asset management to me. I finally got the picture and bought their software for myself back in 2015. And in fact, I use Axle to this very day.

(pause) Sorry, I was just reflecting on what Axle looks like. Anyway, last Monday, we announced that Axle AI had purchased all of my training assets, webinars and tutorials. My company is now an independent division of Axle AI. We have editorial freedom and our only mandate is to continue doing what we’re doing – just grow.

Today is not a product demo. It’s not a commercial. It’s really a chance for us to figure out what AI and machine learning is all about.

This came out of a conversation that occurred last week. I was in Cleveland speaking at the University Film and Video Association conference. I was talking about the challenges of teaching technology to college kids. That evening, Sam and I had a chance to sit down and get into a long debate about what AI was and was not, what was hype and what was real. I found the conversation so useful that I invited Sam to join us today to give us a similar briefing.

Sam, welcome. Glad to have you with us.

Sam Bogoch: Thanks so much, Larry. Great to be here.

Larry Jordan: To get ourselves started, how would you describe your background in AI?

Sam Bogoch: All right. Well, it goes way back, but maybe, focusing on the last decade or so. When I left Avid, where I’d been responsible for media asset management products, I started Axle. We knew that a lot of people were going to be using video and that a lot more people would need some tools for searching and managing it.

Then, around 2018, this kind of tidal wave of modern AI techniques began to hit video. At that time, we actually added AI to our name and decided to really focus the company on the potential of this. So what we do is basically help people search their video, audio and image content. I would say I’ve spent a few decades now, initially working in AI, then I was at Avid, then working in images and then back into video.

So, the latest developments are really exciting, but also a little bit overwhelming compared to what came before.

Larry Jordan: I should mention that you got your start playing with neural networks long before you got to Avid. If I remember correctly, your background was in neural networks and research, was it not?

Sam Bogoch: It was to some degree. And in the late 80s and early 90s, I was actually designing and building multiprocessor accelerators for neural network algorithms. But they were so primitive back then – you know, they’ve come a long, long way since.

Larry Jordan: Well, yeah, but you get credit for playing with it before the rest of us had even heard about it. So I just wanted to mention that. How would you define AI today and what’s the difference between AI and machine learning?

Sam Bogoch: So machine learning is literally what it sounds like. It’s teaching machines about different patterns and content, both in text and in other media. So machine learning is really just how you train computer systems to recognize and respond to inputs. AI is a very broad term that means kind of nothing and everything at the same time.

Because, you know, the vision of artificial intelligence or AGI (Artificial General Intelligence) is still quite a ways out. but everything, all the steps and building blocks that have been deployed up until now are also counted under that. So it’s kind of more of a historical trend line than it is any given thing.

But it’s taking computer technology and making it work very hard to try and simulate human thought processes and, biological, networks, let’s say.

Larry Jordan: As you know, there’s a lot of fear that AI is going to destroy the creative industry, including media. How worried should we be?

Sam Bogoch: I think it’s reasonable to be worried. I think there is a real threat here, and a huge opportunity. So we all have to be mindful of both of those. The threat is, of course, that big swaths of jobs could disappear or kind of be engineered out of existence. The opportunity is that all this cool stuff kind of lands in our lap, and we can do cool stuff with it. But whether the number of people after this transformation who work in this industry will be the same, less or more than they were at the beginning, is still to be determined. And we’re really on year 1 or 2 of a decades-long journey.

Larry Jordan: One of the things that puzzles me is: is it possible to use AI without stealing creative work, without impinging on copyrights? In other words, is there any application where AI can enable us to get our work done, as opposed to replace us?

Sam Bogoch: Absolutely. There are many such applications. Again, we focus on one of those, which is that our software typically runs on private networks. People point at their own content, not anybody else’s. They train it on their own content and this yields useful results in terms of searching and repurposing of that content. So this is a a harm-free use of AI.

There are other harm-free uses of AI. Everything from artificial lighting to, you know, generative AI for virtual sets. As long as your generative AI wasn’t trained on somebody else’s virtual sets or somebody else’s content. Now that’s getting trickier and trickier to police… but the bottom line is that if things are built in a reasonably ethical way with consideration for IP and who owns what, then there are thousands of applications of this stuff.

There are also the big hyperscale or cloud applications that, at least in recent years, have been hoovering up a lot of other people’s data, so called scraping; training on that and then regurgitating it in ways that in many cases, violate those people’s copyrights and ownership. So, it’s a whole spectrum of possibilities.

Larry Jordan: What does the word “training” actually mean? What’s going on?

Sam Bogoch: What’s happening is you create one of these architectures, these large language models, for instance, and, and like human beings who have eyes and ears, you know, it has inputs and you feed it inputs to give it tons and tons of stuff. much as you might, an infant or a child, you just kind of show it a bunch of stuff, you know, hand it books, put it in front of the TV screen, see what sticks.

That’s what’s going on. There’s a lot more to it, of course, because these are not infants. They’re not completely self-regulated. You have to apply techniques for feedback and for checking after the fact. You might have heard of the acronym RAG (Retrieval-Augmented Generation), which essentially provides for that feedback where the incoming content and the learnings from it have to be checked against known good sources.

But, essentially, all of this stuff is just basically being poured into different AI engines, and then they’re being judged on the quality of their responses. It’s kind of an arms race right now between Open AI, Meta, Google, and Mistral, and Apple just came out with a model. There are hundreds of these models and they’re all scrambling to be fractions of a percentage point more accurate or responsive or better performing than the others.

Larry Jordan: Is it a true statement that AI can only copy, that it can’t create? And is it likely to contain to continue that way?

Sam Bogoch: No, actually, I believe it can create already, even though it falls way short of what we would consider human intelligence or something that you could get comfortable with coming from another person. But the point is, it can absolutely synthesize new things. Absolutely. And if you look at the output of things like Sora, for instance, or some of the other generative AI models, a lot of that is really new.

It’s only if the stuff kind of passes straight through without much processing, and then on the output side, you’re getting pretty much what was fed in, which can easily happen that, of course, there’s no creating at all happening. And that’s been the basis of a number of lawsuits.

Larry Jordan: One of the things that you’ve said is that training is essential to being able to create an AI system. AI doesn’t live in a vacuum. It has to be trained. Is it possible to train an AI system without violating copyrights or stealing the creative work of others?

Sam Bogoch: Sure, you can form agreements, and this is what’s happening right now with some hyperscalers where they’re actually creating deals with content owners, and we can pay you, whatever, $10 million to rummage through your stuff and train my stuff with it. I do think that it’s a little bit of a short-term phenomenon, because as these systems get more sophisticated, the training will probably evolve to the point where there isn’t much of a risk of a direct output of an immediate derivative.

At that point, for instance, nobody charges you beyond a cable bill to park your kid in front of a TV and have them watch old movies. That’s pretty much what’s happening with these AI engines. But it’s creating an uproar because it’s not training a person. It’s training software of a kind. So we will reach a point where I think, well before these things have legal status or whatever, that people will just realize, okay – sure. Train them on whatever you’ve got. Because what comes out hopefully is not a direct link to what came in. And much as you can think about nickel and diming people for every article they read, it may reach a point where there’s kind of a generally accepted approach to doing this, but today they’re having to do direct bilateral deals.

Larry Jordan: I am not a fan of AI, which comes as shock to nobody. Why can’t I just not use AI tools?

Sam Bogoch: You always have that choice, at least for the near future. The longer-term problem is that it’s being baked into almost every digital product you use. So the latest wave of laptops and desktops and tablets and phones, they all have significant AI processing capability. And all of the software vendors are taking advantage of the NPUs.

You know, you might end up with the person that – you know, there are still authors out there that use WordPerfect because they never quite got accustomed to Microsoft Word. You might be like one of those guys, but you probably don’t want to be that person. So the challenge is how to take advantage of this stuff in a selective and sensible way to your advantage.

You know, a lot of people talk about Luddites like it’s a slander. “You. Oh, you’re just a Luddite!” But actually, the Luddites had a lot of interesting stuff to say. They were resisting the early Industrial Revolution, and they were saying they were looking at, like, factories and coal and child labor and raw sewage, and they were like, you know what? I’m not interested in any of this. I’m going to live in the country. I’m going to resist a big train coming to my town. They even resorted to terrorist acts like spiking the train tracks and stuff. I guess the point was, they recognized, nearly 200 years ago, actually, the threat the technology posed to traditional ways of life.

And we’ve had multiple threats to our “traditional ways of life” since then. AI is just the latest one.

Larry Jordan: A comment was made at the 9:00 AM session that AI is… or machine learning is… I should probably be more accurate; machine learning is very much like word processing. In the old days, before word processing, offices were filled with secretaries who did nothing but type. And now with word processing, all those secretarial jobs are gone because we can do it on our own laptop or our own computer.

Is that a good analogy to what we’re seeing with machine learning? The potential of machine learning?

Sam Bogoch: It’s an analogy. I will say that if you’re in a creative industry, as I imagine most, if not all, of the people on this session are, it goes a little deeper than that, right? This isn’t just replacement. I mean, that would be more like saying this is like when non-linear editing replaced tape-based editing. That would be the kind of step that would be analogous here.

We’re really talking about reshaping the nature of the medium and how you can conceive of and deliver content, video content. it’s going to really shake it to its core. And many of the steps that people follow are going to get kind of rearranged because you have all these options late in the process.

Larry Jordan: A question from Tom. He asks, what is the best method or practice to leverage AI to our advantage? What do you see as the new avenues to explore?

Sam Bogoch: My advice is just to really stay on top of the trends and look closely at whatever product or service you’re considering using. Decide for yourself what its impact is on you and maybe, what its impact is on the broader society. You know, we have to make these decisions every day anyway. Right now, things like social media have already proved to be both very valuable and very insidious.

And yet people wake up in the morning and continue to use different services and benefit from that. But we have to be really on guard for the bad stuff. And that’s not just things like deepfakes, but it’s also situations where you’re consuming content or using tools that may in fact have grabbed their skill set from other people’s content and other people’s tools.

It’s a blurred version of copyright infringement. So, because this is a blurred version of technology, you have to be extra vigilant.

Larry Jordan: Building on Tom’s question, last year the actors and the writers went on strike because, among many things, they wanted better protections against the encroachment of AI into their writing and into their acting. NPR announced two days ago that the video game actors have gone on strike for exactly the same reason.

They don’t want to be replaced with synthesized video or audio. But these are unions where there’s a lot of people all gathered together as a group, and most of editors are single practitioners or working on a small group. What can we do, if anything, to guard against, the encroachment of AI or, as you say, do we just have to roll with the punches?

Sam Bogoch: It’s a tough one. There are real threats and real opportunities here, and the pace is only going to continue. One thing that I think probably bears mentioning is that we’ve been through some technology waves in the last decade. That turned out to be interesting. Those kinds of blips. Right. So, you know, blockchain or web 3.0, or the metaverse, I mean, everyone was talking about them and talking about them.

Then, the next thing you know, they kind of fizzled. AI is not going to be one of those. There will be peaks and troughs. And I think we’re actually in the peak right now. So, the current wave of AI technology has done about all it could do and we’re seeing diminishing returns on some of that.

But then the next wave of this stuff is just kicking off and, I think as individual practitioners, whatever you do for a living, and especially if you’re in a creative field, you need to look at this stuff as just empowering you because it’s going to, there’s no doubt. And it’s a race. There are several races happening.

One is on the technology supply side. There’s all these cool new technologies. They can either help you do your job better, or they can replace you or points between those. And then on the on the demand side, what’s really interesting to me is that the world has already changed significantly in the last decade in favor of many more sources and destinations for content.

When I worked at Avid, I was there for five years. We sold to about 1,000 customers worldwide for my products, which were essentially Media Central and Interplay. Those customers are the big broadcasters. There were only so many of them in the world. We sold to about half of them, GV (Grass Valley) sold to the other half, and it never really changed much from year to year. The budgets didn’t change.

You can go back and look at those same broadcasters today. Their budgets are probably the same or slightly lower, adjusted for inflation, than they were then. Meanwhile, all these other sources of content have been bubbling up, right? You’ve got sports, churches, universities, governments, corporates. That’s where all the content is. And YouTubers, creators, all this content is coming from these folks.

There wasn’t anything like the level of activity ten years ago that it is today. So this is where the growth is. And it’s multiplying as it grows because everything’s decentralizing. More and more people are getting this capability.

So the race is on the supply side, you got to stay ahead of the curve. You’ve got to learn the technology and make it work for you. And then on the demand side, you need to be looking out for who your next customers are, because there will probably be more of them. They’ll probably be smaller. But you know, instead of like, waiting for that one editing job on that big movie, you might have five editing jobs for different corporate customers that need promos posted on social media. And, you know, to the extent that AI can help you do that better, it’s your friend.

Larry Jordan: Andrew’s asking, what are the opportunities for the large corporates but not the self-employed? Oh, I think you’re saying “AI provides opportunities for the large corporates but not the self-employed editors. Our clients will edit their own content and won’t need experienced creative editors because, at the end of the day, it comes down to the cost. They won’t hire us when they can do it themselves via easy editing using AI.”

It seems to me that there’s a there’s a lot of very simple productions. The CEO speaking to a training group or something where they need video, but it doesn’t require a craft editor to put together. That currently has been a craft editing function. It seems to me that that that’s going to become increasingly scarce or more importantly, be done by a non-craft editor because it can be doing selects, can be done by AI and put together automatically.

Sam Bogoch: Yep. And that means we all have to be pickier about where our skills are really needed. Right. That same company that’s doing routine communications may also need high quality ads to be done with more production value. And because of the proliferation of social platforms, and for, you know, people want a vertical, they want a horizontal, they want a large screen, a small screen, different bitrates.

It’s about creating high quality content that is viewable and for all these contacts. But it is also true that I’d say the vast majority of content, low-value corporate communications, is either going to be auto-edited or edited by the people that captured it on their phones. You know, just using the simple phone editing app.

And that’s it. I wish I could offer reassurance that it’s all coming our way and this is only going to get better, but it actually is kind of a scramble.

Larry Jordan: Well, the good news is it isn’t your fault. So that’s a good thing. Tom asks a question which is above both our pay grades, but I’m going to mention it anyway and give you my opinion. Tom is asking, who should be holding the reins and guiding AI? And I think we can categorically state that it’s neither Sam nor myself that are going to have that responsibility.

But I also think it’s not going to be the industry. I think the industry is going to continue competing and releasing stuff. I mean, we’ve seen this traditionally over the last 20 years. they’ll release it without worrying about the societal impact if there’s money to be made. I think regulation of AI has got to be done through legislation or other over governmental regulation of some sort, whether that’s done through the EU or the United States.

I don’t see the industry has ever succeeded at regulating itself where society is involved. but that’s just my opinion. Sam, you’re welcome to disagree.

Sam Bogoch: And in a field that’s changing this fast, it’s such a moving target. I mean, I actually respect what the EU have done with their regulations. but already they’re being kind of scapegoated for it because there are no EU big winners. You know, in tech, right? So people are saying, well, this is exactly the kind of attitude that’s prevented the European Google or European Meta or something like that and there never will be one if they keep going like this.

The other point of view is that the American example has gotten out of control and that you have forces at work that the government is powerless to stop. So, I don’t have a strong position on this, but, like you said, we’re not in charge of that. I think all we can do is hope that at some point reasonable choices get made.

Larry Jordan: A bunch of questions coming in. The problem, Andrew says, is the way that most editors get started is doing low-value work to get the experience, so Andrew continues, it looks like machine learning will destroy the food chain because an editor doesn’t straight away edit high-end stuff.

In other words, it’s going to be hard for the new person to get started.

Sam Bogoch: I actually have a very contrary point of view. I think it’s easier than ever for a new person to get started in this industry, because they’ll start by editing social posts. AI will do the generic ones, but new editors will do some really cool ones. Those will attract the attention of somebody in the marketing department or somebody who’s got a budget.

Those people may not established editors, but I think actually there’s a bigger threat from this groundswell of people coming up from social media and kind of redefining what editing is. I will say that I’ve seen some amazing, high quality, super, analytical kinds of editing going on at social media companies.

I can’t name names and so forth, but just very detailed, team-oriented multicam. super thoughtful editing that, rivals in many ways what you see at the biggest movies. And those people mostly bubbled up from social media stuff. They’re often young. they’re not unionized. you know, and again, they’re riding some of these waves.

They’re using some of these tools. So I actually think more, rather than fewer, people are coming up in our industry, but not in the traditional parts of it.

Larry Jordan: I don’t want Robert to feel I’m ignoring his question, but neither Sam nor I can answer it: How can we promote awareness of AI servers and scraping, impacting energy and water, cooling costs and disrupting climate change migration? Can you answer that in 20 words or less?

Sam Bogoch: Yeah. It’s terrible. I think we should all be shouting about this. You know, AI is already sucking down more power than some countries. The good news is that this is a very inefficient early phase of the technology. Actually, it’s better than bitcoin, right? Bitcoin started that way and it’s still that way.

Within a year of chat GPT 3.5 coming out they’re already figuring out ways to deliver it on a laptop on a modestly powered server. And yes, you can spend billions of dollars and set up these huge farms, and maybe those still will be the leading edge, but it will be R&D for mainstream AI that is hopefully much more energy efficient.

It can’t go on like this. There’s there isn’t enough money on Wall Street to fund all the crazy server farms that people want to build. And that’s as it should be. I’m totally opposed to the idea of these crazy, wasteful setups. And it’s not clear that a whole lot more benefits are accrued by it.

I think if you spend a thousand times as much on energy to get a 1% better result from your AI engine, unless that unless that engine is solving some huge problem, then you just wasted a lot of money. So I think we’re coming to a more sensible place.

Larry Jordan: We have unleashed the forces of philosophical and political debate. Pablo asks in script writing: I can use AI to help my writing. Being smart enough, one can get help in developing and refining scenes as would be done with other writers. It looks like I can replace some of the other co-writers. The question is if the final product or script can be considered as the result of my own creative process, and how could a producer recognize the contribution of AI?

I think that is a huge question, which we were talking about earlier, which is the idea of labeling AI content and how much is AI and how much is contributed by the creative writer. The answer right now is what my publishers have told me in the books that I write is that I cannot use any AI generated images and I cannot use any AI generated text because they can’t legally get copyright for it.

So, I am contractually obligated not to use AI in any of the creative work that I do for a publishing company. Because copyright is not resolved. They’re taking a very conservative approach. Sam, have you had other experiences in that regard?

Sam Bogoch: Well, I think they’re happening all around us. I don’t have any special insights except to say that I think we’re going to end up in a more nuanced stance on that because there are clearly uses of AI in terms of a creative loop where you’ve got something, you hand it to the AI, and gives it back to you with some permutations. You edit it further.

I don’t think that should be out of bounds, or reasonable use of tools. But obviously if you just take something handed over wholesale, like the kid putting in a term paper, that he didn’t write or she didn’t write, that’s clearly not good. So, I don’t know, butI think we’re going to end up in a more nuanced place.

You know, when every word processing application has built in AI, it’s going to be harder and harder for your publisher to insist on that.

Larry Jordan: Brandon writes, all of us who have copyrighted material on YouTube are powerless to prevent this copyright theft, but there should be a recognition that it’s occurring. I don’t think there’s any question. Everybody recognizes it’s occurring. What is not happening is figuring out a resolution to how this is occurring.

Adobe made an announcement at NAB in April that they were now flagging every piece of AI generated art in an Adobe application, and set up a standard so that other companies can subscribe to this to label AI generated work.

But the problem is that it is voluntary participation. It’s not required. So some companies may label it and other companies will not. I think there’s industry-wide recognition that some sort of AI labeling needs to occur, but there is no consensus within the industry about how that’s going to be done. Personally, I have limited expectation that voluntary industry collaboration will solve it.

I think it’s going to have to be forced upon the industry.

Sam Bogoch: Great comments everybody. By the way, I’m just trying to keep up with them coming in fast and furious. but it’s worth noting, by the way, that OpenAI just declined to put in a watermarking technology.

Larry Jordan: Declined putting it in!?

Sam Bogoch: They declined. They think – we have it. We were ready to do it, but then we did a customer survey and found that 15% fewer customers would use our service if we watermarked the stuff. So we’re not going to do it.

Larry Jordan: Oh my goodness!

Sam Bogoch: So I was like, okay, well we tried! Like you said, these things are going to be voluntary. For every vendor or service that tries to be rigorous about it, there’s going to be a dozen that that are kind of the Wild West.

Larry Jordan: Paul makes a really good comment. He says these high-quality social media productions that Sam mentions are productions. These are completely dependent upon how they are shot, not directly on how they are edited. Editing like AI depends upon what you feed into it. And point of fact, whether we’re editing a Hollywood film or editing a social media post or a newscast, or special social media event we are using Avid, we’re using Premiere, we’re using Resolve or using Final Cut.

Those tools are the same regardless of what the output destination is. So in point of fact, it really is what is fed into it, not the editing tool that we use. Would you disagree?

Sam Bogoch: Oh well, I mean, all the tools that you mentioned are professional grade editors. I think a lot of folks that are editing for social media are also using Capcut. There’s a variety of these very lightweight, either phone-based or browser-based editors. So that’s what’s new here, that the traditional editing space has not changed that much.

Though even they show signs of actually morphing pretty quickly with all this technology. It’s also worth mentioning that the ways people shoot are changing, and in some cases, at high-end social media, I’ve seen dozens, literally dozens of cameras on a single scene. You know, they’ll be like taping GoPros to the legs of tables and things, and then they’ll have more multicam than you’ve ever seen on a screen at the time and saying, oh, I like number 17. You know, that’s got the right feel to it.

I mean, you would never see that on a movie shoot because they’ve got big, expensive cameras. They point a couple of them at the scene and that’s it. Here it’s like, oh, I just put my phone up over here. I got some GoPros over there. I got seven cameras and we’ll see what comes out of it.

Which actually generates more editorial work because you have to choose. Whereas if you’re intentional about it, you never have that problem. The same kind of thing happens in reality TV where they literally have cameras all over the house and then decide later what to use. So again, it’s editorial work, often supplemented by AI for automatic transcription and things like that.

But it’s a different kind of work than traditional craft editing.

Larry Jordan: Just thinking about that, I was reading an article this morning that the audio mixer for the Olympics use 28 mics, 28 on the pommel horse, just on the pommel horse. Oh my goodness.

Sam Bogoch: I gotta get that squeak just right.

Larry Jordan: That’s a new challenge in mixing. So it’s a really good question, which we spent all last week discussing at the UFVA conference. They asked, how will AI affect education with teaching and learning? Will AI technology trickle down to schools or grassroots end users, and how will textbook publishers use AI? First, the publishers don’t know how they’re going to use it either.

Debating it now, because I was talking with them last week. But educators are really worried about the impact of AI on the classroom. because if the administration of the school views teaching as simply transferring knowledge from the “sage on the stage,” the teacher to the student, that can be done using AI. If, on the other hand, teaching is enabling students to become the best possible person that they can, that requires human intervention.

AI is going to be cheaper because you don’t have to pay the salaries of teachers, but will it be as effective? And one of the questions that was being debated constantly with no resolution is what is the impact going to be? Because AI can be delivered a lot more cheaply, but will it deliver an education?

What students are going to be able to get from it is a huge, untested question.

Sam Bogoch: I think I should toss in something on that note. I have a lot of data points because my wife is a is a college professor and my daughter is applying to colleges. She’s going into senior year in high school. I will say, if there was ever an area that was ripe for disruption by AI, it’s higher education.

You know, colleges cost a ridiculous amount. Tuition is just crazy expensive. And it’s because it’s one of the last parts of the economy that has not experienced any modernization, or any economies of scale. You know, a teacher still needs to walk into a classroom, and then maybe 50 kids would watch and you’re still paying for that retail interaction whether you want to or not.

Meanwhile all these things have sprung up. Video. The internet. Those things ought to be helping people learn. Then the teachers could be doing high level work analogous to what we’re saying with editing, where what you get paid for is your judgment, not your brute force work. And I think society would be a better place.

You know, I don’t want to say teachers don’t deserve their jobs, but their jobs have to shift. And you know, you can’t have college tuitions that approach $100,000 a year and put big chunks of our society in debt unnaturally; then have to have government edicts wiping out all the student debt.

It’s like – wait a minute! Couldn’t there be more efficient ways to teach people? So I’m hoping that that works for the better in that part of the world. I don’t think video editing is as dramatically in need of an overhaul as the college education system. But I think it’s got to be healthy if some of this makes its way into that space.

Larry Jordan: James has a comment. Let me read it. James writes: I agree with Sam that I can make it easier for new editors like me to edit our productions. For example, it has always been difficult for me, a lawyer with 30 years of experience, to convey to a video editor or website designer who isn’t a lawyer. the reasons I want a production to be edited in a certain manner.

Now that I’ve learned NLEs, with AI’s assistance, I can further harness AI to help me edit my own productions as I wish, because I know how my intended audience will understand my work. This is where I see AI enabling the new user to get started. Which is to your point, it allows people to get started that have less initial knowledge because AI takes over some of the fundamental steps.

Sam Bogoch: Yeah. And it demystifies the space for everyone. You know, just think back a few years, people were not conversant with video the way they are today. And I mean, everybody. Today people are very conversant with video and they know how to make simple cuts and they know how to shoot video on their phone, usually badly and usually in a vertical aspect ratio.

But, give them time to figure it out. So that awareness is creating an audience for good editing, but it’s also raising the bar for commodity editing because people can do that.

Larry Jordan: James continues, so much of the way we humans communicate with each other is filled with nuance, tone of voice, paralanguage, messages, etc. Sam, he writes, do you think it will take quite a while before any AI product can capture the nuances of human communication?

Sam Bogoch: I think it’s happening sooner than you think. And the capabilities of these systems are pretty staggering. If folks are interested, there was this This American Life episode last weekend. They spent about 30 minutes on this topic, and they had some really jaded industry experts commenting on some of the results they were seeing. I think we are going to continue to be both pleasantly and unpleasantly surprised at how powerful these systems are over the coming years, particularly as we’re getting into media centric AI, right?

What’s called “multimodal.” Last year, the big thing was text. This year, and going forward, it’s all going to be about video, audio, photo and text. And it’s going to be increasingly hard to distinguish what these things output – and how they respond to input – from what a person would do. So, again, it’s already a little bit troubling for some of the close observers of these things, but we’re going to have to wrestle with it because it’s definitely happening. And we’re only a year or two into this.

Larry Jordan: James continues: I taught law school students as an adjunct professor of law. While I can help educate our students, I don’t think I can share wisdom, judgment, and decision-making capabilities with students. I share my 30-years of legal practice with my students. So what he’s suggesting is that AI is missing the experience that we have, can try to track the facts, but not the experience.

Sam Bogoch: Yes. Well, exactly. I mean, there is no experience in most of these cases. What will be interesting as the training process for these systems becomes more lifelike, it may start to experience. And, again, I think it’s I think it’s naive to rule out chunks of this and say, oh, you know, it’s just a glorified Turing test and fooled us again.

But I don’t think for a minute that there’s intelligence back there. The hardest challenge over the next decade is going to be trying to figure out some kind of threshold, within which you say, well, that that’s not intelligent, because it’s going to keep challenging and pushing those borders.

Larry Jordan: I want to take us back to the beginning of our conversation, where you were differentiating between an AI function that creates, like generative AI, and something that doesn’t create – a classic example is speech-to-text – where it’s not necessarily creative, but what it does do is to recognize the patterns in the waveform and translate those into text based upon pattern recognition.

This brings us to Brandon’s question. Transcription as machine learning is replacing the somewhat mechanical work of human transcribers, while Runway and Sora are copying the art and judgment of practiced cinematographers. Is there a difference between the machine learning of transcription and the generative AI of runway and sentence and Sora?

Sam Bogoch: They are orders of magnitude more complicated to deal with; you know, a complex visual problem over time, as opposed to just an audio stream that needs to be passed. There are now free speech-to-text-engines, like Whisper from OpenAI that do an excellent job. And that’s a part of life now. So almost any audio can be transcribed. There are commercial ones like Speechmatics, that do a better job on a wider range of languages with more features, probably, but you can just assume that that speech is now transcribed.

Meanwhile, video content is not understandable by engines that simple, but it’s beginning to be – currently video content is created, but kind of in a monkeys-at-a-typewriter sort of way.

You can prompt it, you know, the prompts get longer and longer and more and more specific. People start putting in F-stops and saying, I want it to start like this again. I want it to end like that. I thought it was interesting that Runway just added that end frame capability. So you can create a path from this beginning frame to that end frame.

And then of course, you can make another 10-second segment that latches onto that, with the previous ending becoming the new beginning and so forth. but I don’t anticipate anything in the near future where people are not in the loop on that process. And it’s incredibly empowering. I read a post by an editor the other day who’s discussing all the possibilities in post-production that he’s using; to adjust things after a shoot so they don’t have to go back and reshoot it.

For example, if there was an expression an actor had on their face that kind of ruined the scene, but they only noticed it in post, they can fix that now. For creatives, I think it’s problematic to latch on to all the problems of this. Instead I think to focus on the creative possibility is to really say, this whole field is going to be unrecognizable in five years, and even more so in ten years.

We can be the masters of this because we already understand the fundamentals and we’re already good at those, and we already know how to make stuff that looks good and tells a great story. So you can either ride the wave or you can be kind of crushed by the wave – but it it’s also clear that these are complex problems. And that’s a good argument for humans in the loop where you’re just going to get endless amounts of junk video.

Larry Jordan: Philip asks: I’m an editor managing a small group of editors on a large archival project. I’ll keep those editors employed because they bring nuance and human insight to the work that is essential. But I’m being asked to accelerate the editing process using AI. I know this isn’t a product session, but I’m just curious how actual AI or other tools might assist with large archival assets.

We’re currently using Adobe teams to collaborate. I can think of several ways, but Sam, I’ll let you answer that.

Sam Bogoch: Yeah, sure. And the biggest that we see is just that people spend less time combing through material and trying to tag it in some way and much more time being creative. They can take quick requests. You know, I need any footage of the following kind with two people sitting at a dinner table, or I need this, or I need B-roll or I need footage of my candidate talking about the environment. We’ve had some great political use cases for our stuff.

And, you know, instead of spending a day and a half going through the archive to find that, it’s a second and a half for the search, and then they’re actually editing and going back and saying, what else, you know, going down through the search results. Maybe I’ll try it with that clip.

So it feeds creativity and it feeds higher quality output; searchability is kind of everything. And we think the numbers are, if anything, understated. You hear a number like 30 or 40% of an editor’s time is spent just combing through material and looking for stuff. And I think that’s probably conservative. So think about getting all that time back and being able to decide what really needs to be done, and then maybe using other AI techniques to implement the things that that you need.

Larry Jordan: I use Axle for managing over 200,000 photographs from the 1830s to the modern day, and being able to say, I need a photograph from roughly 1905 that features the following people and have it be able to go into that database of that many photographs and find every one that matches those criteria. A friend of mine who was working for Object Matrix coined the phrase “If you can’t find it, you don’t have it” and I think that’s true, especially in historical documentary or archive projects. You know that somewhere in there is a shot of a Model T breaking down on the on the highway. But if you can’t find it, you can’t use it in your show.

What Axle provides is the ability to find stuff, keep track of it, store hundreds of thousands of clips, stills or video or audio and be able to track it down by metadata. it’s a powerful tool.

Sam Bogoch: Someone asked me about educational resources, about AI, and, I mean, we have some on our site, but it’s such a new field and it’s changing so quickly that I think overall there’s a lack of those. It’s partly why we teamed up with Larry. I mean, he has such a great perspective on the editing space.

I think as AI techniques kind of infiltrate that, Larry, you’re ideally placed to tell more people about how this is all happening.

Larry Jordan: James has another question. It’s lengthy, so I’m going to summarize it. We’re familiar with the phrase garbage in, garbage out, where we can only deal with the material the computer is fed. James wants to know whether you see AI transcending the limitations of the creators. Can it create something which is greater than the information that it was fed in?

And my sense is the answer is yes.

Sam Bogoch: Definitely yes. But it will also generate thousands of things and much of that will be garbage. So you have to be a very good sifter. You have to be like those guys that went out in the in the gold rush with their with their big sifting pans, and you’re looking for the nuggets of gold in there.

You know, it’s almost like a random number generator. I keep coming up with analogies, but, you know, random number generation or sifting for gold. So there will be really good stuff in there. And there will be lots and lots of junk.

Larry Jordan: Gary asks: I’m involved in supporting kids with special needs. Recently I’ve seen examples of very challenged kids using AI to create content that they otherwise would have no possible access to. These kids are already using devices to speak for them, and so now to be able to convey their visual ideas is great. As Sam suggested, the teaching team’s focus needs to shift to a place of guidance. I think that picks up on the theme you were saying before.

Sam Bogoch: Yeah. A lot of these technologies really help disabled, partially-abled people. I mean, I can only think a lot of the boom in tagging has come from very low cost labor and places, and there’s just a huge groundswell of available workers who might not qualify for other jobs, but can absolutely sit in front of a screen and tag content and their work is being used to train the AI that are now doing it, at which point they’ll hopefully work their way up that particular food chain over time as well.

Larry Jordan: Paul asks a follow up question: My photo database includes a lot of custom added metadata, so it’s not just having it automatically recognized by AI, but where typing and descriptions are physically adding copyright information to photographers’ names. So we can do that. Paul continues: Here’s a legitimate question for Sam (as opposed to a bunch of illegitimate questions, I might add).

Since this is what’s happening, does Sam have any advice for what we can go to educate ourselves in using these AI tools, the different apps or platforms or even institutions that may be teaching us how to integrate AI into our knowledge?

Now, before you answer that, Sam, I’ll let you think for a second. Future Media Concepts two days ago announced a series of eight AI bootcamps that cover broadcast television, broadcast audio, editing, visual effects, and audio mixing. So that’s one – Future Media Concepts. They have been doing training at NAB for many years.

But are there other places that people can go to learn what tools are available?

Sam Bogoch: Right now, I don’t think there’s a really good go-to source; even on platforms like YouTube. It’s very scattershot and, like I was saying, that’s a big part of why I think you have increased relevance because as long as the world wasn’t changing that much and you had your Final Cut and your Resolve and your Premiere; I mean, those results were coming up in the last few years, but overall, it’s a kind of fixed set of tools.

Now, the tool set is just exploding with a lot of AI powered alternatives, browser and mobile alternatives. So it’s actually a great time to be a video editing and creating expert because I think you’re going to have a much busier job, probably.

Larry Jordan: Sam, we need to wrap this up. But I want to end with this question because I liked the answer you gave at nine. Let’s see if we can come close to it now: We know that we can’t tell what’s going to happen five years out. We’re lucky if we can tell what’s happening next year. But what’s your advice for the near term?

Sam Bogoch: Read up on this stuff. Subscribe to relevant sources on your social feeds. I’ve actually just started an AI for Media group on LinkedIn. It’s called “AI for Media.” There are others. Then just take in as much as you can and spend, you know, 10% of your time – if you can set aside a little time every day – to get up to speed and see some of the new developments. I probably spend more of my time on that.

Of course I’m in the industry and I find I can barely keep up! But to the extent that you are up to speed on it, you will get to know more than almost anybody you’re likely to talk to and then can be a resource for them. They’ll say, wow! that guy knows what’s going on; I’m going to use them on my next editing job, or I’m going to have him on my next shoot or whatever.

So, I’d stay on the side of knowing more and try to make your own judgments about which technologies you want to use. A lot of them are open source and they’re becoming much more affordable and accessible.

So you become an expert. I think everybody who’s on this call can absolutely become an expert over time. It’s just a matter of ongoing commitment. And it’s fascinating stuff!

Larry Jordan: I think that’s as good a way to end as any. I want to thank Sam Bogoch, the founder and CEO of Axle AI, for joining us today. And thank you for joining me on this Power Up webinar.


Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

Larry Recommends:

FCPX Complete

NEW & Updated!

Edit smarter with Larry’s latest training, all available in our store.

Access over 1,900 on-demand video editing courses. Become a member of our Video Training Library today!

JOIN NOW

Subscribe to Larry's FREE weekly newsletter and save 10%
on your first purchase.