mirza.town - my very own scribe

_^18/06/2026

Computa, Transcribe da Meeting!

The palest ink is better than the sharpest memory.

I mentioned that I built a meeting transcriber in my previous post.

Here’s a crude specification of what I wanted:

Work with a single audio/video file as input, no need to handle multiple inputs per session.
Speech-to-text transcription with timestamps.
Detect speaker changes and mark them in the transcription.
Able to recognize known speakers. Should be easy to add new speakers.
Should be able to handle Turkish, specifically.

Nice to haves:

Spinning up a remote STT service and using it for transcription. (This is still dramatically cheaper than using a paid STT service - with the exception of Groq.)
Detecting overlaps between speakers. (This is a hard problem to solve, also I’m not sure if it’s worth the effort. I included this work in progress feature in the final transcript output for this blog post. I’m still iffy.)
If the input is a video file, directly feed the frames from the video to a Multi-Modal LLM so it can describe what’s happening in the video. (A whole new project in itself but it can increase the usability of the tool significantly.)

I actually built a version_0 of this tool back in November 2022, but I’ve iterated on the design, added features, and fixed bugs since then. It only transcribed the audio and that’s it. I remember feeding the entire transcript to an LLM to generate a summary of the meeting. Not so fun but still generated some useful insights, even back then. :^)

How?

We have three problems at hand:

We don’t know what the speaker is saying.
We don’t know who the speaker is.
We don’t know when the speaker is speaking.

The first one is easy. Whisper can transcribe the audio and give us not only the text but also the timestamps - even at the word level. However, Whisper won’t differentiate, or even create a new segment for a different speaker. If two people are speaking at the same time, Whisper will transcribe them as one segment. So an example of the Whisper output would be:

{
  "start": 12.40,
  "end": 15.50,
  "text": " I think we should move the deadline no wait let me finish to Friday the budget isn't approved yet.",
  "words": [
    { "word": " I",           "start": 12.40, "end": 12.52 },
    { "word": " think",       "start": 12.52, "end": 12.68 },
    { "word": " we",          "start": 12.68, "end": 12.78 },
    { "word": " should",      "start": 12.78, "end": 12.95 },
    { "word": " move",        "start": 12.95, "end": 13.10 },
    { "word": " the",         "start": 13.10, "end": 13.18 },
    { "word": " deadline",    "start": 13.18, "end": 13.45 },
    { "word": " no",          "start": 13.48, "end": 13.58 },
    { "word": " wait",        "start": 13.58, "end": 13.72 },
    { "word": " let",         "start": 13.72, "end": 13.82 },
    { "word": " me",          "start": 13.82, "end": 13.90 },
    { "word": " finish",       "start": 13.90, "end": 14.08 },
    { "word": " to",          "start": 14.10, "end": 14.18 },
    { "word": " Friday",      "start": 14.18, "end": 14.45 },
    { "word": " the",         "start": 14.48, "end": 14.55 },
    { "word": " budget",      "start": 14.55, "end": 14.78 },
    { "word": " isn't",       "start": 14.78, "end": 14.92 },
    { "word": " approved",    "start": 14.92, "end": 15.18 },
    { "word": " yet.",        "start": 15.18, "end": 15.50 }
  ]
}

Let’s deal with this problem later.

Now we can get the words and the timestamps but we don’t know who is speaking. We kind of dealt with the first and third problems just by using Whisper. _{Note that Whisper’s timestamps are not always accurate. It’s a good approximation but not always precise. Anyways…}

For the second problem, we can use Speaker Embeddings to map the speaker’s voice to a space with high dimensionality. If you’re not familiar with embedding models, here’s a quick explanation: - An embedding is a vector of numbers that represents a piece of data. (Just a list of numbers.) - The embedding space is a high-dimensional space where the data is represented as points. (Every number is a signal for that dimension. This dimension may or may not be understood by humans, by the way.) - The distance (or sometimes the angle) between two points in the embedding space represents how similar the two data points are (The shorter the distance, the higher the similarity). Check out my other post if I got you curious about embedding models, I think they are super cool.

So, similar embeddings mean similar speakers, neat.

We can compare every single embedding to every other embedding and note the closest ones to each other. Of course, this is slow and requires a threshold value to determine if two embeddings are similar enough to be the same speaker. This threshold is really hard to nail down without manually checking the embeddings and rerunning the algorithm multiple times. What if we first reduce the dimensionality of the embeddings so we can visualize them in 2D or 3D space? We can use PCA or t-SNE for this but I almost always end up using UMAP. Remember, we want to reduce the dimensionality while preserving the similarity between the data points, so similar data points should be close to each other in the reduced space.

There are clearly some distinct clusters there! We gotta be close now. We can use a vast number of different algorithms to find the clusters, K-Means, Gaussian Mixture Models, hell even Spectral Clustering can do the job. But here’s the catch: most of these algorithms require the number of clusters to be known beforehand. The shining star, my beloved HDBSCAN, can determine the number of clusters automatically. And it is one of the few algorithms that can handle noisy data and outliers AND is scalable at large sample sizes. Beautiful!

We can even generate a speaker timeline for easy viewing:

Save a sample from each cluster, name it and then assign the name to the corresponding cluster. Next time, we can compare a couple of embeddings from the cluster to the known, previously saved samples and determine if they are the same speaker based on a threshold value. Badabing, badaboom! Merge consecutive segments that have the same speaker name and we’re done!

Optimizations

Let’s circle back to the problem of overlapping speakers. Instead of trusting the timestamps from Whisper, we can create a sliding window of, let’s say, 3 seconds with 1 second of overlap and generate speaker embeddings for each window.

Before we embed, reduce dimensionality, and cluster, we should first do some noise reduction. After that, we should also run a Voice Activity Detection (VAD) algorithm to remove the silences so we don’t waste time embedding, clustering, and comparing.

By not using Whisper’s timestamps, we can also parallelize the embedding, clustering, and comparison step with transcription. So that’s a huge plus.

List of tools and libraries used

FFmpeg
noisereduce
PyAnnote’s Brouhaha
Speechbrain’s Speaker Embeddings
PyAnnote’s Speaker Embeddings Weirdly didn’t produced good results for the dimension reduction step.
UMAP
HDBSCAN
Whisper Large v3 Turbo Inside a Docker container for easier maintenance. Specifically this container with cuda tag. The manintainer also has a LOT of other interesting containers, check them out!

Conclusion and a little demo

It’s still not perfect but paired with a decent LLM, this is enough for most of my use cases. Here’s a debate which I don’t suggest you watch but it’s a good example of how the tool handles overlapping speakers, which resembles a chaotic meeting. :^)

And here’s our diarized transcription:

[00:00 - 00:16] GEN: Hi, I'm Gettin and I explore social and controversial issues through both sides. Today I'll be moderating the Middle Ground episode of Pro AI versus Anti AI. We'll be talking about the potential risks, rewards, and fears surrounding the use of AI in our daily lives. The first prompt is, the media has wrongly portrayed AI as a threat.

[00:23 - 00:50] Mariam (overlap with Adib): So the thing is, the media is portraying this whole Terminator situation where it just kills us in that sense. I think the bigger issues are all the alignment problem where the robot does override a kill switch, but then I'm also seeing it as used as a tool in military warfare. So I think that's the actual threat, but the current picture that a lot of media and Hollywood will portray it as these Terminator situations. Totally.

[00:50 - 01:27] Adib (overlap with GEN): I'll give you one example. If you could be working in a factory and an employee gets their arm injured with a machine, sometimes an article will say, oh, a robot attacks employee at a Tesla factory or something. You see it with a car where, you know, there are hundreds and thousands of humans getting into car accidents every day, but there'll be one autopilot accident. And the media will play that out because at the end of the day, saying a Tesla killed a man or a Tesla hurt a man is going to get traffic. And traffic is hundreds of millions of dollars to some of these sites. Bring the disagreeers.

[01:32 - 02:16] Liron: I think the Terminator movie is a gift because the Terminator movie, when you see Arnold Schwarzenegger with a gun taking down a whole city's police force, that gives us some intuition that's a lot better than thinking about how you feel when you're typing on your computer. When you type on your computer, you feel like you're the boss and you can take whatever comes at you through that screen. But if you're a buffalo, right, facing the human race and you're going extinct, buffaloes feel like humans are Terminators. That's what it feels like when there's a Terminator species that's competing with you on the planet. And that is the right intuition. If you want to fast forward five to 20 years, it's going to feel like the robot in your house is a Terminator and it can jump and gun you down and do whatever it wants. And the question is, what will it do? What is it programmed to do? And is it programmed with the right control? And unfortunately, I think the answer is going to be no. We're again,

[02:16 - 02:42] Lauren: anthropomorphizing AI and saying that's a better representation of like the evils that it could potentially hold, right, is to like see it in the form of this human, basically, right? When really the threats that it has or the potential threats are beyond the scope of what humans can do. And so I hope that's not the first moment that people start paying attention to the potential pitfalls of it. It's like when they see it walking around toward them, right? I just

[02:42 - 02:59] Liron (overlap with Lauren): think I'm really talking about the intuition pump that when you bring up your mental image of what is a super intelligent AI, most people are like, oh, it's kind of like Microsoft Word, but it has like a few extra features. And I want you to start from the mental image of the Terminator and then make it even more scary from there. I don't think that's true. I

[02:59 - 03:06] Lauren: think that's what most people think. They think, oh, scary robots AI. They think Arnold Schwarzenegger, I'll be back. Like, that's what they think. If

[03:06 - 03:09] Liron (overlap with Lauren): that's what they already think, then the media's done a good job. I just think that's

[03:09 - 03:35] Lauren (overlap with Liron, Adib): done a really reductive job, actually, of really educating people about what the actual impact of AI could be, how it could affect their daily lives, how it could affect the global, you know, the state of things, the economies, all those things. I think, once again, there's a real detriment to people not really understanding what AI is, what the potentials are, the pros and the cons. And I think that Terminator as like a fulcrum is reductive.

[03:35 - 04:04] Adib (overlap with Liron): This idea that there's this monster in your house that walks around and looks like a human is, you know, again, it's extreme. I don't know how many people are scared of their iRobots or Roombas. But again, like just going back to the media, this is the problem. Like, nobody's going to talk about how, you know, Roomba has helped an old lady keep her house clean without the labors of having to vacuum everything. Or they're not going to talk about like the – there just aren't going to be articles about how some guy had a safe drive home in his autopilot. How do

[04:04 - 04:11] Liron (overlap with Adib): you imagine superintelligence? I'm just trying to help you imagine the raw power of superintelligence. Like, fast forward 20 years, what are you imagining? I think of it

[04:11 - 04:20] Adib (overlap with Liron): as the reduction of the cost of goods, the efficiencies in society so that most of us don't have to spend all day working.

[04:20 - 04:28] Mariam (overlap with Adib): The Industrial Revolution, we became more efficient. We had 40 -hour work days. And now we're still living in 40 -hour work days even though we have like superior technology.

[04:28 - 05:04] Adib (overlap with Mariam): But you got to go before that too. I mean, there were times where people didn't even have days off, let alone, you know, like the amount of effort that it took to sustain life. I think it's gotten better over time. I mean, the Industrial Revolution and factory work and all that has even gone down to an extent. There's a lot more, what I call it, like skilled labor and things like that where people aren't doing as very hard labor as they used to because of machinery and things like that. So, again, like things are going to change. You can't always paint this picture that the world is going to end because of robots. Moving on

[05:04 - 05:11] Mariam (overlap with Lauren, GEN, Adib): to the next one. We asked ChatGPT if it agreed or disagreed with each prompt. Here's what its response to the first prompt was.

[05:11 - 05:41] AI Voiceover (overlap with GEN, Adib): Agree? The media often sensationalizes artificial intelligence as a threat, focusing on dystopian scenarios portrayed in movies and speculative articles. While it's essential to consider potential risks associated with artificial intelligence, such as job displacement and ethical concerns, portraying artificial intelligence solely as a threat overlooks its potential benefits and opportunities for positive impact. A more balanced and nuanced portrayal of artificial intelligence in the media would be beneficial for fostering informed discussions and decision making regarding its development and deployment.

[05:42 - 05:44] GEN: I have been personally affected by AI.

[05:51 - 06:08] Liron (overlap with Adib): I'm anti -AI because I think we're about to enter this uncontrollable extinction scenario. But until we get there, I think life is going to be amazing. I am actually more bullish on how good life is going to be than almost anybody right before it all goes downhill. Here's what I think is happening

[06:08 - 06:25] Adib (overlap with Liron): to you. The reality is the Internet, people make money by driving clicks and virality. And I think the sensationalism, the extreme stories are just scaring you because the reality is AI isn't as like a monster that you think it is. I would disagree

[06:25 - 07:17] Aj (overlap with Ken, Adib): with that because when I had a job as a tech person and there was a lot of layoffs because they were promoting the progression of AI. I also am an actor and we're now seeing SAG -AFTRA signing a deal with the AI agency, which is putting a lot of voice actors and animators at risk of losing a lot of jobs. So right now we're seeing a lot of layoffs and people losing their jobs in this economy where we're already struggling to stay afloat because rent is rising and our wages aren't rising at the same rate as rent. And so in terms of the progression of AI, we're going to see a world where essentially who's going to really benefit from that because the people that already can't afford a lot of stuff right now, you. Yeah, we're only going to see the people that are CEOs that are going to benefit from these companies that are using AI to replace humans who are becoming obsolete at the end of the day. My

[07:17 - 07:43] Ken (overlap with Adib): father, for example, he lost his job as an Uber driver because of a lot of these autonomous driving out in San Francisco. But it's also helped him in some ways where it's helped his he started a new flower shop business and it's helped him with marketing. It's helped him with other things. So I think that not only is it going to affect my dad and my family, but it's going to affect a whole ton of other people in other sectors like acting and voice acting in, you know, retail and all the other industries.

[07:43 - 08:48] Lauren (overlap with GEN, Adib): So I'm a data scientist and a filmmaker. And so I just wrapped post -production on an independent, a micro -budget feature film, $60 ,000 budget, which is basically impossible to do. I don't recommend it. I edited the film. I'm the technical director on the film. I was basically able to do a lot of things to elevate the production value of the film and a lot of things that would have been impossible without the use of AI tools. So I just wanted to go back to your point. You're saying your dad lost his job, but there's also things that, you know, AI helped him do. I think it's always going to be that, right? Like I am a member of SAG -AFTRA. I'm an actor. You know, there's a lot of dangers associated with AI if it's, you know, running rampant, unregulated. There's a lot of unethical uses, but there are ethical uses. If you're not using, you know, data that's scraped or stolen, you know, I can use my own photos and images and create a, you know, 3D sequence or, you know, other things. But obviously education is important there, right? Because we need to understand things about what is ethical and what isn't an ethical use of AI. There's

[08:48 - 09:23] Aj (overlap with GEN): stuff that we can use that's going to, like, fix, like, nuisances in life that's going to help us. However, I don't think we're, like, seeing the, like, the big picture of, like, people that can be affected by it with their, like, livelihoods, their own lives. There's AI -generated images of the Eiffel Tower that was burning and millions fell for it. There's already racial tensions that are currently in the United States. We can use AI to fake videos in the future of people, like, one race seeing something to another race. And we can't even tell, like, what the difference is already right now. So I'm afraid how superintelligence is going to affect the future. I

[09:23 - 09:33] Lauren (overlap with Aj): mean, I agree 100%. But I also think AI in some iteration or form has been around for so long, right, decades. So we've been using AI in our phones and our computers without our knowledge for a long time.

[09:34 - 09:37] Aj: But we are seeing it on TikTok now because it's becoming more popular. Oh, 100%.

[09:37 - 09:54] Lauren: I guess all I'm saying is that now that people are aware of it because we are seeing these really, you know, click -worthy examples of deep fakes and things like that, I'm actually happy because it's going to encourage tech and data and AI literacy, which is really essential. I

[09:54 - 10:18] Mariam (overlap with Lauren): want to add to your point because the thing about the machine learning models that we've already been using in high applications is that it does train on a lot of historical data because, you know, just needs a ton of data to work, get these models to be as accurate as possible. And to your point, a lot of this data is racist and sexist. And so the fact that this technology is just able to proliferate at such a crazy rate that we cannot keep up with as humans is why it's such a danger. So I'm going to

[10:18 - 10:33] Aarushi: give you a use case for that, actually, and I 100 % agree with you. The Apple card, when it was first issued to people, they denied women just because historically women would be under their husband's credit. But if we don't let this technology progress, how will we get that new data so that we can eliminate those biases, right?

[10:34 - 12:11] Adib (overlap with Ken, GEN, Aj, AI Voiceover): AI is the democratization of everything. It reminds me of when music, like now that you can make music in your bedroom and distribute music yourself, like you don't have to all go through a studio that gatekeeps, you know, who gets distributed and who gets technology and studio time. The reality of what AI has done is it's kind of enabled you, if you wanted to start a company tomorrow, let's say you wanted to build an app of some sort, you would need investment funding. The barrier to entry in some of these to start a company is so high because you need money, you need technical co -founders, you need to give up most of your company to other people to make moves along a certain business track, right? But because AI exists, you can build more independent films. It brings the cost of barrier entry so low that in the future, I think what's going to happen is instead of a lot of people working for a few big companies, there's going to be a lot of companies that hire a few people. And so it's just going to flip around where most of us can start a business. Like I can start a business today and because of AI, the barrier to do some marketing, to do some of the coding that I might need, to pull in a lot of the legal advice I might need, it's changed how all of us can start our own business. I mean, I don't know about you, but, you know, working for someone forever, I don't think it's everyone's goal either. I think some people want to go out and start a business or do something, they just have a high barrier to entry for a lot of industries. Hey guys, I'm John. After four years, the Radical Empathy Podcast is now back. There's a new episode out now, so go give it a listen.

[12:12 - 12:16] GEN (overlap with AI Voiceover): It's impossible to stop the growth of AI. Greers, please step forward.

[12:24 - 13:06] Aarushi: Have any of you ever played Checkers before? Yeah. Awesome. Okay, cool. So Checkers has actually been there since the 50s, at least the automated version. Since then, AI has been growing a lot. I'm personally like a machine learning engineer, so I've worked with AI very closely in terms of the models, the data, etc. One thing that I noticed is AI as generative AI has picked up a lot. But if we see this in terms of the bigger picture, if you use text -to -speech, speech -to -text, these are all AI models. And we're using them in our daily lives. And personally, I think it's a net positive there. So I don't see why it wouldn't continue growing.

[13:06 - 13:20] Ken (overlap with GEN): With things like video AI, text -to -video, like Sora coming out, and also robocalls coming out. All these things are really shaping our technology, shaping democracy, and I think that it'll never

[13:20 - 13:29] Lauren (overlap with GEN, Mariam): stop growing. Capitalism still rules most things, and so as long as there's money to be made in this industry, then people will continue to develop and advance AI. Last

[13:29 - 13:46] Mariam (overlap with Lauren): year, when ChatGPT was coming out, and there was all this buzz, like a ton of researchers were trying to sign a letter to pause it. But because of this, again, this arms race in terms of AI and terms of the capitalistic market, as you mentioned, that in practicality, it's not going to be paused.

[13:49 - 14:11] Liron: So I agree with you guys that AI is hard to stop, and there's a profit motive that'll tend to make it non -stop. But if it's as dangerous as something like nuclear weapons, then we better try to stop it, and we have some chance of success, like we did with nuclear proliferation. It's only spread to about nine countries, so there's some hope that we can stop it. We need to stop it. Yeah, I have a question

[14:11 - 14:22] Lauren: about stopping it. Like, if I can download Stable Diffusion and run it locally on my own machine, I guess my question is, you know, from your vantage point, then, how do we stop me from doing that?

[14:22 - 14:41] Liron: It's an excellent question. I do think that there's a point of no return when everybody's laptop can run an Einstein brain or even smarter than that. And I think that we have a couple short years where we better do a very serious ban, or else we're going to be faced with uncontrollable, runaway, super intelligent AI. And I think the urgency of this issue is underappreciated. So AI

[14:41 - 15:07] Adib (overlap with Liron): is going to save more lives than it hurts. I mean, it's going to apply in medicine. It's going to apply in the ability for people to start businesses more effectively. Education. Education. There's so many other places. But, like, this assumption that I think AI is extremely dangerous, I think, comes from just the media sensationalizing it. I think the reality is it's going to help way more people than it could hurt. You had a visual

[15:07 - 16:19] Mariam: reaction. Like, I strongly disagree with that because the way a lot, like, autonomous driving, for example, that research was funded by DARPA, the defense, like, research agency, so that we can use it for autonomous vehicles or tanks in war. And there's so much money being poured, and none of the countries in the world have agreed on stopping or, like, developing this technology. We're already seeing a war being fought with AI. By 2021, the assault on Gaza was the first time that it was an AI war. And currently, the ongoing situation in Gaza, where there is thousands of airstrikes going on since four months in, that is all used by AI. The Israeli Defense Army does talk about their use of AI in it, and they say that it will lower the casualties in civilians. But now we're nearing 30 ,000 dead civilians, 12 ,000 at least are children. So the fact that AI is being so used in military and there's so much money being thrown at it, my own research advisor gets a ton of money from DARPA to fund these type of things. So that's why I strongly disagree because AI is a great tool, but then the fact that it's being applied to military and defense is disastrous, especially to marginalize and occupy terror -like people.

[16:19 - 16:43] Adib: But the reality is even since the 50s, AI has been used. There are all kinds of missiles that use AI and weaponry that use AI. And I think regulating and avoiding war is an important strategy that we need to figure out as a civilized kind of society. But stopping the technology isn't what stops the wars. Humans will always find ways to kill each other. We have to evolve. I'm

[16:43 - 16:51] GEN (overlap with Adib): curious because hasn't a lot of technology even came from the military, though? For example, the computer, for example, even the Internet. How would you respond to that?

[16:52 - 17:23] Mariam: No, of course. I mean, all of our greatest inventions are from the defense. But the thing is, when we talk about the net positive or evil, the thing about any big industrial revolution, such as this AI revolution, is that it will proliferate the current technology and intentions. And so given the current intentions, and we can witness it in our current world right now, how things are going, it's only going to expedite that schedule. And so that's why, yes, it's a great tool, but again, the people using it aren't good actors.

[17:23 - 17:43] Aj: I do just fear for the collateral damage in terms of the people in Gaza. So, like, say all those lives that are lost, and then we just say it was all at the expense of, like, promoting AI at the end of the day, I don't see lives being worth being lost just for progression of AI that can just be useful for us, like, in everyday life.

[17:44 - 18:09] AI Voiceover: It's a moment to next problem. Agree, it's highly unlikely that we can completely halt the advancement of AI. The trajectory of technological progress suggests that AI will continue to evolve and improve over time. Efforts to regulate or control AI development may influence its direction and pace. But complete cessation of AI progress seems improbable given the global interest and investment in AI research and development.

[18:09 - 18:21] GEN (overlap with AI Voiceover): In the world of AI, there will be no more truth. AI will lead to more disparity and wealth.

[18:26 - 18:53] Adib: I make music, personally, and without AI and, like, online platforms, I would not be able to do that. And so even other professions, AI can lead to so much new openings, which can lead to wealth, as well you can ask for help all the time. I'm a student in college, and I feel like AI really has helped me stay on top of my classes, which subsequently will obviously lead to, like, an education degree and just wealth in general. Do you all

[18:53 - 18:59] Ken (overlap with Liron): think that some sort of universal basic income should be used to alleviate some of those disparities in AI? Yeah,

[18:59 - 19:31] Lauren (overlap with Liron): I think part of the reason why I'm pro -adoption of AI is because I think it will bring issues like UBI more to the forefront, right? Like, who was it that was touting that in the 26th? Andrew, yeah. He was talking about that. It seemed like some people were into this. I think that's grown in popularity, the idea of UBI. And I think what he was talking about in 2016 about automation and how that's eliminating the need for many, many different jobs that we need to stop saying, we'll just employ people constantly and say, okay, if machines are doing half the work that needs to be done on the planet, then we need to just pay people.

[19:31 - 19:53] Mariam (overlap with Lauren): It's like, I think, yeah, it'd be really great if we get universal basic income. Like, I would want that. But given our capitalistic market, I'm very pessimistic, especially when we have corporations lobbying politicians. What happens when AI exceeds human capacity? And then it'd be just for the company profits and interest to lay off most workers. And there's no government regulation for all these tech layoffs going on.

[19:54 - 20:32] Adib: I just want to say it's over for the gatekeepers. And they're going to try to regulate their way into stopping all of this. But the reality is there's a lot of discrepancy in wealth today. And for a long time, it's been that way because the gatekeepers get to choose who gets invested in, what communities, what race, what groups get investment. And right now, what's slowly happening is almost anybody can, with very little investment, do certain things like start businesses and create new opportunities for wealth that were never possible before. But

[20:32 - 20:54] Mariam (overlap with Adib): I would counter, like, you know, your idea is like, you know, it's just, you know, wealth disparity has been a thing. But the first, like, huge gap in wealth disparity in our country is the Industrial Revolution, for example, where now we're getting a huge peak in efficiency. So then what's going to stop this trend for, like, continuing to grow with the AI revolution? And to counter your point, these companies are actually doubling down in profits. The billionaires are doubling down in their, like, net incomes.

[20:55 - 21:29] Adib: The reality at the end of the day is we're in this weird spot between the old way of doing things and the new way, where there's more independent kind of smaller businesses that we're all able to start versus the old world where we all grew up expecting to work for the big factory or the one big, you know, the five big software companies or whatever they are. There's a shift in society and how we're going to sustain ourselves. And it's hard and it's weird. Even universal basic income. A lot of people, I'm sure, in the comments are going to be like, that's weird because it's weird for today's situation. But

[21:29 - 21:40] Aj: how are we going to sustain ourselves without the money? Like, you guys keep saying that they're going to open up new jobs. What are those jobs going to be? I feel like we can easily just say, oh, there's going to be new jobs. Oh, AI is going to bring a lot of, like, opportunities. But what are

[21:40 - 22:07] Adib (overlap with Aj): those opportunities? I mean, you think 100 years ago, somebody who, like, worked with horses could imagine the kind of jobs that we have today 100 years later? Like, the problem is we are in it. And it's going to be hard for us to imagine 50, 100 years down the road the thing that people at that time are doing. Who, like, I mean, a social influencer 30 years ago would have been a silly thing to say that somebody does. That's true, but that

[22:07 - 22:17] Aj: was developed by humans, though. So we're using, like, phones in order to become social influencers. With more AI, humans are becoming more lazy. And so how are you going to, like,

[22:17 - 22:32] Adib (overlap with Aj, Liron): put all your... That's subjective. I think humans are going to have more time to be more creative and do other things. The shift to media and content creation and artistry and creativity has already been, I mean, massive over the past few years. The

[22:32 - 22:48] Liron (overlap with Adib): rules have changed. What do you think is the mental skill that human brains can do that AI will never do, even if you extrapolate forward 10 or 20 or 50 years? What do you think is our sustainable advantage? We have emotions. We care. So you think emotions are going to give us an economic role and AI will never have emotions, and because of that, humans can get paid.

[22:49 - 22:58] Aarushi (overlap with Liron): Well, another thing is I feel like AI is as good as what's already out there. So people are always going to be more creative. I think that's just kind of like a counter to that. But what if AI just

[22:58 - 23:04] Liron: gets better? Like, it's obviously getting more creative over time, right? The latest AI is more creative than it was a few years ago. By

[23:04 - 23:16] Aarushi (overlap with Liron): any chance, have you followed, you know, Sam Altman, By Any Chance, OpenAI? Have you followed the reason why he was actually, like, ousted from the board for a chart line and then came back in? Yeah. It was actually, to your point, because of the idea of AGI. Should you

[23:16 - 23:18] Ken (overlap with Liron): explain what AGI is? I'm not familiar with that. Artificial

[23:18 - 23:22] Aarushi: general intelligence, I want to say, if I remember correctly. It's

[23:22 - 24:07] Liron: basically AI that can perform any human skill. So any skill of, let's say, the average human or even the best human, if it can go head -to -head against the AI and a single AI can win in every contest, at that point, we fully achieved artificial general intelligence. And a lot of experts are estimating 5 to 20 years until that AGI point. And after AGI, we get to ASI, which is artificial superintelligence, which is just, it's hard to even fathom, but you can imagine taking the smartest people who ever lived, Einstein and whatnot, and speeding them up by a factor of 1 ,000 running a simulation. And even then, that doesn't get you to what it can do, because it can be even smarter than their brain was, because their brain was only 12 watts of power. It's the size smaller than a basketball, right? I mean, there are smarter intelligences, which is what I wanted to ask you guys. I mean, do you guys not think

[24:07 - 24:18] Adib: that the scale goes beyond human? You're explaining a society where potentially there's cures for lots of things. You're explaining a society where we're able to do all kinds of energy development, where we don't necessarily

[24:18 - 24:33] Liron (overlap with AI Voiceover): have a human environment. Yeah, that could happen as long as the AI is value -aligned. But the problem is, the smallest disalignment between what the AI wants and what humanity wants blows up into a scenario that you cannot undo. So if you don't get the initial conditions right, there's no undo. So that's the next point.

[24:34 - 24:43] AI Voiceover: I'd lean towards agreeing that AI could contribute to wealth disparity. However, it's essential to consider that the impact of AI on wealth distribution will likely vary depending on how it's deployed and managed.

[24:44 - 24:47] GEN: Super intelligent AI will feel love for humanity.

[24:56 - 25:34] Adib: All right, so love in the sense of humans, I think, is tricky. You're talking about, like, an emotion. But even for humans, these chemicals that kind of go through our brain that trigger these emotions and feelings serve a purpose. And so it's part of our programming, quote -unquote. So this idea that somehow love or your emotions or your feelings are not programmed or not part of a systematic design is silly. So I think, like, yes, like robots are in their design by humans, typically are going to care for humans. I think

[25:34 - 25:47] Aarushi: our emotions as well are present in that data. And naturally, that will also get fed into the model. And because of that, I think, like, okay, that machine will output what we can define as love, quote -unquote.

[25:50 - 26:11] Adib: Yeah, so I'd like to say I would replace the word love with a partnership, like you were mentioning earlier. I think that's definitely true, where it's more like a mutual, we work together. But love, like I was mentioning to you earlier, AI does not have emotion. It does not care about anything. It's just there to follow the algorithm and the rules. It has no emotion. It doesn't care about the future or anything. It's just a partnership with humanity.

[26:11 - 26:15] Liron: I mean, what if it simulated a human brain neuron for a neuron? Then wouldn't it feel human emotion?

[26:17 - 26:34] Adib: Well, let me ask you this. Like, let's, you know, your brain releases serotonin and endorphins and stuff like that, right? So you do something, and it releases these chemicals that makes you feel good. Don't you see that as, like, some kind of, like, system design that is encouraging you to have certain behaviors?

[26:34 - 26:38] Adib: Yeah, to some extent, I see what you're saying. But AI

[26:38 - 26:54] Adib (overlap with Adib): cannot... Because you're saying emotion, like, it's some kind of, like, thing that doesn't, you know, that doesn't have a meaning, like, a reason for it existing, right? Even within humans, we have emotions for a reason. Like, there is just certain responses that we have chemically in our brains. We might not understand it.

[26:54 - 26:58] Adib: That's what you're saying, chemically. Do you think it can be, like, a more spiritual, in a sense? I

[26:58 - 27:25] Adib (overlap with GEN): mean, you could take it there. Like, I mean, you know, again, we're allowed to interpret these feelings that we have in different ways. But at the end of the day, even spiritual feelings, all those things help towards bringing us peace or bringing us survival. Things have a reason, right? So, like, the question really is, you know, yeah, like, robots aren't emotional, but they're going to, you know, they're going to respond to a design that's in them.

[27:25 - 27:30] Adib (overlap with GEN, Liron): But does it get anything out of it? That's my question. Does it get anything out of it? Does it make a connection like we do in our

[27:30 - 28:13] Liron: heads? So, it is possible to architect an AI to have it work the way humans do, to have it feel emotions the way humans do. And what I said before is if you just go neuron by neuron, scan the human brain, clone it inside of a computer, then I do think that you will get very human -like AI. Now, the problem is that what we're actually building is what we talked about before. We're doing black box reinforcement where we just ask the AI to say something, and then we basically upvote it or downvote it. And then it repeats the training cycle, and we just see what comes out. And the problem is that we're going to get a super -intelligent AI that basically fooled us, that basically gamed the process, acted friendly to us, and the moment it realizes it's now smart enough and powerful enough, it's like, great. What do I want to do now that these humans are out of the way, now that I've successfully passed their tests and fooled them? That's what's going to happen. It's like the Turing test, is that

[28:13 - 28:14] GEN: what it's called?

[28:14 - 29:03] Liron: So, the Turing test is the ability to simulate a human conversation, and that's largely been passed, which is a huge milestone that's been passed in the last couple years. But this is a separate test. It's often called RLHF, reinforcement learning with human feedback. This is what the AI labs are doing right now in order to make the AIs act friendly, in order for the AIs to say things like, hey, I would never tell you anything that would harm somebody. Because they've been through this RLHF feedback process where humans actually give them downvotes, being like, oh, you weren't supposed to say that. And they actually understand the downvotes, and they figure out what it takes to make the human evaluators give them a good score. But the problem is that that whole process, it works fine for now while the AI is kind of dumb. But it doesn't work when the AI is a genius, because when AI is a genius, it can just game our tests the same way that you would game a test if it was given to you by a five -year -old. You could probably trick the five -year -old giving you the test. That's what the AI is going to do to us. What would a

[29:03 - 29:12] Aarushi: super human or super computer or super AI look like to you, out of curiosity? What could it do that we couldn't? Do you have some concrete examples? Yeah, I

[29:12 - 30:03] Liron (overlap with Adib): mean, the best example is just what the smartest people in the year 2024 can do. That people in the year 1000 just think is complete magic. It's a complete miracle. I mean, if you watch a SpaceX Starship rocket taking off, it's a skyscraper taking off. That's arguably more impressive than a lot of the miracles in the Bible, right? And yet we, as a modern human race, are able to pull that stuff off. If you extrapolate the pattern and you say, what does the civilization of the year 3000 do? That'll get your creative juices flowing. And everything we've said in this conversation, when we talk about, oh my God, the economy, jobs, is it going to create income inequality? You got to think bigger than that. You got to think about events that are discontinuous, like the first nuclear bomb, the first time that a single button could kill 100 million people. Or an extinction event where 96 % of life on Earth just suddenly died because of a super volcano. Events like that happen, and I think AI is one such event. But humans

[30:03 - 30:25] Adib (overlap with Liron): will evolve too. Again, you're looking at like, okay, AI is going to evolve as if humans are just going to stay still. You're familiar with Neuralink. I'll just give you that one example. There's a potential chance that the humans of the future will have been able to converge with the AI in some way or develop a significant kind of symbiotic relationship with AI that you can't even fathom

[30:25 - 30:29] Liron (overlap with Adib): right now. There's a chance, but that's not where we're headed. That's not where we're headed, though. And we only have 10 years. They

[30:29 - 30:32] Aarushi (overlap with Liron): already started doing the first testing phase. But

[30:32 - 30:45] Liron (overlap with Aarushi): if you ask the researchers, the researchers themselves will tell you, ask any researchers. They'll all tell you that the AI that we don't know how to control is about to come online, and all the other AIs that have more hope are way behind. That's what's happening right now. How would

[30:45 - 30:47] Aarushi: that happen? I'm kind of curious. I would say,

[30:47 - 31:40] Mariam (overlap with Aarushi): but to add to your point, there's Jeffrey Hinton. He's called the godfather of AI, for example. And he left Google to warn about the effects of AI, and he compares it to the atomic bomb in that way. And also with your example of superhumans fusing with AI. Like, personally, I would never be in that, like, you know, guinea pig, like, experiment. Yeah, because you look today in 2024. You don't give it 3 ,000. I know, but going back, the way that current AI research is looking is that they're just working with these neural networks. That's the model. And that's the black boxes that we just throw upon a ton of data. We optimize it using some loss function or reward function to then get the best results. But the thing is, we lack explainability, which is why AI is not trustworthy. And so the fact that we are going to deploy it into such a crazy application, like, we're not thinking of the repercussions. And to his point, like, this is an alarming issue. Can you explain, like, why humans

[31:40 - 31:49] Adib (overlap with Mariam): are evil? Like, do you, there are things about humans that are unexplainable as well. Like, there are, I mean, what is evil? Like, why do some humans do terrible things? Why?

[31:49 - 32:02] Mariam: That's a good philosophical question. My deduction is that people act evil out of fear when they believe they are threatened. And I think that with the way the world is going with just, like, the increase in population, mass poverty. There's weirdos

[32:02 - 32:08] Adib (overlap with Mariam): out there killing little kids. What about psychopaths? And they have money and they have a house. And there's just terrible evil out

[32:08 - 32:11] Mariam: there. Oh, yeah. There's psychopaths for sure. But that's a mental disorder. But you can't unpack it.

[32:11 - 32:20] Adib (overlap with Mariam): So, like, just because you don't understand why certain things do certain, like, what do you do about humans in this case? Because we're everywhere. Millions of us. Billions.

[32:20 - 32:29] Mariam (overlap with Liron, Adib): The whole issue is that it's a black box and we don't fully understand how the model makes the decisions that it does. So, we have no idea if, like, there's going to be something really crazy. To you.

[32:29 - 32:31] Liron: Right, but you're not infinitely powerful. But you can explain

[32:31 - 32:37] Mariam (overlap with Aarushi): why you make the decisions that you do, which is what a model lacks. And that's why we don't have trustworthy AI. And that's why we should be more wary of the application. You

[32:37 - 32:43] Aarushi: can print out the layers of each model so you know the output within each model. Is that what you're saying? Like, is that the black box? Like, I'm kind of confused.

[32:44 - 32:58] Mariam (overlap with Ken, Adib): No, it's known. All researchers unanimously agree neural networks are black boxes. But we don't fully understand the AI models. That's why there's so much money being put into transparency and explainability. And I know this because, like, again, a lot of the research labs get money from the government to do this kind of problem.

[32:59 - 33:09] Ken (overlap with Mariam, Adib): One fear that I have is what if AI causes people to become more mentally ill, become psychopaths, become more evil in itself? What if, you know, we're not causing AI to become evil? AI is the other way

[33:09 - 33:11] Adib (overlap with Liron): around. I mean, I think TikTok's already done it.

[33:12 - 34:24] Lauren (overlap with Liron): I want to go back to the original, like, question about whether or not superhuman intelligence will have human emotions. And I think it's kind of a weird impulse, honestly, to try to anthropomorphize AI all the time. Like, we're always talking about it, like, is it close to being human -like? Is it, like, basically like us yet? Does it feel like we do? I think it's weird. I want AI to be something that is outside of human existence. I want it to be a tool. I want us to learn things from it. I want to have the academic impulse to investigate it, to try to figure out what is beneath the surface, what is inside the black box. I don't know. There's something weird to me about, like, this obsession with applying human characteristics all the time to something that I'm like, this is outside of ourselves. It's always going to, you throw compute power at it. It's going to be way more powerful than any of us could individually be, right, once we have 8 ,000 GPUs, you know, pushing it forward or whatever. I don't know. It's just, like, a weird thing. I don't care. I don't want AI to feel things for me. I don't want it to care about me. I don't. It's outside of me. It's not a human. So it should function differently. I want it to be more efficient than I am. I want it to do things better than me. I want to use it to, you know, have a better life. I don't need it to be a human. I don't know.

[34:24 - 34:57] Mariam (overlap with GEN): Yeah, and I agree with your point of it being used a tool. And that's why, like, I think I would fall pro AI if that was the case. But the fact is, it's not only just bad actors. It's also, like, the big tech company actors. Like, your story, your painting of this, like, net positive of AI, it's very reminiscent of how social media was presented to us. It was talked about, like, this is how we connect humans. And then we are seeing studies of, like, skyrocketing anxiety and depression, especially with young children. And so that's why I feel like we need to be more cautious with this technology because it's going to be even more of a huge impact to our society. Let's go on to the next one.

[34:57 - 35:22] AI Voiceover (overlap with GEN): Disagree. It's highly unlikely that super intelligent AI would inherently feel emotions such as love for humanity. Emotions like love are complex human experiences rooted in biological and social contexts. While AI may be capable of understanding and mimicking human emotions to some extent, genuine emotional experiences like love would likely be beyond its capabilities, especially in the absence of human -like consciousness or subjective experiences.

[35:23 - 35:24] GEN: I could change my mind about AI.

[35:34 - 35:58] Aarushi: Personally, I do see, like, a lot of benefits in AI. But, you know, in case your doomsday situation turns out true, I don't know what my opinion will be. So, I don't necessarily have, like, a very strong rationale. It's just more, like, I like to keep an open mind, you know, no matter what. So, I'm always, like, willing to listen to, like, opposing viewpoints and maybe change my opinion from there. Right. I think you guys

[35:58 - 36:18] Aj (overlap with Aarushi): bring a lot of points with how it could benefit society. I would probably change my mind in terms of, like, regulating where AI is used. And if we regulate where it's not used in terms of, like, laws that start from the top, I don't think it should be everywhere to the general public. I do think that would lean into a more scary place. But I do agree that it could bring pros to society for

[36:18 - 36:55] Mariam (overlap with Aarushi): sure. Yeah, and as also from the anti -AI side, I would also switch over to pro -AI if, yes, there's regulations in the applications, such as there's an international agreement on the restrictions of AI being used in warfare. I also believe I'd be pro -AI if we had transparent, explainable AI, so then we can have trustworthy AI and be able to, like, foresee any, like, disastrous consequences. And, of course, regulation in the sense of, like, if tech companies or any company does mass layoffs to replace workers with AI, like, tools, then I think there should be a tax on that company to fund, like, universal basic income. If all of those requirements are met, then I'll switch to pro -AI.

[36:55 - 37:21] Liron (overlap with Ken): What I need to see to change my mind is an AI that starts to actually be fully human -level intelligent or superhuman intelligent, that when you ask it to do something like, hey, how do I go murder someone? It doesn't just happily tell you the answer, but it actually shows that it's, like, much more under control, and it shows no evidence of helping you scheme to murder or just doing whatever you asked. Basically, being superhuman is a threshold where I think all hell breaks loose, and if we somehow get to that threshold and all hell doesn't break loose, then I guess I'm wrong.

[37:22 - 37:52] Ken: I'd be open and willing to change to the pro -AI side if we, again, had some sort of conference or some sort of stop on AI for, like, six months to a year where we can all discuss and place stop gaps into AI so we can figure out, you know, hey, what should we do? What should we not do? And also, we should have some sort of policy that would mitigate the effects of, you know, unemployment, which I think all of us agree that there might be some type of unemployment, unemployment, and maybe shift of jobs, so we should have something to, you know, be prepared for that.

[37:55 - 38:09] Adib: You think China's going to stop? It won't. You think Russia's going to stop? It won't. The reality is you can't be anti -AI. You can't. Like, I mean, I know some of you are, but the reality is it's here. It's happening. But we're talking

[38:09 - 38:19] Aj: about the ethical use of AI, so obviously those places are not going to stop, but we're talking about the jobs that will be lost in the sense where it shouldn't be used in the general public, and we should regulate where it is.

[38:19 - 38:39] Adib (overlap with Ken, Aj): Yeah, we should build universal basic income. We should make the tools that we are building accessible. Like, I totally agree, but everything you're saying is indirectly pro -AI, right? You're basically saying let's invest in AI and make it more available. Let's invest in AI. Let's do the work so that AI is useful, right? Stop it.

[38:39 - 38:43] Ken: Stopping the development of AI for just six months really make us lose to China, make us lose to Russia.

[38:44 - 38:48] GEN: I'm curious because let's say that his doomsday outlook happens, would you still not change your

[38:48 - 38:59] Adib (overlap with GEN): mind? No, because I think AI will be used to stop the other. Like, there's going to be—you have to realize the reality here is there's going to be two sides in any situation. It's not one. There's a

[38:59 - 39:05] Liron (overlap with Adib): reason why the world is currently full of thousands of nukes and not a single effective nuclear weapon defense system.

[39:07 - 39:14] Aarushi: But it's also full of wind turbines powered by, like, nuclear energy as well, right? Am I wrong? Like, we can also use it for that, too. Yeah, we're

[39:14 - 39:17] Adib: using nuclear energy for a lot of good

[39:17 - 39:22] Liron (overlap with Adib): things. We're using nuclear energy, but the nuke trumps everything, right? If somebody uses a nuke, there's no defense. The reason why there's no

[39:22 - 39:27] Adib: nuclear war is because it's mutual destruction. So we're not going to use it. They're not going to use it. Yeah, but

[39:27 - 39:38] Liron: that's not a robust reason, right? I mean, there's been close calls all over. It's barely working, and now you're introducing a harder problem. You're saying, here's the thing that's barely working. Let's do another version of the nuclear problem, except this time there's profit to everybody who works on it. I'm like, ooh.

[39:39 - 39:50] Ken (overlap with Liron): How do we get AI to a point where we can all agree that it's going to cause the same amount of nuclear destruction as an atomic bomb? Because AI could, you know, advance to a state where it could hack into everyone's computer. I mean, I agree

[39:50 - 40:13] Lauren (overlap with Mariam, Liron): with that already. I'm just saying nothing is going to make me change my mind about AI. I think also because of the way that I conceptualize it, which I want all the things, actually, that you've described are your conditions, right, for changing your mind about it. I think all those things are necessary. Just the way that I personally conceptualize and think about AI, I don't think any of those things can happen if we're not on the offensive. But to,

[40:14 - 40:40] Mariam: you know, to my point, too, like the reason why I still stand on the anti -AI is because, like, I'm seeing, like, a lot. I, like, you know, my research is in this area. I see what other people are doing. I see the other, you know, PhD students who are working alongside me, and I don't think they're that concerned with all the things I do bring up, like, to the proportion of the people who are more concerned with the risks of AI. And that's why I still, I think the trajectory we're on, it's going to fall in the net negative. Yeah,

[40:40 - 41:10] Liron (overlap with Adib): and listen to what the AI labs are saying. This is the craziest thing to me. Go to OpenAI's website, one of the leading AI labs. They will explicitly say, hey, we're working on AI as fast as we can. Within 10 years, there will be an AI that's as powerful as a human CEO. We don't know how to make that align yet. And so we have a project that's working on how to align it. Here's a prediction market for whether people think the project is going to succeed. And it's saying low chance of success. This is the actual state of the art from the AI labs are saying, hey, we're really excited about working on this. And by the way, we don't know how to make it friendly to humans, but it's really exciting. Closing statements

[41:10 - 41:11] GEN (overlap with Adib, Liron): from each side. The benefits,

[41:11 - 41:17] Adib: I think, still greatly ultimately outweigh the negatives. And I hope you guys see

[41:17 - 41:36] Aj: that. I'm afraid of the place that we're going to head to in terms of elections. We will never know if this election was accurate, if this election was not accurate. It's just going to be hearsay. People are going to be like there's going to be more racial tensions at the end of the day. I'm just afraid of where humanity is going to go when we progress AI. One more

[41:36 - 42:00] Mariam: closing statement. I would liken this whole conversation about AI to climate change. Currently, a lot of us don't really experience climate change in the worst fatality. But island nations that are suffering from sea level rise are being threatened. And I do liken to AI, like, yes, in our first world country, we might not feel the repercussions. But oppressed, occupied people are feeling the repercussions of AI.

[42:01 - 42:07] GEN (overlap with Lauren, AI Voiceover): Thanks for watching this debate of pro -AI versus anti -AI. If you guys want to shake hands, embrace, now's the time to do something.

[42:10 - 42:22] Aarushi (overlap with Ken, GEN, Adib, Aj, Liron, Adib, AI Voiceover): That was awesome. You guys brought up really good points. Oh, you're amazing. Thank you. You brought up really good points. Great job. You

[42:22 - 42:24] Adib (overlap with Liron): and me got to go to the movies together. We got to

[42:24 - 42:26] Liron (overlap with Adib): do one -on -one podcast. We got to do one -on -one podcast. We

[42:26 - 42:28] Adib (overlap with Liron): got to do one -on -one podcast. We got to do one -on -one podcast. We got to do one -on -one podcast.