Lingo Bingo at the India AI Summit w/ Karen Hao, Joan Kinyua, Chenai Chair, and Rafael Grohmann
Show Notes
The AI Impact Summit in India is just a couple of days away and we are ready to drown in vague terms that kinda describe AI, and definitely obscure power. Let’s talk about how to reframe those terms…
More like this: The Vaporstate: All Hail Scale at the AI India Summit
We’ve partnered with the AI Now Institute and Aapti Institute to conduct twelve interviews based around the biggest and baddest terms we feel have been co-opted by global summits such as this one. This week we have Karen Hao discussing what it means to be ‘data rich’; Rafael Grohmann on the word ‘sovereignty’ and how it has a hundred definitions; Joan Kinyua on ‘human capital’, a key part of any AI development supply chain; and Chenai Chair, who will discuss ‘linguistic diversity’ — what it is, and what it isn’t.
These are just the best parts of the interviews — if you want to go deep and see each of these interviews in full, head to our Youtube channel now.
Further reading & resources:
- More about Rafael Grohmann — Assistant Professor of Media Studies with focus on Critical Platform and Data Studies at the University of Toronto
- More about Karen Hao — investigative journalist and author of Empire of AI
- More about Chenai Chair — director of the Masakhane African Languages Hub
- More about Joan Kinyua — president of the Data Labellers Association
- More on the Due Diligence Act
- More about the amendment to the Business Laws Act 2024
- What does the notion of “sovereignty” mean when referring to the digital? — Stephane Couture and Sophie Toupin
- Buy The Oracle for Transfeminist Technologies by Sasha Costanza-Chock, Joana Varon, and Clara Juliano
- Watch this week’s interviews in full on Youtube (link to playlist of interviews)
**Subscribe to our newsletter to get more stuff than just a podcast — we run events and do other work that you will definitely be interested in!**
Post Production by Sarah Myles | Pre Production by Georgia Iacovou
Transcript
Alix: Hey there. Welcome to Computer Says Maybe. This is your host, Alix Dunn. And we are about to start another little series. This is a three parter that's in partnership with AI now and the Aapti Institute. If you will recall from last week, we wrapped up the Vaporstate, which was a mini series about all the ways governments around the world are rolling out gigantic technical projects to lots of different effects in different places and how all of that is important context to carry into a big AI summit that is coming up in India.
So just quickly before I dive into what this episode is gonna bring to you, I don't often like to get feedback from my wife on the podcast because oftentimes I can't do anything about her suggestions, and they're usually really good. But this one I have an opportunity to make a difference because we're talking about this summit over and over again, but she pointed out that not everyone knows what it is.
So, in Delhi, in India, there is going to be, I think, 40,000 people descending on the city. Tons of tech leaders, tons of heads of state, tons of civil society all over the world, coming to have a set of conversations, and then a set of side conversations, and then a set of closed door conversations. And it is following on a kind of annual tradition that started in 2023 when the UK government hosted the AI Safety Summit. And then rather than be outdone, South Korea and then France decided to host their summits. And the French summit was the Action Summit, which I guess was supposed to be taking a further step from safety into action.
And then this one is gonna be the Impact Summit. It's being hosted in India. Not particularly a country well known for democratic political environments, let's say. And what happens in these summits, as is very common, and you can also tell in the way that [00:02:00] they get kind of cheekily titled these things, is that there is a use of language that oftentimes is um, they use real words, but because it's this very high level.
Set of conversations, it's oftentimes very superficial. And so those words can be co-opted and used in ways that are sometimes like the opposite of how they're intended to be used. So rather than just kind of sit alongside this summit and complain about the misuse of language. Amba Kak and Astha Kapoor, who we recorded an episode with last week.
We figured we would take back 12 of these good words. Um, and instead of trying to define them ourselves, um, we sought out 12 experts who can speak to each of them and can sort of give them life before the summit begins. Uh, so you can find the full interviews with each of the. Guests that we will have on in the show, this episode and the following two episodes on our YouTube channel, which if you haven't noticed, has been popping [00:03:00] off.
So highly recommend you check that out. If you haven't, um, and we'll link to it in the show notes. But if audio is more your speed, which I would understand, or if you are traveling to the summit and trying to steal yourself for the linguistic violence that will happen in these spaces. This is what you should be listening to.
So welcome. We took all of those 12 interviews, um, which were all super interesting and in full form on YouTube and took the best bits of them across these three episodes. So this is the first, and I'm gonna introduce the guests as we move through the episode, just to give you a sense. We are talking with Karen Hao about what we mean when we say data Rich Chenai Chair about linguistic diversity.
Joan Kinyua about human capital. And finally we're gonna be talking about sovereignty with Rafael Grohmann, who is an assistant professor of media studies at the University of Toronto and focuses a lot on platform and data studies. He's gonna kick us off by telling us about how sovereignty made its way into [00:04:00] Canadian tech discourse as early as the 1970s.
Rafael: There is a kind of playbook for policymakers around the world, and they are preferring to read and to follow this playbook, looking at their respective context before to design policies. So this is one of the reasons sovereignty became one word in these specific places. And if you think about sovereignty, this is not a new word.
I was discovering that like Canada in the seventies was talking about tax sovereignty in that time. Because the seventies was a time where countries, like in Latin America, there was the Marxist theory of dependency or a lot of. Series of dependency and sovereignty is the [00:05:00] opposite, or one of the opposite of dependency.
Being a dependent country in relation to others in the global north in terms of technology means that you are not sovereign in that area. And it's so interesting because nowadays even in the, in the academic sense, we are living in a kind of. Academic is stranger things, nostalgia, like stranger things in the eighties, and like reviving or recovering these old words like imperialism, colonialism, and sovereignty because we didn't address those issues properly in the past.
So the monsters and the zombies are back because we're seeing, uh, uh, the imperialism and the colonialism and all, all these issues. In worst of face. So sovereignty also became with this renew aspect. So in the beginning, sovereignty or in this theoretical terms, sovereignty is, is something good that refers [00:06:00] to both like territories, like states, or even in the popular formats like here in Canada, it's so common.
To talk about indigenous sovereignty or in Latin America talking about popular sovereignty or sovereignty from people, or how people have to take control and autonomy over their lives, over their technologies, and so on and so on. But what's happened was someone in a policy ward decided that sovereignty was a word to start conversations.
So probably like international labor organizations, United Nations. It started to talk about tax sovereignty and digital sovereignty eight years ago or seven years ago. There's a, a, a, a good, strong, one of the foundational papers on digital sovereignty is from 2019 from two Canadian colleagues of mine [00:07:00] called Stefan Ktu and Sophie, and at that time they already said.
That sovereignty is not only one thing. Even in the political sense, what the indigenous communities and peoples are talking about, sovereignty is not the same thing they state or like national states are saying about sovereignty. They have different interests, they have different meanings, and what we mean by digital sovereignty, tax sovereignty can mean different things.
So. Before big tech enter in this debate, this notion of digital sovereignty was already over the, the meanings were over struggles and in dispute among different actors. Oh, this is my preferred notion of sovereignty. This is your preferred notion of sovereignty because it's so like an umbrella term and the notion of digital sovereignty appeared in a time where.
Especially digital platforms or platform [00:08:00] companies, it started to overcome the role of state and the role of civil society. They started to to play as a state, and especially like from 2016 when Trump won for the first time. So sovereignty as a way to address tech dependency. And also to address the way some platforms are trying to circumvent roots in countries like Brazil, where, or even in regions like European Union.
But what some platform companies learn is the way they always do. With a lot of words because you can see the way digital labor platforms like Uber or DoorDash or something, they already co-opted the terms of autonomy, saying that the workers have autonomy over their labor when they are working [00:09:00] for apps like Uber and other companies, co-opted terms like.
Partners, you are not a worker. You are my partner. You are my collaborator. So the, the way companies, especially tech companies within this California Ideolog or Silicon Valley ideolog are, they are always trying to co-opt. The discourse in order to empty the political meanings or to reappropriate the meanings from their own ways.
This is not something really nil, but what happened with this notion of sovereignty in a reaction to the, the proposals, especially in European Union, trying to address regulation and strong regulation for platforms and for ai, and this was a kind of. Communicative or or discursive response from the companies when two, three years ago, meta a alphabet and [00:10:00] Amazon launched digital sovereignty programs.
And you can say like, is it a joke? Why, uh, Amazon? Uh, the leader in infrastructure services and cloud services around the world is launching a digital sovereignty program and what that means, and they are selling sovereignty as a service and they are. Reappropriating the terms like data sovereignty as something individual, as something you can claim you can buy, and especially directed to government.
So government, if you, you are government, especially in the global south, and you want to be sovereign, but you don't know, you don't have, you don't have the infrastructure or the money to be suffering in a proper way. We can sell. Sovereignty for you and what they means. They install a local cloud in the country saying You are sovereignty.
Because [00:11:00] now the cloud and the data centers and something, they are in your own territory, but it's owned by us. And so this is a way in both material SaaS and discursive SaaS to sell sovereignty as a service. And this is the sovereignty claim by big tech or big tech sovereignty.
Alix: We'll hear a little more of Raphael at the end of the episode, but I wanna move on to Karen Howe, a great investigative journalist who you probably know best as the author of Empire of ai.
She's gonna share more about the term data rich and to connect with Raphael's last point, as Karen will explain here. Achieving quote unquote sovereignty is also a massive challenge for a lot of nations 'cause they're still fighting historical colonial powers or empires.
Karen: I've heard this also from like Chilean policymakers and.
From people in Kenya that there's just this, I don't wanna say cynical understanding, but a pragmatic understanding that [00:12:00] they want to have a seat at the table, and the way to have a seat at the table is to find the role that they can play and to kind of make themselves indispensable in some kind of way to the global supply chain of AI development.
It is really hard to figure out when you don't have capital, like what to offer instead, and, and the easiest thing at that point is to offer your people your minerals and your data. That said, I absolutely recognize and I sympathize with. The position that they're in because one of the reasons why they're only able to offer these things is because of also the history of colonialism dispossessing them of the strong economic growth that is experienced by global North countries today.
And so within this particular framework, they feel a bit handcuffed in what they can actually do to. Get into the room and have that access. And at the same time, I, I wish that there was just a more expansive [00:13:00] idea in general about how all countries should be brought to the table, not just based on them having to offer themselves up to extraction and exploitation.
What we're seeing today with the way that Silicon Valley is orchestrating, the development of AI systems is pretty much a match with how empires of old operated, they're consolidating the extraordinary amount of economic and political power by Dispossessing the majority of things like their resources, their land, their labor, their data.
By exploiting that labor in ways where. Even as that labor is contributing to the expansion of the empire and occurring more value to the empire, they're not seeing it themselves. Also, empires engage in control of information flows where they try to either through softer, hard ways, essentially make the narrative.
Such that [00:14:00] they can continue to do what they want and sensor inconvenient truths that undermine their imperial agenda. And we see the air industry engage in that as well by how they control what science is ultimately produced about the fundamental capabilities or limitations of AI systems. And we just don't have a clear picture of that anymore.
In the same way that we wouldn't have a clear picture of that if most climate s in the world were bankrolled by fossil fuel companies. And I think the last important dimension to recognize is that there's a kind of quasi-religious element to a lot of the push for. The AI empire in the same way that religion was a really crucial part of colonial empires as well.
Where there is both this narrative that these companies are the good empire on a civilizing mission to bring progress and modernity to humanity. So there's this moralizing tenet that drives what they're doing, but also it's undergirded by these like fears around. AI [00:15:00] potentially going rogue and devastating humanity.
And that is why they need to have supreme control over the development of this technology because if it falls into the hands of the bad actor, that could be total obliteration of the human race. One of the things I feel really strongly about is like narrative is a huge pillar that props up the empire, and their narrative only works when there's a vacuum of other information about what the actual reality of the impact of their technologies are, because their narratives are detached from reality.
They talk about abundance, they talk about utopia, they talk about theoretical risks like everyone dying. And this is just like not rooted in the reality of. What is actually happening today with real people, average people all around the world. And so when communities, like communities don't even have to like be organizing an intense like protest action like Moosa had, like simply just voicing.
[00:16:00] Documenting and raising awareness to either journalists or directly through their own communication channels about what they are seeing within their communities helps to fill in that vacuum and weaken the ability of these companies to manipulate the narrative and then capture nation, state governments and, and other like more powerful elites.
Actors who are often only being talked to by the corporations themselves.
Alix: And now to talk about filling that knowledge vacuum, we have Joanne ua, who's the president of the Data Labelers Association, an organizing group for data workers to fight for fair pay and better working conditions. And we're gonna dig into the term human capital.
He or she describes what it's like having your work Ivis eyes when you're told you're gonna be a part of something huge and important, but actually you're just subject to a business tactic that's essential for propping up empires.
Joan: Ever since we started having the conversation about the role of data label as is when people started to actually.
Know, [00:17:00] like we are in existence. So it is our voices that made us to be known, and this is for the minority of the people who have had the chance to listen to us or even have interacted with most of the content that we've been putting out in the media or online, in the social media pages. But a majority of the people have been fed this narrative that AI is doing it on its own.
Most of the people think like AI is magic, and I do not know why, why we are being invisibilized and we are. For example, when I started this job, um, you are told like you are going to be recognized. You're part, part of something big and like your hopes are very up. You submit quality after quality. I can tell you for free, the quality that you were supposed to give was 98% and a above.
So when you promised that you're going to be recognized for the self-driving cars in the next maybe 10 [00:18:00] years, you're so pumped about everything and you just want to submit quality jobs. So that was among the things that were really help making us to give quality jobs. But fast forward to 10 years. In fact, the situation got worse in an instance where we were having conversation with these people.
Now you do not have anybody you're speaking to because they've already gotten what they want from you. So we are completely not known completely and recognized and completely like sidelined outta even the conversations. So at the Data Labelers Association, we represent the people who are powering artificial intelligence.
So it's the data labelers, it's the content moderation, it's the people doing transcription tasks. Recently, it's the check moderation that we've just learned of the, like the Eroica GPT. It's basically the hold landscape of the ai. So why we came into existence is because we faced a lot of challenges. And we had no [00:19:00] recourse.
We had nobody to report the challenges to, you are not paid. You've worked for a full month, you're not paid. You do not have anywhere to report that to. You do not have any social protection. You have family, you have children. And most of the time when it comes to maybe working in A BPU, it's set up in a suburb area where you have to use like two vehicles and like the salary you're getting is a recycle or you, you just recycling it to.
Just power the same system, right?
Alix: Mm-hmm. Really quickly, BPO, can you break that? What does that acronym stand for?
Joan: Also the BPU is a business processing outsourcing. It's a model that is used by the big tech to outsource jobs here in Kenya. So basically what they do is just an office. So major decision making is done in the countries where the BPU are held.
So the business processing outsourcing is just the facility that is here in Kenya that is offering the space and the people back, the operations, the major operations, the casts, and everything else comes from the other side. [00:20:00] So like there are no mental safe safeguards when it comes to the workers you find.
Either it's the content or the work that you're doing that is really harming you. It's the nature of the work because like you're always working on eggshells. When you are working in these spaces, it's the pressure, it's the monthly salary that does not make sense and you have a lot of things that the salary's supposed to sustain.
So we came up because of these systemic challenges and after several conversations and after like just being in the space for the one year that we have been in existence. We have come to realize it's sort of intentional to be like that, not to be enforced, not to be paid. Well, it's the system that is also allowing this to happen, right?
So we took it upon ourselves to fill the gap of the, like the knowledge gap that is there of who the data labelers are because there's this blanket. Definition of platform workers, and it usually speaks [00:21:00] about the delivery, the Ubers, but it did not include us. Let me give you an example of Kenya where we have a majority of the people that they are at five years and below, and almost everybody is very well educated.
We all have university degrees, we all can speak in English very well, and there's a good penetration of the internet when it comes to the internet connectivity. We are all very tech savvy people, so that's a very ripe market for big tech. Right? Um, also there's no jobs. Most of us, like we, we find like somebody, um, is a certified act, but there's no jobs and they come to like the work that they're trained.
So what is next is the, what work that is available. So you find that it's cheap labor. Cheap quality labor that is leading people to the side and ready [00:22:00] market. We are very ready. You just give us a platform, promise us you're going to give us like you're going to pay and then we are ready to submit tasks.
So I think that's just the main thing that the reason as why they're concentrated on areas where there's so many young people and unemployment rates are very high. It's a very good, it's a very good space for them to use the people that are there.
Alix: I wanna take a step back thinking a little bit about.
How AI conversations around labor are framed in things like summits when lots of nation states get together, lots of heads of state wanna talk about AI transforming workforces, et cetera, and they talk a lot about up-skilling or workforce transformation, which obviously touches on some aspects of data labeling and some aspects of organizing that, that you're a part of.
How do you feel about that framing when it comes to ai? How do you see that connecting or not connecting to the [00:23:00] work that the people that are represented in the association are doing?
Joan: First of all, when you're speaking about reskilling, upskilling, most definitely, they're not speaking about data level learning.
They're speaking about the engineers and all those big, big titles. Personally, I've been in this space for more than five years, but. I found it very hard to transition to any other field because of the skills that I had learned in this space. Were just drawing boxes, like shapes around things and recognizing different things and stuff like that.
It's very hard for me to have a conversation with another person, sort of like tell them I was working in this field. They're going to ask me for work purpose. So I feel like we're concentrating on like the major fields. I know why they do this is because. This has always been time as an emerging or a future, the future of work or something, but how is it the future while it has been in existence for the last 10 years?
So I feel like it, it's like it is being treated like an entry level job or a transition job. It's [00:24:00] not being treated as a job or we have been working in this space for the longest time and people are continuing to work in this space. Yeah. So. I feel like it's the notion and the narrative of people continuing saying that it's an entry level job, or it's a place where you just pass time.
Even when you read the ads, it's you stay at home mom and you need of, uh, something that will keep you going. Like it's not tamed as something very serious, but. It's something very serious. Even when it comes to contracts, you find that somebody's giving you a monthly contract, a renewable monthly contract, but you are going to renew that contract for five years.
So it's the whole system that is really, really messed up.
Alix: I wanna turn to my conversation with Chenai Chair, director of the Mae African Languages Hub. Which is an organization that works to preserve evolving or dying African languages with ai. And she [00:25:00] explains really well how this is done in a thoughtful and equitable way.
So the term for this interview was linguistic diversity. And as Chennai will explain, people have already been thinking about this for years.
Chenai: In essence, what we are seeing is people catching up. A lot of people who do not speak the most recognized languages like your European languages and your American languages, and I mean American like popular languages have always wanted to have their own languages recognized.
That's why there were efforts where people were kind of like really fighting to think about how did they create data sets of their own languages? And our argument five years ago was that. Tech companies have the data, they just don't simply care for the markets. Now the market is opening up, so what we are seeing is that reaction of like, oh, no person left behind, as we saw with the internet for all with the mobile phones for development movement.
Now we're here in the next phase of development, which is about creating tools and resources and technologies that everyone can [00:26:00] actually make use of beyond being able to speak one of the most dominant languages, which is either English, French, Portuguese, or Spanish. Potential political pitfalls are often sort of thinking about we are moving towards another level of data, which is language and language is also personal identity.
So to what extent are we thinking about the safeguard for involved with actually. Curating this data and what's their point of value extraction that's happening for the people who are actually then providing these language data sets in their own languages? Are they the ones who are actually then building the technologies?
And then also thinking about the issues of, are we going to have this rush to have these data sets that may actually lead to heightened already existing political tensions in those countries when certain languages are prioritized over the others? And that's why we've seen sort of like the success of.
Small community entities actually seeing their languages because maybe the governments wouldn't have invested in them, but they wanted to see their own languages [00:27:00] digitized. Hence that importance and value of ensuring that there are multiple players at the table who are actually doing the work. And then another thing often is we see, like I mentioned, some people started doing this work a long time ago.
There is the oversight of those communities now in the decision making process and the governance of the work. They may not have access to these rooms because. They may not be able to get a visa to come in time. They may not be able to afford the flight. So it's also that question of who actually is being included for the people who've always been doing the work and who's now taking up that center stage.
So that I think has been the significant, um, issue around language creation. And then of course. We exist in, in contexts where there may be increased harms, such as increased surveillance because now people can like better understand how people are talking in their languages, and that's a key issue. And then also one of the regulatory safeguards, we're still existing in the context where some people have not refined their.
Data protection laws or their access to information X. So people are [00:28:00] having this data collected, but they actually are not sure what it's going to be for yet. There is value if people are voluntarily with consent participating in the design of the systems because then they can be able to communicate with their devices in the language of choice.
That's one of the reasons why Masana African Languages hub was set up and Masana African Languages Hub does come from the wider Masana community of people who are doing things by the bootstraps because they were rejected in rooms because the people were like, what do you mean African languages? They have some form of intelligence and they can be written about.
So that was the, like the response and the call to action. So. That stands as an example. Like you just have to look at global majority people who are often excluded in spaces. The tenacity to actually say, Hey, I want to hack the system and I want to see my language represented. And then so what we see there is more a community-led grassroots approach to creating these datasets.
The significant challenge there, of course, that it's not sustainable to always keep everything at community level. [00:29:00] So there's a need to have multiple players in the room. There are over 2000 languages on the African continent, and they are evolving. Some are dying off, but they're dying off and evolving.
And so then the, the significant thing is that. It cannot be done by one entity. Um, so you have to think about how many players can be involved in the space. Who do we need to think about having the multi-stakeholder approach, having linguists, sociologists, the actual community of speakers, entities interested in education, for example, because not a lot of countries actually use their own mother tongue languages for education.
So we've already got use cases ready in play, but then it's a matter of that there needs to be multiple players who are contributing it to it. It's actually being able to evaluate the language I that are created by the people who are hoing up all of this data to see if they really work. Because the significant thing is that sometimes there.
Always miss the cultural nuances. So you'll have hallucinations and you're being told, you know, this name sounds African. And then you're like, [00:30:00] what language is it? And then you realize there's no language. It's just a bunch of syllables put together. And this was experience I recently had, um, making use of an AI gen AI platform.
And I laughed so hard because it sounded familiar in my language. And then when I asked where it's from. It was just like, it sounded like it's African. So those are the sort, so, so those are the realities of like bringing everyone to the table. I have seen examples of some people, I'm just trying to remember which language it was, where they didn't want to have the language digitized.
And so then that actually creates the nuance of intergenerational struggles and cultural issues that actually need to be developed. It's, this is the part where we take out big tech and we actually think about the social norms at hand. So you may find that, for example, younger people are interested in collecting their languages because they're like, this is the way that I'll connect to my family.
And then you may have family members. Older members of the language community actually saying absolutely not. I remember doing this one [00:31:00] project where someone was like, if you take my voice, it can be useful witchcraft. And that's a valid point, right? 'cause that's a cultural understanding and an issue. So in those instances, we have often say that for languages that are created, if there's a community that actually says they don't want their data to be digitized, it has to be recorded and it has to be clear.
It does create a vacuum of governance because you'll have the push of, it's important for the language to digitize, and then you have people saying, but I'd rather not. It's also a means of protection of our community and of self. So those are, I don't have the answer, which is to say that this is something that we're constantly in conversation and awareness of, because we often ask ourself who actually does own the language, like who gives us the permission?
And that's why there's a need to have the. Stage recognized language councils as part of the conversations, but actually going into communities and figuring out, you know, what is the accepted means of getting people's buy-in [00:32:00] and also doing a lot of the work of the explanation of why it's of value if people.
Are curious about it, but if people are not interested, and then you have to acknowledge that that's not gonna happen.
Alix: I wanna revisit my conversation with Karen now because she shared a fantastic story about the Maori who managed to revitalize their language in a way that was careful, consensual and equitable from beginning to end.
What a concept, Karen, to say more here.
Karen: These are like things that are the actual practical ways that we can make. Dents on not just the quality of life for people, but also these much bigger challenges. And then the other example I often give is the Maori one in my epilogue of my book, where there's this organization called Tahi Media in New Zealand, al there are Rio Maori radio station nonprofit.
And so they broadcast in the language of the Maori people and have for decades and. As part of the broader movement to try to [00:33:00] revitalize the Teo Maori language, which was almost lost because of colonization policies. They thought that they could open up this rich archive for Teo Maori Maori learners so that they could actually listen to the sounds of their elders, especially the sounds of their elders who predated colonial distortions of the language.
And so. They were thinking through ways of making the archive more accessible and they felt that they needed to transcribe the archive such that people could listen, but also see the text and then be able to click on the text and get definitions of it and have it be a more interactive learning experience.
But that then required there to be. People transcribing the audio and, and there just are not that many Tero Maori speakers that have that level of advanced language skill and have the time to do that work. So it ended up being the perfect example of a moment [00:34:00] in which AI could be really useful. And so they developed a speech recognition tool for Teo Maori, and in going about developing it, they took.
A fundamentally different approach from the norms in the tech industry. They first went to the community and asked their consent for developing the tool at all, which is. A process. What a concept is. Concept completely. I know it's completely skipped over these days, right? Like it's sad that that's radical, but it is radical.
And once they actually got the permission, then they moved on to educating the community, saying like, in order to develop this tool, we will need data. This is the kind of data we would need. This is why we would need that data. This is how we would store and protect that data once we collect it for this project, to make sure that it doesn't end up getting used in other ways that you didn't actually sign.
Off to, they started a public campaign drive, basically where they got. People in like [00:35:00] a fun competition to participate, consent fully in recording and transcribing and donating their data. And within days, because they had the entire community on board, they were actually able to get enough of the data to train their first performance model.
And then. With that model, transcribe some of the archival audio, clean up those transcriptions, and then create a positive, virtuous feedback loop on training to improve the model more and more and more. Tero Maori is unique in that there are more Tero Maori speakers than other indigenous languages, and there's also some similarities with Terio Maori and other Pacific indigenous languages, and so they've actually opened up their.
Base model to other Pacific languages, to interesting. Enable them to actually like replicate the same thing, but without nearly the same amount of data resources. Because some of these communities, you know, have just [00:36:00] a few thousand speakers or a couple hundred speakers, you know, like I've learned of some initiatives where the indigenous language only has like four speakers left, and so they're working with like a really, really, really limited.
Sample. That entire lifecycle of how they approached the development of this tool is exactly what I wish we would see all around the world because they were. Constantly in communication with their community to make sure that the technology was wanted, that they were designing it in ways that were actually delivering benefit.
They were also preserving the values of the community, like the Tero matter community. Really, really cares about their sovereignty. Like many indigenous communities, they really care about their sovereignty. And that was a huge pillar of the project was regardless of what we do in this project, we will never undermine the sovereignty of our community by somehow, you [00:37:00] know, then giving that data to big tech or something, you know, and those values might be different for a different community, but like the, the community, that community should then decide what they want.
And what they wanna uphold. So this like one size fits all. Model of AI development is inherently so problematic and colonial. If we can shift away from that to thinking about a multitude of small, specialized, localized models that are much more controlled and governed by each individual community, I think we will be in a much more democratic place with AI development.
Alix: Let's briefly hear from Raphael again. Because I think so many of these conversations are about ownership and sovereignty is obviously a big part of that. And what Raphael touches on here is that words like sovereignty have become so elastic that anyone can kind of take them and use them for their own purposes.
So here's Raphael. Now
Rafael: it's the context. Like in the nineties, we lived [00:38:00] in the context of globalization, and we are not living in a context of globalization anymore for a lot of reasons and sovereignty. Can be plastic enough to be used by Trump or by China or by European Union or Brazil in different meanings.
So they can use the same words with so different meanings and participate in the same conversation because they are not the same thing when Trump are claiming sovereignty is not the same word than the homeless working movement. I don't know why they are. All of them are using, uh, the same word. What I know is both in the policy debates, but also in academia.
There are some high language, depending on the context, I can say from. Academia how decolonial became a trend, even in [00:39:00] so colonial spaces saying decolonizing curriculum, decolonizing like comic book and, and also co-opting them and it's about power relations. And how if you create alternative, lets bring these alternatives to us.
I'll finish what with one example. When I was living and working in Brazil, I organized, uh, public seminar on platform cooperative and public policies, and the main delivery platform in the country attended the conference as a spies. They didn't say to the people, they were staff, they are like employees of this big delivery platform in the country.
And people discovered they, they were not like ordinary attendees. They were like spying the event as the delivery [00:40:00] platform. And they say to someone, we want to create cooperatives as well. So when you try to create an alternative. The power, the discursive power will try to avoid this alternative. And discourse is one of this issue.
There is another hype word, which is governance. And governance is really related to sovereignty, and if I put my talk as ai, governance, sovereignty, and the colonial, I will be the best at attendees in the in, in the AI summit on majority board as well. Governance sometimes. It is framed as something, oh, workers have a seat, government have a seat, and everybody has, has a voice on this so-called governance.
But really it's class struggles. It's not something equal. It's profoundly [00:41:00] an equal. And unfair because what you call AI governance is someone having more voice and more participation than others. So it's a kind of fake word to name the real politic.
Alix: I wanna end on a really powerful metaphor, anecdote, something from Joanne ua, which I'll let her explain.
Now,
Joan: in Kenya, usually when you're speaking about good things, you refer to them as cake. So for me. I am the cake for the president. He's the owner of the cake. So instead of selling this cake, he's giving it away. For example, here, there's this bill that was passed the other day. It's called the Business Amendment Loaded 24, and.
There's this part that was touched. Um, when they say that the big tech cannot be sued in Kenya, we can only sue the [00:42:00] BPO. So how can you sue an like a building because the major decision making and everything else, even like the tasks are not here on the other side. So when you see that, you really feel very discouraged.
In fact, like I felt like I wanted to stop this activism job because. You are supposed to be our biggest supporter you're supposed to be, because in Kenya, I can say for now in Kenya, I think AI is the biggest employer to the young guys, to the organizations we know, and to those that we don't know. These jobs are the biggest employers, so instead of banking on the young guys and selling them at the highest rate, he's just short judging them.
For his own benefit to be in the put books of Mark Zuckerberg and in the put books of Elon Musk. It's just that they want to be identified with the big people. They, they want to be part of the [00:43:00] conversation because again, the big tech is in the, they're in bed with the politicians, so there's nothing you can expect from the president.
There's nothing you can expect from anybody in government when it comes to banking on us as workers. And that's why we took it upon ourselves to even speak about the role, the work we are doing. And that is why we are coming up with things for our own. Because when we wait on other people, nobody's going to do it.
They really care about being in their good books with other, like they just care about, I dunno, if it's the money they're being given or it's their reputation, or they just want to be seen at times it's even not beneficial in any way. Just want to have a conversation with them.
Alix: They wanna be bringing the cake to the party.
Karen: Yeah.
Joan: Yeah.
Alix: Thank you to Karen Raphael, Chenai, and Joan. The full interviews are available on YouTube and you get to see their beautiful faces also, which is a plus. Uh, the link is in the show [00:44:00] notes. If you're headed to Deli next week, good luck with your travels. Um, do let us know if you hear these terms used on the ground in ways that you wanna share back with us.
We've been thinking about a buzzword. Bingo. Probably won't put it together, but very curious how it all goes. Next episode features Abeba Birhane, Audrey Tang, Meredith Whitaker, and Usha Ramanathan . So some real heavy hitters. Aren't you lucky? Um, thank you to Georgia Iacovou and Sarah Myles for producing this episode, and we will see you next week.
Stay up to speed on tech politics
Subscribe for updates and insights delivered right to your inbox. (We won’t overdo it. Unsubscribe anytime.)
