Workers throughout the AI life cycle.
A growing body of research shows the
crucial yet frequently forgotten role of human labour in AI. Each stage of an AI product life cycle, from development and production
to maintenance, relies on human labour,
often through digital platforms and business
process outsourcing companies dispersed
around the world. An AI life cycle requires human labour
at three stages, namely, data preparation,
modelling and evaluation (figure II.5). Data
preparation and AI evaluation may require
different levels of content-specific expertise,
while modelling generally requires higher
competences in computer science.
The initial stage, data preparation, involves
data collection and annotation. Despite
the increase of unsupervised learning from
unstructured data, AI systems rely on
annotation by humans to label and mark
data in order to add meaning. Computer vision models, for
example, rely on semantic segmentation,
a time-consuming process requiring each
pixel in an image to be assigned a relevant
label. Similarly, autonomous vehicles rely
on databases of images annotated by
humans through classification, object
tagging and landmark detection.
One source of such annotation is the
use of a captcha [Completely Automated
Public Turing test to tell Computers
and Humans Apart].
While some aspects of data preparation
can be automated, many tasks still require
human judgment. For ChatGPT, for example,
the initial model training involved human
trainers who engaged in conversations,
posing as both users and AI assistants.
To optimize its performance, the model’s
parameters and settings often need to be
adjusted by machine-learning experts.
Creating training data for specialized
fields such as translation or transcription
requires workers with high levels of skill. Medical systems require
professionally trained workers to label
and tag images and videos; common
annotation tasks include the pixel-level
segmentation of surgical images, bounding
box annotations around organs and the
plotting of characteristics within data. Such
tasks can be time-consuming; an hour of
video footage may require approximately
800 hours of human annotation.
The second stage, modelling, is more
complex and technical and requires
significant human expertise and decision-making. Developers and data scientists need to select the appropriate model
architecture and algorithms and therefore
require an understanding of the advantages
and limitations of different models and
algorithms, as well as expertise in a
particular domain, such as medicine or
transportation. During the model training,
when an AI model learns patterns from data,
human operators manage, optimize and
guide the process. Engineers, for example,
need to troubleshoot model errors or issues,
check for signs of overfitting or underfitting and adjust the model’s hyperparameters. Overfitting and underfitting are common problems in statistics and machine learning. Overfitting occurs
when a model is too complex, fitting the training data too closely and failing to generalize well to new data.
Underfitting occurs when a model is too simple, leading to poor performance. One study showed that human judgment remains crucial, since “algorithms cannot always tell the difference
between terrorist propaganda and human rights footage or hate speech and provocative comedy”.
In the final stage, evaluation, humans need
to review the outputs in order to maintain
quality control and feed information
back into further model training. With
regard to translation, for example, human
experts assess the accuracy of machine
translations and diagnose errors, providing
feedback for improvement.
This interplay between humans and
machines extends to large language models
such as ChatGPT. Humans are needed
to evaluate performance both qualitatively
and quantitatively and to ensure a model
meets quality standards and avoids biases
related to gender, race, religion or other
attributes. Human labellers rank model
answers from best to worst, a process
known as reinforcement learning from
human feedback, which helps align systems
with human values and preferences and
to more closely match complex metrics
of human quality.
That is, in monitoring content online,
workers may be exposed to disturbing or
objectionable material that could negatively
affect mental health.
There is also a risk of deskilling and
dissatisfaction due to mismatches
between qualifications and tasks. Workers
annotating or deleting images, that is,
carrying out repetitive low-skill tasks, may
be highly educated. In India and Kenya, for
example, a survey conducted in 2022 on
microtask platforms and business process
outsourcing companies showed that
highly educated workers, with graduate
degrees or specialized educations in
science, technology, engineering ormathematics, were often relegated to
relatively low-skill tasks such as text and
image annotation and content moderation.
Such significant wastes of human capital
may be exacerbated in increasingly
connected job markets, in which tasks are
outsourced globally.
AI systems require continuous adaptation
and, as they are employed to address
new challenges, the demand for workers
for their development will likely persist. AI systems can thus provide new forms of
employment, but this is not necessarily
“decent” work. In the data preparation
stage, for example, employment can
involve exploitative, often-precarious
working conditions. Data annotators in
developing countries often experience
difficult conditions, including up to 10 hours
of work per day at wages of less than
$2 per hour, engaged in repetitive tasks,
and with limited opportunities for career
advancement, for example in Kenya and
Uganda.
With regard to content moderation
(e.g. of social media posts), algorithms
or machine-learning systems can help
flag data for human attention. This
process may be harmful for workers.
Comments
Post a Comment