From the course: How to Measure Anything in AI: Quantitative Techniques for Decision-Making
Treating AI as knowledge work
From the course: How to Measure Anything in AI: Quantitative Techniques for Decision-Making
Treating AI as knowledge work
- At first glance, measuring AI seems like a brand-new measurement problem that has never been addressed before, but it actually has precedent in measurements in business. To see this, let's look at another measurement problem that organizations have been dealing with for a long time before AI was available, specifically the productivity of knowledge workers. Knowledge workers don't produce a part in a factory. Their output is structured information, judgements. They produce advice, computer code, analysis, presentations, and sometimes those contribute to sales or completing an assigned mission. Let's consider three such knowledge workers who work in the same company performing the same roles: John, Aziz, and Maria. How would you measure their performance and compare them? You might want to evaluate who is faster at a given task or who is objectively better producing higher-quality work on a given task. Or who is the most affordable, cheapest resource for a given task when you take into account the true cost of the combination of speed of the task, as well as the cost of rework and fixing errors? These are common trade-offs that businesses and organizations weigh up all the time. For a given task, we know how to measure who finished a task faster. If their labor costs are different amounts, we can determine who did it more cheaply. We know how to determine whose work is better based on the number of errors in output. If we want to know who is doing it better, we might consider measuring error rates. Using both of these elements and the time spent to fix errors, we can determine, along with the cost of an employee, who is actually more economical overall for the task. In the case of a task involving something like writing quality, we want to go beyond mere error rates. We might, instead, rely on qualitative grading of those reports. As a control to ensure objectivity, we would want to hide who produced the report from the reviewers. In other cases, we might rely on customer feedback. If it is more directly related to sales, we can look at who sells more. These are fundamental components of measuring knowledge work in humans. Let's suppose one of those workers, John, was actually an AI. Would the work output be evaluated any differently? What if Aziz was using AI to support his work? Would the work be evaluated any differently in that case? The answer to those questions is no. The output metrics are still the amount of time taken to complete a task, the number of errors, and the true economic cost and benefit, which leverages the time and quality of the work. The methods for measuring all aspects of knowledge worker performance are already well-developed. Research on this topic, which we will discuss further in this course, demonstrates that if you compare humans, AIs, and human-AI collaboration hybrids, you can directly compare the quality of the work product, error rates, and overall cost. So when determining the value of AI, you would evaluate that based on how much faster, better, or cheaper it made the same task, just as you would if you were directly comparing three workers. This relates to one of the measurement maxims that we will introduce later in this course, that it probably has been measured before. In the next lessons, you will learn how common misconceptions about measurements in general get in the way of measuring AI, and we'll cover some simple ideas to start measuring AI in your organization.