TL;DR
This newsletter provides a simple explanation of algorithms. It looks at what they are from both a technical and sociological point-of-view.
Here are the key takeaways:
On its face, an algorithm is an instruction about how to process given inputs that is interpreted and executed by a computer to produce an output.
Traditional algorithms take an input and apply processing rules to produce an output. Machine learning algorithms take inputs and outputs and work out the processing rules to turn those inputs into the desired outputs.
However, such explanations of algorithms only explore their technical functionality. Algorithms are about much more than their lines of code.
The existence of algorithms relies on two interconnected ideas:
Algorithms are created by people.
Algorithms therefore reflect the assumptions, biases and values of their creators.
So no matter the algorithm in question, as well as looking at its lines of code to determine functionality, we should also be asking:
Who developed the algorithm this way and why?
Who implemented the algorithm into the wider process or system and why?
The answers to such questions help reveal the underlying assumptions, biases and values behind the algorithm. This is what algorithms are really about.
What is an algorithm?
To put it simply, an algorithm is an instruction about how to process given inputs that is interpreted and executed by a computer to produce an output.
This means that algorithms are about three things:
Input. What data are being used by the algorithm?
Processing. What is the algorithm doing with the data it is given?
Output. What does the algorithm produce using that data?
You can see a very simple example of an algorithm below:
def times_two(x):
y = x * 2
return y
The above python script shows an algorithm called times_two
that takes an input, does something with it, and produces an output:
Input. The first line shows the data fed into the algorithm. In this case, the input is an integer (a number) which is defined as
x
.Processing. The second line shows how the algorithm is processing the input, which is defined as
y
. It is taking the integer and multiplying it by 2, which is shown asx * 2
.Output. The third line shows what the algorithm produces. Here,
return y
means that algorithm outputs whatevery
is, which in this case is the product ofx * 2
.
So if we take the times_two
algorithm and give it an input of 2
, we can expect output to be 4
. The instruction given to the computer on how to process the input to produce the output is very clear.
Below is an even more complicated example of an algorithm, inspired by Safi Bahcall's book called Loonshots which is about innovation cultures within organisations.
Bahcall defines 'loonshots' as breakthrough ideas that are widely dismissed but end up producing remarkable results. His book explores various such loonshots, from technologies that won wars to drugs that cured diseases.
In exploring how these loonshots come about, Bahcall argues that individual incentives within an organisation play a significant role. These incentives change as the size of the group changes.
When the group grows beyond a certain size, individuals shift their focus from loonshots to career progression ('the politics of promotion'). This shift then causes less innovation to happen within an organisation.
In other words, when a group is smaller, the success of loonshots will have a greater impact on the progress of the group and be more valuable to the individuals within that group. But when the group is bigger, the impact of loonshots may be less and therefore not as valuable to individuals on average.
Bahcall lists the following factors that determine when an organisation shift from loonshots to politics:
Salary growth. As an organisation grows, the salary increases that come with promotions will also likely increase. The bigger the salary growth, the greater the incentive to focus on career progression.
Management span. Bigger organisations usually come with structures where the gap between junior levels and senior levels is wider. It also means narrower management spans, where the number of reports a manager has decreases.
Equity fraction. This is about compensation being closely tied to the quality of work. If you do good work, your stake in the future income generated by the company grows, incentivising greater focus on loonshots rather than individual career progression.
Organisational fitness. High fitness organisations match employee skills and roles very well. Low fitness organisations do not do this well and politics strongly influences promotion decisions.
Bahcall combines these factors together in the following formula:
This formula can be used to create a loonshots algorithm that calculates the size an organisation can reach before incentives shift from loonshots to politics:
def loonshots(E, S, F, G):
M = (E * (S**2) * F) / G
return M
To break down the loonshots
algorithm:
Input. As shown in the first line, the algorithm takes the inputs
E
(equity fraction),S
(management span),F
(organisational fitness) andG
(salary growth). These inputs are either numbers or percentages.Processing. The second line shows how the algorithm processes the different inputs. It executes a formula that combines the different factors outlined above which influence innovation cultures within organisations.
Output. In the third line, the product of the algorithm is represented as
M
which is the result of the formula executed in the second line. This output is the size that a group can reach before it shifts from nurturing loonshots to encouraging politics.
So imagine that the inputs are as follows:
Equity fraction is
50
, meaning that half of an employee's compensation is determined by income generated by the organisation.Management span is
6
, meaning that managers within an organisation have around 6 reports each.Organisational fitness is
1
, meaning that skill and politics ratios are equal.Salary growth is
12
, meaning that each promotion comes with a 12% increase in salary.
These inputs would produce the number 150
. This is the size that an organisation can grow to before innovation culture decreases, given the inputs we provided.
What about algorithms in AI?
Algorithms also feature in AI development, though there is a key difference in how they work.
In the traditional form of programming, algorithms are given both the input and the processing rules to produce a particular output. This makes such algorithms deterministic, meaning that the algorithm will always process the input in the same way to produce the output.
Contrastingly, machine learning (ML) algorithms are provided with inputs and outputs. The purpose of these algorithms is therefore to identify the processing rules necessary to produce the desired outputs from the inputs.
This is shown in the diagram below which comes from François Chollet's Deep Learning with Python.
An example of an ML algorithm is multiple linear regression. This algorithm uses two or more independent variables (input data) to predict the value of a dependent variable (output).
Multiple linear regression can be used to predict the price of a property based on pieces of information about that property. That information could be its size, the age of the building or its proximity to a train station.
The below python script shows what such an algorithm would like look:
example_inputs = [size_sq_feet, age, prox_station]
example_outputs = [rent]
mlr_property_prices = LinearRegression()
mlr_property_prices.fit(example_inputs, example_outputs)
To break down this algorithm:
Input. The first two lines show the data that the algorithm is being trained on. This consists of size (
size_sq_feet
), age (age
) and distance to the train station (prox_station
) of different properties and their corresponding rent prices (rent
).Processing. The algorithm will take the property information and corresponding rent prices and work out the relationship between these variables, as shown by
mlr_property_prices.fit(example_inputs, example_outputs)
. In doing so, it will identify which property information has the highest impact on property prices.Output. The algorithm will use the relationships between the variables that it has 'learned' and apply this to new data (i.e., properties it has not seen before). This is how the algorithm can be used to make price predictions on other properties.
What are algorithms actually?
While they can be explained through lines of code, algorithms represent much more than this.
This is something that is explored in Taina Bucher's book If...Then: Algorithmic Power and Politics. Her argument is that algorithms cannot merely be defined by what they do or how they work.
Instead, we must also consider the relevant circumstances of their creation. Bucher explains why:
...it is important to acknowledge that while we start with the question of how software and algorithms shape sociality by looking at materiality in the more conventional sense as "properties of a technology," the answer cannot be found in these properties alone, but rather the ways in which programmed sociality is realized as a function of code, people, and context.1
The argument made by Bucher can be condensed as follows:
Algorithms are created by people.
Algorithms therefore reflect the assumptions, biases and values of their creators.
Reconsider the loonshots algorithm explored beforehand. That algorithm is a manifestation of the ideas presented by Bahcall on the factors he believes are important for innovation cultures within organisations.
He has determined that group size is the critical variable for identifying when an organisation shifts from innovation to stagnation. And he has also determined the factors that influence the size at which this shift happens.
These ideas can be condensed into a formula to systematically determine when an organisation may shift away from innovation. And this formula can be codified into an algorithm that can be executed by a computer.
If a company decided to use this algorithm to make decisions about its strategy, this could have downstream impacts on its employees. This could range from how much they are paid to the roles they are given.
But when we think about these impacts, our analysis cannot merely be limited to looking at the algorithm's code to learn how it produced the relevant output. This would only tell us about the its technical functionality.
We ought to go further and also determine:
Who decided that size is the best way to measure innovation cultures within organisations and why?
Who decided that equity fraction, management span, organisational fitness and salary growth are the only factors to consider and why?
Who decided to use the algorithm to help make decisions about the company's strategy and why?
This approach can also be used for the ML-based property prices algorithm explored beforehand. We can ask:
Who decided that rent is the dependent variable we should be predicting and why?
Who decided that information like the property's size, age and proximity to the train station are the independent variables that should be used to make the predictions and why?
Who decided we should use multiple linear regression to predict the price of a property and why?
So no matter the algorithm in question, as well as looking at its lines of code to determine functionality, we should also be asking:
Who developed the algorithm this way and why?
Who implemented the algorithm into the wider process or system and why?
By asking these questions, we can reveal the underlying assumptions, biases and values behind the algorithm. This in turn helps to reveal who or what is ultimate source of its effects, and therefore who or what may be responsible.2 [fn, 3]
This is what algorithms are really about.
Taina Bucher, If…Then: Algorithmic Power and Politics (OUP 2018), 8.
Taina Bucher, If…Then: Algorithmic Power and Politics (OUP 2018), 3.