Automate or Die

November 2024.

One way to read the history of technology is to look for things that are no longer crushingly expensive. As a ≈80 year old example: during World War II, a lot of American soldiers were stationed in the UK, where they met and married British women. The prohibitive cost of travel made this bittersweet for the girls’ families:

Parents [of British daughters] often opposed such wartime romances because they were likely to mean that their daughters would emigrate to the United States at the end of the war. In the days before frequent air travel, a single flight across the Atlantic cost around £175, a sea passage between £40 and £65, while the average weekly wage in Britain in 1940 was £4.10s. Many feared they would never see their daughters again — and to the daughters ‘going to America seemed like going to the moon’. [From Overpaid, Oversexed, and Over Here]

Today, the average weekly wage in Britain is around £690, so the equivalent cost from a fraction-of-wages perspective would be about £27,000 for a flight and £6,000 to £10,000 for a boat. You can actually fly from the UK to the US for as little as £300, so things are about 90x better. Travel was crushingly expensive in the past, but no longer.

As another example, here’s the price of lighting over time. Note that this has already been adjusted for inflation, so in 1800 the cost for a million lumen-hours was £10,000 in year-2000-pounds.

One lumen is very dim; a single 60W light bulb produces around 800 lumens. Running this amount of light — a single 60W bulb equivalent — for a year would have cost about £70,000 in 1800 (year 2000 money).

Some color from the high and low ends of the income distribution:

Mary Johnson, in her book on household management of 1775, suggested a family of the middling sort would need to buy some two and a half pounds of [probably tallow] candles per week on average. [Note: this is about 130 hours of single-candle-flame light.] More in the winter, of course, when days in Britain are short; fewer in the summer. The cost of this she estimated as 1s 3d: i.e. about sixpence per pound of candles. To pay for that amount of lighting would be well beyond the reach of labourers or poorer artisans. They would need to economize on light and cost by using rushlights whenever they could. [Rushlights are reeds soaked in animal fat.]

A grand dinner, a ball or an assembly might cost far more in candles than you paid for food and drink. The yearly wage for a housemaid [in the 1700s] was some £3.00, but in the 1760s, the Duke of Newcastle paid £25.00 every month for wax candles to light his London house. On one occasion in 1731, the first Prime Minister, Sir Robert Walpole, had 130 candles lit in the hallway at his grand mansion, Houghton Hall in Norfolk, with another 50 in the saloon. The amazed guests counted them to be sure! The overall cost for his single night of extravagance was £15.00, yet that was a comparatively meagre compared with others. In 1712, the Duchess of Montague was supposed to have paid £200.00 for candles for an assembly lasting one night; and it was claimed the Duke of Bedford illuminated an event of his with 1,000 wax candles, at a cost of £603.00!

If we analogize the housemaid salary to $30k in the USA today, then the “middling family” is spending about $32,500 per year on lighting. The Duke of Newcastle is paying $250k/month on lighting for his London house, and the Duke of Bedford paid $6M for a single night. This is all for a bunch of candles.

Light was crushingly expensive in the past, but no longer.

Our Poverty

In the 1940s we were poor in transportation. In the 1700s we were poor in light. What are we poor in now?

We are poor in intelligence.

Up until the last few years, intelligence was extremely expensive — and you need it for so many things. You need it to drive to the airport. You need it to code up a simple CRUD app. You need it to file your taxes, to respond to your emails, to make your slide deck. Etc, etc.

It’s not just the cost. Up until today, computing has been grossly, painfully time-bottlenecked by human intelligence. You can measure this in crude but telling ways. My computer can write chars to disk at a gigabyte per second, but my meatspace human mind can only write chars at about 10 bytes per second; I am slowing things down by 100,000x.

When a human salesperson gets off a call and has to update their CRM, it might take them 2 or 3 minutes (maybe 10-15 minutes if they’re using Salesforce). Transmitting that data over the internet and updating the database takes, maybe, a second. Again, the human is the bottleneck, and they’re slowing things down by 100x.

Amortized human latency is even worse. I get an email while I’m sleeping; now the delay isn’t just the 2 or 3 minutes it takes me to write a response, it’s the 8 hours it takes me to sleep. People are busy with other things. Imagine a simple job that requires 5 different humans to do 5 different things, in order, gated by the person coming before them in the sequence. Even in high performing orgs, this probably takes several hours at best.

And we’re not counting the “capex” cost of human inference. If you need a lawyer to draft a doc for you, or a Rust developer to write a program, you have to find a human who’s spent hours and hours training so that they can perform their inference task (relatively) quickly. You’re not going to a smart 18-year-old and saying “Hey, please go study corporate law for a couple of months, so that you can write this complex contract for me at the same level as Wachtell & Lipton. I’ll wait until you’re done.”

For “knowledge-worker”, stuff-you-can-do-on-your-laptop tasks that an AI can perform just as well as a human, the time cost premium of using a human is probably something like 100x. The dollar cost premium depends a lot on the task, but is probably also in the ballpark of 10x-100x per minute. In situations where execution speed matters at least as much as inference cost, taking the human out of the loop might be a 1000x+ improvement.

That is big. In comparison, farming productivity has roughly 100x-ed since 1800. (One person today can produce as much food as 100 people in 1800.) The promise of today’s AI systems is a world where software development productivity 1000x-es, lawyer productivity 1000x-es, digital artist productivity 1000x-es, financial analyst productivity 1000x-es, etc. Roughly, jobs that are mostly about reading or writing bits might be 10x-100x cheaper and 100x faster.

Code Gen

Grandiose, long-arc predictions tend to be fuzzy and non-specific. I think it’s interesting to try to imagine a 100x improvement in something tangible. Let’s go with software engineering.

In 2020, the average software engineering workflow looked something like this. Somebody sits down with a laptop and fires up VSCode. If they’re starting a new codebase, they begin writing from scratch, character-by-character. They benefit a lot from pre-existing frameworks (don’t roll your own web framework, just use Express / Django / Rails / whatever), libraries (use pytorch, use pandas, use whatever), and APIs (don’t implement web payments, pay for Stripe), but they’re still writing a lot of code from scratch. They use Stack Overflow to ask questions and to get help fixing bugs.

In 2024, things already look a lot better. Instead of starting from scratch, you might ask ChatGPT or Claude to write you some scaffolding code to get you started. Then you’re using Copilot/Cursor/Supermaven/Zed/Etc. to help you as you go. No more looking things up from Stack Overflow and slowly adapting them to your program. No more poring over API docs. You can move somewhere between 2x and 10x faster than you could pre-LLMs. The “agent” demos are getting closer to reality. Prompt-to-something-that-works will get better.

What could things look like next? I’ll hazard that one near-term advance is that code generation mistakes which are “obviously wrong” will be fixed automatically before the human user ever sees the output. Through some mix of static analysis and just-run-the-code, the AI system will fix the obvious bugs; no more copy/pasting error stacks from the terminal. Going one step further, future systems will probably create some basic mocks and unit tests, and make sure the program handles them correctly.

Human feedback will shift away from “that’s obviously wrong” and towards “that’s not what I meant”. You will prompt “make me a todo list app” and you’ll get a working todo list app, but it probably won’t be what you wanted. Maybe it’s a web app and you wanted something that you could just run locally as personal software. Maybe you don’t like the UI. Etc. You’ll spend some time going back and forth with the system to get what you want, even if each generation is “technically correct” in the sense that it does everything that your (vague, under-specified, possibly contradictory) prompt asks for.

Perhaps the next iteration will be to automate as much of this back-and-forth feedback as possible. The system might do some planning, and figure out how to determine what I mean by “make me a todo list app” with the least amount of my time/effort possible. Maybe it asks me some questions (”did you mean X, or Y”). Maybe it just builds 10 versions, shows me all of them, and asks me which one I like most. Obviously, it keeps a total history of all feedback I’ve ever given on any code gen project, and tries to guess what I, personally, will like most. Obviously, the deployment process is just as automated as the code generation process, and I don’t have to write any Terraform or suffer through configuring AWS IP rules. Obviously, the in-prod maintenance is similarly automated, and I’m not going to get paged in the middle of the night because my Postgres instance is out of memory or whatever.

I’m happy to stop speculating here. When I contrast the above with how software development actually works today, I think we’re already close to a ≈100x improvement.

Not Your Father’s Technological Revolution

Peoples’ intuitions about AI are heavily influenced by the last technological revolution, which was the Internet. The analogy is misleading. The Internet was about communication, but AI is about automation. Many of the biggest Internet companies are either about payments (PayPal, Stripe, Square, etc.), marketplaces (eBay, Airbnb, Doordash, Uber, etc.), media distribution (Netflix), user generated content (LinkedIn, Facebook, Instagram), or information retrieval (Google). These are all flavors of communication technology.

But AI is not about helping humans communicate with each other. It’s about automating human actions. The Internet is not a good historical analogy.

A better analogy is farming mechanization:

Picking was hard work. The cotton bolls were at waist height, so you had to work either stooped over or crawling on your knees. Every soft puff of cotton was attached to a thorny stem, and the thorns pierced your hands as you picked — unless your entire hand was callused, as most full-time pickers’ were. You put the cotton you picked into a long sack that was on a strap around your shoulder; the sack could hold seventy-five pounds, so much of the day you were ragging a considerable weight as you moved down the rows. The picking day was long, sunup to sundown with half an hour off for lunch

In an hour, a good field hand could pick twenty pounds of cotton; each mechanical picker, in an hour, picked as much as a thousand pounds — two bales. In one day, Hopson’s eight machines could pick all the cotton in C-3, which on October 2, 1944, was sixty-two bales. The unusually precise cost accounting system that Hopson had developed showed that picking a bale of cotton by machine cost him $5.26, and picking it by hand cost him $39.41. Each machine did the work of fifty people. [Emphasis added.] [From Nicholas LeMann’s The Promised Land]

Or textile automation:

The capable [manual, pre-industrial] weaver could turn out three or four yards of cloth per day if diligent, and his loom devoured in weft and warp the product of several spinsters. He was paid from six to twenty cents per yard for his work “according to the cloth.” The [manual, pre-industrial] spinster’s stint averaged “a skein,” perhaps two pounds, of coarse yarn per day. Nowadays her expert, but much less strenuous granddaughter manages from one thousand to twelve hundred spindles, running ten sides of spinning frames for fifty-eight hours weekly. She earns about one dollar and a quarter per ten-hour day, and produces thirty-nine hanks, one and a half pounds, of fine thread per spindle, or 1500 pounds in all, several hundred times the possible output of the old-fashioned wheel. A skillful weaver of the modern type, managing five high-speeded power-looms, produces in ten hours from three hundred to three hundred and fifty yards of staple ginghams, twenty-seven inches wide, earning about six- tenths of a cent per yard; or managing eight or ten looms in a Fall River mill, turns out from four hundred and fifty to six hundred yards of common sheeting, seven-eighths yard wide, in a day, and is paid less than one-half cent per yard. [From Nourse’s “Some Notes upon the Genesis of the Power Loom in Worcester County”]

Moving bits is easier than moving atoms. Knowledge-worker jobs might be even more amenable to automation than weaving or farming. And the adoption curve will probably be faster. It’s not a story of ≈100 years for the loom, it’s probably more like ≈10 years.

Predicting this doesn’t require any spooky beliefs about summoning an ASI god. We’ll simply automate away most of what knowledge workers currently do, just as we’ve already automated away the jobs of 1700s weavers, 1800s farmers, or 1900s factory workers. From a long-term perspective, this can’t happen soon enough because today we are paupers. We’re as poor today in knowledge-worker automation as our ancestors were in cloth, in food, in light, etc. Soon, no longer.