Breaking Down Problems: What Is First Principles Thinking?

 

What is First Principles Thinking?

First principles thinking is one of the best ways to discover new solutions.

Sometimes called “reasoning from first principles,” it’s a tool to help break down complicated problems by separating what we know is absolutely true from anything that is an assumption. What remains are the essentials. If you know the first principles of something, you can build the rest of your knowledge around them to produce something new.

While you could take this way of thinking down to an atomic level, a lot of value is gained by simply going a level or two deeper than most people. Solutions are based on what you see. Different answers reveal themselves at different levels.

If I hand you a house made from Lego blocks, you know it’s possible to make a house. Thinking at the first layer, you might move a few blocks around and, in the process, slightly improve the house. Most people stop here. They are presented with something that already exists and they endeavor to make it slightly better. Going a layer deeper and breaking the Lego house into individual pieces opens the door to possibility: not only can you build a better house, you can build something entirely different.

Everything that exists is effectively a set of Lego blocks, assembled in a certain way, that can be taken apart and reassembled. A bike is just a seat, chain, body, handlebars, etc. Breaking the bike down into its parts allows you to reassemble the parts into something new. However, you can also go deeper, melting the parts into their core metals and making a shield, sword, or anything else, limited only by material and imagination.

I don’t know what’s the matter with people: they don’t learn by understanding; they learn by some other way—by rote or something. Their knowledge is so fragile!

Richard Feynman

The Basics

The idea of building knowledge from first principles has a long tradition in philosophy. In the Western canon, it goes back to Plato, with significant contributions from Aristotle and Descartes. Essentially, these thinkers sought foundational knowledge that would remain constant and serve as a basis for building everything else, from our ethical systems to our social structures.

First Principles

First principles thinking doesn’t have to be quite so grand. When we do it, we aren’t necessarily looking for absolute truths—millennia of epistemological inquiry have shown us that these are hard to come by, and the scientific method has demonstrated that knowledge can be built only when we are actively trying to falsify it. Rather, first principles thinking identifies the elements that are, in the context of any given situation, irreducible.

First principles do not provide a checklist of things that will always be true; our understanding of first principles evolves as we gain more knowledge. They are the foundation on which we must build, and thus will be different in every situation—but the more we know, the more we can challenge. For example, if we are considering how to improve the energy efficiency of a refrigerator, the laws of thermodynamics can be taken as first principles. However, a theoretical chemist or physicist might want to explore entropy, and thus further break the second law of thermodynamics into its underlying principles and the assumptions that were made because of them.

“To understand is to know what to do.”

— Wittgenstein

Techniques for Establishing First Principles

If we never learn to take something apart, test our assumptions about it, and reconstruct it, we end up bound by what other people tell us is possible. We end up trapped in the way things have always been done. When the environment changes, we just continue as if things were the same, making costly mistakes along the way.

Some of us are naturally skeptical of what we’re told: Maybe it doesn’t match up to our experiences. Maybe it’s something that used to be true but isn’t true anymore. Or maybe we just think differently about something. When it comes down to it, everything that is not a law of nature is just a shared belief. Money is a shared belief. So is a border. So is Bitcoin. So is love. The list goes on.

There are two techniques we can use to change the level where we are looking at a situation, identify the first principles, and cut through the dogma and shared belief: Socratic questioning and the Five Whys.

Socratic Questioning

Socratic questioning can be used to establish first principles through stringent analysis. This a disciplined questioning process, used to establish truths, reveal underlying assumptions, and separate knowledge from ignorance. The key distinction between Socratic questioning and normal discussions is that the former seeks to draw out first principles in a systematic manner. Socratic questioning generally follows this process:

  1. Clarifying your thinking and explaining the origins of your ideas (Why do I think this? What exactly do I think?)
  2. Challenging assumptions (How do I know this is true? What if I thought the opposite?)
  3. Looking for evidence (How can I back this up? What are the sources?)
  4. Considering alternative perspectives (What might others think? How do I know I am correct?)
  5. Examining consequences and implications (What if I am wrong? What are the consequences if I am?)
  6. Questioning the original questions (Why did I think that? Was I correct? What conclusions can I draw from the reasoning process?)

This process stops you from relying on your gut and limits strong emotional responses. This process helps you build something that lasts.

“Because I Said So” or “The Five Whys”

Children instinctively think in first principles. Just like us, they want to understand what’s happening in the world. To do so, they intuitively break through the fog with a game some parents have come to hate.

“Why?”

“Why?”

“Why?”

Here’s an example that has played out numerous times at my house:

“It’s time to brush our teeth and get ready for bed.”

“Why?”

“Because we need to take care of our bodies, and that means we need sleep.”

“Why do we need sleep?”

“Because we’d die if we never slept.”

“Why would that make us die?”

“I don’t know; let’s go look it up.”

Kids are just trying to understand why adults are saying something or why they want them to do something.

The first time your kid plays this game, it’s cute, but for most teachers and parents, it eventually becomes annoying. Then the answer becomes what my mom used to tell me: “Because I said so!” (Love you, Mom.)

Of course, I’m not always that patient with the kids. For example, I get testy when we’re late for school, or we’ve been travelling for 12 hours, or I’m trying to fit too much into the time we have. Still, I try never to say “Because I said so.”

People hate the “because I said so” response for two reasons, both of which play out in the corporate world as well. The first reason we hate the game is that we feel like it slows us down. We know what we want to accomplish, and that response creates unnecessary drag. The second reason we hate this game is that after one or two questions, we are often lost. We actually don’t know why. Confronted with our own ignorance, we resort to self-defense.

I remember being in meetings and asking people why we were doing something this way or why they thought something was true. At first, there was a mild tolerance for this approach. After three “whys,” though, you often find yourself on the other end of some version of “we can take this offline.”

Can you imagine how that would play out with Elon Musk? Richard FeynmanCharlie Munger? Musk would build a billion-dollar business to prove you wrong, Feynman would think you’re an idiot, and Munger would profit based on your inability to think through a problem.

“Science is a way of thinking much more than it is a body of knowledge.”

— Carl Sagan

Examples of First Principles in Action

To better understand how first-principles reasoning works, let’s examine some examples.

Elon Musk and SpaceX

Perhaps no one embodies first-principles thinking more than Elon Musk. He is one of the most audacious entrepreneurs the world has ever seen. My kids (in grades 3 and 2) refer to him as a real-life Tony Stark, thereby providing a convenient opportunity for me to remind them that by fourth grade, Musk was reading the Encyclopedia Britannica, not Pokémon.

What’s most interesting about Musk is not what he thinks but how he thinks:

I think people’s thinking process is too bound by convention or analogy to prior experiences. It’s rare that people try to think of something on a first principles basis. They’ll say, “We’ll do that because it’s always been done that way.” Or they’ll not do it because “Well, nobody’s ever done that, so it must not be good. But that’s just a ridiculous way to think. You have to build up the reasoning from the ground up—“from the first principles” is the phrase that’s used in physics. You look at the fundamentals and construct your reasoning from that, and then you see if you have a conclusion that works or doesn’t work, and it may or may not be different from what people have done in the past.[4]

His approach to understanding reality is to begin with what is true, rather than relying on his intuition. The problem is that we don’t know as much as we think we do, so our intuition isn’t very good. We trick ourselves into thinking we know what’s possible and what’s not.

Musk’s approach is quite different.

He starts out with something he wants to achieve, like building a rocket. Then he starts with the first principles of the problem. Running through how Musk would think, Larry Page said in an

interview, “What are the physics of it? How much time will it take? How much will it cost? How much cheaper can I make it? There’s this level of engineering and physics that you need to make judgments about what’s possible and interesting. Elon is unusual in that he knows that, and he also knows business and organization and leadership and governmental issues.”[5]

Rockets are absurdly expensive, which is a problem because Musk wants to send people to Mars. And to send people to Mars, you need cheaper rockets. So he asked himself, “What is a rocket made of? Aerospace-grade aluminum alloys, plus some titanium, copper, and carbon fiber. And … what is the value of those materials on the commodity market? It turned out that the materials cost of a rocket was around two percent of the typical price.”[6]

Why, then, is it so expensive to get a rocket into space? Musk, a notorious self-learner with degrees in both economics and physics, literally taught himself rocket science. He figured that the only reason getting a rocket into space is so expensive is that people are stuck in a mindset that doesn’t hold up to first principles. With that, Musk decided to create SpaceX and see if he could build rockets from scratch.

In an interview with Kevin Rose, Musk summarized his approach:

I think it’s important to reason from first principles rather than by analogy. So the normal way we conduct our lives is, we reason by analogy. We are doing this because it’s like something else that was done, or it is like what other people are doing… with slight iterations on a theme. And it’s … mentally easier to reason by analogy rather than from first principles. First principles is kind of a physics way of looking at the world, and what that really means is, you … boil things down to the most fundamental truths and say, “okay, what are we sure is true?” … and then reason up from there. That takes a lot more mental energy.[7]

Musk then gave an example of how SpaceX uses first principles to innovate at low prices:

Somebody could say — and in fact people do — that battery packs are really expensive and that’s just the way they will always be because that’s the way they have been in the past. … Well, no, that’s pretty dumb… Because if you applied that reasoning to anything new, then you wouldn’t be able to ever get to that new thing…. you can’t say, … “oh, nobody wants a car because horses are great, and we’re used to them and they can eat grass and there’s lots of grass all over the place and … there’s no gasoline that people can buy….”

He then gives a fascinating example about battery packs:

… they would say, “historically, it costs $600 per kilowatt-hour. And so it’s not going to be much better than that in the future. … So the first principles would be, … what are the material constituents of the batteries? What is the spot market value of the material constituents? … It’s got cobalt, nickel, aluminum, carbon, and some polymers for separation, and a steel can. So break that down on a material basis; if we bought that on a London Metal Exchange, what would each of these things cost? Oh, jeez, it’s … $80 per kilowatt-hour. So, clearly, you just need to think of clever ways to take those materials and combine them into the shape of a battery cell, and you can have batteries that are much, much cheaper than anyone realizes.

BuzzFeed

After studying the psychology of virality, Jonah Peretti founded BuzzFeed in 2006. The site quickly grew to be one of the most popular on the internet, with hundreds of employees and substantial revenue.

Peretti figured out early on the first principle of a successful website: wide distribution. Rather than publishing articles people should read, BuzzFeed focuses on publishing those that people want to read. This means aiming to garner maximum social shares to put distribution in the hands of readers.

Peretti recognized the first principles of online popularity and used them to take a new approach to journalism. He also ignored SEO, saying, “Instead of making content robots like, it was more satisfying to make content humans want to share.”[8] Unfortunately for us, we share a lot of cat videos.

A common aphorism in the field of viral marketing is, “content might be king, but distribution is queen, and she wears the pants” (or “and she has the dragons”; pick your metaphor). BuzzFeed’s distribution-based approach is based on obsessive measurement, using A/B testing and analytics.

Jon Steinberg, president of BuzzFeed, explains the first principles of virality:

Keep it short. Ensure [that] the story has a human aspect. Give people the chance to engage. And let them react. People mustn’t feel awkward sharing it. It must feel authentic. Images and lists work. The headline must be persuasive and direct.

Derek Sivers and CD Baby

When Derek Sivers founded his company CD Baby, he reduced the concept down to first principles. Sivers asked, What does a successful business need? His answer was happy customers.

Instead of focusing on garnering investors or having large offices, fancy systems, or huge numbers of staff, Sivers focused on making each of his customers happy. An example of this is his famous order confirmation email, part of which reads:

Your CD has been gently taken from our CD Baby shelves with sterilized contamination-free gloves and placed onto a satin pillow. A team of 50 employees inspected your CD and polished it to make sure it was in the best possible condition before mailing. Our packing specialist from Japan lit a candle and a hush fell over the crowd as he put your CD into the finest gold-lined box money can buy.

By ignoring unnecessary details that cause many businesses to expend large amounts of money and time, Sivers was able to rapidly grow the company to $4 million in monthly revenue. In Anything You Want, Sivers wrote:

Having no funding was a huge advantage for me.
A year after I started CD Baby, the dot-com boom happened. Anyone with a little hot air and a vague plan was given millions of dollars by investors. It was ridiculous. …
Even years later, the desks were just planks of wood on cinder blocks from the hardware store. I made the office computers myself from parts. My well-funded friends would spend $100,000 to buy something I made myself for $1,000. They did it saying, “We need the very best,” but it didn’t improve anything for their customers. …
It’s counterintuitive, but the way to grow your business is to focus entirely on your existing customers. Just thrill them, and they’ll tell everyone.

To survive as a business, you need to treat your customers well. And yet so few of us master this principle.

Employing First Principles in Your Daily Life

Most of us have no problem thinking about what we want to achieve in life, at least when we’re young. We’re full of big dreams, big ideas, and boundless energy. The problem is that we let others tell us what’s possible, not only when it comes to our dreams but also when it comes to how we go after them. And when we let other people tell us what’s possible or what the best way to do something is, we outsource our thinking to someone else.

The real power of first-principles thinking is moving away from incremental improvement and into possibility. Letting others think for us means that we’re using their analogies, their conventions, and their possibilities. It means we’ve inherited a world that conforms to what they think. This is incremental thinking.

When we take what already exists and improve on it, we are in the shadow of others. It’s only when we step back, ask ourselves what’s possible, and cut through the flawed analogies that we see what is possible. Analogies are beneficial; they make complex problems easier to communicate and increase understanding. Using them, however, is not without a cost. They limit our beliefs about what’s possible and allow people to argue without ever exposing our (faulty) thinking. Analogies move us to see the problem in the same way that someone else sees the problem.

The gulf between what people currently see because their thinking is framed by someone else and what is physically possible is filled by the people who use first principles to think through problems.

First-principles thinking clears the clutter of what we’ve told ourselves and allows us to rebuild from the ground up. Sure, it’s a lot of work, but that’s why so few people are willing to do it. It’s also why the rewards for filling the chasm between possible and incremental improvement tend to be non-linear.

Let’s take a look at a few of the limiting beliefs that we tell ourselves.

“I don’t have a good memory.” [10]
People have far better memories than they think they do. Saying you don’t have a good memory is just a convenient excuse to let you forget. Taking a first-principles approach means asking how much information we can physically store in our minds. The answer is “a lot more than you think.” Now that we know it’s possible to put more into our brains, we can reframe the problem into finding the most optimal way to store information in our brains.

“There is too much information out there.”
A lot of professional investors read Farnam Street. When I meet these people and ask how they consume information, they usually fall into one of two categories. The differences between the two apply to all of us. The first type of investor says there is too much information to consume. They spend their days reading every press release, article, and blogger commenting on a position they hold. They wonder what they are missing. The second type of investor realizes that reading everything is unsustainable and stressful and makes them prone to overvaluing information they’ve spent a great amount of time consuming. These investors, instead, seek to understand the variables that will affect their investments. While there might be hundreds, there are usually three to five variables that will really move the needle. The investors don’t have to read everything; they just pay attention to these variables.

“All the good ideas are taken.”
A common way that people limit what’s possible is to tell themselves that all the good ideas are taken. Yet, people have been saying this for hundreds of years — literally — and companies keep starting and competing with different ideas, variations, and strategies.

“We need to move first.”
I’ve heard this in boardrooms for years. The answer isn’t as black and white as this statement. The iPhone wasn’t first, it was better. Microsoft wasn’t the first to sell operating systems; it just had a better business model. There is a lot of evidence showing that first movers in business are more likely to fail than latecomers. Yet this myth about the need to move first continues to exist.

Sometimes the early bird gets the worm and sometimes the first mouse gets killed. You have to break each situation down into its component parts and see what’s possible. That is the work of first-principles thinking.

“I can’t do that; it’s never been done before.”
People like Elon Musk are constantly doing things that have never been done before. This type of thinking is analogous to looking back at history and building, say, floodwalls, based on the worst flood that has happened before. A better bet is to look at what could happen and plan for that.

“As to methods, there may be a million and then some, but principles are few. The man who grasps principles can successfully select his own methods. The man who tries methods, ignoring principles, is sure to have trouble.”

— Harrington Emerson

Key Takeaways

First principles thinking is the art of breaking down complex problems into their most fundamental truths.

By reasoning from first principles, we identify root causes, strip away layers of complexity, and focus on the most effective solutions. It allows us to step outside the way things have always been done and, instead, see what is possible.

First principles thinking is not easy. It requires a willingness to challenge the status quo. That’s why it’s often the domain of rebels and misfits who believe there must be a better way. It’s the mindset of those willing to start from scratch and build from the ground up.

In a world focused on incremental improvement, first-principles thinking offers a competitive advantage because it is not widely practiced.

The OCR Service to extract the Text Data

Optical character recognition, or OCR, is a key tool for people who want to build or collect text data. OCR uses machine learning to extract words and lines of text from scans and images, which can then be used to perform quantitative text analysis or natural language processing. Here at the Urban Institute, we’ve used OCR for tasks such as automated text extraction of hundreds of state zoning codes and maps and the collection of text from nonprofit Form 990 annual reports.

A plethora of OCR services exist, but we didn’t know which were the most accurate for Urban projects. OCR services can vary by cost, ease of use, confidentiality, and ability to handle other types of data, such as text appearing in tables or forms, so accuracy is just one dimension to consider. Although we haven’t tested every OCR service, we chose four representative examples that vary across these dimensions. Below, we provide a thorough comparison, as well as the code to replicate our accuracy competition yourself:

1. Amazon Web Services (AWS) Textract, which is fully integrated with other AWS cloud-computing offerings

2. ExtractTable, a cloud-based option that specializes in tabular data

3. Tesseract, a long-standing, open-source option sponsored by Google

4. Adobe Acrobat DC, a popular desktop app for viewing, managing, and editing PDFs

Accuracy

The best way to improve OCR accuracy is through data preprocessing. Enhancing scan resolution, rotating pages and images, and properly cropping scans are all methods to create high-quality document scans that most OCR offerings can handle. But practically speaking, many scans and images are askew, rotated, blurry, handwritten, or obscurely formatted, and data cleaning can be too time-consuming to be feasible. We wanted to test the four OCR candidates against the messiness of real-world OCR tasks, so we compared how each tool handled three poor-quality documents.

We converted all 12 pieces of output (4 OCR offerings x 3 documents) into text files for nontabular text and CSV files for tabular text, and we compared them against “ground truth” text, which was typed by a human.

For each document and OCR service, we computed a text similarity score using the Levenshtein distance, which calculates how many edits are necessary to change one sequence of text into another. Because common errors made by OCR software occur at the character level (such as mistaking an “r” for an “n”), this framework made sense for evaluating accuracy.

Extracted text is not always outputted in the same order across OCR offerings (especially in cases of multicolumn formatting, where some services may first read down one column and others may start by looking across the columns). This variability motivated us to use the token sort ratio developed by SeatGeek, which is agnostic of text order. Token sort splits a sequence of text into individual tokens, sorts them alphabetically, and then rejoins them together and calculates the Levenshtein distance as described above, meaning “cat in the hat” and “hat in the cat” would be considered a perfect match.

From our comparison, we found that Textract and ExtractTable lead the way, with Tesseract close behind and Adobe performing poorly. All four struggled with scan 3, which contained handwritten text, but the high performers handled askew and blurry documents without major issue.

Cloud-Based OCR Offerings Outperformed Competitors across All Three Document Types

Cloud-Based OCR Offerings Outperformed Competitors across All Three Document Types

The scores from this “fuzzy matching” procedure generally indicate which OCR offering processed the most text correctly, but a single number can’t reliably tell the whole story. First, the scores are rounded to the nearest whole number, so there is some granularity lost in the comparison. Second, not all errors are created equally. If OCR software interprets the word “neighbor” as “nejghbor,” then token sort scoring will count one incorrect character, but the lexical understanding of that word is not greatly affected. But if the software mistakes “quality” for “duality,” that would totally change the meaning of the word yet yield a similar score.

These scores can serve as useful rules of thumb for OCR accuracy, but they are no substitute for a deeper dive into the output text itself. To allow for this deeper comparison, we published these results, including the original scans, all code, and outputs documents to this public GitHub repository.

We also include an Excel file with the tabular output from Textract and ExtractTable alongside benchmark tables for comparison. The table extraction performance looks comparable between the two services, except for a pair of rows that ExtractTable mistakenly merges. (ExtractTable’s Python library does include a function for making corrections to improperly merged cells to remedy this issue.)

Cost

Open-source options like Tesseract are the most cost-effective choice, but by how much depends on the size of the input and desired output (information from text, tables, and/or forms). AWS Textract charges $1.50 for every 1,000 pages, although it costs more to additionally extract text from tables ($15 per 1,000 pages), forms ($50 per 1,000 pages), or both ($65 per 1,000 pages). The user specifies up front which kinds of text to extract. ExtractTable users purchase credits up front (1 credit = 1 page) and pay on a sliding scale. The price per 1,000 pages to extract tabular data ranges from $26 to $40, and it costs slightly more to extract nontabular text (ranging from about $30 to $45 per 1,000 pages). For jobs that don’t require pulling key-value pairs from forms, Textract is the cheaper of the two cloud-based options, though ExtractTable uniquely offers refunds on bad and failed extractions. Finally, Adobe Acrobat DC requires an annual subscription that charges $14.99 per month (or a month-by-month plan costing $24.99 per month), which includes unlimited use of OCR and other PDF services.

Confidentiality

Although the documents in this competition all consist of nonsensitive text, natural language processing and quantitative text analysis can involve confidential data with personal identifiable information or trade secrets. ExtractTable explicitly guarantees that none of the data generated through purchased credits are saved on their servers, the gold standard here. AWS stores no personal information generated from Textract, though it does store the input and log files. Users can also opt out of having AWS use data stored on its servers to improve its AI services. Tesseract has no built-in confidentiality mechanism and depends entirely on the systems you use to integrate the open-source software.

Ease of use and output

Each OCR user will have a different use case in terms of the output required and a different level of comfort with code-based implementation, so ease of use is an important dimension for each of these offerings. For users looking for a no-code option, Adobe can perform OCR by simply right-clicking on the document in the desktop app. Although Adobe can process documents in batches, the output will be a searchable PDF, which is great for finding text within scanned documents but not for collecting data for text analysis. Converting the searchable PDF to text files is possible, but we find that some of the resulting text can be unintelligible.

We used the tesseract package in R, which provides R bindings for Tesseract. (A Python library is also available here.) Using tesseract is quite simple, and the output can be either a string of text (easily exported to a .txt file) or an R dataframe with one word per row. The creators of the tesseract package also recommend using the magick package in R first to preprocess images and enhance their quality. To keep the playing field level, we did not do that above, but it could lead to improved results for Tesseract users.

ExtractTable’s API and Python library similarly make it possible to process image files and PDFs in just a few lines of code, outputting tabular text in CSV files and nontabular text in text files. ExtractTable also has a Google Sheets plug-in.

The Textract API is less user-friendly, as it entails uploading documents to an Amazon S3 bucket before running a document analysis to extract text in nested JSON format. We use the boto3 package in Python to run the analysis and various pandas functions to wrangle the data into a workable format. Outputting tabular data in CSV format also requires a separate Python script.

All offerings support PDF, JPEG, and PNG input, and Tesseract and Textract can handle TIFF files as well. Adobe will convert other image files to PDF before parsing text, but Tesseract will do the opposite, creating image files whenever the input document is a PDF before running OCR on the new file.

Lastly, if the use case involves extracting text from tables, both Textract and ExtractTable can parse the text and preserve the layout of tabular data. And Textract is the only one of the four options that supports extracting key-value pairs from documents such as forms or invoices.

Conclusions

Ultimately, the right OCR offering will depend on the use case. Adobe is an excellent tool for converting scans to easily searchable PDFs, but it probably doesn’t fit very well into a pipeline for batch text analysis. Tesseract is free and easy to use, and if high accuracy isn’t as important or your documents are high quality, then the open-source, low-hassle model may suit some users perfectly well.

Perhaps unsurprisingly, the paid, cloud-based offerings win the competition, and each offers certain advantages at the margins. Many downstream natural language processing tasks require cloud-computing infrastructure, so if your organization already uses a cloud service provider, offerings such as Textract can plug into existing pipelines and be quite cost-effective, especially at scale. On the other hand, ExtractTable may appeal to individual researchers for its impressive performance, low barrier to entry, and other unique benefits, such as confidentiality guarantees and refunds for bad output.

In part because Urban already uses AWS for our cloud computing, we found Textract best suited large batches of text extraction because of its low cost and integration with other AWS services. But for smaller operations, we found ExtractTable to be a sleeker, more user-friendly alternative that we also recommend to our researchers.



Breaking Down Problems: What Is First Principles Thinking?

  What is First Principles Thinking? First principles thinking is one of the best ways to discover new solutions. Sometimes called “reasonin...