Primer Chapter 6: How We Study What We Can't See


This chapter is part of the companion primer to The Inhabited Body. It introduces the major techniques scientists use to study microbial communities — from the earliest microscopes to modern DNA sequencing and beyond. The goal is not to make you a microbiologist, but to give you the vocabulary and conceptual framework you need to follow the evidence presented in the main book — and to understand its limitations.


The Invisible Majority

Everything we have covered in this primer so far — the three domains of life, the architecture of cells, the code written in DNA, the strange biology of fungi and viruses — has been building toward a practical question: how do scientists actually study these things?

For most of human history, the answer was: they could not. Bacteria are typically between 0.5 and 5 micrometres long. A micrometre is one-thousandth of a millimetre. Line up a thousand bacteria end to end, and they might stretch across the head of a pin. Viruses are smaller still — most are between 20 and 300 nanometres, which is to say, between twenty and three hundred billionths of a metre. You cannot see a bacterium with the naked eye, and you cannot see a virus even with a standard laboratory microscope. For the overwhelming majority of our species' existence, the microbial world was not merely unknown — it was unknowable.

The history of microbiome science is, fundamentally, a history of tools. Every major advance in our understanding of the microbial world has followed an advance in our ability to observe it. This chapter walks through those tools — not in exhaustive technical detail, but in enough depth that you understand how the claims in the main book are actually evidenced, and where those methods have blind spots.


Seeing the Unseen: Microscopy

The story begins with glass.

In the 1670s, a Dutch cloth merchant named Antonie van Leeuwenhoek began grinding tiny glass lenses — smaller than a raindrop, but polished to extraordinary precision — and mounting them in handheld brass frames. Using these simple, single-lens instruments, he achieved magnifications of up to 270 times and became the first human being to see bacteria. He called them animalcules — "little animals" — and reported their existence, with meticulous drawings, in a series of letters to the Royal Society of London [lane2015?].

Van Leeuwenhoek's microscopes were light microscopes (also called optical microscopes): they used visible light, focused through glass lenses, to magnify small objects. The principle remains the foundation of microscopy today, though modern light microscopes are vastly more sophisticated. A good research-grade light microscope can magnify up to about 1,000 times and resolve structures down to roughly 0.2 micrometres — enough to see individual bacteria and to distinguish their basic shapes (rods, spheres, spirals), but not enough to see their internal structures in any detail.

Think of it like binoculars. Binoculars let you see a bird on a distant branch — its shape, its colour, perhaps whether it has a crest or a long tail. But they will not show you the structure of its feathers, the pattern of blood vessels in its wing, or the food in its stomach. Light microscopy gives you the equivalent of a bird's silhouette. You can tell that something is there, and you can make out its rough shape, but the fine details remain hidden.

To see finer structures — the internal architecture of cells, the components of viruses — scientists use electron microscopy, which replaces light with beams of electrons. Because electrons have much shorter wavelengths than visible light, electron microscopes can achieve resolutions thousands of times higher. Transmission electron microscopy (TEM) fires electrons through an ultra-thin slice of a specimen and can resolve structures down to a few nanometres — enough to see individual proteins, the layers of a cell wall, or the intricate injection apparatus of a bacteriophage. Scanning electron microscopy (SEM) bounces electrons off the surface of a specimen, producing dramatic three-dimensional images — the iconic photographs of bacteria clinging to intestinal villi, or phages perched on the surface of a bacterial cell like lunar landers, are almost always SEM images.

The trade-off is that electron microscopy generally requires specimens to be dead, dehydrated, and coated in metal. You are looking at a snapshot, not a living process. And while microscopy can show you that microbes are present, it usually cannot tell you which microbes they are. A rod-shaped bacterium viewed under a microscope could be any one of thousands of species. To put names to faces, scientists needed different tools.

> Where This Matters: Chapter 4 of The Inhabited Body discusses how modern imaging techniques, including fluorescence microscopy with species-specific probes (FISH), are now being used to visualise the spatial organisation of microbial communities — revealing, for example, that the gut microbiome is not a random soup but a structured landscape where different species occupy distinct niches.


Growing the Invisible: Culture

If microscopy was the first breakthrough, the second was learning to grow microbes outside the body.

In the late nineteenth century, the German physician Robert Koch and his colleagues developed a set of techniques that would define microbiology for the next hundred years. They invented solid nutrient media — the familiar agar plate, a gel-filled dish on which bacteria can grow as visible colonies — and a rigorous experimental framework, now known as Koch's postulates, for proving that a specific microbe causes a specific disease [blevins2010?].

The logic of culturing is straightforward. You take a sample — a swab from a patient's throat, a drop of blood, a smear of soil — and you spread it on a nutrient medium. Each individual bacterium that can grow on that medium divides, again and again, until it forms a visible clump: a colony. Each colony is a clone — millions of genetically identical cells descended from a single ancestor. You can then pick that colony, transfer it to a fresh plate, and study it in isolation: stain it, look at it under a microscope, test which antibiotics kill it, analyse its metabolic capabilities.

This approach was spectacularly successful for identifying pathogens. The bacteria that cause tuberculosis, cholera, plague, diphtheria, and dozens of other infectious diseases were all identified using culture-based methods. Clinical microbiology laboratories in hospitals around the world — including, quite likely, the one that processes your doctor's samples — still rely on culture as a cornerstone of diagnosis.

But for studying communities of microbes — the diverse ecosystems that make up the microbiome — culture had a devastating blind spot.

The problem is that most microbes cannot be grown using standard laboratory techniques. A bacterium will only form a colony if the medium, the temperature, the atmosphere, and the incubation time are all within its tolerance range. Many gut bacteria are obligate anaerobes — they are killed by oxygen. Others require nutrients that standard media do not provide, or depend on chemical signals from neighbouring species, or grow so slowly that they are overgrown by faster-dividing competitors before they become visible.

In 1985, the microbiologists James Staley and Allan Konopka gave this problem a name: the great plate count anomaly [staley1985?]. When you examined an environmental sample under a microscope, you could count the cells directly. When you tried to grow them, only a tiny fraction — typically less than one per cent — would form colonies. The rest were invisible to culture.

The great plate count anomaly meant that, for most of the twentieth century, microbiologists could only study the minority of microbial species that happened to grow well in their laboratories. The majority of microbial diversity — including the majority of the species living inside the human body — was hidden. Not merely uncharacterised. Unknown.

The solution to this problem would come not from better culture media, but from a completely different approach: reading DNA.

> Where This Matters: Chapters 4 and 19 of The Inhabited Body explore the culture problem in detail — including the recent revival of culture through a technique called culturomics, which we introduce at the end of this chapter.


Molecular Barcodes: 16S rRNA Sequencing

In the late 1970s, Carl Woese and George Fox made a discovery that would reshape both our understanding of life and our ability to study it. By comparing the sequences of a gene called 16S ribosomal RNA (16S rRNA) across a wide range of organisms, they discovered the three-domain tree of life we introduced in Primer Chapter 1 [woese1977].

But the 16S rRNA gene turned out to have a second, even more practical significance. Because every bacterium and archaeon possesses this gene, and because it changes slowly enough to retain similarities between related species yet fast enough to distinguish between unrelated ones, the 16S rRNA gene functions as a molecular barcode — a universal identifier for prokaryotic life.

Think of it like an ISBN on a book. Every published book has an ISBN — a standardised code that identifies it uniquely. If someone handed you a torn page from an unknown book, you could not identify it. But if that page happened to contain the ISBN, you could look it up immediately. The 16S rRNA gene is the microbial world's ISBN. If you can read that gene from an organism — even from an organism you have never seen, never grown, and know nothing else about — you can identify it.

The technique works as follows. You take a sample — a gram of stool, a swab from the skin — and extract all the DNA it contains. You then use a laboratory method called the polymerase chain reaction (PCR) to make millions of copies of just the 16S rRNA genes in the sample, using short synthetic DNA sequences called primers that bind to the conserved (unchanging) regions of the gene. Finally, you sequence those copied genes — read out their nucleotide letters — and compare each sequence against a database of known 16S sequences to identify which organisms were present.

This is 16S rRNA gene sequencing, also known as amplicon sequencing, and it revolutionised microbiology in the 2000s. For the first time, scientists could survey entire microbial communities without growing a single organism. The Human Microbiome Project — the landmark study we will encounter in Chapter 1 of the main book — used this technique to map the microbial communities at 18 body sites in 242 healthy adults, producing the first comprehensive atlas of the healthy human microbiome [hmp2012?].

But 16S sequencing has important limitations. It identifies organisms primarily at the genus level — it can tell you that a sample contains Bacteroides, but often cannot distinguish between closely related species. It only detects bacteria and archaea — not viruses (which have no ribosomes) or most fungi (whose ribosomal genes require different primers). And because it relies on PCR amplification, it can introduce biases: species whose 16S genes happen to amplify efficiently with the chosen primers will be overrepresented, while those with mismatches in the primer-binding sites may be underrepresented or missed entirely.

Most importantly, 16S sequencing tells you who is present but not what they are doing. Knowing that a sample contains a particular species of Faecalibacterium tells you nothing about which genes that species is expressing, which metabolites it is producing, or how it is interacting with its neighbours. For that, scientists needed to go beyond the barcode.

> Where This Matters: Chapter 4 of The Inhabited Body covers 16S sequencing in technical detail, including the shift from operational taxonomic units (OTUs) to amplicon sequence variants (ASVs), and the critical issue of how DNA extraction methods can bias community profiles.


Reading All the Genes: Metagenomics

If 16S sequencing reads a single gene from each organism in a sample, metagenomics reads everything.

Instead of targeting one specific gene, metagenomic sequencing takes all the DNA in a sample — from every bacterium, archaeon, fungus, virus, and human cell present — and chops it into millions of short fragments. These fragments are then sequenced and computationally reassembled, or at least matched against databases of known organisms and genes.

The analogy commonly used is a paper shredder. Imagine taking a library of a thousand different books — some in English, some in Mandarin, some in Arabic, some in languages you have never seen — shredding them all together, and then trying to figure out which books were in the pile and what they were about, using only the resulting strips of text. Metagenomic analysis does something similar, except the "books" are genomes and the "strips" are sequence reads of 150 to 300 DNA letters each.

The power of metagenomics is that it provides not just taxonomic identification but functional information. You can identify which metabolic genes are present in a community, which antibiotic resistance genes are circulating (the resistome), and which virulence factors are encoded. You can detect organisms at the species or even strain level. And you can find things that 16S sequencing misses entirely — viruses, novel organisms with no close relatives in databases, and genes of unknown function that may represent entirely new biology.

The landmark MetaHIT study, published in 2010, demonstrated this power dramatically. By sequencing all the DNA in stool samples from 124 European adults, Junjie Qin and colleagues assembled a catalogue of 3.3 million microbial genes — roughly 150 times the number of genes in the entire human genome [qin2010?].

The trade-offs are cost, complexity, and noise. Metagenomic sequencing generates enormous datasets that require significant computational infrastructure to analyse. Samples from the human body inevitably contain human DNA alongside microbial DNA — in a nasal swab, over 90 per cent of the DNA may be from the host, meaning you must sequence vastly more to capture the microbial fraction. And, like 16S sequencing, standard metagenomics reveals which genes are present but not necessarily which ones are active.


Beyond DNA: Metabolomics and Multi-Omics

DNA tells you what an organism could do. To understand what it is doing, you need to look at its outputs.

This insight has driven the development of several complementary approaches, collectively known as multi-omics.

Metatranscriptomics extracts and sequences the RNA — specifically, messenger RNA (mRNA) — from a microbial community. As we discussed in Primer Chapter 3, mRNA is produced only when a gene is being actively used. The metatranscriptome is therefore a snapshot of gene expression: not the parts list, but the set of instructions currently being executed. It has revealed, for example, that gene expression in the gut microbiome shifts dramatically in response to meals, medications, and even the time of day — even when the underlying DNA composition of the community barely changes.

Metabolomics takes an entirely different approach. Rather than reading nucleic acids, it uses analytical chemistry — typically mass spectrometry or nuclear magnetic resonance (NMR) spectroscopy — to identify and quantify the small molecules (metabolites) present in a sample. These metabolites are, in a sense, the final output of all the genomic activity: the short-chain fatty acids, bile acid derivatives, vitamins, and signalling molecules that the microbiome actually produces and that interact with the host body.

When you read in the main book that the gut microbiome "communicates with the brain" or "influences cardiovascular risk," it is largely through metabolites that this communication occurs. Butyrate, a short-chain fatty acid produced by bacterial fermentation of dietary fibre. Trimethylamine, produced from dietary choline and linked to heart disease. Indole derivatives, which modulate intestinal barrier function. Metabolomics is the tool that identifies and measures these molecules.

The most powerful modern studies combine multiple layers — metagenomics and metabolomics, for instance, or metatranscriptomics and proteomics — to build integrated pictures of what a microbial community is and what it is doing. This multi-omics approach is computationally demanding but is beginning to reveal how microbial communities function as coordinated systems rather than as collections of unrelated individual species [knight2018?].

> Where This Matters: Multi-omics approaches are central to many findings discussed throughout The Inhabited Body, particularly in Chapters 7 (diet and the microbiome), 10 (the gut-brain axis), and 13 (microbiome and metabolism).


The Return of Culture: Culturomics

Given everything we have said about the limitations of culture, it might seem surprising that one of the most exciting recent developments in microbiome science is... better culturing.

In 2012, a team led by the French microbiologist Didier Raoult introduced an approach they called culturomics [lagier2016?]. The strategy is conceptually simple but operationally ambitious: instead of growing a sample on one or two standard media under a single set of conditions, you use dozens or even hundreds of different culture conditions — different nutrient media, different atmospheric compositions (aerobic, anaerobic, microaerobic), different temperatures, different incubation times (days, weeks, sometimes months) — to coax into growth the many species that standard methods miss.

Every colony that appears is then identified, not by traditional biochemical tests, but by MALDI-TOF mass spectrometry — a rapid technique that identifies organisms by the unique pattern of proteins they contain, producing a result in minutes rather than the days required by classical identification. Colonies that cannot be matched to any known species are flagged as potentially novel and subjected to genome sequencing.

Think of it as fishing with every kind of net, bait, and lure you can find, in every part of the lake, at every time of day — instead of casting the same hook from the same spot and concluding that the lake only contains one kind of fish.

The results have been remarkable. Culturomics studies have isolated hundreds of bacterial species from the human gut that had never previously been grown in culture — including some that were known only from their DNA sequences in metagenomic databases, and others that were entirely new to science [lagier2016?]. By 2018, the approach had added more than 200 new species to the catalogue of human-associated bacteria.

Culturomics and metagenomics are complementary. Metagenomics tells you who is there; culturomics lets you get a living specimen. A living culture can be characterised in far more detail than a sequence: you can test its antibiotic susceptibility, study its metabolism, observe its behaviour under controlled conditions, and — critically — use it in animal experiments to test its effects on the host. You cannot do any of these things with a sequence alone.

> Where This Matters: Chapter 4 of The Inhabited Body covers culturomics in detail. Later chapters, particularly those on probiotics (Chapter 20) and faecal microbiota transplantation (Chapter 21), depend heavily on having living cultures of gut bacteria — something that culturomics has made far more feasible.


Germ-Free Animals: Testing Cause and Effect

The tools described above — sequencing, metabolomics, culturomics — can reveal correlations: that a particular species is more abundant in people with a certain disease, for instance, or that a particular metabolite is elevated after a dietary change. But correlation is not causation. To test whether a microbe actually causes an effect, you need an experiment. And the most powerful experiment in microbiome science involves an animal with no microbes at all.

Gnotobiotic animals (from the Greek gnotos, "known," and bios, "life") are raised from birth in completely sterile environments — sealed isolators with filtered air and sterilised food and water. The most commonly used are germ-free mice: animals that harbour no bacteria, no viruses, no fungi — nothing. Their microbiome is a blank slate.

This blank slate is extraordinarily useful. If you want to know whether a specific bacterium affects body weight, you can colonise a group of germ-free mice with that species — and only that species — and compare their weight gain to that of mice that remain germ-free. If you want to know whether the gut microbiome from an obese human can transfer obesity to a mouse, you can transplant the entire faecal community from an obese donor into germ-free mice and watch what happens. (It can, and it does — a landmark experiment we discuss in Chapter 8.)

Germ-free animals have revealed that the microbiome influences immune system development, brain chemistry and behaviour, bone density, fat storage, and dozens of other physiological processes. Without this experimental model, we would know that the microbiome correlates with many aspects of health. With it, we can begin to demonstrate causation.

> Where This Matters: Gnotobiotic animal studies are referenced throughout The Inhabited Body, especially in Chapters 8 (obesity), 10 (gut-brain axis), 11 (immune development), and 14 (autoimmune disease). Chapter 4 discusses the model's limitations and ethical considerations.


How to Read Microbiome Science: A Sceptic's Checklist

Understanding the tools is important not just for appreciating what scientists have discovered, but for evaluating the claims you encounter — in this book, in the news, and in the marketing materials of probiotic companies.

Here is a brief checklist of questions worth asking when you encounter a microbiome study:

What method was used? A 16S study can tell you who is present but cannot identify viruses or provide functional information. A metagenomic study is more comprehensive but more expensive and computationally complex. A study based only on culture may have missed the majority of species. The method shapes what can — and cannot — be concluded.

How big was the study? Early microbiome studies often involved fewer than twenty participants. Modern cohort studies can include thousands. Small studies may detect real patterns, but they are also more susceptible to false positives — apparent associations that do not replicate in larger samples.

Was it correlational or causal? A study showing that people with disease X have more of bacterium Y does not prove that bacterium Y causes disease X. It might be a consequence of the disease, a side effect of its treatment, or a coincidence. Causal claims require interventional experiments — ideally in gnotobiotic animals or in controlled human trials.

Were confounders controlled? Diet, medication, age, geography, and dozens of other factors influence the microbiome. A study comparing the microbiomes of healthy people and people with a disease must account for these variables, or any observed differences might reflect the confounders rather than the disease itself.

Can it be replicated? A finding reported in one study, from one laboratory, using one set of methods, is preliminary. It becomes robust only when independent teams, using different methods and different populations, find the same result.

This is not a counsel of despair. The microbiome field has produced a remarkable body of rigorous, replicated findings that have genuinely advanced our understanding of human biology. But it has also produced a considerable volume of preliminary, overhyped, or poorly designed research — and an even larger volume of commercial claims that far outstrip the evidence. A little healthy scepticism serves the reader well.

> Where This Matters: Chapter 4, Section 4.11 of The Inhabited Body provides a more detailed guide to reading microbiome science critically, including common statistical pitfalls and the problem of "p-hacking" in large omics datasets.


The Tools That Shaped the Story

The technologies we have surveyed in this chapter — microscopy, culture, 16S sequencing, shotgun metagenomics, metabolomics, culturomics, germ-free animal models — are not merely background information. They are the reason the story in this book can be told at all.

Every claim about the microbiome's role in health and disease rests on one or more of these tools. When you read that certain gut bacteria produce molecules that influence mood, that claim was established through metabolomics. When you read that the infant gut is colonised in a specific sequence, that finding came from 16S or metagenomic sequencing of serial stool samples. When you read that a faecal transplant can cure a recurrent infection, that was demonstrated in clinical trials using culture and molecular diagnostics to track outcomes.

Understanding the tools also helps you understand the gaps. The virome is less well characterised than the bacteriome because viruses lack the universal barcoding gene that 16S provides. The mycobiome has historically been overlooked because standard sequencing protocols targeted bacteria. The functional activity of the microbiome is harder to study than its composition because metabolomics and metatranscriptomics are technically demanding and expensive.

If Primer Chapters 1 through 5 gave you the biological vocabulary you need to understand the microbiome, this chapter gives you the methodological vocabulary — the ability to understand not just what scientists have found, but how they found it, and how confident you should be in the findings.

You are now ready for the main book.


Chapter References