It is often easy to sort people into broad geographic categories simply by looking at them. Recognising parental and sibling relationships may be equally easy but determining more distant or tenuous connections demands detailed analysis which nowadays can come by looking at peoples' genes. Genes are the "instructions" we all have for making each of us just the way we are - instructions not only for major features like sex, arms and legs, but also for all our more subtle characteristics. Although in the laboratory the analysis is complicated, retrieving a gene sample from an individual is simplicity itself: just gently scrape the inside of the cheek with a clean swab. The scientists do the rest.
In each us, in almost every one of our cells, the genes form part of tiny structures called "chromosomes". There are 23 pairs of them, one member of each pair being contributed by the mother in her egg, the other by the father via his sperm. From the original fertilised egg, the chromosomes are copied over and over again so that as we grow each new cell in our bodies has a full set. Or nearly every cell: apart from red blood cells which lose their genes in the course of development, the exceptions are the sperm and eggs which contribute to the next generation when we have our own children. In sperm and egg cells, only one member of each pair of chromosomes is present so that when the two come together in the act of fertilisation, the full set is restored.
In 22 out of the 23 pairs, the two chromosomes (one originally from the mother, the other from the father), while not identical, are so similar that from time to time they actually interchange bits between themselves; eventually, individual chromosomes come to represent both maternal and paternal lineages. But the 23rd "pair", called "X" and "Y", are very different and almost no exchange can take place. Females possess two Xs in the 23rd pair (which can indeed undergo exchange) and their eggs therefore have just one X. Males, however, carry one X plus one Y; as sperm have only one or the other, about half of a man's sperm are X and half are Y. If an egg if fertilised with an X sperm the resulting individual is XX and female; if with a Y sperm, the combination is XY and hence a male. That is how boys and girls result from the chance mating of an egg with one or other type of sperm.
Because the Y chromosome cannot readily exchange any of its substance with X, the Y of each living man directly resembles that of his father, his grandfather, great grandfather and so back into history. Analysis of Y chromosomes is therefore potentially a very powerful way of determining historical male lineages and the relationships between contemporary living males. That particular type of record does not hold for females because of the interchange between the two X chromosomes of the 23rd pair (one from the mother, the other from the father).
The genetic instructions on the chromosomes are encoded ("written", one might say) in the form of a sequence of certain chemicals; we could analogise them to letters, just as in a book. However, no more than 2-3% of these letter messages are actually meaningful; the rest appear to have no function and are often called "junk" (though whether or not they are really junk nobody knows for sure). As the DNA is copied from cell to cell and from generation to generation it is, of course, very important not to make mistakes because then the instructions would change and offspring might suffer some damaging or even fatal defect. Mistakes nevertheless do occur by accidents of chemistry: they may take the form of putting in the wrong letter, adding extra letters, deleting some of them or repeating whole sections. Very often, mistakes result in the death of the cell or organism in which they occur, or reduce its ability to breed, and so are quickly eliminated. But changes in the 97% or so of the chromosomes which are junk don't matter because junk is junk and is not used for coding. Junk changes are therefore handed down willy-nilly from generation to generation (father to son in our Y chromosome example): what we do is look for them. Some have already been identified by other scientists so we know where to look; others we will have to find for ourselves. Using as many of these junk changes as we can, we test the sample recovered from each donor for the state of affairs at each junk site we are testing: what form does it have ? Clearly, the more similar any two individuals are, the more closely they are related; in other words, the more recently they shared a common ancestor. The more different they are, the further back in time was their common ancestor. In this way paternal family trees can be constructed.
These types of analyses are not confined to samples taken from living people. It is often possible to recover DNA from the bones of people (and plants and animals) long since dead. Many such archaeological samples are available with well-established provenances. Whether or not a particular specimen of ancient DNA is actually useful in providing information depends on its state of preservation: sometimes one can go back thousands of years and in one (animal) example, the mammoth discovered frozen in Siberia, informative samples were recovered which had lain buried in ice for at least 50,000 years and perhaps much longer.
Our group based at University College London, with links to the Hadassah Hospital in Jerusalem and elsewhere, is using Y chromosome analysis to explore the relationships between populations. We can ask how peoples are related. In the cases of migrations and colonisations we can try to find out if the all the colonists were men and whether they married local women or brought their own womenfolk with them. (One can do that because there are also genetic methods for exploring the maternal lineage.) We can attempt to trace migratory routes, settlements and intermixing between the migrants and indigenous population.
One of our particular areas of interest concerns the movements and relationships of the peoples of the ancient and modern Middle East. They include, of course, the Jews, both of the mainstream Ashkenazi and Sephardi communities, and of the more peripheral groups in Africa, the Caucasus and Central Asia. Within the Jewish communities we can look at the relationships between subgroups such as the Cohens, the Levites and the rest; you may have seen the newspaper reports earlier this year supporting the traditional view that the Cohens, the "Jewish priests", have indeed maintained their oral continuity from father to son apparently for thousands of years. Our historical work also is directed to the colonisation in ancient times of the Mediterranean basin by the Phoenicians and the genetic diseases they may have taken with them.
Another fascinating study is the origins of the Eastern European Jews. You may know that there are at least three main propositions. The first is that they derive in principle from the ancient Israelite population, part of which migrated in Greek, Roman and later times to Eastern France and Western Germany, and early in this millennium to Poland and other areas. Their Yiddish language was a form of Old German with many later Slavic and Hebrew borrowings. A variant of this concept is that the migration was via Italy to Switzerland, Bavaria and Austria, with a postulated later migration east along the Danube valley to Romania and outwards from there. Yiddish shares many words and expressions with the southern form of German. I am not sure for the moment how much inward conversion to Judaism these two hypotheses suppose. The third, and in some ways the most intriguing idea, depending heavily as it does on the syntax and specific vocabulary of Yiddish, is that the Eastern European Jews, in addition to their descent from the ancient Jews population, have a significant part of their ancestry derived from Slavic converts (Sorbs, Balkans and others) plus a minor Turkic input from further east. The proponents of this theory designate Yiddish syntactically as a Slavic language with a mainly German lexicon. This is just the sort of proposition it might be possible to sort out genetically.
There are few questions more intriguing to all of us than our own histories and how we fit into the scheme of things: who are we, who were our ancestors and to whom are we related. Genetic anthropology is already providing fascinating new insights and we know that our own work is going to help to widen our area of understanding.