77 |
78 |
79 |
photographs - which may itself be sufficient but may not quite justify the effort
spent in making them! Why this insistence on typing up indexes? Partly because it may help others use your material at some indefinite time in the future, but more importantly it is a great help to you: rather than having to keep in your head some vague notion that on a particular tape there was a conversation that is somehow relevant to the problem at hand, you can look it up. This is why the literacy argument is important and finds a place in a practical manual. By externalising thoughts and mental constructs - such as where we have stored a particular bit of information we are freed (so runs the pious hope) to think of better things. It is easier to analyse well indexed research material and hence we may expect more and better analysis from the indexers. In addition, the work of indexing forces a familiarity with the whole, and such a pass over the entire work may not otherwise occur. Indeed, this is why I try and index my notes in the field since the discipline of so doing reveals holes in what I have learnt, inconsistencies and loose ends which can be sorted out easily while I am there in the field. There is a downside to this - by refusing to index, an anthropologist may rely more on their memory. This may encourage analytic insight; riffling through fieldnotes/photos/tapes may facilitate serendipitous discoveries. Yet when I look something up I always find myself drawn to adjacent pages so I am not convinced that not indexing really increases serendipity. Nor am I convinced by the 'mental index' view which I believe to be self-serving and encourages the creation of structures independent of available evidence. Case study: Newspaper cuttings from Pakistan In the course of doctoral research in Pakistan a doctoral student collected a large collection of newspaper cuttings on the subjects that concerned her - Islam and politics. These were topical during the period of her research which ended up focusing on the public representations of these topics, so the press coverage and what her informants made of it was very important to her. To cope with collection she first of all scanned the cuttings then used an Optical Character Recognition (hence OCR) program for data entry. This has the same end-result as having the cuttings copy-typed - a set of word processor files. OCR is good for bad typists who have |
80 |
81 |
A good example would be a database of plant names where for every specimen you may
wish to record a local name, the scientific name, a sample number, notes about habitat and information about local uses in food, ritual and medicine. In short, there is a predictable set of categories of information, and by implementing these in the form of a database you can be assisted to be consistent in the manner of recording the information. Once this has been achieved the same structure can be of great help in finding information - it is conceptually no different from using a set of file cards on which the same sort of information is always recorded in the same place on the card - in a computer database set-up for the plant names information about the scientific name belongs in the 'field' (i.e. category) for scientific names: Illustration of file card |
sample number | 1 |
local name | kembar |
scientific name | ceiba pentada |
habitat | forest |
notes | kapok collected at end of dry |
season and spun, was used in mattresses |
Another important type of systematic information that all academics have to deal
with are bibliographies, and this is where databases really come into their own. For, in the course of a couple of years of research activity, you are likely to have looked at many books and even more scattered journal articles in a variety of different libraries. To return to a particular article you need to know not only which library and where it is but the page numbers as well! This is a classic task for a database. And there is more: if you wish to use a reference in a bibliography then there is the additional chore of formatting. Particular universities and journals have different formatting styles laid down and it is extremely tedious having to change the titles which were in italics to being plain text but within inverted commas and so on. There are now a number of special bibliography programs available that not only at as databases to allow the easy manipulation of the references that you collect in the course of your research but also allow you to print out the references in a variety of different formats - so the same collection of references may be printed first in the Chicago style for a university thesis, then reformatted in JRAI or AA format for submission to a journal. Once a style template has been set-up (and the main programs come with a large set of pre-established style templates) then the output format can be changed swiftly with little or no further editing required. I write with the enthusiasm of a |
82 |
convert: having started to use one of these programs more than five years ago I can
scarcely bear to contemplate the hours wasted in the past wrestling with bibliography formats. And I bow my head in shame when I hear of new research students being told by their supervisors to keep bibliographic references on file cards. If ever there was a simple task which computers can greatly facilitate then bibliographies are it! Everyone is kin Genealogies are much much harder, and the extraordinary thing is that there still is not a good program suitable for an anthropologists to use available in 1997. CSAC is working on this1 - the commercial market is dominated by those doing family histories in Europe and America and the programs reflect their interests and cultural biases, thus making them unusable for a wide variety of non-western family structures, quite apart from the different graphic representations used). It is worth reflecting upon why the task of managing bibliographies is straightforward but genealogies is not. A bibliographic record is an autonomous, independent entity, containing information such as Author, date, title, pages, location journal title etc. etc. So a bibliographic database consists of a list of entries and is thus conceptually very like a stack of file cards. But a genealogy is very different because are relationships, and these make the problem much more complex. In essence, a genealogy comprises a list of individuals (who have attributes such as name, sex, date of birth, place of birth, residence, languages spoken etc.) and other lists of marriages or of children. But among the elements of these lists are references to individual people. So one list refers to another - we are dealing with a relational database. The references between these lists are the essence of the relationships of interest to anthropologists. One irony is that the genealogical tree is an extremely efficient way of representing this information - for all the problems of drawing the trees and the ideology behind them (Bouquet 1993). There are two separate aspects to a genealogical database which help demonstrate both the utility and the problems concerned: data entry and access (browsing or examining the data that have already been entered). 1 See the work included in the Experience Rich Anthropology project- which has an online drawing programme. |
83 |
When typing in the data it is important to try and minimise any errors, and to reduce
to a minimum any repetition. It should not be necessary to type repeatedly the names of mother and father for the ten children resulting from a particular marriage. Attempts to replicate in computer the sheer efficiency of genealogical trees have not been particularly successful. The genealogical tree is an incredibly efficient and simple device for recording relationships. They are good for a few tens of people, but can quickly become cluttered and unmanageable once more than a hundred or so individuals are involved. And once multiple marriages occur which may involve unions between kin and across generations then the diagrams become very hard to maintain on paper. Automatically generated diagrams can be selected to represent particular views, perhaps omitting categories of people to make the diagram clearer. But that is to move to a discussion of browsing rather than data entry. I take it as basic that genealogical information includes data not easily or neatly represented on tree diagrams - which are good for relationships but not gossip or more mundane information such as dates. Data entry then must allow basic information (where known) about an individual to be easily typed in. These typicallyinclude categories such as |
Unique Id Sex Name (s) Date of Birth Place of Birth Date of Death Place of Death Current Residence Time at current residence Education Religion Economic Data (may be several different fields) Genetic/Medical Data (may be several different fields) Gossip/Other Information |
The list of possible sorts of information that may be recorded is strictly that concerning
a particular individual. How can we enter data about parents without having to repeat the same information for siblings? The neatest solution seems to be to have a second set of data entries that record relationships - and which use the unique individual id numbers to keep track of the relationships. |
84 |
There are two solutions which are more or less equivalent, and which serve to demonstrate
the type of approach which can be developed. In the first solution we can add to the data on individuals just one field which records the Marriage (or union) Id of that individuals parents. If nothing is known of either parent then it may be left blank. A new data type is then created which contains the following types of field |
Marriage Id | Note this also appears in the records of individuals and thus connects |
individuals to their parents |
Husband Id | Note this and the following Id numbers connect individuals with their |
spouses (legitimate and illegitimate) |
Wife Id Date of start Date of end Other information e.g. place of marriage, dates of prenuptial events, divorce etc. |
This solution does generate a little repetitive typing but not much - the marriage
ID must be added to each sibling. Note that if someone remarries or is a part of a polygamous marriage a new marriage record must be made for each union. So a man with ten wives and an illegitimate child would have ten marriage records associated with the official wives, and another one for the extra-marital union. The database structure is neutral with respect to legitimacy. Extra fields may have to be added if issues such as legitimacy are important. In the second solution the individual record contains no information about parents, no reference to a marriage record. Instead the marriage records contain references to the individuals that the union produces. A typical format would be: |
Marriage Id Husband Wife Date of start Date of end Offspring set of Individual ID nos... Gossip/Other Information |
This type of cross-referencing may sound onerous but a well designed system can mask
the chore of having to copy type random id numbers - the systems can display a list of the individuals, and the marriages that have already been entered. The user can select a particular individual then click on a special button or switch on screen which will add the id of that |
85 |