Sunday 6 December 2009

Will Google Books destroy local libraries?

In the long term the answer is "Yes". It's not just Google Books that will do this, it's other digital developments as well. History tells us that digital markets converge into a few large players, and that is precisely what will happen in this "market place". Libraries may wish to think of themselves as a public service, but ultimately they are a part of this digital "information" market place.

The question is, what should local libraries and archives do to combat this trend? And how long do they have?

The answer to the last question is easier than the first. History tells us that it will take longer than you might imagine for the digitisation of "everything" to occur. Digital developments generally take longer than the pundits/experts would have you believe. So, there is time for these local services to do something.

The question is, in a digital world where all books are free to access via the internet, what is the role of a local library or archive? If everyone can find what they want on line, free of charge, from the comfort of their own home, why should they bother travelling to a library?

The answer can only be that local libraries offer something which the "net" does not, in some form of added value. It will probably not be any one thing.

Certainly, the advice and assistance of knowledgeable librarians and archivists will always be valued in terms of educating the public and new librarians, and also in assisting the public with their research. Retaining experienced and knowledgeable staff is thus a critical success factor for libraries. Currently, their cost cutting initiatives seem to lead to some of their best staff seeking early retirement or redundancy. This appears to me to be the wrong direction of travel. Councils would be better advised to divert some of the funds they apply away from headline grabbing initiatives like building new libraries, and towards investment in people, services and books.

In addition, libraries should also be trying to add value to the holdings they have. My own web site shows what can be done to add value to printed archival material by digitisation (i.e. producing digital copies, and using optical character recognition software to make it searchable). Where local libraries and archives have local holdings which are unique, I think they should be doing what I am doing. That would at least give them some advantage in the digital marketplace.

Libraries and archives already have extensive catalogues and indexes which could be used to build local "knowledge bases" to answer research questions. In addition, it would be a good idea if every time a member of staff answered a query from a member of the general public that this response is added to that knowledge base. Over time, a substantial body of material would be developed specifically aimed at addressing local research questions, and this would become a valuable resource. Many companies already do this - IBM has for decades had a database of engineering problems and solutions to assist its field engineers; Microsoft has its own "knowledge base" for developers.

There must be many other responses that local libraries and archives could make.

You might think that the government would have an answer. They don't. They utter platitudes about "needs based services" and have a strategy of fewer bigger archives. That means centralisation, inevitably with a concentration on London. It all points to the marginalisation of local libraries and archives.

My own web site is therefore just an exemplar of one of the responses that local libraries and archives could make to the impending digital threat. My plea is that they should do something other than sleep walk to oblivion - please.

Saturday 5 December 2009

Do look ups break copyright?

Many customers of my web site do look ups for other peope, and one of you has asked whether this breaches the copyright of the site. The straight answer is "Yes".

The reason is that information on the web site is for private research, and not for publishing - whether for gain or not. Doing look ups, and passing the results to a third party, is "publishing".

I guess the question is "Does this matter?" After all, it is quite impractical for me as an individual to pursue every perceived breach through the courts. If anyone were daft enough to offer one of my CD's for sale, say via eBay, then I probably would take action. Except for that case, then anyone who breaches the copyright in this way is unlikely to face any penalty.

So, "Does it matter?" The trouble is that this sort of breach of copyright does damage (by reducing funding) to what I am trying to do - which is to improve access to printed local and family history sources from the Midlands. Local libraries cannot afford to do what I am doing, and are unlikley to be able to do so in the foreseable future because of budget cuts caused by the current economic circumstances.

I could react in two ways. I can pursue people who breach copyright through the courts, but this is costly and, in any event, is no way to treat customers. The other is to change what I do to limit the risks, and I have already made a few changes. For example, because of the risk of of piracy I do not offer the electoral roll on CD. I probably should also increase security and require everyone to accept a long winded set of "Terms and Conditions", but in truth, I can't be bothered. Anyone who is of a mind to steal the information will do so. Perhaps I should increase prices to compensate for "leakage". None of which is in line with my aim to offer exemplary customer service.

The real risk is to the future of the web site. If revenue is reduced through copyright breaches then the opportunities for future investment are reduced, and ultimately the site will fail. I think that would be a shame - and actually, anyone who is doing look ups tacitly agrees, since the implication of this activity is that the information is useful.

Consequently, I'd like to ask anyone who is involved in doing look ups for others on my web site not to do so - please. It may seem harmless, and a service to others, but it isn't.

Monday 27 April 2009

What's in the digital library?

I have included printed material from several sources in the digital library that is generally not available on-line elsewhere. This includes local history books, directories, military histories, electoral roll and a few indexes produced by family history societies and others.

What is the value of these items for family and local history research? I shall be exploring that in more detail in future blogs. But in summary, these sources will help to fill in the details and provide more background information for research.

Take directories as an example. They list residents and businesses, and often contain descriptions of a town, its notable local events and its socitieis and associations. As such they are a valuable insight into local life,culture and commerce.

They started in the 17th century, initially as listings of businesses. By the mid 19th century companies like Kelly's were providing more comprehensive coverage, including some individuals, but the evidence is that coverage was best in the urban centres, and rather sparse in the rural areas. Indeed, Kelly's started in London and owed its existence to work done by the Post Office to list postal adresses for London residents. Frederick Kelly, who was at the time chief inspector of inland letter carriers for the Post Office, bought the copyright for the directories in 1835 and developed his business from there.

Initially, Kelly used Post Office employees to gather information on residents and businesses, but this lead to complaints of unfair competition from other directory publishers. Latterly he employed his own canvassers who provided the information.

As more local directories were produced, and the electoral register expanded to cover a greater proportion of the population, so this was used as an important information source for directories.
What does all this mean for historical research? It means that directories are a useful source of information if you can find what you want, but that they are not complete information sources. Some research has been done on this subject, and this shows that, for example in 1890, coverage may only have been about 60% of local residents in comparison with census of the same period.

I hope to provide more insights into the quality of the resources in my digital library to assist historical research.

Friday 24 April 2009

What's in the name Midlands Historical Data ?

The name Midlands Historical Data - of this blog and of my web site - was chosen to reflect my project aims.

Firstly, Midlands. I only plan to include in the digital library local history books and directories from the local area of the West Midlands, by which I mean the "old" counties of Staffordshire, Shropshire, Warwickshire and Worcestershire. I want to provide depth rather than skate across the surface, as many other projects are obliged to do for budgetary reasons.

Secondly, Historical. The books I have scanned so far are mostly history books which are out of copyright and relatively rare. By producing digital copies and putting them on the web they can be shared between locations.

Thirdly, Data. Finding what you want from the 650 books on the site requires that they be turned into useable data. We are used to the miracle of Google, but someone somewhere has to create the original data. What I have done is use existing technologies to convert images of books into searchable text.

The computer process that achieves that conversion is at best 99% accurate - and for old texts it can be as low as 70% accurate. However, I took the view that an some index is better than none, and discovered that where a series of books can be scanned and indexed (for example, an annual series of local directories), the chances of finding a piece of research data - say, a person or a place - increase. Also, where the format of the data is known, as it is in an Electoral Register, software can be developed to improve the quality of the resulting index.

The project aims can thus be summarised as adding value to the original text by making it accessible and searchable - and by having enough books in the library to provde the detail that history researchers need.

Why I'm blogging

I decided to start this blog to create a forum to discuss the digital library I have created. I hope it's a regional resource which adds to the national web sites providing information for family and local historians.

I started scanning local history books and directories in March 2003, and to date have scanned about 750 books. All are on my web site as searchable images.

About two years ago I started on the Birmingham Electoral Roll. I have now completed the Electoral registers for 1920, 1925, 1930, 1935, 1939 and 1945. I have also created an index which is searchable from my web site.

In parallel, I have worked with Family History Societies who wanted to scan material and create and publish indexes.

Where now? Is it useful? Time will tell.

Followers