Thursday, September 25, 2008

Constructing a Search Strategy #2 - Boolean Operators

Computers are basically giant calculators. Databases and search engines can't think of words the same way we do - as signs representing meanings. They have to think of a word as a thing that can be used in sums.

An illustration I like using is to think about a deck of cards. You can tell the computer to find all of the black cards and it will. You can tell it to find all of the hearts, and it will. You can tell it to find all of the Jacks, and it will - but you can't tell it to find a royal flush, because then it has to think about how the cards relate to each other. Unless you tell the computer exactly what cards are in a royal flush, then it won't be able to find one for you.

This is where Boolean Operators come in. They act as commands telling the computer how the words relate to each other and what should be done about them.

The most common Boolean terms are AND, OR and NOT.

AND tells the database to look specifically for documents that have both of the terms you are searching for. For example, if you ran a search for "rotator cuff", you might get several hundred results. If you ran a search for "injury", you might get several thousand. Most of those articles would be completely useless for an assignment about rotator cuff injuries. However, if you searched for "rotator cuff" AND "injury", the database would only return search results with both terms. It would not find any articles about rotator cuffs that did not include the word "injury", nor would it return any articles about injuries that did not mention "rotator cuff". This could cut the results down from thousands to only a hundred or so. Any other terms you added with an AND would cut the number of results even further.

But what if there was a brilliant journal article that consistently used the word "shoulder" instead of "rotator cuff"? The search above wouldn't find it, because it was missing one of the essential search terms.

This is where OR comes in.

You use OR to tell the database or search engine that these words are interchangeable, and you and happy to accept any article with either (or both). So, ("rotator cuff" OR "shoulder") AND "injury" should pick up any articles that use "shoulder" instead of "rotator cuff", and make sure the word "injury" is included. The brackets in this case are just like the brackets in a mathematical problem - they help the computer to know which 'sums' to do first in the 'equation'.

OR can also help with controlling your search. Say you particularly wanted to know about rotator cuff injuries in sports that involve over-arm bowling. The most obvious ones are cricket and baseball, so you could run a search that looked like this: ("rotator cuff" OR "shoulder") AND ("cricket" OR "baseball") AND "injury".

You will probably find the thousands of results you had to start with have now been whittled down to twenty or so.

Now, if you found you were constantly getting articles about lacrosse, and you wanted to get rid of them from your search results, you would think about using NOT.

NOT basically tells the database or search engine to shut out any articles containing those terms. So, adding NOT "lacrosse" to your search would get rid of any articles containing the word "lacrosse".

Of course, there might be a really good article that uses the word "lacrosse" in one sentence, and the database will reject it just like all of the other lacrosse related articles. You should think carefully about using NOT in your searches. It can be very useful for getting rid of 'noise', but it can have it's downside.

There are other Boolean Operators that you can use in different databases and search engines, but these three are standard for almost all of them. You should check the database guides for the particular database you want to use to see what other operators it offers you.