Searching for Information: Concepts and Tools
Whether you seek information for research or for practical purposes, you need to be sure that you have the most applicable and correct data possible. To achieve those goals, it is essential that you understand how to search for the data and how to evaluate it once it has been found.
- It is also important to know when it is appropriate to search textbooks, journal articles, websites, or a combination thereof. To be able to construct a well-defined search query within any of these sources, you should first understand the underlying concepts beneath classification systems, searching terminology and tools, and the critical evaluation of search results.
RESOURCE CLASSIFICATION
ELECTRONIC SEARCHING
Terminology
Subject searching (Controlled vocabularies)
Keyword searching
Tools
Search Operators (Boolean Operators)
Positional Operators
Truncation
Nesting
Qualification
Limiting
Stop Words
Evaluation
FINAL CONSIDERATIONS
RESOURCE CLASSIFICATION
Many academic libraries, including the Christopher Center, use the Library of Congress (LC) classification scheme to assign call (or location) numbers in their collections. A general understanding of how the LC classification scheme is arranged is helpful in locating information resources. The LC classification designates twenty-one categories for different subject areas. Broad subject categories begin with a single letter chosen from A-Z (except for I, O, W, X, and Y). One or two additional letters and a set of numbers are added for sub-categorization. Many newer books also display the publication year as the last segment of the call number.In Christopher Center, the LCSH scheme is applied to all general stacks and Reference Collection resources. The Government Documents resources are classified under a different system that is detailed within that part of the library.
Knowing the call numbers of an item can help in browsing that section of the collection for further resources not initially identified. For example, religion resources are located within the BL - BX shelves. Further knowledge of what subjects the numbers connote can lead to more detailed browsing, such as knowing that the English Bible texts section is BS 135 - BS 198, while critical writings about the Bible are located in the BS 500 - BS 534 shelves. Specific call numbers for books are found through searching GALILEO, the University Libraries' catalog.
- Even given this, information on a specific topic can be scattered throughout the collection. This is when a broad understanding of how the classification system works can be especially useful. For example, in researching the topic of ethical practices and how they relate to medicine, it is useful to know to browse three areas to start with: "R 723-726" Medical Philosophy, Medical Ethics section; "KF 3821-3829" Law of the United States, Medical Legislation section; and "QH 332" General Biology, Methods of Research, Techniques section.
ELECTRONIC SEARCHING
Not all databases are created equal...or even by the same rule book!The search process can be frustrating or overwhelming when your search strategy retrieves records that are too few, too many, or not in your desired subject matter. Understanding the concepts of what "vocabularies" you are working within and what tools will help you further define your strategies can make a world of difference in successful record retrieval.
There are two basic types of search queries that most databases use: Controlled Vocabulary and Keyword. Each has its own benefits and constraints, but often, the key step in successful searching is knowing (or choosing) which type of query you are working within.
Subject Searching (Controlled Vocabularies)
Controlled vocabularies are standardized, hierarchical lists which have been designated by a database to represent the major subject concepts and conditions contained within that database. They can change from database to database. The hierarchical nature of the lists benefits search strategies by allowing broad concepts to be narrowed in a manner that stays consistent within that framework.Before an item is added to a database or catalog, its subject matter (main and peripheral) is determined. Specific terms that apply to those determined subjects will be chosen from a pre-determined list, no matter what terminology the author used within the item. This way, there is a consistent method for retrieving the same information concepts even though different terminology has been used. The listing is standardized and somewhat predictable. For example, the term "heart attack" is always listed as "myocardial infarction" within a controlled vocabulary structure.
This type of searching is called "subject searching" and, unless noted otherwise, will display only those records that match exactly to those terms you have entered within the manner that the database has been set up. For example, GALILEO searches only the "subject" portion of each citation record in its database. The trick is to be able to determine within that database what terms have been chosen to represent the subject matter you want to find. That's when knowing more about the different systems can be of great help.
Library of Congress Subject Headings
Many academic and health science libraries (including Christopher Center) use the Library of Congress Subject Headings. This is a controlled vocabulary system whose terms are listed in the large red books that are close to many information and/or reference desks (Christopher Center's copies are located in the Reference Area). The terms can also be seen in the "Subject" area of any record that is retrieved. By finding an item that matches your search need, checking the "Subject" area of the record, and then using that exact wording and punctuation to enter a new subject search, you can find many relevant records much quicker. Using GALILEO, pick a topic you are interested in, do a keyword search, then examine the "Subject" area within the citation record to see an example of this.
- Many other databases used by Christopher Center also have their own controlled vocabularies, contained in books in the Reference Area. Some of them are as follows:
CINAHL (nursing database) - CINAHL Subject Headings
EI Compendex (engineering database) - EI Thesaurus
ERIC (education database) - Thesaurus of ERIC Descriptors
MEDLINE (medical database) - MeSH: Medical Subject Headings
PsycInfo (psychology database) - Thesaurus of Psychological Terms
ATLA Religion Index (religion database) - Religion Indexes Thesaurus
Some databases include their thesauri online within the database. Other databases use the option "Word Lists" in addition to, or in place of, thesauri. They are not the same tool.Keyword
Keyword searching, also called textword searching in some databases, allows you to enter a search term that you believe best describes the term as it is used in information source records. While this search strategy will retrieve what you've entered, you also need to search using synonyms and variations of the search term to make sure that you have retrieved all of the relevant records.For example, if you are looking for information on "heart attack" using a textword search, you also need to search using the terms "heart attacks," "myocardial infarction," "myocardial infarctions" and so on.
Keyword searches are often best when searching full text in addition to searching the information source record. In GALILEO, keyword searching looks at the title, content notes, and corporate author areas of each citation record.
Database Differences
Some sophisticated natural language interfaces use an algorithm to compare your entered search terms with common variations (such as plurals) within the entire database. This is termed "concept matching."
- And, as previously mentioned, different databases have different vocabulary structures that can necessitate changes in your search strategies. Look in the HELP files of each to determine how each is structured. For example, a search conducted in the PsycInfo database may use a different subject term than a search conducted in the MEDLINE database, even though both are science-related databases. When this happens, your search "fails" and you are mistakenly led to believe that the second database contains no information that applies to your search query.
ToolsUsed primarily with keyword searches, search tools are used to refine search strategies. While their concepts remain constant across various applications and resources, some electronic databases don't support all of them or their symbols may be slightly different. To make the best use of powerful search strategies, it is essential to check the database's HELP files to make sure the tools are available and that the chosen symbols are entered correctly.
Search Operators, aka Boolean Operators: AND, OR, NOT
Search operators are words that specify the relationship between two or more search terms. Search terms can be linked in a number of ways:AND
This narrows a search. Both search terms must be found somewhere in the record, though not necessarily in the same place, such as part of an author's name and part of a title. GALILEO's default operator between search terms is "and".
In GALILEO, try a keyword search of "jazz and blues" to see what records are retrieved.OR
This broadens a search. Either search term must be found somewhere in the record. This is especially helpful when searching synonyms or words with various forms. Most Internet search engines' default operator between search terms is "or"... a primary reason a search can retrieve so many irrelevant records.
In GALILEO, try a keyword search of "jazz or blues" to see what records are retrieved.NOT
This narrows a search also. A record can be retrieved only if the first search term is present and the second search term is not present.
In GALILEO, try a keyword search of "jazz not blues" to see what records are retrieved.Some database software works best for these strategies when you first do a search for each search term to get the "mapping" strategy in place. Following this, you can then use a combination of the search sets retrieved with your search operator(s) of choice to further delineate your search.
Search operators can also be used with Internet search engines. It is essential to check the HELP files of your search engine of choice as to what its specifics are concerning search operators: some require that the words be in all capital letters, some accept two and not the other, some require the combination "and not" for the "not" operator.
- Most will honor the use of the plus " + " and minus " - " signs as shortcuts for the use of the words "and" and "not." Precede a required word or phrase in the search query with a " + " and a prohibited word or phrase with a " - " . The use of these symbols in combination within one search query is also acceptable.
Positional Operators: ADJ, NEAR, SAME
Positional operators are used to specify the proximity of search terms to one another. The most commonly used are "ADJ" (adjacent), "NEAR," and "SAME." Usually, adj and near can be grouped with a number to specify how many words can appear between the search terms. Same designates that the search terms are found in the same part of the information record (e.g. the title or the abstract). Order is not designated with same, though, and the search terms won't necessarily be next to one another.Currently, GALILEO does not have this capability. Some Internet search engines, such as AltaVista, support the "near" proximity option in their advanced search strategies.
Truncation
Truncation is the ability to retrieve records of search terms that share a common root. In each database, a symbol of some sort (such as a colon, a question mark, an asterisk, a dollar sign or a pound sign) is placed at the end of the group of letters forming the root search term. Check the HELP files of the chosen database to see which symbol is used. When possible, use the longest root possible to increase the accuracy of the search.Within GALILEO, the truncation symbol is the asterisk " * " . Try a keyword search using "psych*" to see how those records are displayed and what suggestions you are given.
- Other databases may also use "wildcards" - symbols which take the place of one or more letters in a search term. Check the database's HELP section to see if and how wildcards are used.
Nesting
When a search strategy contains two or more operator tools, parentheses are used to tell the database which search terms are grouped together. This is called nesting. As with elementary algebra problems, the database will first process the information within the parentheses and then apply that result in processing the information outside the parentheses.In GALILEO, try the keyword search query: "(drug or alcohol) and abuse" to see how the results display.
Another form of nesting, the use of quotation marks around a phrase or proper noun, allows the database to search for the term/s exactly as entered. While GALILEO doesn't currently support this option, it is becoming a useful option with many other databases and Internet search tools.
Qualification
Qualification allows a search strategy to designate where the search terms are to be found in the record. The most common limits are those by author (au), title (ti), subject (su), publisher (varies), and publication date/year (varies).The placement of the punctuation can be important when using limiting search terms, as is checking with the each system to learn its preferred command format.
- GALILEO allows you to qualify much of your searching at its main menu screen. Other databases may require you to choose their advanced search method to use these options.
Limiting
Within GALILEO, and many other databases, the software interface allows you to apply limits to the fields that are searched within the document, thereby filtering unwanted documents from a set that has already been retrieved. Limits most often are additional search terms, material language, material/publication type, and publication year/date. Limiting will create a new set, drawn from the original set, that now has more clearly defined criteria. GALILEO's software allows you to apply consecutive limits, which further narrows each subset.In GALILEO, construct a keyword search query using "shakespeare". Then choose "Limit this Search." Now choose "Words in the subject" and type "criticism". If you'd like to limit even further, once again choose the "words in the subject" limit and type "poetry".
Stop Words
Stopwords are words that are so commonly used in records that they are a hindrance to accurate record retrieval. These words are usually articles of speech, conjunctions, and pronouns. Most databases show "failed search" results of some nature when stopwords are used in a keyword search strategy. Title searches ignore initial articles (such as those listed here as stopwords) as well as their foreign equivalents, such as "der" "das" and "les" to name a few. While these designated words can vary from database to database, the most common ones to avoid are: a, an, and, for, in, of, the, this, to.Each database will have its own list of stopwords which are accessible through the HELP file. They can be extensive, but are very useful to know about when searching electronically.
- Within GALILEO, try both a keyword search and a title search for the book "For Better, For Worse" to see how the different search queries are treated.
EvaluationMeasurements of Search Results
Along with the application of search tools, the measurements of recall and precision provide conceptual baselines for refining your search strategies. The successful integration of these measurements into your search strategies can result in retrieved records that match more closely the desired focus and number that you feel are appropriate.Recall
When you enter a search term, the concept of recall describes the broad "catch" of retrieved records. Those records, usually high in number, will display a wide range of what has been classified as related records. Recall can also be termed "sensitivity" in some databases.Recall is a strategy best used when the information you are seeking is uniquely detailed, a new topic or procedure, or hasn't been widely written about in the literature. By gathering a large number of records, you can have some assurance that the information you seek is somewhere within them, even though you may have many records that aren't as relevant.
Precision
When a search term displays a precise retrieval, the records are usually smaller in number but more narrowly match the entered search terms. Precision can also be termed "specificity" in some databases.This search strategy is best to use initially when the information you are seeking has been written about in a number of authoritative sources by a number of knowledgeable people. You can narrow your search without undue concern over the loss of some relevant records. It is likely that you have already gathered that information somewhere in your retrieved records.
Another Approach
A good technique, especially when your search strategy skills are not yet finely tuned, is to combine these concepts. Begin with a recall strategy, which will bring many records, and then use precision to narrow the search strategy within those initially recalled records.Please note: within MEDLINE and other health-related databases, sensitivity and specificity are measurements within epidemiological searches and may have different connotations than noted here.
Critical Analysis of Search Results
Once you have gathered information that you feel is applicable to your subject, it's important to remember that you aren't finished yet. Often, many people quit at this point, satisfied that the information they have gathered will "somehow" work within their needs.Make sure that your gathered information meets the evaluative source criteria such as knowledgeable authorship, respected publisher and publication, and a current date of publication (if applicable). Then evaluate the content, judging it on criteria such as intended audience, depth of coverage, objectivity, and related reviews. For an excellent, in-depth resource on evaluative criteria, see Christopher Center's Evaluating the Quality of World Wide Web Resources.
- Also, go back to your original focus and search strategy, making sure that this information meets those established components. If it doesn't, figure out why that change has occurred and then decide if you need either to redo your search strategy or to change your original focus based on your critically evaluated findings.
*Consider the time you will spend using electronic resources when traditional print materials may be easier to access and to use.
* Know the parameters of each database that you will be using. Use the HELP sections to learn about, and to keep up with, the changes that these resources undergo.
* Get familiar with, and use, multiple databases for all of your research strategies. No matter what the advertising says, each database contains only of small portion of our accumulated information. Make your research as comprehensive as possible.
Useful Sites for Internet Use and Evaluation
- Spider's Apprentice - Search Engine Tips, Information, Strategies
http://www.monash.com/spidap.html- BARE BONES 101: A Basic Tutorial on Searching the Web
http://www.sc.edu/beaufort/library/pages/bones/bones.shtml- Evaluating the Quality of World Wide Web Resources
http://www.valpo.edu/library/evaluation.html
![Valparaiso University Homepage [logo]](http://www.valpo.edu/images/template01/valpo.gif)
