If you ever want to screw over a library, just walk up to any shelf,pick up any book, and put it on another shelf where it doesn’t belong.
Eventually a librarian will stumble across it,see that it’s out of place, and pull it off the shelf.Until then, that book is hopelessly lost.It might as well be on the surface of the moon.Finding a needle in a haystack is easy.When it comes down to it, needles are not hay.Finding one book lost among a million other books?That’s hard.
The only reason such a thing as a library is possible is that it isa gigantic, life-sized, walk-in data structure, tuned for fast lookup.
This post is about searching and sorting,two fundamental aspects of data processing,and what the library has to teach us about them.
Suppose you have a sorted array,
books = [ "001.4 Graphs", "001.56 Symbols", ...millions of strings... "998.9 Antarctica" ]
and you want to find the position of a given value in it.
How do you do that? Well, there’s a method for that:
But it checks every item in the array, one by one. That could take a while. Is there a better way?
Binary search is the right answer.
And this algorithm is proven optimal.
But if the sorted array looks like this
and the item you’re looking for is something like this
then it doesn’t feel optimal.
The orange path is the path you would take doing binary search. The purple path is what you want to do.
What’s going on here? Why is this optimal algorithm so bad in real life?
Is it because humans are dumb and slow and make mistakes, whereas computers are smart and fast and dependable, so what works for a computer won’t necessarily work for a human?
I don’t think so, because if I were programming a robot to go get books off the shelves, I wouldn’t have the robot do a binary search either. A robot doing binary search to find a book would look just as dumb as I would. It really is the wrong algorithm somehow.
Is it because we’re not actually searching a sorted array using comparisons? Well, no, that’s exactly what we’re doing.
So, I thought about this, and the best answer I could come up with is that the cost model is wrong.
Binary search does an optimal number of comparisons. It’s only the best algorithm if comparisons are the most significant cost.
In a library, the books are physically spread out. Most of the time it takes to find a book is not comparisons. It’s walking. And this is exactly why binary search feels so dumb: the algorithm makes no effort to minimize your walking.
There is another factor, which is that librarians do something very clever. They cheat. Or rather, they change the problem.
They put up these signs.
It’s just a little extra information. You can look at these signs and before you even start walking, you know which aisle to go to. So before you even begin, you’re already doing better than binary search.
Well, real-life programs do this exact thing in software, and for exactly the same reason.
Here’s a picture of a B+ tree. Filesystems and databases basically all use this data structure or something very similar. You can see there’s a hierarchy. The actual sorted data is across the bottom; all this other stuff is just a set of signs telling what’s stored where. And all of that is a fraction of the size of the actual data. When you create an index in a database, that tree is what gets created on disk.
To find something, you start at the top, follow the signs, and in a very small number of hops, you can get to the data you’re looking for.
Now, each hop involves looking at a chunk of data and determining which way to go next. It’s some code. And if you just ignored the B+ tree and did binary search on the data itself, that would be simpler. It’d be less work, fewer CPU instructions. But it touches more pages of memory. Or disk. More walking around.
It’s easy to forget, but reading data from memory is one of the slowest things a CPU does. A modern CPU can execute hundreds of instructions in the time it takes to load anything from main memory. Going to disk is thousands of times worse.
Everyone knows how to sort an array, right? There’s a method for that.
Unfortunately, it doesn’t work on real-world books.
>>> ().sort() TypeError
So the library must know how to sort. I mean, they’re not programmers there. But institutionally, the sorting algorithm must be there somewhere.
The standard sort methods are mostly (souped-up) merge sorts. Merge sort is O(n log n), which is optimal. So to within a small constant factor, on average, if the input is random, merge sort can’t be beat.
But if you’re sorting books, don’t use merge sort. It’s hopeless.
What the library does is a lot faster.
When books are returned to the Main library, they take them to a back room and put them on a set of shelves. There are separate shelves for popular fiction, for nonfiction, for children’s storybooks, and so on.
Every once in a while, they take all the kids’ books, put them on carts, and take them upstairs to the children’s department, where they have another set of shelves.
Same deal, only more fine-grained this time. These are the nonfiction shelves; you can’t see, but they’re labeled with Dewey Decimal numbers: 100s, 200s, and so on.
If you squint a little bit, you can see the beginning of a sorting algorithm here. Divide the data set into piles based on the first letter or first digit.
It turns out that there’s a standard name for this algorithm. It’s called bucket sort. It is in fact faster than merge sort, because—well, let’s put it this way. It’s almost exactly merge sort. But without all the annoying merging.
|merge sort||bucket sort|
|1. divide (in half)||1. divide (in bins)|
|2. sort halves||2. sort bins|
|3. merge||3. |
Instead of dividing the items arbitrarily, you’re dividing them into groups, so that when you get to the end of step 2— the 100s are sorted, the 200s are sorted, and so on— you’re done.
How is it possible for bucket sort to beat a provably optimal algorithm? Well, merge sort is as good as you can do using only comparisons. Bucket sort is cheating, because in step 1, it’s partially sorting the data without doing any comparisons at all. It can do that because the key we’re sorting on happens to be made of letters or digits. Bucket sort exploits this fact; merge sort doesn’t.
I should point out that bucket sort has pretty awful worst-case performance. You can get all the way through step 1 only to realize that all your keys started with the same letter, and so they all ended up in the same bucket, so you basically haven’t accomplished anything! This doesn’t happen to the library in practice, though.
Of course we’ve ignored the remaining problem—how do you sort the bins? And the answer is, however you want. You could bucket sort them. It could be bucket sorts all the way down.
But of course the library doesn’t do that. Because they have something even faster!
How do ordinary people sort stuff? Most of us are not terribly systematic about it, which is fine; we use a few different techniques sort of at random. But one of the main techniques that you’ll see people use again and again is what’s called insertion sort. If you’re like me, you probably learned about insertion sort in school and then promptly forgot about it because it’s O(n2). Which means it’s incredibly slow, right? Yeah, about that. Insertion sort is actually really fast. Under the right conditions.
This is a movie of me doing insertion sort on some books. I’m sorting them by color, so in the end we’ll get a nice rainbow. It’s really simple. You just put all the unsorted items in a pile on the right, and you take them one by one and put them into the output array on the left. Just insert each item in the place where it belongs so that you keep the output array sorted as you build it.
This is the fastest sorting algorithm for small arrays, up to maybe 20, 30 elements. Even in software! The only reason this sort is slow for larger arrays is that when a computer inserts an item into an array, it has to move all the other items to the right, one by one, to make room.
But you can see, inserting a book into a spot on a shelf is pretty fast. You just stick your fingers in there and slide everything over at once.
So not only is insertion sort the fastest algorithm for sorting small arrays, it’s even better suited to sorting physical books. Again, due to costs.
The library’s doing something kind of cool here. I never really thought about it before, but these divide-and-conquer algorithms: they’re pluggable.
The library is taking advantage of this. They use bucket sort, but not obsessively. Once they reach the point of diminishing returns, they switch over to insertion sort, which is faster for sorting small arrays.
Standard library sort routines do exactly the same thing. Most of them use merge sort, which in its purest form looks something like this:
But once they subdivide the problem down to pieces of, say, 20 items or less, they use insertion sort for those.
Some libraries go one step further, and if the array is truly unmanageably enormous, they’ll start out using quicksort, which is better for huge arrays. The problem with merge sort for big data is that it needs some temp space to work efficiently. Allocating gigabytes of memory can be bad. Quicksort can sort an array in place.
Quicksort is a divide-and-conquer algorithm too. Once it divides the data into manageable pieces, it can switch to merge sort, which will keep breaking up the data, and eventually switch to insertion sort.
If there’s a theme here, it is that the library is doing things that are recognizable as algorithms from computer science. But not the obvious textbook algorithms! I guess when an algorithm is going to take up a human being’s time, it tends to get optimized. These algorithms seem to be tuned very well indeed. But perhaps if you looked inside any organization you’d find something similar.
I wanted to share this because thinking about these things gave me just a slightly different perspective on programming. I just gave the library two hours a week this summer, as a volunteer, putting books on the shelves, and it turned out to be unexpectedly rewarding.
And humbling. When I started at the library, there was this moment, embarrassing now in hindsight, when I thought, hey, I know information systems. I could teach the library a thing or two! But of course, predictably, it turned out the library had something to teach me.
I don’t feel too bad about it though. Librarians have been in the IT game for a long, long time.
One more little thing. The Nashville Public Library has a great volunteer program. And they have interesting projects for web developers. They could use your talents.
Or you could just put books on the shelves. Like me. It’s about as exciting as it sounds, but I promise you this: you will read a book that you wouldn’t read otherwise.
If you’d like to give this talk, feel free to use the slides (PDF).
Why do we need to do literature searching? ›
Literature search is a key step in performing good authentic research. It helps in formulating a research question and planning the study. The available published data are enormous; therefore, choosing the appropriate articles relevant to your study in question is an art.What are the challenges and problems of libraries? ›
- Funding. It's rare that funding isn't an issue, but with budget freezes, staff furloughs, and increased sanitation resources needed, there is a deep concern over the financial situation for libraries. ...
- New Safety Protocol. ...
- Upset Patrons.
Understanding the differences between the library and the Internet and knowing where your research comes from is crucial in the process of research writing because research that is available from libraries (either in print of electronic form) is generally considered more reliable and credible than research available ...How would you behave in the library? ›
- Never talk at full volume. Whisper – it's a no-brainer. ...
- Don't forget to plug in your headphones. ...
- Do build a book tower beside you… ...
- 4. … ...
- Try not to stare at other people. ...
- Avoid crunchy snacks. ...
- Keep the small talk to a minimum.
Narrow and refine your search results by:
document or source type (e.g. article, review or book) subject or keyword (for relevance). Try repeating your search using the 'subject' headings or 'keywords' field to focus your search.
- Develop a research question in a specific subject area.
- Make a list of relevant databases and texts you will search.
- Make a list of relevant keywords and phrases.
- Start searching and make notes from each database to keep track of your search.
- Great Libraries Foster Communication. ...
- Great Libraries Showcase History and Information. ...
- Great Libraries Build Capacity for Local Businesses. ...
- Great Libraries Become Public Gathering Places. ...
- Great Libraries Boost Local Retail and Public Markets. ...
- Great Libraries Offer Easy Access.
- Erosion of faith in objective information. ...
- The decline in civility and civic engagement. ...
- The disappearing middle class. ...
- Tax revolt and the tyranny of ROI. ...
- The decline of attention span. ...
- The decline in reading. ...
- Lack of diversity. ...
- Lack of recognition.
The author discusses seven measurable criteria that he accepts as defining a “great library”: Great libraries provide measurably superior service; have great funding; train and retrain their staffs; integrate their virtual, place and outreach services marketing; serve both the weakest and the strongest among their ...Do you think libraries are still important? ›
Libraries offer free educational resources
They provide countless resources, such as educational materials, trainings, courses, scientific publications, etc. to visitors. Public libraries provide their services not only face-to-face, but some of them have also integrated e-learning.
What are the advantages of library research? ›
3 Advantages of Library Research
Getting a librarian's assistance can mean having access to resources you never would have known about. And library collections are heavily vetted. Students can be sure that the majority of items in a library are reliable sources.
Today's libraries still lend books, he says. But they also provide other services to communities, such as free access to computers and Wi-Fi, story times to children, language classes to immigrants and technology training to everyone. "Public libraries are arguably more important today than ever before," Marx says.How do you dress for a library? ›
Dress in layers.
To combat this, wear layers! Tank tops, cardigans, and scarves will help you stay stylish and able to focus on your work no matter what the temperature is.
- 1 Stay focused. Stay focused. ...
- 2 Come to class well-prepared. Come to class well-prepared. ...
- 3 Remain during class. Remain quiet during class unless your teacher gives you permission to talk. ...
- 4 Shut off your cell phone during class. ...
- 5 Stay in your seat during class. ...
- 6 Show your teacher respect.
Answer: In most cases you probably won't have to pay to replace the book. Most libraries tape together torn pages all the time - there is even a high-strength book tape you can buy to mend pages.What is a library search? ›
Library Search is your search engine for searching across all resources rather than limited to a particular type of resource e.g. book. Library Search results can include: Books, e-books, Journals, Full-text articles.How do you search data? ›
- Define the type of data you need. Consider what/who is being measured, where is it collected, when, and how often. ...
- Determine who collects the type of data you are looking for. Think of who has a stake in collecting this data. ...
- Start searching for data.
Eight key stages were determined relating specifically to literature searching in systematic reviews. They were: who should literature search, aims and purpose of literature searching, preparation, the search strategy, searching databases, supplementary searching, managing references and reporting the search process.What is the first step of a successful search? ›
Step One: Identify a Topic
Selecting a topic is the most important component of a successful search.
- Choosing search terms.
- Searching with keywords.
- Searching for exact phrases.
- Using truncated and wildcard searches.
- Searching with subject headings.
- Using Boolean logic.
- Citation searching.
What are the 4 importance of research? ›
Answer: My hub provides several reasons as to why doing research is essential in general, including (1) to build knowledge and facilitate efficient learning, (2) to understand various issues, (3) to know the truth and prove lies, and (4) to seek opportunities, among others.What is one way we could improve the school library? ›
Discard large collection of books with low circulation Automated library management system enable academic libraries to configure and customize rules for circulation. Librarians can simplify circulation and assign tasks to issue books, magazines journals and make a check out.How do you organize a school library? ›
- Organize by Reading Level.
- Determine the reading level for each book.
- Sort your books.
- Place the lowest reading level books.
- Organize by Genre.
- Determine the genre for each book.
- Sort them by these genres.
- Dedicate a shelf to each genre.
Listen thoughtfully and respectfully. Try to elicit the specific reason for their concern, whether they have read the entire work or only parts, and the specific action they would like the library to take. Do not promise to act or appear to agree with the individual.What are the problems of school library? ›
These challenges include; poor staffing practices, poor funding, lack of a library policy, poor ICT infrastructure, poor library facilities, and lack of awareness of the importance of school libraries. The findings in this paper are based on literature review and author's own experience/observations.Is library a profession or job explain? ›
Librarianship is a people' profession. A librarian's job is to connect people with the information they are seeking in whatever format it is available. All library related jobs have one central purpose, i.e., to help people access and use information. It can be for education, work, or for pleasure.What are the uses of the library? ›
Libraries often provide quiet areas for studying, and they also often offer common areas to facilitate group study and collaboration. Libraries often provide public facilities for access to their electronic resources and the Internet.Why do people use libraries? ›
Libraries are community hubs. They connect people to information and connect people to people. They are safe havens for kids, providing after-school homework help, games, and book clubs. They offer computer classes, allowing older adults to stay engaged in a digital world.What is the importance of library in students life? ›
School libraries help students to get authentic information through the books written by reputed scholars who come from different parts of the world. A library plays an important role in creating a school culture, which helps every student to grow on their individual basis as well.How do library helps students? ›
Librarians help students learn the best ways to access and use quality information and resources, help them to enhance their study and research skills and explain how to use the latest technologies to enhance their learning.
Why libraries are important for students? ›
“Research consistently shows that when children have access to good libraries with plenty of good books and with adequate staffing, they read more, and thus do better on reading tests. For children of poverty, libraries are typically the only possible source of reading material.”What is the role of library in learning and research? ›
Libraries offer space for students to learn and provide excellent environment for research. Libraries have staff that can help students in locating the information that a researcher might need. In addition to this, most libraries today have systematic digitized information.What is the purpose of a research library? ›
A library's fundamental purpose has always been to support the process of research and education by helping users find information and ascertain its value.What is library research method? ›
Library research is a technique of collecting data by learning and understanding data which has close relation with the problems from books, theories, notes, and documents. It is a general or specialized library that collects materials for use in intensive research projects (Mary George, 2008) .Why is library important essay? ›
Libraries play a vital role in providing people with reliable content. They encourage and promote the process of learning and grasping knowledge. The book worms can get loads of books to read from and enhance their knowledge. Moreover, the variety is so wide-ranging that one mostly gets what they are looking for.Why is the library an important part of the community? ›
They help local people figure out the complexities of life, from navigating the health system to helping those with housing needs. This “go-to” role has influenced library programming and events, with libraries providing advice and connections to health, housing, literacy, and other areas.What is the role of the public library to the society? ›
The public libraries have very important role in the society. It builds Citizen, educate individual, and fosters thoughtful communities. The free flowing nature of public libraries supports the literacy programme tied with cultural values for community development.Can librarians wear jeans? ›
Pages have to bend, stretch, climb up, and crawl on the ground – jeans, neat t-shirts, and gym shoes are considered perfectly acceptable. For clerks and librarians, however, jeans aren't always allowed.How do you sit in class confidently? ›
- Turn off the little voice. ...
- Realize you're not alone. ...
- Take something you're good at. ...
- Start small. ...
- Reward achievements. ...
- Make all the classes. ...
- Take a small class. ...
- Get feedback early.
- Step 1: Prepare the book. "If the water damage occurred recently, we want to first make sure the wet pages are separated so that they don't stick together when dry. ...
- Step 2: Use fan to dry book. ...
- Step 3: Place book under heavyweight. ...
- Step 4: Ask a Conservator.
Is it OK to write in a library book? ›
Please do not write in library books. By refraining from doing so, you are respecting your fellow library users and helping us preserve the cultural record of the past.What could lead to books to be damaged in the library? ›
Damage to library materials is caused by natural elements such as temperature and humidity extremes, light, pollutants in the air, mold, and pests. Their detrimental effects are usually gradual, cumulative, and irreversible.What does it mean to do a literature search? ›
A literature search is a systematic, thorough search of all types of published literature to identify a breadth of good quality references relevant to a specific topic, and is a fundamental element of the methodology of any research project.Why is it important for a researcher to review the literature? ›
The purpose of a literature review is to:
Provide foundation of knowledge on topic. Identify areas of prior scholarship to prevent duplication and give credit to other researchers. Identify inconstancies: gaps in research, conflicts in previous studies, open questions left from other research.
A literature search is a considered and organised search to find key literature on a topic. To complete a thorough literature search you should: define what you are searching for. decide where to search. develop a search strategy.Why does a researcher conduct a literature review? ›
The purpose of a literature review is to gain an understanding of the existing research and debates relevant to a particular topic or area of study, and to present that knowledge in the form of a written report. Conducting a literature review helps you build your knowledge in your field.What are the 4 importance of research? ›
Answer: My hub provides several reasons as to why doing research is essential in general, including (1) to build knowledge and facilitate efficient learning, (2) to understand various issues, (3) to know the truth and prove lies, and (4) to seek opportunities, among others.What is the first step of a successful search? ›
Step One: Identify a Topic
Selecting a topic is the most important component of a successful search.
Eight key stages were determined relating specifically to literature searching in systematic reviews. They were: who should literature search, aims and purpose of literature searching, preparation, the search strategy, searching databases, supplementary searching, managing references and reporting the search process.When looking at the purpose of a source which of the following should you consider? ›
Evaluating a source by purpose & objectivity
When considering the purpose & objectivity of a source, ask yourself the following questions: What point of view does the author represent? Is the source arguing for or against something? Does the source contain mostly factual information or is it opinion-based?
What are the purposes and goals of the research? ›
The purpose of research is to enhance society by advancing knowledge through scientific theories, concepts and ideas. A research purpose is met through forming hypotheses, collecting data, analysing, etc.What makes a good literature review? ›
Qualities of A Good Lit Review
A good literature review is NOT simply a list describing or summarizing several articles; a literature review is discursive prose which proceeds to a conclusion by reason or argument. A good literature review shows signs of synthesis and understanding of the topic.
A search strategy is an organised structure of key terms used to search a database. The search strategy combines the key concepts of your search question in order to retrieve accurate results. Your search strategy will account for all: possible search terms. keywords and phrases.How do you use search terms in research? ›
Using your research question, identify the most important 2 - 4 words from your research question. These are your key concepts. For each key concept, make a list of synonyms or related words. Use a thesaurus to find synonyms, and/or list any specific examples of your concepts.
By searching multiple databases, you can maximize available data and consider all relevant literature. This will give you better insights for your study and help you get a more comprehensive understanding of the topic.What is one of the important skills that you need to learn in doing a literature review? ›
A literature review requires several skills: The ability to search for and access publications on your research topic. Reading and analyzing sources on your topic. Evaluating data and publications to determine which literature makes a noteworthy contribution to the scholarship on your topic.
The purpose of a literature review is to:
Place each work in the context of its contribution to understanding the research problem being studied. Describe the relationship of each work to the others under consideration. Identify new ways to interpret prior research. Reveal any gaps that exist in the literature.
Researchers need to be cautious of some material, particularly material found online because the quality is unknown.