RDS Blog

Posted March 26th, 2014 by jjt0005 & filed under Uncategorized.

Lately we’ve focused an awful lot on Summon when talking about all things resource discovery. Summon is the new kid on the block, after all, and we’ve needed to try to get a handle on it. It’s changing the way people use our resources, and there are still a lot of open questions about how it’s used and how it should be used.

But what about the catalog? We haven’t really been talking much about the catalog. Not that there hasn’t been any interest from my colleagues: on the contrary, I’ve heard loud and clear the desire for a catalog that can do faceted browsing and the discussions about possibly loading our catalog data into Summon. Unfortunately, as a III shop, we’ve always been at a disadvantage in this area. We’ve never had an easy time of getting our catalog data out of the catalog because our system hasn’t really supported it.

What if I told you that had changed? What if we suddenly had a way to load data from our ILS into other systems of our choosing in data formats of our choosing? And what if we could automate it, to the point that a record saved in the ILS would get extracted minutes later with no muss, no fuss into the formats and systems that we specify? If we had this, what sorts of things could we build? What new discovery tools and methods could we support?

My big project over the past five months has been exactly this. It’s all still in development and probably will be for a few more months—but it’s creeping ever closer to being production-ready. The problem—my problem—is that this thing I’m building is infrastructure. It’s invisible. There’s not really an end-user interface that I can point you to so you can look and get a feel for what it does. But it will enable things that we couldn’t have done before. It will enable us to do the things we’ve planned in our RDS Phase 2 Action Plan—and much more.

So this is the first in a series of blog posts that I wanted to write to introduce this thing in a way that might make it interesting and accessible. I’ll start out by laying some of the groundwork, examining why having control of our data is so important for the things we want to do.

The Importance of Data

Astute readers might now be thinking, “Wasn’t moving to Sierra supposed to help open up our data?” Yes—yes it was. With Sierra, III promised much. III billed the system as “open”—built on open technologies using a “service-oriented architecture.” Listening to their marketing, Sierra sounded almost as though it would provide an extensible ILS framework that we would be free to build upon. We were promised open APIs that would allow us to build new software components that would consume data from Sierra and be able to interact directly with Sierra.

But wait, let’s take a step back. I know we throw around terminology like “open architecture” and “open data” and “API,” but these are all kind of buzzword-y, aren’t they? We talk about these things like they’re important—but we’ve never stopped to explain why.

So What Use is Data, Anyway?

At its most basic, a computer program comprises a set of instructions that a computer follows to perform some task. Useful programs contain generalized instructions that can be reused in different contexts. Data is what a program uses to do stuff.

Say I want to program something that displays a formatted list of titles and authors for a set of books that I own. One approach would be to write a separate command to display each individual piece of information—each title and each author, like this (where “\n” is a new line):

    print "Title: Hamlet\n"
    print "Author: William Shakespeare\n"
    print "\n"
    print "Title: The Sun Also Rises\n"
    print "Author: Earnest Hemingway\n"
    print "\n"
    print "Title: The Great Gatsby\n"
    print "Author: F. Scott Fitzgerald\n"
    print "\n"

I hope you see that this approach is pretty useless. It displays exactly what we told it to display and nothing more. What happens if we want to change the label “Title” to “The Title?” What happens if we want to change what displays between each book? We have to go through and change each instance in our code. This is clearly a very inefficient way to program, fraught with potential errors.

But we can write a more generalized program that does the same thing using fewer actual instructions. All we need to do is to store the title/author information for our book list in some sort of data construct. Then we can code a loop that will repeat one set of instructions for each item in the data.

    my_data = [{
        "t": "Hamlet",
        "a": "William Shakespeare"
    }, {
        "t": "The Sun Also Rises",
        "a": "Earnest Hemingway"
    }, {
        "t": "The Great Gatsby",
        "a": "F. Sott Fitzgerald"
    }]

    for book in my_data:
        print "Title: " + book['t'] + "\n"
        print "Author: " + book['a'] + "\n"
        print "\n"

What’s happening here is that we’re storing all of the authors and titles in a nested data structure that we’re putting in a variable called my_data. Our program loops over each element in that data structure, temporarily assigns it to the variable book, and then displays the title (book[‘t’]) and author (book[‘a’]). This is much better than the last version of our code. If we want to change how anything is displayed we only have one place to change it.

In addition, now that the data is actually defined in a data structure, we can reuse it later in our program and do other things with it besides display it. We could write code to let us search it, for example, which we couldn’t do at all with the first version.

But this still isn’t as good as we can make it. We’re still storing our data structure inside our program, and it’s still a little bit cumbersome to have to get the brackets and curly braces and the formatting all right when we’re editing the data. The next step might be to store the data in a separate file that’s a little more compact and little easier to edit—say, a comma-delimited (CSV) file, like this:

    title,author
    Hamlet,William Shakespeare
    The Sun Also Rises,Earnest Hemingway
    The Great Gatsby,F. Scott Fitzgerald

Now to use this, we’d have to code instructions in our program first to access the file, parse the contents, create a data structure like the one in the last version, and load the data into memory. But if the resulting data structure is identical, the rest of the code works without modification.

Show Me the Data!

Once I have my data stored somewhere, like in a CSV file that I’ve saved on my computer, I can write programs all day long that use it to do different things, and I don’t have to re-enter it anywhere. In a nutshell, that is one of the primary uses of opening up data: it allows anyone that has the necessary skills to write programs that can do new things with it.

In contrast, our data in Millennium was locked away—it was both physically inaccessible to us (except what we could access through the Millennium user interface) and stored in a data format that was Millennium-specific, that we couldn’t make sense of even if we could access it. To be fair, we could export data in a few ways: as MARC via Data Exchange, as MARC via Z39.50, or as delimited data via Create Lists. But these methods were too limited or too manual to be of much real use.

With Sierra, although the reality has unsurprisingly fallen somewhat short of the marketing promises, III has cracked open the door for us. Sierra is built on open-source database and indexing software, unlike Millennium. III has given us direct read access to the database. With this little bit of extra freedom that we didn’t have with Millennium, they have opened the door to allow us to take control of our data.

But is direct database access enough?

Storing, Using, and Extracting Data

Turning back to our earlier example: as we use this program we’ve created, there might be more and more that we want it to be able to do. Say we want to be able to track the date that we added a book, the number of times each person in our family has read a book, the amount we spent on a book, and the person in the family that bought a book. Say we notice that we have multiple books by the same author and we want to start recording information about authors somewhere so that we don’t have to replicate it for each book. As we want to do more with our program, we need the data that supports that functionality, and we need our program to be able to read that data in order to work with it. And, as our data grows—both in size and in complexity—the format in which the data is physically stored begins to impact how smoothly and efficiently the program runs.

How a program stores data and what it stores are highly individual to that program. It has to be able to read stored data into variables as appropriate, use those variables to carry out its tasks, and then update the data in storage as needed. It’s best to store data in a way that requires as little processing as possible to translate to internal data structures—if your data access methods are inefficient, it can slow down your program and make your code harder to understand and extend.

The upshot is that one system’s internal data store isn’t going to translate 1:1 to any other system or program—nor should it.

Let’s consider the ILS again and what I want to be able to accomplish. I don’t necessarily want to use the exact data as its stored in the ILS database. Each of the applications I want to write needs a different subset of data from the ILS; for each application I may need to store similar (or the same!) data quite differently.

At the moment Sierra allows us direct access to its internal database using SQL. SQL is the standard language used for querying relational databases: most programming languages have functionality that will let you query a database using SQL and pull data into internal programming structures with relative ease. With this, we can actually write programs that can read Sierra’s internal data.

But you know what? As nice at that is, it isn’t good enough. Sierra’s database is designed to support the Sierra ILS—it wasn’t written to support my applications. III made design decisions when it built the database that make it work okay for some purposes and not others. Let me give you an example that illustrates what I mean.

An Example: A Shelflist Application

One of the applications I started working on back in August is a simple shelflist builder to help with doing inventory. It lets you enter a location code and a call number range and pull back a list of items, sorted by call number, that you can then browse. You can search for call numbers and barcodes. You can mark items as on or not on the shelf.

The shelflist itself is a table of items, like this:

Row	Call Number	Volume	Copy	Status	Barcode	Suppressed?	Marked
1	M128 .A016 2002		1	AVAILABLE	1002285505	false	On Shelf
2	M128 .A05 2005		1	AVAILABLE	1002189053	false	Not On Shelf
3	M128 .A09 2007		1	AVAILABLE	1002420293	false

When you click on a row, it expands to show you the title, author, bib and item record number, and a link to the record in the WebPAC.

This data of course all comes from Sierra initially. When you first create a shelflist, my application submits an SQL query to Sierra, gets the information, and then stores it as a flat file so that my code can access the data quickly and efficiently. But the data as stored in Sierra is anything but flat. The database structure is quite complex, with item information and bib information being stored in separate tables and variable length fields stored in yet a separate table. Once I convert the data to the flat structure, access is instantaneous—but the initial query to build the shelflist takes 4 to 5 minutes, and sometimes even longer, to run. The SQL query is as efficient as I can make it—but, based on the database structure and what fields III has and has not indexed, the query still takes forever. And waiting 4 to 5 minutes for this query to run is unacceptable.

So even though we have direct access to the Sierra database, we still don’t have control of our data. To have control, we need to be able to pull the data out of Sierra. We need to be able to extract it and put it into other storage mediums that we do control, ones that we can configure to support fast, efficient access to the data that our applications need. The direct access we have to the database allows us to do this, but we’ve had to build the functionality ourselves.

What About APIs?

When one system shares data with other systems, there are at least two potential gotchas. One we saw in the last section: a system’s internal data won’t necessarily translate well to a system for which it wasn’t designed and optimized. Another is that a system has to maintain the integrity of its data. If we allow multiple programs—even ones that we’ve written—to access and write to the same data store, we risk that one of the programs might write data that renders the store (or parts of it) unreadable or invalid for other programs. This is why even open systems don’t generally allow write access to their internal data.

One way to take care of these problems is to enable access to your system via an API, or Application Programming Interface. If someone has created an API for their system, it means that they have defined a set of commands that I, for example, could use in my own programs that allows mine to interact with theirs in predefined ways. Maybe this allows my program to submit a query to their system to get back data that it can then use to do something. Or maybe it submits a call to their API with a data value and their API performs some calculation on the value and returns the result. Having access to APIs helps me extend the capabilities of my program so I don’t have to reinvent something that someone else has done. And APIs allow my programs to interact with other systems in a controlled, predetermined way—i.e., so that they don’t have to give me carte blanche access to everything in their systems and I don’t have to care about their internals.

APIs can be read-only or read-write. A read-only API only allows other programs to send commands for reading data, whereas a read-write API allows other programs to send commands that write data to the system as well.

In a way, you can think of an API as being a counterpart to a UI (User Interface). A UI allows a person to input commands, enter data, and work with a system to accomplish some task. An API is the same, but for a computer program. It’s just as important to design an API well so that they’re easy to understand and use for the same reasons that its important to design a UI well.

The work I’ve been doing the past five months initially began as a desire to create an API for our catalog data. In fact, that’s still the major component. Having a “programming interface” built on top of our catalog data that can serve up different views of that data on request will make application development easier. (And I will come back to this in a later post.)

At the same time, you may have seen recent announcements from III about their own “Sierra API,” which has been in the works for a long time. They’ve finally announced that they will be releasing the initial version of their API in April.

So you might be wondering: how does III’s API affect mine? Have I just wasted five months of work building something that’s now redundant?

Consider my earlier point about why having direct access to Sierra’s database isn’t good enough. Just like the Sierra database, III has designed their APIs for some purpose. I haven’t seen them yet, so I can’t yet say how well-designed they are or how applicable they will be to the types of things we want to do. But they’re still their APIs. They are in control of what data they serve up and how that data is modeled. I want our applications to be built on top of our own APIs, because I want the option to serve data to our applications in a way that’s custom-tailored to what we need. The applications I build this way will be more fully functional and will perform better.
Speaking of performance—if API access is slow, all the applications that use that API will be slow. I can ensure that our APIs are fast and responsive; I can’t ensure that III’s will be.
Let’s think of our catalog data as separate from Sierra. III is building an API for Sierra. I’m building an API for our catalog data. Now that we can extract data from Sierra, and as we build more and more of our own applications, that distinction will become more and more meaningful. For instance: data in Sierra isn’t FRBRized. The data extraction process I’ve built would allow us to write an extraction routine that would analyze and FRBRize our catalog data as it gets extracted, loading it into a system as separate Work, Manifestation, Expression, and Item entities. Our API would then serve up this data as such to applications we build to use it. Sierra’s API couldn’t do this, because that’s not how Sierra stores data.
I’m almost certain that there will be ways that my API will be able to use III’s. For instance, one thing mine won’t be able to do on its own is write data to Sierra, whereas some of III’s APIs will be read-write. The systems underlying my APIs could make calls to III’s to enable us to write to Sierra while still letting us retain all the advantages of controlling our own APIs.

Ultimately, building an API lets you give other programs access to data using standardized queries and commands. From the consuming program’s perspective it’s not too different than storing data on the filesystem or in a database and using commands to read it into an in-memory data structure. Just as with every method for storing and accessing data, there are practical implications and considerations that affect how you build and design your API depending on the needs of the application that uses the data. Building and maintaining our own API for catalog data means that these factors are under our control—it means we can build better applications and generally do more than we could otherwise.

Hopefully now why we’re building what we’re building is a little bit clearer. But I still haven’t given you a great idea of what exactly it is we’re building. In the next installment we’ll take a closer look, specifically at how the extraction process I’ve developed works and how it will run in a production environment. There are a lot of practical implications—like, how do we keep extracted data fresh and in-synch with Sierra? Will it require any extra work from library staff to maintain? Stay tuned!

Posted January 21st, 2014 by jjt0005 & filed under Uncategorized.

User Interfaces is continually working on improving resource discovery interfaces, and of course an integral part of improvement is always evaluation. One major element we want to evaluate is our changes’ impact on patrons’ library use. Near-term plans for discovery hinge on Summon, so we’re especially interested in how Summon and our Find Articles search have affected e-resource use. But this is really tough to characterize and quantify. For one thing, the difficulty of getting meaningful usage stats for library electronic resources is notorious. Our e-resources are spread out among many different databases from many different vendors, where each vendor manages its own interfaces and its own content and collects its own stats its own way. We have meta-interfaces that are supposed to help funnel our users to the correct resources in the correct databases, and these systems collect stats as well. Although we can compile stats from each of these sources, little of it is unified. Stats from different sources measure different things differently.

Summon landed right smack in the middle of an already complex situation when we launched the beta in 2012. According to Karen Harker, the library’s Collection Assessment Librarian (and e-resource stats guru), Summon has played havoc with the stats we get from database vendors. Some vendors have shown a steep increase in usage while others have shown a steep decrease, and it’s impossible to tell from the data what traffic comes from Summon and what doesn’t.

However, one set of statistics we have that has remained pretty consistent over the years is what we get from Serials Solutions about our “360” services we have through them: 360 Core (our e-resource knowledgebase), 360 Link (our link resolver), and our e-journal portal. This includes “click-through” stats, which measure users’ requests to get resources’ full text.

What’s convenient for our purposes is that:

Summon uses the link resolver to link out to many (but not all) full-text resources, so the 360 stats include some Summon usage.
Our 360 services predate Summon.
Summon implementation did not change anything about our 360 services that would affect click-through stats other than the effect Summon itself has on full-text downloads.

So, if nothing else, the 360 click-through stats seem to provide a good way to compare pre-Summon e-resource usage to post-Summon e-resource usage. Although they can’t give us the whole picture, they can help us determine whether or not Summon is having an impact and maybe partially characterize that impact.

Click-through Stats, 2006-2013

In truth, I’ve been keeping an eye on these stats for a while, as they are quick and easy to obtain. Last spring I put together a visualization that shows a comparison of the years 2006 through the present, and I have been updating the graph with new data on a regular basis. Last week, in preparation for a Liaison’s meeting on the topic, I updated the graph with complete data for Fall 2013. Now that our Find Articles service has been live for over a year, I thought this would be a good time to share what we’re seeing.

The visualization I created is located here. Note that I built this using the D3.js JavaScript library, so it works best in Chrome and Firefox.

Here’s a screenshot showing the relevant data.

We implemented Summon as a “beta” at the very end of January 2012, so the post-Summon lines are the red and green ones, and the pre-Summon ones are the brown, blue, and light grey ones. Here are the features I want to point out.

Pre-Summon lines are grouped together at the bottom and are startlingly consistent in terms of data values and line shape.
The first month of our Summon beta click-throughs were almost double the previous February peak from 2011. The months following consistently show much higher usage compared with previous months. (Except July/August 2013, which I’ll discuss in a minute.)
September 2012 is when we launched Summon as “live” in conjunction with the launch of our new website, and Spring 2013 (live) shows a nice increase compared to Spring 2012 (beta).
Fall 2012 and Fall 2013 both show “live” usage and are pretty consistent.
Summers are still low, which is to be expected. But most striking is Summer 2013. In 2013, the difference between summer and spring and summer and fall looks proportionally much greater than the difference between summer and spring/fall during pre-Summon years.

So based on this data, what conclusions might we draw about Summon and the impact that Summon has had on e-resource use?

First, I think we can safely say that the leap in numbers of click-throughs that we see from 2011 to 2012 and 2013 was caused directly by Summon. The data corresponds perfectly with our Summon implementation timeline. Nothing else happened in 2012 and 2013 to explain the change any other way. We always have fluctuations in enrollment and database subscriptions, and, despite these, 2006-2011 numbers are relatively consistent.

Second, the magnitude of the leap in click-throughs after Summon implementation and the levels that are being sustained suggests that Summon is well-used and that people are continuing to find it at least somewhat useful. If people were not finding what they needed through Summon, I’d expect the click-through rates to drop off. (Of course, the question is: well-used and useful compared to what?)

Third, there’s the matter of Summer 2013. I checked the figures and found that there’s an average difference of -61.3% in 2013 between summer and spring/fall. Comparatively, 2010 and 2011 only have average summer and spring/fall differences of -39.1% and -35.1%, respectively. I also checked enrollment figures and found that there was a comparatively larger dip in enrollment between summer and spring/fall in 2013 than previous years—but still not large enough to explain the large difference in click-throughs by itself. My guess? I think this data may show what we have heard anecdotally about Summon: that students use it, while faculty (overall) tend to stick with their tried-and-true methods of research. If Summon’s impact on click-throughs is disproportionally weighted toward times of the year when students are around, then it seems reasonable to assume students are using it more than faculty.

What about Summer 2012? Click-throughs were a bit higher than in 2013. Unfortunately 2012 is a little difficult to generalize about, since we were in beta in spring and summer and then went live in the fall. It’s possible that a lot of faculty members were giving Summon a test run during the summer and decided it was not too useful for them, boosting its stats in 2012 and leading to a drop-off in 2013. It’s tempting to think that enrollment numbers probably play a role (15743 in 2012 and 13866 in 2013), but if we look at Summer 2010 and Summer 2011, we had a big drop in enrollment (17259 in 2010 and 15909 in 2011) while the overall number of clickthroughs is higher in 2011. The forces at work here may simply be more complex than we can determine based on the numbers we have available. It will be interesting to see what happens this summer.

To summarize, here’s what I think we can say based on these statistics.

People are using Summon to get access to the library’s full-text e-resources.
With Summon, we’ve seen a big increase in requests being passed through the link resolver.
People continue to find Summon useful and continue to use Summon to access full-text e-resources.
We think that, overall, students are using Summon more than faculty members and that most faculty members are sticking with their tried-and-true methods of research.

And here are a few things that we most assuredly can’t say anything about based solely on these statistics.

If or to what extent Summon use is cannibalizing direct database usage.
If Summon is just shifting usage of e-resources or if it’s actually increasing use.
Summon’s effect on use of A&I resources.
If Summon is causing a decline in use of resources not indexed in Summon.
If people are finding better, more relevant resources through Summon compared to other sources (such as direct database searches).
What impact our Summon setup and the defaults we use for our Find Online Articles search has on e-resource use.

Posted November 22nd, 2013 by William & filed under Uncategorized.

Where Things Stand

Everybody loves to talk about the importance of “mobile” and we’ve had conversations with various folks for years now about the appropriate level of support for mobile sites and services at the UNT Libraries. The User Interfaces Unit currently has a staff of two front-end programmers, supporting nearly all public-facing websites and services. Concurrently, our Library Technology staff checks out iPads and Kindles for students’ long term study needs. So what does the future support for mobile look like at a university library like ours? That’s what this post is about. Stick with me, hopefully I won’t ramble too much, and towards the end you”ll find my plea for help. Apologies too for any geek-speek ahead of time. There’s a lot to cover.

This week I reviewed our online usage statistics and found wide ranging variations in real-world mobile use, with hits in 2-7% range for the library catalog, website, and related UNT-centric systems, and an astronomical 21-32% for the Gateway to Oklahoma and Portal to Texas History. At first glance this would suggest that students and professors are using our services on their mobile devices with much less frequency than patrons outside of our academic bubble. So what does our mobile footprint look like right now?

The catalog has some support for mobile devices. We run a dedicated mobile search interface here. Since we have some ability to modify the main catalog’s look and feel, we also have some CSS in place that hides non-essentials when on smaller devices and also offer a QR code on the desktop display of item records (example record). This latter feature will get a person with a smart phone camera to a clean record to walk into the stacks with. This set of solutions is a couple of years old at this point and could use some updates. Note that Jason is working on an API and playing with Sierra’s internals and I predict good things happening in this space’s future.
Our “Find Articles” search, powered by Summon, Libguides, and a number of the other services we subscribe to offer mobile versions on their sites. There’s not a lot we can do here beyond what the vendors supply.
The library website has a few essential pages redirected into a mobile optimized skin. Again, we only offer a limited amount of content using this methodology, so we could defiantly up our game here as well.
Aside from some newly developed sites, that’s basically it. Everything else is full-on desktop version, fixed width, non-responsive. Our sites “work” in really good smartphones and tablets but a lot of them could be sooooo much better.

A Snapshot on Web Development Testing

As one of our developer goals this year, UI is going to be revisiting the various sites we manage and figure out what can be done to improve the mobile experience. One the most exasperating problems in doing this is the wide ranging variability of the experience a user might get, based purely on their device.

Testing a Few Years Ago

In the good old days we just had to worry about:

Which flavor of Windows was the user on: XP/7/8
Which other operating system where they using (Mac, Linux – haha)
Which version of Internet Explorer where they using: 6/7/8/9/10
Which alternative Browser were they using: Firefox, Chrome, Safari, Opera (and occasionally, which version number!)
What was the size of the browser window width (800, 1024, 1200, etc.)

We could, of course pretty much assume the person was on a high speed connection of some sort too.

There were dozens of potential issues we could/would run into when developing a site, and so every project had to hack in small bits of special code that target a particular bug in say, IE 7. To be sure, there is a lot of variability here, but its fairly workable with only a couple of physical machines. Most of this came to be very well documented over the years and plenty of standard fixes were available to us.

Testing Today

And while testing against most of those use-cases is still ongoing we increasingly have a whole other kind of insanity to consider too. What pray tell?

Which other Operating Systems are we seeing traffic from (Android, iOS, Symbian, blackberry, etc)
Which company made some sort of change to the OS (ATT, Amazon, Samsung, LG)
Which version of that operating system (ice cream sandwiches and jelly beans, v.5/6/7)
What device width? < 300px, 500, 700, 1024, 1200,1920, and literally every pixel value in between
What orientation is the device in? landscape/horizontal which of course changes the device width!
Which browser! Safari, Chrome, Dolphin, Opera, Firefox

And now we need to worry about page ‘weight’ since it affects both download speed and data quotas. Everybody hates slow loading pages on their phones and loading a series of poorly optimized 2MB photos on a 200MB data plan is just infuriating, right? While this is a mild inconvenience to those of us in the States, consider what this means for visitors to flagships like the UNT Digital Library and the Portal to Texas History, on lower-quality, limited data devices in developing nations (yes, it happens).

Finally many of the operating systems and browsers listed above have quite different support for newer HTML elements, javascript, and CSS, making cross-browser rendering of our sites a tight rope balancing act and thus testing for this kind of variability is… difficult. While there are some online tools to help out, real devices can often be far more helpful (you can’t truly touch a computer screen to interact with buttons), and thus this story’s emotional plea.

We are Building a Device Lab

User Interfaces is building a mobile device resource lab using funds from a TSLAC Award. Over the course of this year we’ll be buying a variety of devices and development tools to test the most common scenarios we encounter, and to design new and cool things. Along with the physical hardware and software tools, we are also building up a small library of mobile-related reading materials, and some really cool usability and wireframing supplies.

At some point we hope to open this up as a service for students, TexShare members, and maybe the public to use in their own projects as well. This is modeled after the Open Device Lab concept (see the resources section, particularly), a growing movement for developers in the US and Europe. If all goes well and someone else doesn’t beat us out, we will become the first such lab in Texas, and the only such lab in the south/southwest!

What Else Could be in the Works?

Uses of mobile technology in a library setting are only going to increase with time. What might we be looking forward to?

In the coming months the UNT Libraries will be unveiling some new online exhibits using a new Drupal-based system I developed during the spring of 2013. The new site is built using responsive web design techniques and will be capable of hosting images, videos, audio, and textual materials to highlight our collections and it is optimized to work in tablets and phones. Because many exhibits have physical components, imagine the possibilities of being able to walk through the Judge Sarah T. Hughes Reading Room, the Edna Mae Sandborn Room, or one of our other exhibits spaces, snapping a picture of a QR code next to an object using a tablet’s camera and being able to read a narrative about the cuneiform tablet, or hearing an audio sampler of the score in front of you!
One of the year’s goals is to get more of the sites “responsive.” The Open Access Symposium is now, and the forthcoming launch of a new Oral History site and brochures site will be as well. If all goes well, by the end of summer, the main website, and a few others will be too.
A number of items in the Portal to Texas History are starting to get geographic coordinates entered into their metadata records. Phones have a good sense of where they are in space. What if you could get a proximity search for real-world items based on your location? At some point in the future we will update this and other digital library site’s HTML strucutre to bring them in line with more modern presentation techniques, but there are so many other cogs here that such updates will come as part of a longer-term project.
Maybe we need to consider revisiting the concepts of roving reference, now with with tablets in hand, each smartly loaded with a plethora of Electronic Resources divvied up so a Librarian, GLA, or student worker could provide help to information-starving undergrads! And as an added bonus having a nice place tracking tools in-tow like SUMA, would grant us much more precise knowledge of which areas of the library were getting used, when, and how.
What might wearable technology do for us in the library world eventually? Horizon technology like Google Glass is already showing potential for warehouses and inventory control before it hits the public. What would this mean for keeping track of our millions of books, or immersing yourself in an exhibit, or browsing the stacks?
The speculative list is simply endless.

How Can I Help: Donate a Device!

There are hundreds of manufacturers, and thousands of existing devices out on the markets, and the procurement of used items at a State institution are tough. So if you happen to have (contract free, with wifi):

an older smart phone
a phablet,
web-capable media player,
an e-reader,
tablet,
other web connected devices like gaming systems

that is sleeping quietly inside a drawer somewhere because you’ve upgraded to something better, please consider donating it to us. (email interest here first, please) or…

How Can I Help: Make a Cash Donation!

Devices and the infrastructure we will need to make this work aren’t cheap, and as time marches on, new devices come online and web browsers change how they do thing internally. As a reminder, the price commonly advertised for cell phones (typically from $0.00-300.00) is the “subsidized” price that service providers use to entice users into two year contracts. The actual cost of new devices is always several hundred dollars more and we will also need to think about such things as storage, security, and adequate power for a large array of devices.

Our TSLAC funds must be spent by September of 2014 and we would be overjoyed to have funds to buy new equipment, attend or receive relevant training, and update our list of tools after that date. If you are feeling generous, please consider donating with this secure giving form. If you do, remember to designate the libraries, and clearly indicate in the notes that you would like to direct your gift to “User Interfaces Mobile Development”.

Other Suggestions

Thanks for sticking with me through this long post. If you have other suggestions related to our mobile sites and services, feel free to get in touch. (email me).

Credits

The image of devices used at the top of this post appeared in Smashing Magazine’s “Open Device Labs, Why you should Care“

Posted August 14th, 2013 by jjt0005 & filed under Uncategorized.

Sequels to successful Hollywood movies (not to mention prequels) are notoriously hit and miss. For every The Empire Strikes Back, Aliens, and Wrath of Kahn you have The Phantom Menace, Alien Resurrection, and The Final Frontier. (Yes, I do watch a lot of sci-fi. Why do you ask?)

User Interfaces has just produced our very own sequel, a companion to our thrilling 2011 debut, Resource Discovery Systems at the UNT Libraries. We call this one: Resource Discovery Systems at the UNT Libraries: Phase Two Action Plan.

Think of it as more of a continuation than a full-blown sequel. The original outlined a grand, multi-phased RDS implementation vision and then detailed an action plan for just the first phase. After Phase One, we said that we would revisit our vision, update it based on our experiences, and develop a concrete action plan for Phase Two. As astute readers may have already guessed, this is that plan. We hope that you find it worthy of its predecessor—something like, Part 2: The Users Strike Back, rather than Part 2: Discovery Boogaloo.

Download the Phase Two Action Plan Now!
Go ahead—download it, read it! It’s only about 10 pages of actual content, with some pictures. Or, at least read the Executive Summary if you’re short on time.
Although we are pleased with it and feel comfortable presenting it to you as a finished product, this is still in draft form! Before finalizing it, we wanted to put it forward to library employees for comment. So after you’ve read through it, please tell us your thoughts in the comments. (I’ve started off the discussion by addressing a few comments and questions that we’ve already gotten.)
If you haven’t read the original RDS Report—or if you just want to brush up on it before tackling Phase Two—you can get it here. (We have a publicly accessible version here, but it omits the Phase One Action Plan at the end.)

Posted June 17th, 2013 by zw0011 & filed under Uncategorized.

1) Will blog authors be allowed to create their own tags?

Absolutely! All authors will have the ability to freely create two type of tags via Drupal’s Taxonomy option by default:

Tags (free keyword tagging)
Categories (a dropdown list of controlled vocabulary)

2) Will moderated comments from readers of the blog be allowed?

By default, the comments function for each blog is disabled. The comment function can be turned on per blog author(s) request. It is important to point out that:

Blog author(s) will be solely responsible for moderating, approving pending comments, and deleting inappropriate comments
All incoming comments will be in pending stage, and will need to be moderated by the blog owner(s) before they will be made available to the public
Spam prevention means have been implemented to help preventing spam. However, UI and Lib-Taco can not guarantee that pending comments for approval will be spam-free.
To reduce user management overhead, we will not require the visitors who want to create comment to create a user account before a comment can be submitted. However, reCapture Form + email address will be required to fill out before a comment can be submitted. Please see example: http://blogs.library.unt.edu/example/post/1

3) How will the statistics on the number of individuals viewing the blog be calculated?

Our libraries have chosen Piwik (http://piwik.org/ ) to replace Google Analytic (due to Google’s SLA issue) as the software for collecting libraries websites usage statistics.

Each blog comes automatically with its usage statistic tracking installed, and the stats will be made available to all blog authors to view via https://pw.library.unt.edu.

4) Will statistics on the blog usage be accessible to the blog authors? If not, what would the expected turnaround to get those numbers from UI be?

The statistics of UNTL blogs will be make available to all blog authors via https://pw.library.unt.edu after the requested and approved blog is installed.

5) Can the blog author of a side bar profile section with their picture and a brief bio?

The author profile (with picture and bio) feature will be made available by default with each blog installation.

The author’s full name (not euid) and the link to view this author’s profile will be accessible through each post. Please see example here: http://blogs.library.unt.edu/example/post/3. You can see the full name of the post’s author, and access to the profile is right beneath the post title. Each author will be able to create and update his/her own bio and profile picture freely.

6) Will the blogs be indexed by Google?

Yes. In addition, UI will actively help each established blog with blog marketing via the library website, and promote discoverability through all means that it may deems necessary.

7) Can we allow guest writers not affiliated with UNT?

At this point, it is beyond the scope of this blog service.

8) Can we customize the side bars?

Majority of the typical blogs features, such as archive, search, related links, latest post, and social network sharing are already made available at the installation by default. For keeping consistent branding and providing familiar user interfaces across the blogs, blog author/owner by default will not be able to customize the side bar. However, after each blog is installed, it will be in ready condition without the need of any customization. Blog author/owner will be able to:

Edit “About” page
Edit “Contact” Page
Create/update Posts
Add/Maintain Related links
Create taxonomy (free tagging, and controlled vocabulary)

9) The web form mentions storage limits are subject to change. What are the storage limits currently?

Per image/document upload size limited is set at 2 MB per upload, and there is no limit so far on how many files can be uploaded per author for each blog.

Lastly, here is an example blog we have set up for everybody who is interested in requesting a blog to preview https://blogs.library.unt.edu/example, pretty much this is what each blog author/owner should expect from the default blog installation.

Please contact Ui.library@unt.edu for acquairing Guest logim access for a test drive.

Posted April 3rd, 2013 by zw0011 & filed under Uncategorized.

It has been a year since we revealed the Summon web-scale discovery service as our libraries’ designated search tool for discovering online full-text articles.

How is the Summon faring with the public so far?

The folks at the User Interfaces Units are the firm believers in Data speaking for themselves. Let’s us show you what the usage data we have collected are telling us.

Please visit our online presentation here: http://tinyurl.com/c5clafc

Posted October 12th, 2012 by jjt0005 & filed under Uncategorized.

For background on Sierra and the migration project, please see this PowerPoint presentation file.

What system downtime will be associated with the migration, and what exactly will the “downtime” entail?
How will logins and authorizations in Sierra work? Can some areas retain generic logins (e.g., generic Circ logins for students) that can then be overridden with specific authorizations?
Diacritics in Millennium aren’t easy to type in. We have to type them in by their codes, but we can’t see what the codes are. Is this improved in Sierra?
Will we be getting the Sierra Dashboard product? (Is it a separate piece?)
How will Sierra affect printing templates/options?
Will SQL mean faster Create List completion?
Will there be formal training/tutorials available when we launch?
For how long will we still be able to use the Millennium client on top of Sierra after Sierra goes live? What about the telnet version of III?
Is there a date by which we need to stop making changes to Millennium?
Will patron loads work the same with Sierra?
Will the Music Special Collections database that contains, e.g., the WFAA/WBAP collections continue to work in Sierra?
Will the Output Vouchers and PVE system still work in Sierra?
Will LDAP / single sign-on features still work in the WebPAC in Sierra?
Will the self-checkout stations continue to work in Sierra?

Q: What system downtime will be associated with the migration, and what exactly will the “downtime” entail?

A: There are three dates that will involve system downtime:

November 8th, ~8AM to 4 or 5PM, for installation of two new servers.
November 18th, ~1-4 hours at the most in the late morning, for upgrading Millennium to version 2011.
January 23rd, ~10AM to 6 or 7PM, for switchover to Sierra.

“Downtime” means that the staff-side Millennium client is unavailable, the telnet application is unavailable, and the WebPAC is unavailable. For the first two dates, this will have no effect on electronic resources–the IR Services page and all databases will be available even though the catalog is down. For January 23rd, since we will have migrated the electronic resource front-end into the catalog by that time, the databases/e-journals A-Z lists and search interface will be down, but individual databases and e-journals will not be affected (e.g., if you have them bookmarked).

During the downtime, circulation functions will be performed–depending on the area–either using the Millennium offline circulation program or a paper process. If you work in a circulation capacity and aren’t sure what your area will be doing, please check with your Circulation Work Group representative to find out.

Q: How will logins and authorizations in Sierra work? Can some areas retain generic logins (e.g., generic Circ logins for students) that can then be overridden with specific authorizations?

A: The situation with logins in Millennium is a little complex. In Millennium, there are two separate components: logins and initials. Certain settings are tied to your login, while authorizations are tied to your initials. Some areas have group logins and group initials; other areas have individual logins and individual initials; still other areas have some mixture of group and individual logins and initials.

Sierra combines logins and initials into one single “user.” Once you log into Sierra, you inherit whatever authorizations are tied to that user. It will be possible to set up “context users,” which will function similarly to Millennium logins–you log in with the context user’s name/password, then you log in with your username/password to set your permissions. It will also be possible to override permissions if you’ve logged in with a user that has a more restrictive set of permissions and need to perform a particular function that the regular user is not authorized to perform. And our system could use both single-user and context-user access.

For more information, please see the Sierra Knowledgebase article, Migrating Users from your Millennium System.

Q: Diacritics in Millennium aren’t easy to type in. Is this improved in Sierra?

A: Unfortunately diacritics in Sierra appear to behave exactly the same way as in Millennium.

Q: Will we be getting the Sierra Dashboard product? (Is it a separate piece?)

A: We will be getting the Sierra Dashboard product–it is not something we have to purchase separately.

For more information about the Sierra Dashboard, please see the CS Direct Knowledgebase FAQ on the topic.

Q: How will Sierra affect printing templates/options?

A: Not at all. Print templates and options will be the same in Sierra.

Q: Will SQL mean faster Create List completion?

A: Yes. Create List jobs will just be running SQL queries in the background, so they should run very quickly in Sierra.

Q: Will there be formal training/tutorials available when we launch?

A: Yes. The Sierra page on CS Direct has the Sierra Knowledgebase which has video tutorials and articles. You can also access the Sierra manual (Sierra Web Help).

Q: For how long will we still be able to use the Millennium client on top of Sierra after Sierra goes live? What about the telnet version of III?

A: Officially there is no date yet by which the Millennium client will stop working with Sierra. However, we would like to phase out Millennium entirely by June 1st.

In Sierra, there is still a significantly stripped-down version of the telnet interface (called the “Admin Corner”) that contains the functionality that they just haven’t yet moved into Sierra–things like system options and certain circulation settings. They will begin moving those things into Sierra starting with phase 2 of Sierra development with the intent to eventually completely remove the telnet interface.

But, there will still be a full working version of the telnet interface tied to the Millennium client.

Q: Is there a date by which we need to stop making changes to Millennium?

A: No. Even though the Sierra database will start building on December 14th, it will remain in synch with Millennium until the time Millennium is taken down on January 23rd for the final switchover.

Q: Will patron loads work the same with Sierra?

A: Yes. We’ll need to start using Data Exchange instead of telnet for patron loads, but there’s not much difference.

Q: Will the Music Special Collections database that contains, e.g., the WFAA/WBAP collections continue to work in Sierra?

A: It will still be available as it is now for the foreseeable future. The web interface will still be up on port 81. The telnet interface will still be available via the Millennium client that runs on top of Sierra. We will just need to make sure there is a workstation in music that’s set up with the Millennium client so it can be kept up-to-date.

Q: Will the Output Vouchers and PVE system still work in Sierra?

A: The “Output vouchers” feature will also transfer to Sierra as-is, so the PVE system should work the same way in Sierra. We will discuss this with Lib TACO before the migration to make sure.

Q: Will LDAP / single sign-on features still work in the WebPAC in Sierra?

A: Yes. Users will still be able to log into their accounts in the WebPAC using their EUIDs via LDAP.

Q: Will the self-checkout stations continue to work in Sierra?

A: Yes. There will be small configuration change needed, but they will continue to work.

Posted September 24th, 2012 by zw0011 & filed under Uncategorized.

Just in time for the new semester, the UNT Libraries’ website has a brand new look!

This new design is aiming to improve the user experience: putting UNT students, faculty and staff needs at front and center.

We welcome you to take a close look at some of the new and improved features, and let us know your thoughts and feedback by leaving your comments here.

Highlights:

Improved search options and search boxes location
Added today’s building hours, and improved locations and hours display
Added a collapsible “Find” tab throughout the site that will provide quick access to frequently used search options, features, and services
Improved news and events display to help promote Libraries’ collections, happenings, and services
Improved service-oriented and context-driven information architecture to make discovering and using library services and resources more straight-forward
Aligned with other UNT schools and colleges in adopting the new UNT branding

Posted May 15th, 2012 by zw0011 & filed under Uncategorized.

From October to December 2011, the User Interfaces Unit (UI) conducted a series of study asking stakeholders and end-users about their experiences in using the now 5 years old UNT Libraries’ website. So what did our users tell us?

Brief Summary:

Our internal stakeholders told us that the following areas of the website could use some improvements:

Marketing features, and eye-catching visuals are lacking
Locating division/department-oriented information is sometimes difficult
Services and policies content provided/maintained by various groups could be integrated more cohesively
Access to hours, locations, maps and contact information could be designed more efficiently
Representation of the new organizational structure of the library is lacking
Staff Directory can use some improvement to make it easier to use.

Our end-users told us that they:

Are less likely to stay within the site as soon as they found what they need, and they have no time to explore the site.
Prefer a set of prominently located tabbed search boxes for finding resources from the library homepage.
Are not in favor of the Single Search Box that only offers combined search options. They need to know the search options, and what they are searching in
Believe the “Ask Us” is an important feature to have, however only less than 25% indicated that they have used “Ask Us” before.
Like the features that show today’s hour, and/or the upcoming hours at the library website homepage.

To find out more, please feel free to check out our Tech TAlk presentation.

What are our actions in response?

Based on our study finding, the User Interfaces Unit went on to completely re-design our website’s user interface. Here were a list of the visual mockups we developed for demonstating how those study results have influenced the interface redesign of our website.

Posted February 8th, 2012 by jjt0005 & filed under Uncategorized.

Summon

Why did we purchase Summon? What is its purpose?
Can patrons limit their searches to select databases in Summon?
How much of UNT’s Electronic Resource material is available for searching in Summon? Will that number improve in the future?
Can remote users get access to all of the same material in Summon as on-campus users? Are there any issues with remote access?
Where does Summon’s data come from?
Will we be indexing our catalog data in Summon as well?
How does Summon decide what resources to rank as more relevant than others?

Resource Discovery

What exactly do you mean when you say “resource discovery?”
Improving resource discovery means making our discovery experience more like Google’s. Right?
What are you working toward? What’s the plan? What will it look like when it’s done?

Summon

Why did we purchase Summon? What is its purpose?

Summon is a “Web-scale discovery system,” which is a relatively new sort of tool for helping patrons find (or “discover”) library resources in ways that were difficult–or flat out impossible–with our traditional tools. We purchased and implemented Summon mainly to fill a gap that existed in our resource discovery infrastructure: quick, comprehensive discovery of full-text articles that doesn’t require any knowledge about specific databases. Summon’s index combines article-level metadata and full text for the majority of our subscription e-resources together in a single bucket, allowing users to search for articles just as they would within a single database, but without having to know which database(s) to use.

By implementing Summon, we intended to fulfill an unmet need–not to replace any of the traditional components of our online discovery infrastructure. Individual databases and e-journals are still available and accessible from their respective interfaces and the catalog. For deep research, we should certainly continue to point people to the appropriate databases and other resources for their subject area.

But not all research tasks require depth. Patrons may not want to start their research with a particular database, or patrons may wish to search broadly before they search deeply. For these, Summon offers some new possibilities.

Sometimes people may need to research a topic or subject area with which they’re not very familiar. We might prefer that they take the time to read a pertinent subject guide, learn the appropriate databases to use, and go about their research that way. But, unless they’re interested or highly motivated, most people will not do more work than is required. This tool helps these people find some (hopefully) relevant resources where they might otherwise stubbornly search the wrong place and not find anything useful. For example:
- New students (especially undergraduates) who are not familiar with the library or with the subject that they need to research.
- Experienced researchers who are researching a subject area that is new to them–perhaps for interdisciplinary research.
Going straight to a particular database is not always the best search strategy. Tackling a research task by starting as broadly as possible and then narrowing as you go is perfectly valid, but it’s not a strategy that existing library tools support well. On the other hand, search engines on the Web and on retailer websites not only support this method, but they actively encourage it. For people that prefer the broad-to-narrow approach, this tool helps fill that void. For people that prefer more precise, more methodical methods, this tool might not be for them.
Even if you’re a serious, experienced researcher who knows your subject area well, knows what databases to use, and is generally happy with how existing library tools support these activities, Summon can help you discover new things and broaden your horizons. Since Summon doesn’t limit to one or even a handful of databases, it lets you cast a very wide net. And, since searches are quick and easy, you don’t have to worry about investing a lot of time in going down bunny trails. It’s relatively low risk and high reward.

Like any other tool, Summon has its place–but it’s not going to satisfy every need or be perfect for every user. It’s one piece of the puzzle that helps us to provide a more satisfying, more complete resource discovery experience to our patrons. UI’s task is to fit this piece together with all of the others so that our patrons can get a complete picture and select the right tool for the job at hand.

Can patrons limit their searches to select databases in Summon?

Effectively, the answer is no–there is no way to limit searches by database. People who wish to search particular databases should go to the databases directly (e.g., via the Database A-Z list).

There is a parameter that you can send in the URL that will limit by resource, but to use it you have to, 1. know that the parameter exists and what it is, and 2. know what Serials Solutions codes are for their resources. I think it’s intended to be used for testing purposes only.

How much of UNT’s Electronic Resource material is available for searching in Summon? Will that number improve in the future?

Based on the initial analysis of UNT Libraries’ electronic resource content done by Serials Solutions in October 2011, Summon (at that time) covered ~92% of our e-resources. Summon coverage continually improves as new content partners sign up and are added to the index, so the 92% figure should increase as time goes by. Serials Solutions’ coverage analysis document states, “Serials Solutions works closely with our Summon clients to prioritize acquiring new content whether it is from paid sources or openly available sources. We welcome libraries suggesting additional open access content as this can usually be added to the index quite easily. Paid content requires negotiation with vendors, but is also added at a rapid pace.”

UPDATE 04/2013: At the time that we were implementing Summon (e.g., January 2012), Serials Solutions estimated that they had content from 6,800+ publishers and 94,000+ individual journal titles. Now their data shows they have content from 7,250+ publishers and 136,000+ individual titles. That’s a 6.6% increase in the number of publishers and a 44.7% increase in the number of titles indexed in 15 months.

If users cannot find what they’re looking for in Summon, they can still access databases or e-journals individually via the database and e-journal A-Z lists.

You can view the content analysis that Serials Solutions provided us here: https://ui.library.unt.edu/project-manager/documents-sharing/1303 (note that you will have to log in first to view the document). If you would like to see a more thorough analysis showing exactly what journal titles UNT subscribes to that aren’t covered in Summon, please send an email request to UI. We can only share the list with people affiliated with UNT.

To see more information about Summon’s content and coverage, please visit Serials Solutions’ website.

Can remote users get access to all of the same material in Summon as on-campus users? Are there any issues with remote access?

Remote users can access most of the same material in Summon that on-campus users can access, they just have to authenticate through the library’s proxy server first. Because our proxy sets a cookie on their machine, they only have to sign in once during a session to access full-text materials. The only difference that off-campus users of Summon should experience is that they will not see Web of Science results or citation counts embedded in search results in Summon.

When we first released Summon in February 2012 we had lots of issues with remote access, especially for Internet Explorer users. Those issues were resolved within a couple of weeks. If you know or hear about off-campus users who are still having trouble accessing full-text resources via Summon, please ask them to submit a ticket so that we can troubleshoot with them.

Where does Summon’s data come from?

Data in the Summon index comes directly from content providers (7,250 publishers and 136,000+ journal and periodical titles)–not from database providers or other content aggregators like EBSCO. This insulates Serials Solutions and Summon against problems like this.

Will we be indexing our catalog data in Summon as well?

No, there are no immediate plans to do so. Our RDS Report describes why.

Web-Scale Discovery Systems (like Summon) are, by nature, proprietary. We should be careful about coupling all of our content and our entire RDS strategy to a single proprietary system.

From the RDS Report, Literature Review, Observations, Page 17:

At this point in time, using such a system [a proprietary Web-Scale Discovery product] to serve as a single access point might very well be putting all of our eggs into one basket, but—if used as one component within a larger resource discovery framework—it would give our users much-needed article-search capabilities without tying our entire discovery strategy to one system. It would give us the flexibility to continue working toward making a genuine single-access-point search a reality without being beholden to what one vendor will or will not allow.
The ultimate goal of indexing our catalog in Summon would be to use it as the single access point for our library’s materials. But, even if we indexed the catalog in Summon, there would still be materials that we could not index in Summon. Any single-search based entirely on Summon would be incomplete, and the single results set presented by Summon would make it difficult (if not impossible) for users to understand what might be missing.

From the RDS report, Institutional Data, Data Analysis, Page 19:

Because RDSes only have partial coverage of library resources, what a particular RDS searches—and how to present that information to users—becomes a big issue. Although a single-access-point RDS for libraries sounds great on paper, in practice it requires additional qualification about what’s being searched as well as supplemental access points (e.g., to databases and e-journals) to shore up the weaknesses. We haven’t seen any user studies that address this, but we would guess that this reduces the effectiveness of the single-access-point search. Web-Scale Discovery Systems are a big step forward from the information silos of libraries’ past—but they are not yet able to provide a single-search experience on par with Google.
The phased model we developed and outlined in our RDS Report keeps our catalog and Web-Scale Discovery System indexes separate deliberately. The plan is to integrate our discovery systems at the interface layer rather than to combine the indexes. This helps us to quarantine as much as possible the systems that are entirely vendor controlled and retain as much control as possible over how we present our data to users.

From the RDS report, Recommendations, Our Vision: The RDS Implementation Model, Page 26:

The first step—phase one—will have us deal with the weakest component of the existing framework: the electronic-resources search. Current-generation Web-Scale Discovery Systems could actually do what an electronic-resources search implies: search across a wide array of individual articles. Although such a system—both the application and the data—would be closed-source and vendor-controlled, the functionality that it would provide out-of-the-box would justify incorporating it into our model. Furthermore, at this stage we would lessen the effect of that issue in two ways. First, we would select a system that provides a fully-functional API that would give us flexibility in the future, at least at the application layer. Second, we would refrain from incorporating our catalog data into the system. Though this would prevent us from offering a single-search solution at this point, we contend that such solutions are not yet tenable. They do not actually offer a single search of all resources; they obscure too much from end-users; and they would place us on a path of putting our data into systems in which a vendor controls the content and the system.

And Page 28:

In phase three, we begin our own development at the application layer. It may be unlikely that vendors of Web-Scale Discovery Systems would ever allow third parties direct access to their data, but a good API would allow us to incorporate the system’s functionality more fully into our existing applications. Hooking the Web-Scale Discovery System and the Discovery Layer applications together would, for instance, allow us to provide a high degree of consistency to the end user, even if we retain separate Books and Articles searches.

When we originally wrote our RDS Report, although we recommended against loading our catalog into Summon, internally we were still entertaining the idea of experimenting with it just to see if there would be benefits—if we could do it easily. But, since writing our report, the landscape—externally and internally—has continued to evolve. And much of this evolution has actually supported our initial findings and reasoning on this topic. A growing body of evidence from usability testing suggests that results combining article-level items and catalog items is confusing—that users prefer these two basic types of things to be kept separated in our interfaces. Our own testing during development of our new website showed this very clearly. There has been growing interest in what has been termed “bento box” style search results interfaces, where the top results from a variety of sources are combined at the interface layer on-the-fly and presented in separate boxes, showing, e.g., Articles results, Catalog results, Database results, etc. in different (clearly-labeled) areas on one screen. There is a growing consensus that this is the current best-of-breed approach to providing a search-box that searches all library resources, and it wouldn’t require indexing our catalog in Summon.

(For more discussion about the “combined library search” idea, see the FAQ question, Improving resource discovery means making our discovery experience more like Google’s. Right?)

How does Summon decide what resources to rank as more relevant than others?

Summon uses a relevance-ranking algorithm developed by Serials Solutions. Full-text items receive a static rank based on content type, publication date, scholarly/peer review, and whether or not an item is in the local collection. Items that are more recent and peer-reviewed are favored over those that are not, and items that are in the local collection are favored over those that are not.

When a user searches Summon, a dynamic rank is generated–search results are compared against a user’s query and ranked based on term frequency, field weighting, term stemming, and stop-word processing. A combination of the dynamic rank and the static rank determines the final ranking.

Resource Discovery

What exactly do you mean when you say “resource discovery?”

Ranganathan’s third law: Every book its reader. Resource discovery is foundational to library science. When we consider how best to organize our resources to help our patrons find what they need, we are considering the issue of resource discovery.

In the print universe, the catalog was one of the central systems that enabled this–you could be reasonably sure that you’d searched the entirety of the library by checking the catalog and maybe a handful of other sources. But, as more content has moved online–and as more of the content libraries obtain and make available has moved online–the number of systems for searching that content has increased in kind. For technical reasons, intellectual property reasons, practical reasons, and many other reasons, the content that a library makes available to its patrons has come to exist in many different systems. Each system has its own interface for searching the content that it holds. This greatly complicates how people find what they need. To use this smorgasbord of systems appropriately and effectively, you need to have a better understanding of how libraries work than most people are willing to obtain.

On the flip side, non-library entities have grown to deliver much better online resource discovery experiences. Amazon and other online retailers make it easy to navigate their product catalogs. Google, of course, makes it dead simple to find something that is relevant to just about any query. Using the library is comparatively difficult.

Over the past 10 years, technologically literate folks working in the Library and Information Science profession have been working toward making library resources easier to find and use. One of the fruits of this labor is the “Resource Discovery System,” aka “Next Generation Catalog,” aka “Web Scale Discovery System.” This type of system uses a central index to store content from a variety of sources and allows use of a single interface to search/discover that content. The Summon system that we just purchased and implemented is one example.

But–it’s very important to keep in mind that any system that lets people search for resources could rightfully be called a resource discovery system, and systems like Summon are not the be-all, end-all for improving discovery of library resources. Like any other type of system, they have their positives and negatives, and they have to be willfully and intelligently incorporated into the overall discovery experience (e.g., the library website) in order to be effective.

So when we–the User Interfaces Department–talk about, e.g., improving resource discovery (or resource discovery interfaces) at UNT Libraries, we are talking about both Resource Discovery Systems in particular and about resource discovery systems in general. We’re talking about the discovery experience as a whole. When a patron comes to our library website, how do they get to the resources that they need, no matter what they are and what system they’re in? That’s what we mean when we say “resource discovery.”

For a more complete picture, please see our RDS Report, especially the Introduction and the Literature Review sections.

Improving resource discovery means making our discovery experience more like Google’s. Right?

Yes and no. It depends on what you mean by “more like Google.” If you mean that we need to continue simplifying the discovery experience, construct the right tools for the right contexts, customize our tools based on user data and user feedback, and continuously adjust them based on changing user needs (based in turn on user data and feedback)–then yes, absolutely we need our discovery experience to be more like Google’s.

On the other hand, if you mean that we need to mimic Google’s search functionality–i.e., just provide a single search box that searches a single system containing everything we own and returns one results set for each query–then the answer is a very qualified “no.” Or, at least, not necessarily.

Over the past 7 or 8 years, libraries and related organizations have gathered lots and lots of data showing that users prefer starting their research on sites like Google and Wikipedia. Plenty of focus group and user survey data that’s been collected shows that users say that they’d like the library search experience to be more like Google’s. Based on this information, it’s easy to assume that all we need to do to make our patrons happy is to implement a single, Google-like search box. But–until very recently–providing any search experience that crosses the majority of library resources hasn’t been possible, so this assumption is based mostly on preference data. People telling us what they think they want. And with the advent of Web-Scale Discovery Systems, as libraries actually implement their single, Google-like search boxes, it’s now possible to test the assumption. Although it’s still very early, some of the user data that’s been published recently contradicts–or at least qualifies–the idea that users just want a single, Google-like search. (See the More Resources section, below, for supporting examples.)

We do have to tread carefully here and make sure we check our assumptions. There are myriad reasons why users are having mixed reactions to libraries’ single-search implementations. First, there are obviously practical reasons, which include interfaces with usability issues and an underlying infrastructure that still can’t quite provide a totally seamless experience. In short–part of the problem is that the technology is still new, and–although it’s improving quickly–it just isn’t yet able to match the sort of expectations set by Web search engines.

But what’s interesting is that there might actually be conceptual problems with putting all library materials into a single bucket. As Dana Mckay’s paper [3] points out, there are distinct differences between how users use books and how they use online articles. These differences are strong enough that they may lead to confusion when users get search results that mix the two together. Our own user studies that we conducted during November and December of 2011 support these findings–when examining the home pages of different libraries with different types of search boxes, users actually showed a strong preference against library websites that employed a single search box. They liked search boxes that instead presented options, usually in the form of labeled tabs, because that gave them an idea about what they were searching. Of course, this is still preference data–but it’s preference data based on concrete examples.

Something else to keep in mind is that Google Web search and the Web itself go hand in hand. Library search tools don’t search the Web, so expecting them to work similarly to Web search engines–even at a conceptual level–is perhaps a little unrealistic. One of the earliest metaphors that came into widespread use for browsing the Web was “surfing”–which is apt (if silly). But can you imagine anyone “surfing” library resources? The Web is a complex network of interlinked documents and files. It’s vast. It’s open. Although much of its data is not very well-structured, it does at least share a common structure (HTML, XML) and a common infrastructure. You can write a program that crawls from document to document on the Web and automatically gleans lots of contextual information based on what links to what, the text in which the link is embedded, and lots of other contextual clues. The contextual data might not be 100% accurate, but it’s incredibly rich. Library data, on the other hand, consists mostly of various separate pools of records/resources that, 1. have little (if any) contextual data, 2. are not linked together in any meaningful way (not universally and not with unambiguous, machine-readable links), 3. do not share a common structure, 4. do not share a common infrastructure, and 5. are generally not freely/openly available. So much of what Google has leveraged to make Web search work well is simply not part of library data. Even attempting to normalize library data/metadata and pool it all into the same index does not give you the Web–or anything really very close to it.

Going forward, it’s clear that continuing to work toward consolidating the number of discovery interfaces and pools of library data will help improve overall discovery. We just want to make sure we’re proceeding in such a way that we’re not setting users up for confusion or disappointment. Lown, Sierra, and Boyer [2] put it well: “Although libraries may be inclined to design their home pages around a single search box to give an impression of simplicity, inadequate functionality and resource coverage may frustrate users and hide significant portions of library resources and services.”

What are you working toward? What’s the plan? What will it look like when it’s done?

The last section of our RDS Report (Our Vision: The RDS Implementation Model) addresses this question broadly and shows one possible end-game scenario. The take-away from that, however, isn’t that particular scenario. The following points summarize what we’re ultimately trying to accomplish.

We’re working toward providing a more unified interface for our users to search/find library resources, no matter what those resources are or what system they live in natively. As much as possible, we would like to provide a “one-stop-shop,” even if, at the end of the day, that shop is divided into different departments.
Where possible, we’re working toward consolidation of resources, at least for the purpose of resource discovery. It’s easier to provide a unified interface when your data is interoperable–e.g., in the same index. But–
we’re also working toward having ultimate flexibility with our data and our user interface. Although consolidation of resources is important, we don’t want to compromise control over our local data and how our users interact with it. We want to incorporate the discovery experience into our website–we do not want to have to shoehorn things into a proprietary system or interface that don’t belong there just because it’s our only option. This means we want systems that provide us with API access to our data so that we can query it, retreive it, and then mold it to fit our interface.

To help give you a more concrete idea, here are a couple of library websites whose resource discovery interfaces have inspired us throughout our investigations and planning.

UPDATE 02/2014: BYU and Villanova have changed their websites. BYU’s is completely different, although still interesting. Villanova uses the same basic model as is discussed below.

North Carolina State University Libraries. They offer a tabbed search box where users can choose to search books, articles, or the website. But their default search is a combined search–which makes sense if, as Teague-Rector and Gaphery found [5], users just use the default search most (~60.5%) of the time. And their combined search is interesting–it isn’t really Google-like since the results don’t combine everything into a single bucket. It keeps results for different types of things separate and helps guide users to the particular bucket that they’re actually interested in. Again–this is a locally developed tool that would require some development work on our end.
Brigham Young University’s Harold B. Lee Library. Another example of a discovery tool that has both a consistent interface and is well-integrated into the website. In this case the combined search does combine articles, books, etc. into one set of results. Note that, during the user studies we conducted in November and December of 2011, users preferred this style of search box (of the options we gave them).
Villanova University’s Falvey Memorial Library. This is slightly different approach, but it provides some food for thought. Note how closely this approaches the “single search-box” interface, and yet it’s actually much more like an Amazon search than Google. Based on existing user data, they’ve done a good job separating things that should be separate and setting default options that make sense in particular contexts. (The home page search defaults to “library website,” while the Search page search defaults to “library catalog.”) Their “catalog” search is actually a combined search, but it presents results for books and articles separately, so it presumably avoids the problem of mixing together results for things that users keep separate in their minds. Most importantly, the website interface and search tools are tightly integrated and seem to function based on a well-thought-out high-level model. As you navigate the site and use their discovery tools, it never appears that you leave their website and enter separate systems. Brown University Library and the University of Buffalo Libraries have discovery tools that function similarly to Villanova’s, but Villanova’s is still better unified/integrated.

In some ways, the journey is going to dictate the destination. Although we have our ideas, we don’t know exactly what it will look like when it’s done. This is why we have planned a number of phases to help move us forward. At each step of the way, we’ll be collecting data–search data, usage stats, and user feedback. We’ll also reevaluate what we’re doing after each phase is complete. When other institutions that are a step or two ahead of us release data about what they’ve done, we can learn from that and adjust our own model so that we’re always working from the best, most recent data. Yes, this means that our vision will probably change along the way. But that’s just part of being responsive to an environment that’s constantly evolving.

More Resources

Howard, D., & Wiebrads, C. (2011). Culture shock: Librarians’ response to web scale search. Retrieved from http://ro.ecu.edu.au/cgi/viewcontent.cgi?article=7208&context=ecuworks
Lown, C., Sierra, T., & Boyer, J. (2012). How users search the library from a single search box. College & Research Libraries. Retrieved from http://crl.acrl.org/content/early/2012/01/09/crl-321.full.pdf+html
McKay, D. (2011). Gotta keep ’em separated: Why the single search box may not be right for libraries. Hamilton, New Zealand: ACM. Retrieved from http://dl.acm.org/citation.cfm?id=2000772
Swanson, T. A., & Green, J. (2011). Why we are not google: Lessons from a library web site usability study. The Journal of Academic Librarianship, 37(3), 222-229. doi:10.1016/j.acalib.2011.02.014. Retrieved from http://dx.doi.org/10.1016/j.acalib.2011.02.014
Teague-Rector, S., & Ghaphery, J. (2008). Designing search: Effective search interfaces for academic library web sites. Journal of Web Librarianship, 2(4), 479-492. doi:10.1080/19322900802473944. Retrieved from http://dx.doi.org/10.1080/19322900802473944
Thoburn, J., Coates, A., & Stone, G. (2010). Simplifying resource discovery and access in academic libraries: Implementing and evaluating summon at huddersfield and northumbria universities (Project Report. Newcastle: Northumbria University/University of Huddersfield. Retrieved from http://eprints.hud.ac.uk/9921/

The Importance of Data

So What Use is Data, Anyway?

Show Me the Data!

Storing, Using, and Extracting Data

An Example: A Shelflist Application

What About APIs?

Click-through Stats, 2006-2013

Where Things Stand

A Snapshot on Web Development Testing

Testing a Few Years Ago

Testing Today

We are Building a Device Lab

What Else Could be in the Works?

How Can I Help: Donate a Device!

How Can I Help: Make a Cash Donation!

Other Suggestions

1) Will blog authors be allowed to create their own tags?

2) Will moderated comments from readers of the blog be allowed?

3) How will the statistics on the number of individuals viewing the blog be calculated?

4) Will statistics on the blog usage be accessible to the blog authors? If not, what would the expected turnaround to get those numbers from UI be?

5) Can the blog author of a side bar profile section with their picture and a brief bio?

6) Will the blogs be indexed by Google?

7) Can we allow guest writers not affiliated with UNT?

8) Can we customize the side bars?

9) The web form mentions storage limits are subject to change. What are the storage limits currently?

For background on Sierra and the migration project, please see this PowerPoint presentation file.

Q: What system downtime will be associated with the migration, and what exactly will the “downtime” entail?

Q: How will logins and authorizations in Sierra work? Can some areas retain generic logins (e.g., generic Circ logins for students) that can then be overridden with specific authorizations?

Q: Diacritics in Millennium aren’t easy to type in. Is this improved in Sierra?

Q: Will we be getting the Sierra Dashboard product? (Is it a separate piece?)

Q: How will Sierra affect printing templates/options?

Q: Will SQL mean faster Create List completion?

Q: Will there be formal training/tutorials available when we launch?

Q: For how long will we still be able to use the Millennium client on top of Sierra after Sierra goes live? What about the telnet version of III?

Q: Is there a date by which we need to stop making changes to Millennium?

Q: Will patron loads work the same with Sierra?

Q: Will the Music Special Collections database that contains, e.g., the WFAA/WBAP collections continue to work in Sierra?

Q: Will the Output Vouchers and PVE system still work in Sierra?

Q: Will LDAP / single sign-on features still work in the WebPAC in Sierra?

Q: Will the self-checkout stations continue to work in Sierra?

Brief Summary:

What are our actions in response?

Summon

Resource Discovery

Summon

Why did we purchase Summon? What is its purpose?

Can patrons limit their searches to select databases in Summon?

How much of UNT’s Electronic Resource material is available for searching in Summon? Will that number improve in the future?

Can remote users get access to all of the same material in Summon as on-campus users? Are there any issues with remote access?

Where does Summon’s data come from?

Will we be indexing our catalog data in Summon as well?

How does Summon decide what resources to rank as more relevant than others?

Resource Discovery

What exactly do you mean when you say “resource discovery?”

Improving resource discovery means making our discovery experience more like Google’s. Right?

What are you working toward? What’s the plan? What will it look like when it’s done?

More Resources

Recent Posts

RSS Feed

Archives