Notes and Leaves: September 2013

Tuesday, 10 September 2013

Linked Data: from Aleph/Primo to the Dictionary of Luxembourgish Authors

Roxana Popistasu, IT staff, Bibliotheque nationale de Luxembourg

Project started last year, simultaneously went live with Primo. The NLL also manages Aleph & Primo fo rthe network of libraries in Luxemboug. Also partnership with Centre national de litterature, which manages the content of Autorenlexikon (dictionnaire des auteurs luxembourgeois). Idea is to link all this info. Other partners: Magic moving pixel, an IT management company for Autorenlexikon. The goal of the project was to evaluate the work involved.

Questions to be answered:
- How to create a link between authors? String matching or id's?
- How to deal with identical names?
- How to deal with the authority records?
- How to do find (and save?) the matches

The initial results were unsatisfactory. Connecting the authors between AutorenL and the bib database based on id's produced a low number of matches (60%) and this was even lower with the authority database (35%), so we had to use the string matching. But even 60% was better than nothing so used that first. Link was added in the bib record.

The actual project for setting up the linking started in March 2013. Started by adjusting the matching algorithm and create a database with matches. Then came the need to create web service to be used for the display in the catalogue and in AutorenL, then display matches in the Aleph OPAC, and create the validation service. Matches were made on author and title.

The algorithm: created normalisation rules, e.g. elminate different characters, upper/lower cases etc. Work on standardising the cataloguing ruiles with were different in Aleph (MARC21) and AutorenL. Levels of matching needed to be checked. Created database with matches, regularly and automatically updated using exports from Aleph to import in that DB.

Choices were made for the matches, e.g. using pseudonyms and alternative names so looking at those enabled to do the matching if one person was represented differently in the different databases. A validation service was set up for the National centre for literature, to assist them to do the matching on their side. This was based on levels of accuracy. They could find the Aleph system id where relevant. This has also helped them to find small mistakes in their data which they wouldn't have otherwise found.

Phase 2 of the project is how it displays in Primo and doing links between the authority database and AutorenL because at the moment it is only done with the Aleph database. We are also going to work with VIAF to publish our authority data but first we need to improve it. This project can be part of this process. We will investigate more how to link on id's and see how to integrate with other systems, such as DigiTool.

Ex Libris General Question & Answers

[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Q: Read-only access to the Alma configuration screens availability?
A: First let me explain what this is: it refers to the process during implementation, ExLibris performs the configuration in the first instance but once the customer has received the Certification Training, ExLibris doesn't do this anymore. This at the moment is near the end of the implementation just before Alma goes live. We've received this as an important feedback, that people want more transparancy and understand better the configuration options during implementation. By the end of the year we will have a read-only option for the configuration. This will first be at the fulfilment level. This will be incorporated as part of the permissions based on roles.

Q: Data quality and how to re-use cataloging already done is important. What is happening with the plans for the Alma community zone where we expect we should be able to find most records instead of searching them via external sources?
A: It's part of our key goals to improve efficiencies, focus on enabling streamlined sharing. We've started this by modelling this, we're at the stage where we work with the policies and cataloguing advisory group and expect to get results towards the end of the year. We presented the suggested model now and you will recognise real efficiencies in processes. We are finalising the development of those recommendations.

Q: Plans to share EzProxy configurations in the community zone?
A: The community zone isn't planned for that, but it's a good question and we should think about that.

Q: When can we expect to see license information in Alma's uResolver? (obligation to display copyright information in e-resources)? Will we ba able to rename fields or add local terms to be added to the uResolver copyright information when finally built out?
A: The option to show the licensing info is planned for the 2nd quarter of 2014. We will allow to adjust the wording of those fields to be more user friendly. This is in our roadmap.

Q: Cloud safety - What if anything happens to a Data Center? How will Ex Libris restore services to customers whose data and applications are housed in the Cloud?
A: We're aware that out Cloud is a lot of responsibility. We've looked at all the details for hardware, using the best hosting that we can find. We have 6 internet providers providing to the site, multiple firewalls, applications designed specifically and testing in that way. We've upgraded our backup, we used to do it on tape but now we do it on disc offsite. The ability to restore from there to another location would be much quicker than in the past. We have multiple ways to resolve such issues. It's easier to restore from site, but we can do it differently. Restoring is not a simple process. In a catastrophic situation, it would be less than a year... But more like a week...

Q: When can we finally expect daily analytics for Alma?
A: We've just announced that this is near release. It's been rolled out in 10 institutions of the US as part of the testing and it should be available to all customers by mid-October. The short answer is that it's already there.

Q: WorldShare Management Services is OCLC's response to Alma. Although Alma is more mature, given WorldCat as the primary cataloguing datase of WMS do youreally think it makes sense to build a second (semi-)global Bib library within Alma?
A: We want to build efficiencies in processes, not to re-build WorldCat, which was designed a few decdes ago. We're looking into the future. We want to integrate any existing service that is useful to you. Such environments includes WorldCat and this can be included in Alma. Wheter you're an alone-institution or part of a consortium, integration is possible. Others who don't work with WC may find new efficiencies in Alma.

Q: OCLC has recently published info about the WC metadata API. What about ExLibris, planning to integrate in Alma (looks like a good replacement for Z39.50)?
A: We've started discussion with OCLC about this, it's on its way

Q: UStat is the first SaaS component from ExLibris. How long will it remain as a separate product for sfx users? Will it's functionalities be integrated in Alma?
A: Alma anlytics will have greater functionalities

Q: Does Salesforce offer a Claud Status page in order to indicate when the servesr hosting Salsefoce CRM are down?
A: We will post in our Customer portal page the status of our services, including Salesforce

Q: The Ex Libris Customer Center platform is not easy to search, after developping such a good product like Primo, can we expect the same for the Customer Centre?
A: Priority is on services to users but we will look to improve the search engine for the Customer Centre next year

Q:With OA issues, this changes how content is made available. Ins SFX, with effect in Primo, we work at Journal level but OA is at article level, how will ExLibris manage this?
A: Part of our next linking project is to provide this info to the link resolvers

Q: Are e-book loans sheduled in any ExL software?
A: This is more of a question for vendors and we'd be interested to see vendors' models

Q: How far did the implementation of RDA in combination with Aleph & Alma proceed?
A: Voyager team work closely with the Library of Congress team, Aleph didn't need to do any configuraiotn work and Alma cataloguing support will be finalised in the next few months. Next steps: all eyes on on bibFrame but they've not agreed to implement FRBR they're working on a new model also don't think MARC21 should be the standard, so it's too soon to say on how we'll move forward on this question.

Q: Primo was launched as part of the strategy to decouple the frontend from the backend.Many primo services are availabolo only to Alma users and the front end of Alma *is* Primo, could you expand on this strategic shift to a monolithic (or symbiotic) couple?
A: It's tactical/pragmatic rather than strategic. The dicoupling process between front and back end are still relevant, both Primo and Alma can enable 3rd party component. Next gen framework is still relatively young, we develop a lot of user case but we still don't have enough best practice etc. but it will come. Tightly connected interaction between A & P is based on interfaces

Q: Marketplace for metadata requires cooperation between all parties. What about collaboration with EBSCO? Other discovery tools seems to be able to agree with each other to share data. Why not Ex Libris?
A: Good news is that it becomes more apparent what's going on and that EBSCO doesn't enable the subscription based index for discovery services such as Primo. You have to buy EDS to access EBSCO content so you are told to buy another discovery system. Otherwise it would be through API. Indexing is the best way to do it but they offer their service through API. This is a problem especially for those of you who pay for this content. We find it hard to find a way around it. In terms of other deals, when it comes to conent and aggregators such as EBSCO, their content is not unique and we are making good progress to provide alternatives. We sign deals with publishers.

Primo - product update

[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Gilal Gal, Director of Product Management, Ex Libris

In Primo, we focus first on the "engine" to make it functional. It is pure engineering, ensuring movement in the production. With Primo Central we need to ensure it works 24hrs a day, it is quite an achievement but takes a lot of focus. We are the software vendor, we don't sell content, but we can't talk about Primo without talking about the content because we have to handle it correctly. Primo is the tool through which you give services to your customers and we take into consideration comments from users and yourselves, as administrators.

Breath of development in Primo:
We develop OPAC-type functions, such as browsing. We rely on bx, which is contextualisation of searching based on preferences. We listen to the community's requirements, such as a FRBR presentation, added functionalities such as direct request for photocpies/digitisation etc., ability to update the password. We continue to invest in the mobile interface.
We were asked to invest more in the administrator service (i.e. not via the command line) to load files etc. so we have improved those services. We will also continue to invest in security, we've achieved ISO 27001 certification for informaiton security management. We are using Big data infrastructure (Cassandra and Hadoop) for data processing. We will move to the multiple SaaS infrastructure in a future release. We are not abandoning Oracle but for more sophisticated things, some of the data structure will move to Cassandra.
We have a Primo Central resource activation to facilitate management of resources. We will continue to improve the performance of specific things, such as getting results faster so that the search experience for the users is improved. One of the things we want to develop is the browse virtual shelf functionality.

In terms of content we are adding abstract indexing collections. We are including published research because that's what people want more, but we are also including things such as thesis, technical reports (unpublished scientific research), raw material for research (Mintel, Data-Planet) etc. Primo Central enhancement with phantom collections - activate collections via alternative coverage.

Access to the Primo Back Office is made easier. It is done in 3 easy steps, getting an email. information through a webform and receive your completion email.

There is also syndetics-type information for a search carried out. This is presented as a pop-up box on the side of the screen, providing basic information based on the search from reference material, which includes relevant links to other relevant sources of information.

Tamar Sadeh, Director of Marketing
Scholar Rank

See presentation given last year. We have done some evaluation and have used qualitative and quantitative methods.We looked at useage, KPI based on session/GetIt (=concrete interest in a specific item), location of the selected record, time until selection, use of facets and navigation to the next page etc. Globally we see that there is an improvement in all areas. Personalised ranking can be on or off.

Open Access is another topic giving new opportunities. Publishers take part in this movement especially in Gold Open Access. The proceedings of publishing are the same but the money comes from the researchers, not the reader. The Green model is valuable for institutional repositories. What we do in Primo is that we highlight what's open access. From the point of view of readers, it doesn't matter where an article comes from . The interesting thing is with hybrid journals. Those are based on subscriptions but some articles are free. We encourage institutions to promote their work via Primo. It exposes the content but also links to other relevant work (same other, same subject etc.)

Question and Answers

Q: Will there be a more elaborate way of depositing institutional repositories , i.e. more information than just author, title etc.?
A: We aren't limited to any kind of information and we would be happy to extend this to anything you need if you tell us about it.

Q: Installing Hadoop and Cassandra for SaaS customers, what about those that have local installations? How will you ensure that the performance will be acceptable to both groups?
A: It should be seamless to our customers, the software will identify if it should use Oracle as it does today or if it should use Cassandra/Hadoop

Q: Phantom collection: will we find the name of the collection in the facets?
A: To create a facet we need to indicate which collection it belongs to and in this case it would be collectionS so I'm not saying no to the question but we need to work on it

Q: Primo Central and Local are separate indexes that can't be merged. When there is a blended search, it is not possible to dedpuplicate and RFBRise across both indexes. Also ScholarRank and Personalised Ranking only apply to Primo Central. Are there any plans to imrpove integration of both indexes?
A: Yes we have plans to have ScholarRank in Primo Local. We are looking into the FRBRisation. Until the collection wasn't that big but this is changing and we will be looking into this.

Q: The subjects and author names are copied "as is" from data sources into the index, this makes itimpossible to search on specific subjects or variations of names. Are there any plans to normalise this (authorities etc.)
A: We put a lot of effort in normalisation but it is difficult to be perfect. We monitor our work in this area, we are starting a big scale pilot of normalising this data, using a software infrastructure dedicated to that. We did a small pilot and are now going to start a big pilot, also working with publishers.

Q: A Worldcat adapter is available to order to search directly from the Primo UI. This could be very useful, if the description was based on records with more information than a slim DC and if facets were created. As it is now, it's hardly interesting for our users.
A: Difference between the API and the search engine. Worldcat don't provide facets and answer the query in rank order. We can't provide a facets option based on that. We have the capability but we need to review how we will do that.

Monday, 9 September 2013

The shift to electronic - managing e-resources in Alma

Roger Brisson, Boston University
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

There are different kinds of e-resources: e-journals, e-books, databases, mixed packages etc. There are also different types of orderning e-material. We were very early adopters of Alma and at that time, Alma was still undergoing lots of developments. This gave us an opportunity to shape Alma. Over the past year, we've seen new functionalities every month, which has helped to refine workflows. The first thing we wanted to do when going live was to stabilise. We were also looking at how to automate things based on our core values. The library system used influences the reality of services we provide and with our old system designed for print, we were creating work arounds which were becoming unsustainable. When we started working with Alma it was really good to see how everything could be integrated together.I also monitored tasks that we were duplicating and seeing how this could be improved.With the discovery system, we can do things with our data.

As libraries, we need to re-define ourselves and we have to be more efficient with the management of e-resources. We must also change the way we do our cataloguing and share more. Our old system was too flat, not dynamic enough. Because of the print-based design, there was no flexibility and nothing more we could add. We now have more or less the same numer of electronic resources as print and it is likely that next year our e-books will be larger. We use 90% of our budget on e-.

Alma can enrich our records and make our discovery system better. E.g. there is a special box for 856 links in Primo and we are using this more and more. As early adopters, we've been setting up automatic workflows in Alma. We use existing standards (RDA and MARC) to optomise the new environment that we're working in and cutting down things we don't need anymore. An e-book for example doesn't have any physicality. So we work visually to understand what is happening and how to design a system that works. Is an electronic copy a variant? What is it? With an e-book there are additional things, article links, videos, table of content etc.

Examples linked to resources for a new MOOCS that we are setting up: Cataloguing an e-book is not that easy. We have a pdf file and an ebook reader. The numbering of pages is not the same. How do you describe this? We scrutinise these kind of questions. Or if you have a subscription, a closed package that you make available on the net. The package has 96 resources, it's a subscription, it's very dynamic because new e-books are added all the time. Managing a dynamic package is much more challenging than a static one. Or we have an e-book database. It's designed to be read and used in the database. We won't treat this in the same way as an academic paper. Traditionally we would have pre-selected books to our readers but now we are exposing lots of books that the user has to choose from. We are expecting a lot of Alma to manage this new type of model.

In terms of uniform resoucre management and automation we want to take advantage of the central knowledge base in Primo. SFX has been mostly ingested in the central KB. We have to have a means of pulling out data out of Primo and at the same time, new books are pushed in. There needs to be a trusted interaction between the publisher/vendor and Ex Libris so that when I activate a subscription, it's reliable because I can't check 96 books every month. The confidence has to be built. That's taken care of by the CKB. We initially had an issue because the data wasn't reliable, the records weren't good enough and we've worked to change that. We are loading vendor records that we trust every month. We're not quite where we would like to be yet. Those are mostly bib records, we have separate records for acquisitions, licenses etc.We can work at a package level, either doing a simple import or an OAI.

If we consider the bib record in the institution zone, we have a record in the community zone, which is the discovery record = work (FRBR). When we have an e-version, it is par of the electronic inventory = portfolio. The manifestation should be in the inventory so based on the FRBR model, we should first see a unique record. In terms of the logic of Alma, it would make a lot of sense because at the moment a bib record contains a lot of things that are not relevant. We should really clean this up. Alma shows clearly what records come from the community zone. Activating one should include all related records in the protfolio. When migrating to Alma, part of the process is changing print records and using the P2E cleaner helps. A lot of cleaning was still required afterwards. Having all of your inventory in one system allows you to think of efficiencies, such as having only one record for both print and electronic and having all of the data together.

Q: How do you manage condition of use at e-book level?
A: You set up your licensing information, because Alma's functionality is vendor-centric and it's through the vendor record that we set up the licences etc. Depending if it's single or multiple-usage, we can manage that through the portfolios. The actual usage is controlled by the plataform itself. If the licences are separate, you may want to have different records and have them all to display differently in Primo. This might be confusing to users but those are 3 copies of a book and we want all 3 links to be viewable and active.

ExL A: when there is a tilte available through different vendors, you can specify which ones you want to show by setting up "preferences". In terms of management, we are developping overlap analysis so you will know in advance what duplications you have in your system between packages.

Alma - early adopters' experience

Notes from IGeLU 2013 Conference, Berlin

University of Bolzano Library

Reasons to adopt Alma: Future oriented solution for whole integration of all resources in every phase of the life-cycle, support for library processes (core business) including analytics, networking & best practice exchange of similar libraries etc. Joined at an early stage , so needed flexibility & change management. Took opportunity to simplify rules & policies in the data migration process.

Timeline went from April 2012 (Alma training) to Jan 2013 (Alma live). Started at the circulation/fulfillment stage, even if it sounds strange, it was good to manage staff anxiety, followed by cataloguing and rest of the migration (electronic resources etc.).In the first months, work on workflow revisions, configuration revisions, changing opinion about things thought of before moving to Alma, troubleshooting, optimisation etc. Resource management area to clean up records etc. More recently work on analytics.

Positive points and changes: Overal positive experience, good cooperation despite time pressure, solution-oriented and pragmatic approach. Lessons learned are that Cloud Computing requires a new approach to software, so monthly updates, no traditional test environment so all change is made in production. Occuring problems have to be notified "to the Cloud" instead of IT department. Reporting and controlling is improved with positive impact on the organisation in decision-making. Standardised workflows. Cleaning up work requires manual input.

Open and critical issues: Activation of electronic resources is very complexe (some databases are still missing), lack of functionalities of Alma Uresolver, still developments required for analytics, batch updates and set creation not always easy, migration of subscriptions and entering of acquisitions data, customisation in workflows, cataloguing or display of data is still needed, are the major points.

Northeastern University Boston

Why Alma? System that can handle electronic resources, better analytics, no local hardware maintenance, financial incentive to become an early adopter.

Timeline and challenges: April to July 2013 for actual implementation, with a 9-month overall period, quite challenging! Mostly because there was no on-site visit, missing context in WebEx training, missing and developing data migration documentation, confusing project management plans. Other main challenge was that configurating Alma without it going live wasn't available.

Two months live... Optimistic about the future of Alma, happy to have hot fixes and monthly release cycle, forces to look at workflows, already impressed with analytics. Biggest disappointment is in e-resource management, in particular, Uresolver still needs lots of customisation, lack of clarity about the knowledge base (same as sfx?) portfolio management is difficult and time-consuming, missing stability and flexibility in licensing. In addition, missing flexibility & customisation options in user roles & permissions, order records, licensing and notices (can only be customised at the institutional, not library, level). The software is not intuitive, some workflows are clunky. However, still very optimistic! Happyy to work with Ex Libris to improve it and to build a strong community.

Discovery and Discovery Solutions

David Beychok, VP, Discovery Delivery Business Unit, Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

What all discovery services have in common is that they have a better understanding of the various systems. There is a vote of confidence towards Primo, with many new large consortium joining us recently. Primo is the frontend for Alma, which doesn't have an OPAC. So there is a strong integration to bring you these solutions. We use big data tools, those include hadoop and cassandra, which increases speed of indexing and availability. Good tools for massive data processing and storing.

The Primo implementation process includes stages 1,2 and 3, i.e. Primo is set up in an automated process, with default settings and then customisations can be added. It is done smoothly and fairly quickly. Open Access is important to the researchers and Primo makes discoverability of OA repositories or hybrid journals' content easier. Primo has a functionality for institutions to register their repository in the Primo central index.

We continue sfx development and releases. We also continue to support regular service packs for MetaLib, for fast search, integrated with Primo. Primo has a usage data service and this helps for ranking articles. We have a pilot project for branding and are looking for interested parties.

Next geneeration Library Services - Emerging roles and opportunities

Oren Beit-Arie, Chief Strategy Officer Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

The context in which we operate today is significant, increase in scale and diversity of content type, increase in cost especially serials, economic challenges in many institutions, increase scholarly research and learning (e.g. MOOCS), increased pressure and competition for library services etc. there is also a greater value of the value of libraries. We are also moving increasingly on Cloud delivery, so this is our framework.

We started developing it about 6-7 years ago, including sfx, the knowledge base, Primo and Alma. So we supported both the front end and the backend. Collections in the way that library built them in the past is changing, we're moving away from the "bigger is better" and from ownership and we're going more towards access. Many libraries are paying attention to fulfilment rather than selection, that means that it's more geared towards the information need of the user. That means also moving from the "just-in-case" to the "just-in-time". The economy around the research activities is also changing, there are more cases in which access would be free but there is more attention towards the economy of the creation of content, i.e. who will pay for the collections? Is it the reader, the researcher, the library?

Data services: The goal is to maximise the benefits of sharing, optimise management and discoverability. The two areas of focus is the Alma community zone and the next generation linking, of which I won't talk very much today, except that there are paradigms of linking, for example the sfx knowledge base, it's about pre-computered strored links at the article level. The notion of the Alma community zone was introduced a few years ago because there was a need for creating more efficient management tools and this needs to be done in the library context, so the community catalogue in the community zone enables libraries to add to shared content, information to the user community, to add or use vendor information etc. so the bibliographic control is improved. By creating a community environment, we support collaboration.

All this information that is shared is made readily available for discovery services in a streamlined way, so that you shouldn't have to work hard to enable this. It is a work in progress, especially for something like e-books. We work with the advisory group, which is composed of a broad number of represenatatives of users from around the world and this group is helping us to pin down the model that works best for all (see yesterday's session).

Linked data has the potential to extend the level of sharing of knowledge between libraries and external sources. In libraries we are very collaborative within our own island, but linked data has a greater promise. We are interested in this, we think this concept is tremendously important for the content that libraries created and to outreach to external content. For example an activity we're involved in is DME which stands for Digitised Manuscripts to Europeana and we are testing this, it's an important activity that involves big players such as Google. Other examples include NISO bibliographic framework project, the W3C schema bib extend community group (Shlomo), the LC bibframe initiative etc.

Other things we are involved in is that we try to evaluate trends and needs in education and try to build support for those needs by creating new platforms. Users perceive (e)resources as free we-bresources ("found it in Google"), the value of access to electronic content is not recognised by users, and in this realm there is lack of recognition of the role of libraries. It is paradoxical because there is something good about this, you want the user to get to content as soon and easily as possible but the libraries are getting pushed aside because users are not acknowledging their involvement and this could impact on funding for libraries. Our project enables libraries to brand those products developed to your institutional needs so they are customised by your library and build awareness of your library. It is still in pilot stage, we haven't yet talked about it publicly. (See Academic Libraries existence at risk, suggested by @chrpr)

We are also taking initiative for open access because we are aware of the significant drive toward OA around the world. Publishers are starting to respond, maybe not always in the way we'd like to but there is a lot of activity around OA. We are following the debates especially between green and gold OA because we need to be aware of what it going on. CHORUS is a publisher initiative that is about enforcing the OA mandate by publishers, is this how we want the world to be modelled in the future, is it an opportunity for libraries? We want to discuss this with you and I would encourage you to come and speak to us about this. We are in a transition period. Primo is enabling OA and when we are looking at this, it raises an important question. Some content is totally open with unrestricted access and discovery, others have unrestricted discovery but restricted access. There is some bad stuff out there, there is closed access and closed discovery and this is not just about us. Do academics/researchers who create content understand that some of their output is blocked out? So there are lots of opportunities to create better management of OA. When we talk of different models for OA, we see that many librarires are engaging in this.

Sunday, 8 September 2013

Alma analytics

Asaf Kline, Alma Product Manager, Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Roadmap goes from descriptive analytics to predictive analytics (=how should we take action) to prescriptive analytics (=what will happen). We're still at the stage of descriptive. This should help to have a deeper understanding of what's going on in your library. Alma is optimised for anlytics e.g. cost per use etc. We are doing a lot of work behind the scenes to make this happen. The model is a star schema, in the middle there is a fact and then there are different strands. It's optimised for reporting. There is a shared platform and institutions can share their anlytics data. It is also role based, so any report or dashboard created is sensitive to an individual's role.

There is a functionality to schedule and distribute reports. Any user can subscribe to the reports that they want to receive. Analytics work with APIs because for any report created you get an xml representation of the report and it can be sent to the programme of your choice. The analytics provide a history and that's how it can be predictive, because based on previous years, we can see what may happen (e.g. funds burn down).

Usage and cost per use. We take the counter info from the vendor and provide you information
from subject area and from within Alma. We use Ustat for loading in a vendor usage via a spreadsheet or sushi. When we know what's in your inventory and how much you pay for it, then
we can provide cost per usage data. Usage is on a title level but when libraries buy packages, we bring up the title data to the package level and give data for the package. The analogy is like a TV - we pay for lots of channels and may only watch two. What's important is the cost per usage, not so much what we watch. (Question here: we often raise purchase orders at package level but get invoiced at title level so that could be a problem?...).

This has been a central activity in 2013, now being rolled out to Data Centers with a continuing effort to ensure scalability of infrastructure. There are additional subjects that we're working on:
- Usage (Alma generated)
- Requests and booking
- Vendors and vendor accounts (more analytic in nature)
- BIB (bib data) - unfortunately analytics and MARC don't work well together so we will have tools to search for and process info that we have in the catalogue so at the moment there is the option to choose 5 local fields to get data from

Usage data is Alma generated, captured via the Alma Resolver, it will answer questions such as:
- What services were provided?
- What queries ended without any services?
- What were the sources/ip ranges that accessed the system?

We want to provide a set of tools to gain insight into the structure and use of your collection, e.g. print or electronic inventory and usage, but we're missing overlap analysis so e.g. comparing titles from a package so it's combining a system job that we run together with analytics. We want to take our tools to collection level, e.g. shelfmark/classification ranges, and embedd it/make it actionable so that it becomes part of the purchasing workflow, based on facts and analysis. We also want to start generating KPIs (performance measure), to help evaluate success or sucess of a particular activity, e.g. how much time it takes to process purchase order or vendor supply, avg % of items not picked up by user, request processing time etc.

The data will also be made available to be viewed on mobile devices. We want to bring it up to network level, i.e. cross institution reporting, using benchmark analytics (comparing to others "like me") and network analysis (members of a consortia), so this is more about disclosing strength/weaknesses/overlap and how collaboration can be improved.

Ex Libris Alma Community Zone data strategy - from concept to reality

Dana Sharvit Product Manager, on behalf of Sharona Sagi, Director of Data Services Strategy, Ex Libris
Alma Community Zone and next-generation linking initiatives
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

The Alma institution has its own zone, the local catalogue and the inventory. The local catalogue includes the Library's collections, in all formats. There is also an inventory to help manage the collections' holdings. The workflow for managing acquisition of resources is the same for all formats.

When it comes to managing electronic resources, we see that there is much effort from each individual library and we want to leverage this effort for the community. The metadata should be available for the Community Zone.The CZ is built out of 3 things: a portion of the community catalogue, the central KB and the authority catalogue.

The Authority Catalogue contains copies of authority files from major providers, in particular Library of Congress subject heading, the LoC NAF (Name Authority File) and the MESH, as well as GND Name & Subjects as of October. Those are updated regularly. So ExLibris runs a central service instead of each library loading and on-going updating of Authorities files locally.

The Central Knowledge Base includes all the offerings of the different vendors. We have also all the linking information that will enable linking the articles. So the central KB is a resource that describes packages of electronic resources offered by a wide variety of vendors, including the titles that are part of that package and linking information for individual titles and articles in those packages.

The Community Catalogue holds all the metadata records. The sources come from different publishers and aggregators, as well as Library of Congress, British Library etc. We have a dedicated data services team that manages this and goes over the quality of the records to ensure they fit all the needs for the various workflows.This will start with e-resources and we'll add print later. The metadata that we load needs to fit a lot of purposes and to fit a diversity of workflows.

We want the catalogue to be for and by the community and we want the community to share the experience of relevant metadata information, so that it can be as full, correct and comprehensible as possible. The Advisory Group is composed of the core group, including representatives from different parts of the world and the aim is to listen and to understand the global needs of the Community Zone for the Community Catalogue. The full group is much wider. Its focus areas are:
- The Community Zone data model - matching/merging records from different sources, issues relating to various metadata schemas, goverance and policies relating to local annotation
- Contribution to the Community Catalogue by individual Alma libraries
- Workflows - what are the most streamlined workflows for working with Community zone records
- Licensing implications - as we are assuming a CC0 license, are there implicaitons for any of the above?

A shared record becomes part of the library inventory. Inventory information is handled locally at the institution level, including linking information, provider data, public notes, access notes, PDA programmes, lincense information etc. This will be published through the discovery service (Primo) and the record can be enhanced with book reviews, book covers, reading lists etc.

Every change that is done in the Community Zone will be beneficial for others because it is transfered to the local zone as both are linked. The Community Catalogue is part of a collaborative network. There are differennt member institutions that create the Community Catalogue. This is the Community Zone. Each member can use their own catalogue or the Community Catalogue. It is very flexible and the Community Zone can be used in different ways.

In the next few months we are planning to add more records in the Community Catalogue. We'll be working with the Advisory Group and implement their recommendations. By mid 2014, we plan to implement the shared record model, then support the community contribution by the end of the year. In 2015, after establishing a robust catalogue for electronic resources, we'll start adding support for print records.

Libraries will gain through the maximised sharing for mainting and managing the records. The central authority control will help for the management of those recors. There is a funcitonality making the importing process more efficient.

Question and answers

Q: One record fits all but what if we take a record home and make some changes etc, how does this impact on shared catalogue?
A: There will be a functionality in place to help you to contribute to the records managed via the Central Catalogue. today this is not yet the case because if you want to add information you do need a local copy of the record and it defeats the purpose of the Community Catalogue, but this is going to change. So the inventory is where it will be possible to manage this

Q: It is good that records will be CC0, are there any plans for people not part of the Community to access this data?
A: Those records can be used by all. This is not however all in our hands, we have to negotiate with data publishers etc. We are not allowed yet to completely open up the Community Zone but we want to maintain some value to our customers so whenever possible we will open up the data but it may not be possible everywhere

Q: How to integrate other catalogues?
A: We are looking into this

Q: Catalogues that have a high update frequency, how will that work?
A: E.g. authority records are updated on a regular basis automatically. If a specific catalogue managed in the Community Zone is often updated, we would have to look at this on a case to case basis.

Q: In different countries the authority files are connected to different standards, how will you handle these different connections in one catalogue entry?
A: That will be one question that the Advisory Group will have to look into

Alma Product Update and Roadmap

Bar Veinstein, Corporate Vice President, Resource Management Solutions
[live-blog IGeLU Conference 2013, Berlin]

This session is about Alma.If you'd asked me 3 years ago I wouldn't have thought we would have reached 33 live institutions today, which is quite amazing. There are an additional 52 in implementation phase.

Success factors: Customer trust (reliability, security), east of implementation, openness & interoperability, depth of functionality & efficiency, innovation/cutting edge. People talk a lot about ineroperability but it is not until I joined this industry that I grasped what this really means. Banks compete with each other, they don't share api's etc.

Multi-tenancy, why should you care? A lot of customers don't care or prefer the hosted environment because they think it means a dedicated environment for them. The multi-tenancy is valuable for the vendor, but actually it is for the customer as well. This is bercause:
- One version: no customer is left behind when the software is updated. This means there is one single version for everyone and it doesn't matter anymore which version we're on. It also means that new features are made available much more quickly because we don't need to maintain older versions
- Painless upgrades: vendor-managed updates, automatic
- Disaster recovery: a matter of economies of scale, with quicker response and resolution
- Vendor is responsible: customisations & integrations mus work in the new version

One of the main concerns of customers is security. We need a multi-dimensioinal approach. We take this very seriously, we have an ISO certification, we have an external company that monitors 24/7 our firewalls, we've implemented a lot of capabilities on the business continuity etc. This is valid for all products but it is mostly because of more and more working in the Cloud that we've put in place the 24/7 monitoring. With regards to Alma, it is not possible to connect to it without https. For cases where the protocol of an institution is not encrypted, we've developed a solution that uses ssl as a wraper, to ensure that the communication is secure and encypted.

Ex Libris can monitor user experience per transaction, such as end user time, health of the transaciton, errors, calls per minute, etc. so we look at the performance of all the instances across the world. This is something new specifically developed with Alma. Alma deplys in months, not in years, this includes ERM and link resolver and in some cases, discovery. This is what customers have asked for. We know it's not perfect yet, we know some people are complaining and we need to work on better training. But we've created a methodology around this. We are also working on migration from a diversity of other products.

We have a strong vision for developer collaboration. We understand that moving to the Could with sql access is not feasible. We have to take the responsibility of developiong the api's so we have a developer platform, where we are committing to deliver these api's and services and extensions to apply them. Until now, we've developed them but haven't yet done all the delivery. But we are going to embedd the API Management Infrastructure, to allow us to have much stronger capabilities for the delivery of api's. The API Console will be available as part of the portal so people can learn and test api's before implementing. This will be released in the second half of 2014.

Dvir Hoffman, director, Marketing and Product Management
Roadmap planning - some highlights

Collaboration - maximise cooperation, integration and sharing between institutions while supporting each institution's particular worklows and standards
Efficiencies - reduce costs by streamlining processes and redirecting staff to other more important taks
Digital - Continue to enhance unified resource management and provide increased consolidation options

We arealso working hard on analytics solutions. The community zone includes more authority records, we've created a working group to bring a community catalogue in the comunity zone. We are also working on the unified digital resource management.

Next year we will focus on emerging electroinc licensing and purchasing models. The community catalogue will be licence free, i.e. CC0 as we are committed to oppeness. We want our Knowledge Base to expand. Alma should be a multi-format solution and was built from the ground with this multi-format model in view. We want more predictive analytics.

We have a master plan but we maintain the flexibility to adapt and allow for changes in the roadmap. Initial plans for the next release is customer values:
- Overlap analysis = ability to overlap analysis of packages or between vendors: we plan to push the analytics into the workflows, firstly by helping to save costs for acquisitions
- Next year we want to provide you with a benchmark analytics by developing KPIs.- you can define your own and they can be compared with the Alma community
- We continue our investment in digital resources, by providing enhanced collection and metadata management for existing digital collections - no need to migrate files or ingest processes, thereby simplifying the process for staff. Provide ongoing OAI-PMH based harvesting of resources into Alma
- Resource sharing in ILL, i.e. Resource Sharing Driven Acquisitions
- Copyright control, embedd licensing information, monitoring digitisation requests in order to make sure thee are not copyright violations
- Patron Requests for Purchase - better service for patrons

Customers can choose their networks, so you can be part of a consortium but there is flexibility, such as shared catalogue, acquisitions, resource sharing (ILL) etc. Other things we are working on are:
- New purchasing options with vendors: this should come in the next few months, we are working with vendors to streamline the processes, reduce duplication etc.
- We've also engaged actively with some vendors in terms of security, some don't use secure protocols and are not Cloud ready, so we are working on that
- New acquisitions models: vendors themselves need to make decisions, especially for electronic resources
- Bibframe: we are taking an active part

Questions & answers

Q: What about e-book acquisitions workflows and platforms, e.g. ebrary etc.
A: We started working with a few vendors to streamline acquisition processes, including e-books, so using Alma acquisitions tools. The customers using Alma today have e-book purchasing. It is not yet very efficient. So we are working with the vendors to integrate different selection processes and loading the metadata. It's not easy, some vendors are very protective because they make money on their metadata.

Q: New Developer Platform and API console: how are you going to communicate with us, the developers, what is your stragegy?
A: We haven't engaged yet because we had to finalise the vendor we will use first. Once the contract is signed (end 2013) we will start engaging with the user group at the latest at the beginning of 2014.

Q: Electronic resource management: are you talking to local initiatives, e.g. in the UK we have KB+ and are you talking to JISC etc.
A: We are aware of KB+ in the context of licensing, so we've started active engagement, we are looking into ways to integrate or link into external products. This is not just with Alma, but also concerns SFX. We believe working at the national level, in Germany there are other similar programmes and we work together. So it's not because we work closely with vendors that we forget national initiatives.

Q: Alma puts a lot of emphasis on opening up and using CC0, will you do the same with your own Knowledge Base?
A: Good question! At the moment there are no such plans but we should discuss this. We are not a non-profit organisation, so we make money on Alma, and because we are profitable we were able to develop a good product. We should raise this quesiton in front of the management team.

Q: In which way are you thinking of public libraries? Especially interested in circulaiton and e-books readers
A: Most customers of ExLibris are academic but there are lots of consortiums that include more and more other types of libraries and we know we have to focus on this. In terms of e-books readers we are talking to some vendors. This is on our roadmap.

History of the World Library 2040-2090

Key note speech to the IGeLU Conference in Berlin
Michael von Cotta-Schonberg, Deputy Director General, the Royal Library, Univertisy of Copenhagen
[Note: this is live-blogged and is an imperfect representation of the excellent talk - hopefully it will give some sense of what has been presented]

The major governing idea of this presentation is that modern technology will eventually make the traditional library obsolete. Literature will totally migrate to the e-format and a new library culture will develop to handle this situation.

This is a 50 year Jubilee address of the President of the World Library (WL) to the members of the board, whom you represent. You will have to vote on a crucial issue.

My part is to give a brief overview of the Jubilee in 2090. Before the establishement of the WL in 2040, there were institutions called libraries fom latin or bibliothek from Greek. Let's have a brief look at it.

Prehistory of the WL. We leave that to ther Holyness Popess Benedicta the 18th, it was recognised words also have to be written and read not only spoken. They had to invent something to write with and to write on. The oldest texts in the world were written on clay. The first known library was Ashurbanipals palace in Ninive. For thousend of years they used the skin of calves, the invented printing which is a process to reproduce text and images on a support. That was to be short lived, lasting about 500 years. During that period literature was a scarce commodity, due to difficulty of access to books. the solution to the problem is almost as ancient as books themselves, and that is the library. Libraries as a solution to the scarcity problem was so efficient that they proliferated. There were libraries everywhere in the Middle Ages. When books became more of a mass product, libraries were everywhere including in Universities and even nations had their own libraries. But as book produciton became cheaper, more and more people bought books and had their own library. Mostly, books would never be read or never read again - not very efficient...

So within one generation the whole world literature was digitised. E-books had their own distribution systems. E-book publishers kept their books on their own servers because they wanted to protect their own income and services. But Google made them cheaply available on the internet. A period of transition at the beginning of the 21st Century but people weren't aware of it. A great part of non-fiction literature was produced by public money, sold to private companies who would then sell them even more expensive. The cost of this double system was enormous and wouldn't allow to develop the new e-based system. Academics tried to invent new systems with the support of national governments. But academics didn't have the strength to break the monopoly of the great repository institutions and that meant that sometimes double-costs were paid for content that universities had themselves produced.

Google was the solution! In 2035 they made an offer to the UN to turn over all its assests to the Library under one condition that the new library would be called the World Library. This idea was much older it had been thought by HG Wells he called it the World Brain but it's more than a microfilm. Of course Google's offer raised opposition but by that time Europeana and other counterparts grew so important that they had enough say. Opposition was overcome and the WL was approved by the UN in 2040.

Mrs Pippa was the fisrt president and the first to receive a rejuvenation procedure as those became available in the 30's. The WL shouldn't be a centre of texts, but a networkded structure given free and easy online access to texts of all kinds stored on servers belonging to the owners, would provide free access to the content of institutions, libraries and organisations that decided to join the WL. Those were given contributing partnernsihps and voting rights. At the outset the WL didn't own the collection of literature except the collection belonging to Google but was giving access. It was one global network where each collection partner was giving access to the whole world. It was a dounting task. Legal and formalist problems had to be solved. Financial issue was ok because Google gave a huge grant for 25 years.

By 2050 the technological and organisational network was in place. But then crises struck. A small village in Paraguay lived in splendid isolation from the world at large, but the community grew and proclaimed the overthrough of a civilisation that was corrupt and this included the WL. The group was lead by Raoul. They appeared to be healthy and happy so the government left them alone but this was a mistake. On Jan 1 2051 they launched an attack on the system of the WL, a virus that crashed the whole system and it was closed for 2 months causing absolute havoc. One of the most contributing University's digitised content was lost as the person who owned the key to access died. It was therefore necessary to take backups of all the content of the WL. In the end it was dcicded that copies should be deposited by all members. It was costing a fortune but was more efficient than using unsafe procedures. In 2060, the WL had ceased to be a collecitng network but was a central library not for use but for security.

The hacker problem was solved effectively. Every year the WL organised a competition invinting hackers to break into the library. The best one would get a very well paid job for the WL. Only twice the library was hacked but due to the safety copying it wasn't disatrous. Gradually the WL regained its credibility. It would also comprise commercialy produced literature. Print on demand copies could be ordered and this gave a special niche industry for the publishers. Legal Deposit of printed literature had been replaced by Legal Deposit of e-literature. However this was only allowed in the member libraries own reading rooms. Eventually it dawned on the publishers industry that this was obsolete so a petition was made that national libraries had to make this colleciton of e-books available through a price determined by publishers. The authors would get some contribution. The market would be managed by the publishers. The whole procedure was fairly simple and by 2068 the WL was a seller of world literature.

Profits to publishers raised significantly. But the movie and music industry wanted to be included. By 2075, the WL had gained such momentum that those were integrated. The LOE was created = the library of everything. Merging the global web with the WL was an idea that was raised. The discovery department of the WL today is the largest and employs lots of people cleaning the metadata and allowing full seraching texts. Expected to come out with something workable in the next decade. The global publishing system is complicated. For fiction, it is easy for all to enter into the market. Independent publishers flourished as never before. Self publishers overloaded the market but the public got tired of badly written novels. Small scale publishing has by now become a viable industry only if you're not in it for big profits. About 100 years ago many thought this century would see the death of the book. Untrue. Faboulus take off together with the age of the computer.There are different publishing models. the golden open access system made the temptation to publish more and more articles paied by the authors themselves desireable. This functioned reasonably well but was unsustainable because of the large amount of publications. A new system was developed where the pre-publication peer review only comprising the first part, the assessment of methodological soundness. The assessment of scholarly importance was left to new forms of post-publication peer review. This made the publishing run more smoothly but the post-publication peer-reviewing didn't work so well. This meant that there was an increasing number of scholarly article methodologically sound but otherwise insignificant.

Something had to happen. In 2081 the board of the WL received a request to develop an administrative system for all new scholarly publications govering the field of the Nobel Price for A. eminent scientific contribution, B. Valuable scientific contribution or C.accebtable scientific contribution. Would only be a small fee for scholarly publication. The WL created a list of quality publications.

In a very short time the first thing you looked at was the profile of quality certification obtained for a publication rather than the work of the author itself. Quality insurance was taking over by scholarly associations that they had initially introduced years ago. This was the 4th crises of the WL, the most controversial. The proposal now is that world citizens should have direct connection with the WL for retrieveal of data directly into the brain. Using technologies and interface developed by the Kulu organisation. The technology has been proven. 3-dimensional spider in the mind, allowing to look in all directions at once and having text pictures and sound all together. The programm can inject new knowledge in the brain. Voting options are:

1. The world needs to know about the WL's efforts to make available to all knowledge without profit. If yes, it's giving access to knowledge in its fundamental sense. The emprires of the futures and the empires of the mind. So giving our brains to the sum of human knowledge is the best way. By accepting direct unmediated knowledge you will support a new knowledge order and thus a new world based on the right to knowledge. Individuals will achieve a higher level of intelligence. You can steer in any direction you choose. Please vote yes.

2. The risks outweighs the benefits for two reasons. Risk of 2-way communication between the library and human brains. We cannot garantee that we can avoid partial publication between brain and library. the consequences of a massive intelligence computer system are unclear. The second question concerns security. Library computers can be breached this would give hackers the ability to upload data to human brains. This could be to force you to buy new products or programming autonomous funcitons in the brain and this could lead to the complete colapse of human civilisation.

Now it is for the members of the board of the WL to vote. The motion has been defeated! God save us all. Maybe librarians were wondering if the Libraries had a future. It had a great one. It will lead up to the fulfilment of the great library but the defining moment was the transition to the ubiquitous digital library. National libraries have survived to preserve and make national/historical collections available to the community. After the WL came to being most public libraries were integrated into cultural services under many different names. The professions of librarians is glorious. We are excellent researchers.

The past is an everchanging theatre of interpretation; the future is a stormy sea of potentialities; only the present stands firm but it just lasts a second... Fortunately!

Opening session

[This is live-blogged from IGeLU 2013 Conference in Berlin, please forgive typos etc.]

Welcome by Jiri Kende, chair

Updates on the Steering Committee's situation, with a special invitation for members to vote. Results will be announced at the closing session on Tuesday afternoon. Next year, the IGeLU conference will be held in Oxford, hosted by the Bodleian University.

Welcome by Matti Shem tov, President and CEO, Ex Libris Group

Unfortunately I can't join you in person as I broke my leg 3 weeks ago in a very unglorified way... [nice pic of Matti with cruches!]. What we've done since last year, we have 280 new institutioins and are now over 5,500, we employ 530 people, run 3 data centers (Chicago, Amsterdam, Singapour), revenue of 95m$ and have made Alma live. We have a new owner: golden Gate Capital, which is a San Francisco-based private equity firm with $12 billion in capital under management. ExLibris remains an independ business, with the same executive team, roadmap and operations.

We are working on the transition to SaaS, there are a few tedious issues but more and more applications are developed in SaaS. We develop our technology, as well as our business model and operational structure. It is a priority for us. We want our company to be in the Cloud although we will continue to support the local installations for the years to come. In 2008, we had only 11% customers in the Cloud and today, this has risen to 75%.

This past year has been very good for Primo. It now serves 1,916 institutions worldwide. We focus on particular aspects, especially personalised ranking and tighter integraiton with Aleph, Voyager and Alma. We moved to agile releases which present new functionality bi-monthly. We have introduced Big Data technologies (Cassandra and Hadoop) and use Oracle. We continue to enhance the product and have introduced large scale multi-tenancy.

It has also been a very good year for Alma, it is live at 33 institutions worldwide and there are an additional 52 institutions in implementation phase.

Our other products such as Aleph and Voyager still concerns more than 4000 customers and we are continuing developments. Rosetta is a very unique product for massive digital content. We have more and more customers using Rosetta around the world.

Singapore data center is relatively new and currently supports 4 customers, this is going to increase. We've achieved ISO 27001 as a security standard. We take a lot of community initiatives, and support various initiatives, such as open access. We have close collaborations with our customers via the community zone advisory group, voting process for enhancements or OPAC funcitonality in Primo. We collaborate with the product working groups, have teesting teams for new CRM system and work woth joint focus groups.

Koby Rosenthal, Corporate VP, General Manager Europe, Ex Libris

The partnership within the community is very rare because it is usually very competitive. We should be proud of this because it is really unique and I find it impressive. Updates for Europe since the last IGeLU, we have more and more institutions using Primo and adding more experienced people to our organisation. We try to bring more people from the headquarters to Europe. The UK early adopters programme is very successful. In many places Alma is now live, which doesn't mean there aren't still some issues to resolve but we can learn from what is done in various places. We have a very experienced team who speak the language of the countries with which they work.

We are moving forward with Alma and are working in close collaboration with institutions that have made it live as well as those who are showing an interest. We are transferring a lot of knowledge to our staff in Europe as well as to institutions. We are building a new team that will look at the local specificities and needs in various parts of Europe for specific tayloring for the different languages. We want to support the customers and future customers in Alma and are adding knowledge and experience to our organisation.

We have appointed an European strategy director and are planning enhanced cooperation by incresing the number of solution days. and establish regional directors meetings. There is an Alma early adopter program in German speaking countries, including the University of Mannheim, as a large German library, and this will kick off in October. We are in the process of implementing similar programs in other countries such as France, Italy etc.

Pages

Tuesday, 10 September 2013

Monday, 9 September 2013

Sunday, 8 September 2013