Pages

Tuesday 10 September 2013

Linked Data: from Aleph/Primo to the Dictionary of Luxembourgish Authors

Roxana Popistasu, IT staff, Bibliotheque nationale de Luxembourg

Project started last year, simultaneously went live with Primo. The NLL also manages Aleph & Primo fo rthe network of libraries in Luxemboug. Also partnership with Centre national de litterature, which manages the content of Autorenlexikon (dictionnaire des auteurs luxembourgeois). Idea is to link all this info. Other partners: Magic moving pixel, an IT management company for Autorenlexikon. The goal of the project was to evaluate the work involved.

Questions to be answered:
- How to create a link between authors? String matching or id's?
- How to deal with identical names?
- How to deal with the authority records?
- How to do find (and save?) the matches

The initial results were unsatisfactory. Connecting the authors between AutorenL and the bib database based on id's produced a low number of matches (60%) and this was even lower with the authority database (35%), so we had to use the string matching. But even 60% was better than nothing so used that first. Link was added in the bib record.

The actual project for setting up the linking started in March 2013. Started by adjusting the matching algorithm and create a database with matches. Then came the need to create web service to be used for the display in the catalogue and in AutorenL, then display matches in the Aleph OPAC, and create the validation service. Matches were made on author and title.

The algorithm: created normalisation rules, e.g. elminate different characters, upper/lower cases etc. Work on standardising the cataloguing ruiles with were different in Aleph (MARC21) and AutorenL. Levels of matching needed to be checked. Created database with matches, regularly and automatically updated using exports from Aleph to import in that DB.

Choices were made for the matches, e.g. using pseudonyms and alternative names so looking at those enabled to do the matching if one person was represented differently in the different databases. A validation service was set up for the National centre for literature, to assist them to do the matching on their side. This was based on levels of accuracy. They could find the Aleph system id where relevant. This has also helped them to find small mistakes in their data which they wouldn't have otherwise found.

Phase 2 of the project is how it displays in Primo and doing links between the authority database and AutorenL because at the moment it is only done with the Aleph database. We are also going to work with VIAF to publish our authority data but first we need to improve it. This project can be part of this process. We will investigate more how to link on id's and see how to integrate with other systems, such as DigiTool.

Ex Libris General Question & Answers

[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Q: Read-only access to the Alma configuration screens availability?
A: First let me explain what this is: it refers to the process during implementation, ExLibris performs the configuration in the first instance but once the customer has received the Certification Training, ExLibris doesn't do this anymore. This at the moment is near the end of the implementation just before Alma goes live. We've received this as an important feedback, that people want more transparancy and understand better the configuration options during implementation. By the end of the year we will have a read-only option for the configuration. This will first be at the fulfilment level. This will be incorporated as part of the permissions based on roles.

Q: Data quality and how to re-use cataloging already done is important. What is happening with the plans for the Alma community zone where we expect we should be able to find most records instead of searching them via external sources?
A: It's part of our key goals to improve efficiencies, focus on enabling streamlined sharing. We've started this by modelling this, we're at the stage where we work with the policies and cataloguing advisory group and expect to get results towards the end of the year. We presented the suggested model now and you will recognise real efficiencies in processes. We are finalising the development of those recommendations.

Q: Plans to share EzProxy configurations in the community zone?
A: The community zone isn't planned for that, but it's a good question and we should think about that.

Q: When can we expect to see license information in Alma's uResolver? (obligation to display copyright information in e-resources)? Will we ba able to rename fields or add local terms to be added to the uResolver copyright information when finally built out?
A: The option to show the licensing info is planned for the 2nd quarter of 2014. We will allow to adjust the wording of those fields to be more user friendly. This is in our roadmap.

Q: Cloud safety - What if anything happens to a Data Center? How will Ex Libris restore services to customers whose data and applications are housed in the Cloud?
A: We're aware that out Cloud is a lot of responsibility. We've looked at all the details for hardware, using the best hosting that we can find. We have 6 internet providers providing to the site, multiple firewalls, applications designed specifically and testing in that way. We've upgraded our backup, we used to do it on tape but now we do it on disc offsite. The ability to restore from there to another location would be much quicker than in the past. We have multiple ways to resolve such issues. It's easier to restore from site, but we can do it differently. Restoring is not a simple process. In a catastrophic situation, it would be less than a year... But more like a week...

Q: When can we finally expect daily analytics for Alma?
A: We've just announced that this is near release. It's been rolled out in 10 institutions of the US as part of the testing and it should be available to all customers by mid-October. The short answer is that it's already there.

Q: WorldShare Management Services is OCLC's response to Alma. Although Alma is more mature, given WorldCat as the primary cataloguing datase of WMS do youreally think it makes sense to build a second (semi-)global Bib library within Alma?
A: We want to build efficiencies in processes, not to re-build WorldCat, which was designed a few decdes ago. We're looking into the future. We want to integrate any existing service that is useful to you. Such environments includes WorldCat and this can be included in Alma. Wheter you're an alone-institution or part of a consortium, integration is possible. Others who don't work with WC may find new efficiencies in Alma.

Q: OCLC has recently published  info about the WC metadata API. What about ExLibris, planning to integrate in Alma (looks like a good replacement for Z39.50)?
A: We've started discussion with OCLC about this, it's on its way

Q: UStat is the first SaaS component from ExLibris. How long will it remain as a separate product for sfx users? Will it's functionalities be integrated in Alma?
A: Alma anlytics will have greater functionalities

Q: Does Salesforce offer a Claud Status page in order to indicate when the servesr hosting Salsefoce CRM are down?
A: We will post in our Customer portal page the status of our services, including Salesforce

Q: The Ex Libris Customer Center platform is not easy to search, after developping such a good product like Primo, can we expect the same for the Customer Centre?
A: Priority is on services to users but we will look to improve the search engine for the Customer Centre next year

Q:With OA issues, this changes how content is made available. Ins SFX, with effect in Primo, we work at Journal level but OA is at article level, how will ExLibris manage this?
A: Part of our next linking project is to provide this info to the link resolvers

Q: Are e-book loans sheduled in any ExL software?
A: This is more of a question for vendors and we'd be interested to see vendors' models

Q: How far did the implementation of RDA in combination with Aleph & Alma proceed?
A: Voyager team work closely with the Library of Congress team, Aleph didn't need to do any configuraiotn work and Alma cataloguing support will be finalised in the next few months. Next steps: all eyes on on bibFrame but they've not agreed to implement FRBR they're working on a new model also don't think MARC21 should be the standard, so it's too soon to say on how we'll move forward on this question.

Q: Primo was launched as part of the strategy to decouple the frontend from the backend.Many primo services are availabolo only to Alma users and the front end of Alma *is* Primo, could you expand on this strategic shift to a monolithic (or symbiotic) couple?
A: It's tactical/pragmatic rather than strategic. The dicoupling process between front and back end are still relevant, both Primo and Alma can enable 3rd party component. Next gen framework is still relatively young, we develop a lot of user case but we still don't have enough best practice etc. but it will come. Tightly connected interaction between A & P is based on interfaces

Q: Marketplace for metadata requires cooperation between all parties. What about collaboration with EBSCO? Other discovery tools seems to be able to agree with each other to share data. Why not Ex Libris?
A: Good news is that it becomes more apparent what's going on and that EBSCO doesn't enable the subscription based index for discovery services such as Primo. You have to buy EDS to access EBSCO content so you are told to buy another discovery system. Otherwise it would be through API. Indexing is the best way to do it but they offer their service through API. This is a problem especially for those of you who pay for this content. We find it hard to find a way around it. In terms of other deals, when it comes to conent and aggregators such as EBSCO, their content is not unique and we are making good progress to provide alternatives. We sign deals with publishers.

Primo - product update

[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Gilal Gal, Director of Product Management, Ex Libris

In Primo, we focus first on the "engine" to make it functional. It is pure engineering, ensuring movement in the production. With Primo Central we need to ensure it works 24hrs a day, it is quite an achievement but takes a lot of focus. We are the software vendor, we don't sell content, but we can't talk about Primo without talking about the content because we have to handle it correctly. Primo is the tool through which you give services to your customers and we take into consideration comments from users and yourselves, as administrators.

Breath of development in Primo:
We develop OPAC-type functions, such as browsing. We rely on bx, which is contextualisation of searching based on preferences. We listen to the community's requirements, such as a FRBR presentation, added functionalities such as direct request for photocpies/digitisation etc., ability to update the password. We continue to invest in the mobile interface.
We were asked to invest more in the administrator service (i.e. not via the command line) to load files etc. so we have improved those services. We will also continue to invest in security, we've  achieved ISO 27001 certification for informaiton security management. We are using Big data infrastructure (Cassandra and Hadoop) for data processing. We will move to the multiple SaaS infrastructure in a future release. We are not abandoning Oracle but for more sophisticated things, some of the data structure will move to Cassandra.
We have a Primo Central resource activation to facilitate management of resources. We will continue to improve the performance of specific things, such as getting results faster so that the search experience for the users is improved. One of the things we want to develop is the browse virtual shelf functionality.

In terms of content we are adding abstract  indexing collections. We are including published research because that's what people want more, but we are also including things such as thesis, technical reports (unpublished scientific research), raw material for research (Mintel, Data-Planet) etc. Primo Central enhancement with phantom collections - activate collections via alternative coverage.

Access to the Primo Back Office is made easier. It is done in 3 easy steps, getting an email. information through a webform and receive your completion email.

There is also syndetics-type information for a search carried out. This is presented as a pop-up box on the side of the screen, providing basic information based on the search from reference material, which includes relevant links to other relevant sources of information.

Tamar Sadeh, Director of Marketing
Scholar Rank

See presentation given last year. We have done some evaluation and have used qualitative and quantitative methods.We looked at useage, KPI based on session/GetIt (=concrete interest in a specific item), location of the selected record, time until selection, use of facets and navigation to the next page etc.  Globally we see that there is an improvement in all areas. Personalised ranking can be on or off.

Open Access is another topic giving new opportunities.  Publishers take part in this movement especially in Gold Open Access. The proceedings of publishing are the same but the money comes from the researchers, not the reader. The Green model is valuable for institutional repositories. What we do in Primo is that we highlight what's open access. From the point of view of readers, it doesn't matter where an article comes from . The interesting thing is with hybrid journals. Those are based on subscriptions but some articles are free. We encourage institutions to promote their work via Primo. It exposes the content but also links to other relevant work (same other, same subject etc.)

Question and Answers

Q: Will there be a more elaborate way of depositing institutional repositories , i.e. more information than just author, title etc.?
A: We aren't limited to any kind of information and we would be happy to extend this to anything you need if you tell us about it.

Q: Installing Hadoop and Cassandra for SaaS customers, what about those that have local installations? How will you ensure that the performance will be acceptable to both groups?
A: It should be seamless to our customers, the software will identify if it should use Oracle as it does today or if it should use Cassandra/Hadoop

Q: Phantom collection: will we find the name of the collection in the facets?
A: To create a facet we need to indicate which collection it belongs to and in this case it would be collectionS so I'm  not saying no to the question but we need to work on it

Q: Primo Central and Local are separate indexes that can't be merged. When there is a blended search, it is not possible to dedpuplicate and RFBRise across both indexes. Also ScholarRank and Personalised Ranking only apply to Primo Central. Are there any plans to imrpove integration of both indexes?
A: Yes we have plans to have ScholarRank in Primo Local. We are looking into the FRBRisation. Until the collection wasn't that big but this is changing and we will be looking into this.

Q: The subjects and author names are copied "as is" from data sources into the index, this makes itimpossible to search on specific subjects or variations of names. Are there any plans to normalise this (authorities etc.)
A: We put a lot of effort in normalisation but it is difficult to be perfect. We monitor our work in this area, we are starting a big scale pilot of normalising this data, using a software infrastructure dedicated to that. We did a small pilot and are now going to start a big pilot, also working with publishers.

Q: A Worldcat adapter is available to order to search directly from the Primo UI. This could be very useful, if the description was based on records with more information than a slim DC and if facets were created. As it is now, it's hardly interesting for our users.
A: Difference between the API and the search engine. Worldcat don't provide facets and answer the query in rank order. We can't provide a facets option based on that. We have the capability but we need to review how we will do that.

Monday 9 September 2013

The shift to electronic - managing e-resources in Alma

Roger Brisson, Boston University
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

There are different kinds of e-resources: e-journals, e-books, databases, mixed packages etc. There are also different types of orderning e-material. We were very early adopters of Alma and at that time, Alma was still undergoing lots of developments. This gave us an opportunity to shape Alma. Over the past year, we've seen new functionalities every month, which has helped to refine workflows. The first thing we wanted to do when going live was to stabilise. We were also looking at how to automate things based on our core values. The library system used influences the reality of services we provide and with our old system designed for print, we were creating work arounds which were becoming unsustainable. When we started working with Alma it was really good to see how everything could be integrated together.I also monitored tasks that we were duplicating and seeing how this could be improved.With the discovery system, we can do things with our data.

As libraries, we need to re-define ourselves and we have to be more efficient with the management of e-resources. We must also change the way we do our cataloguing and share more. Our old system was too flat, not dynamic enough. Because of the print-based design, there was no flexibility and nothing more we could add. We now have more or less the same numer of electronic resources as print and it is likely that next year our e-books will be larger. We use 90% of our budget on e-.

Alma can enrich our records and make our discovery system better. E.g. there is a special box for 856 links in Primo and we are using this more and more. As early adopters, we've been setting up automatic workflows in Alma. We use existing standards (RDA and MARC) to optomise the new environment that we're working in and cutting down things we don't need anymore. An e-book for example doesn't have any physicality. So we work visually to understand what is happening and how to design a system that works. Is an electronic copy a variant? What is it?  With an e-book there are additional things, article links, videos, table of content etc.

Examples linked to resources for a new MOOCS that we are setting up: Cataloguing an e-book is not that easy. We have a pdf file and an ebook reader. The numbering of pages is not the same. How do you describe this? We scrutinise these kind of questions. Or if you have a subscription, a closed package that you make available on the net. The package has 96 resources, it's a subscription, it's very dynamic because new e-books are added all the time. Managing a dynamic package is much more challenging than a static one. Or we have an e-book database. It's designed to be read and used in the database. We won't treat this in the same way as an academic paper. Traditionally we would have pre-selected books to our readers but now we are exposing lots of books that the user has to choose from. We are expecting a lot of Alma to manage this new type of model.

In terms of uniform resoucre management and automation we want to take advantage of the central knowledge base in Primo. SFX has been mostly ingested in the central KB. We have to have a means of pulling out data out of Primo and at the same time, new books are pushed in. There needs to be a trusted interaction between the publisher/vendor and Ex Libris so that when I activate a subscription, it's reliable because I can't check 96 books every month. The confidence has to be built. That's taken care of by the CKB. We initially had an issue because the data wasn't reliable, the records weren't good enough and we've worked to change that. We are loading vendor records that we trust every month. We're not quite where we would like to be yet. Those are mostly bib records, we have separate records for acquisitions, licenses etc.We can work at a package level, either doing a simple import or an OAI.

If we consider the bib record in the institution zone, we have a record in the community zone, which is the discovery record = work (FRBR). When we have an e-version, it is par of the electronic inventory = portfolio. The manifestation should be in the inventory so based on the FRBR model, we should first see a unique record. In terms of the logic of Alma, it would make a lot of sense because at the moment a bib record contains a lot of things that are not relevant. We should really clean this up. Alma shows clearly what records come from the community zone. Activating one should include all related records in the protfolio. When migrating to Alma, part of the process is changing print records and using the P2E cleaner helps. A lot of cleaning was still required afterwards. Having all of your inventory in one system allows you to think of efficiencies, such as having only one record for both print and electronic and having all of the data together.

Q: How do you manage condition of use at e-book level?
A: You set up your licensing information, because Alma's functionality is vendor-centric and it's through the vendor record that we set up the licences etc. Depending if it's single or multiple-usage, we can manage that through the portfolios. The actual usage is controlled by the plataform itself. If the licences are separate, you may want to have different records and have them all to display differently in Primo. This might be confusing to users but those are 3 copies of a book and we want all 3 links to be viewable and active.

ExL A: when there is a tilte available through different vendors, you can specify which ones you want to show by setting up "preferences". In terms of management, we are developping overlap analysis so you will know in advance what duplications you have in your system between packages.

Alma - early adopters' experience

Notes from IGeLU 2013 Conference, Berlin

University of Bolzano Library

Reasons to adopt Alma: Future oriented solution for whole integration of all resources in every phase of the life-cycle, support for library processes (core business) including analytics, networking & best practice exchange of similar libraries etc. Joined at an early stage , so needed flexibility & change management. Took opportunity to simplify rules & policies in the data migration process.

Timeline went from April 2012 (Alma training) to Jan 2013 (Alma live). Started at the circulation/fulfillment stage, even if it sounds strange, it was good to manage staff anxiety, followed by cataloguing and rest of the migration (electronic resources etc.).In the first months, work on workflow revisions, configuration revisions, changing opinion about things thought of before moving to Alma, troubleshooting, optimisation etc. Resource management area to clean up records etc. More recently work on analytics.

Positive points and changes: Overal positive experience, good cooperation despite time pressure, solution-oriented and pragmatic approach. Lessons learned are that Cloud Computing requires a new approach to software, so monthly updates, no traditional test environment so all change is made in production. Occuring problems have to be notified "to the Cloud" instead of IT department. Reporting and controlling is improved with positive impact on the organisation in decision-making. Standardised workflows. Cleaning up work requires manual input.

Open and critical issues: Activation of electronic resources is very complexe (some databases are still missing), lack of functionalities of Alma Uresolver, still developments required for analytics, batch updates and set creation not always easy, migration of subscriptions and entering of acquisitions data, customisation in workflows, cataloguing or display of data is still needed, are the major points.

Northeastern University Boston

Why Alma? System that can handle electronic resources, better analytics, no local hardware maintenance, financial incentive to become an early adopter.

Timeline and challenges: April to July 2013 for actual implementation, with a 9-month overall period, quite challenging! Mostly because there was no on-site visit, missing context in WebEx training, missing and developing data migration documentation, confusing project management plans. Other main challenge was that configurating Alma without it going live wasn't available.

Two months live... Optimistic about the future of Alma, happy to have hot fixes and monthly release cycle, forces to look at workflows, already impressed with analytics. Biggest disappointment is in e-resource management, in particular, Uresolver still needs lots of customisation, lack of clarity about the knowledge base (same as sfx?) portfolio management is difficult and time-consuming, missing stability and flexibility in licensing. In addition, missing flexibility & customisation options in user roles & permissions, order records, licensing and notices (can only be customised at the institutional, not library, level). The software is not intuitive, some workflows are clunky. However, still very optimistic! Happyy to work with Ex Libris to improve it and to build a strong community.

Discovery and Discovery Solutions

David Beychok, VP, Discovery  Delivery Business Unit, Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

What all discovery services have in common is that they have a better understanding of the various systems. There is a vote of confidence towards Primo, with many new large consortium joining us recently. Primo is the frontend for Alma, which doesn't have an OPAC. So there is a strong integration to bring you these solutions. We use big data tools, those include hadoop and cassandra, which increases speed of indexing and availability. Good tools for massive data processing and storing.

The Primo implementation process includes stages 1,2 and 3, i.e. Primo is set up in an automated process, with default settings and then customisations can be added. It is done smoothly and fairly quickly. Open Access is important to the researchers and Primo makes discoverability of OA repositories or hybrid journals' content easier. Primo has a functionality for institutions to register their repository in the Primo central index.

We continue sfx development and releases. We also continue to support regular service packs for MetaLib, for fast search, integrated with Primo. Primo has a usage data service and this helps for ranking articles. We have a pilot project for branding and are looking for interested parties.


Next geneeration Library Services - Emerging roles and opportunities


Oren Beit-Arie, Chief Strategy Officer Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

The context in which we operate today is significant, increase in scale and diversity of content type, increase in cost especially serials, economic challenges in many institutions, increase scholarly research and learning (e.g. MOOCS), increased pressure and competition for library services etc. there is also a greater value of the value of libraries. We are also moving increasingly on Cloud delivery, so this is our framework.

We started developing it about 6-7 years ago, including sfx, the knowledge base, Primo and Alma. So we supported both the front end and the backend. Collections in the way that library built them in the past is changing, we're moving away from the "bigger is better" and from ownership and we're going more towards access. Many libraries are paying attention to fulfilment rather than selection, that means that it's more geared towards the information need of the user. That means also moving from the "just-in-case" to the "just-in-time". The economy around the research activities is also changing, there are more cases in which access would be free but there is more attention towards the economy of the creation of content, i.e. who will pay for the collections? Is it the reader, the researcher, the library?

Data services: The goal is to maximise the benefits of sharing, optimise management and discoverability. The two areas of focus is the Alma community zone and the next generation linking, of which I won't talk very much today, except that there are paradigms of linking, for example the sfx knowledge base, it's about pre-computered strored links at the article level. The notion of the Alma community zone was introduced a few years ago because there was a need for creating more efficient management tools and this needs to be done in the library context, so the community catalogue in the community zone enables libraries to add to shared content, information to the user community, to add or use vendor information etc. so the bibliographic control is improved. By creating a community environment, we support collaboration.

All this information that is shared is made readily available for discovery services in a streamlined way, so that you shouldn't have to work hard to enable this. It is a work in progress, especially for something like e-books. We work with the advisory group, which is composed of a broad number of represenatatives of users from around the world and this group is helping us to pin down the model that works best for all (see yesterday's session).

Linked data has the potential to extend the level of sharing of knowledge between libraries and external sources. In libraries we are very collaborative within our own island, but linked data has a greater promise. We are interested in this, we think this concept is tremendously important for the content that libraries created and to outreach to external content. For example an activity we're involved in is DME which stands for Digitised Manuscripts to Europeana and we are testing this, it's an important activity that involves big players such as Google. Other examples include NISO bibliographic framework project, the W3C schema bib extend community group (Shlomo), the LC bibframe initiative etc.

Other things we are involved in is that we try to evaluate trends and needs in education and try to build support for those needs by creating new platforms. Users perceive (e)resources as free we-bresources ("found it in Google"), the value of access to electronic content is not recognised by users, and in this realm there is lack of recognition of the role of libraries. It is paradoxical because there is something good about this, you want the user to get to content as soon and easily as possible but the libraries are getting pushed aside because users are not acknowledging their involvement and this could impact on funding for libraries. Our project enables libraries to brand those products developed to your institutional needs so they are customised by your library and build awareness of your library. It is still in pilot stage, we haven't yet talked about it publicly. (See Academic Libraries existence at risk, suggested by @chrpr)

We are also taking initiative for open access because we are aware of the significant drive toward OA around the world. Publishers are starting to respond, maybe not always in the way we'd like to but there is a lot of activity around OA. We are following the debates especially between green and gold OA because we need to be aware of what it going on. CHORUS is a publisher initiative that is about enforcing the OA mandate by publishers, is this how we want the world to be modelled in the future, is it an opportunity for libraries? We want to discuss this with you and I would encourage you to come and speak to us about this. We are in a transition period. Primo is enabling OA and when we are looking at this, it raises an important question. Some content is totally open with unrestricted access and discovery, others have unrestricted discovery but restricted access. There is some bad stuff out there, there is closed access and closed discovery and this is not just about us. Do academics/researchers who create content understand that some of their output is blocked out? So there are lots of opportunities to create better management of OA. When we talk of different models for OA, we see that many librarires are engaging in this.


Sunday 8 September 2013

Alma analytics

Asaf Kline, Alma Product Manager, Ex Libris
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]

Roadmap goes from descriptive analytics to predictive analytics (=how should we take action) to prescriptive analytics (=what will happen). We're still at the stage of descriptive. This should help to have a deeper understanding of what's going on in your library. Alma is optimised for anlytics e.g. cost per use etc. We are doing a lot of work behind the scenes to make this happen. The model is a star schema, in the middle there is a fact and then there are different strands. It's optimised for reporting. There is a shared platform and institutions can share their anlytics data. It is also role based, so any report or dashboard created is sensitive to an individual's role.

There is a functionality to schedule and distribute reports. Any user can subscribe to the reports that they want to receive. Analytics work with APIs because for any report created you get an xml representation of the report and it can be sent to the programme of your choice. The analytics provide a history and that's how it can be predictive, because based on previous years, we can see what may happen (e.g. funds burn down).

Usage and cost per use. We take the counter info from the vendor and provide you information
from subject area and from within Alma. We use Ustat for loading in a vendor usage via a spreadsheet or sushi. When we know what's in your inventory and how much you pay for it, then
we can provide cost per usage data. Usage is on a title level but when libraries buy packages, we bring up the title data to the package level and give data for the package. The analogy is like a TV - we pay for lots of channels and may only watch two. What's important is the cost per usage, not so much what we watch. (Question here: we often raise purchase orders at package level but get invoiced at title level so that could be a problem?...).

This has been a central activity in 2013, now being rolled out to Data Centers with a continuing effort to ensure scalability of infrastructure. There are additional subjects that we're working on:
- Usage (Alma generated)
- Requests and booking
- Vendors and vendor accounts (more analytic in nature)
- BIB (bib data) - unfortunately analytics and MARC don't work well together so we will have tools to search for and process info that we have in the catalogue so at the moment there is the option to choose 5 local fields to get data from

Usage data is Alma generated, captured via the Alma Resolver, it will answer questions such as:
- What services were provided?
- What queries ended without any services?
- What were the sources/ip ranges that accessed the system?

We want to provide a set of tools to gain insight into the structure and use of your collection, e.g. print or electronic inventory and usage, but we're missing overlap analysis so e.g. comparing titles from a package so it's combining a system job that we run together with analytics. We want to take our tools to collection level, e.g. shelfmark/classification ranges, and embedd it/make it actionable so that it becomes part of the purchasing workflow, based on facts and analysis. We also want to start generating KPIs (performance measure), to help evaluate success or sucess of a particular activity, e.g. how much time it takes to process purchase order or vendor supply, avg % of items not picked up by user, request processing time etc.

The data will also be made available to be viewed on mobile devices. We want to bring it up to network level, i.e. cross institution reporting, using benchmark analytics (comparing to others "like me") and network analysis (members of a consortia), so this is more about disclosing strength/weaknesses/overlap and how collaboration can be improved.

Ex Libris Alma Community Zone data strategy - from concept to reality

Dana Sharvit Product Manager, on behalf of Sharona Sagi, Director of Data Services Strategy, Ex Libris
Alma Community Zone and next-generation linking initiatives
[live-blog IGeLU Conference 2013, Berlin - please forgive typos etc.]


The Alma institution has its own zone, the local catalogue and the inventory. The local catalogue includes the Library's collections, in all formats. There is also an inventory to help manage the collections' holdings. The workflow for managing acquisition of resources is the same for all formats.

When it comes to managing electronic resources, we see that there is much effort from each individual library and we want to leverage this effort for the community. The metadata should be available for the Community Zone.The CZ is built out of 3 things: a portion of the community catalogue, the central KB and the authority catalogue.

The Authority Catalogue contains copies of authority files from major providers, in particular Library of Congress subject heading, the LoC NAF (Name Authority File) and the MESH, as well as GND Name & Subjects as of October. Those are updated regularly. So ExLibris runs a central service instead of each library loading and on-going updating of Authorities files locally.

The Central Knowledge Base includes all the offerings of the different vendors. We have also all the linking information that will enable linking the articles. So the central KB is a resource that describes packages of electronic resources offered by a wide variety of vendors, including the titles that are part of that package and linking information for individual titles and articles in those packages.

The Community Catalogue holds all the metadata records. The sources come from different publishers and aggregators, as well as Library of Congress, British Library etc. We have a dedicated data services team that manages this and goes over the quality of the records to ensure they fit all the needs for the various workflows.This will start with e-resources and we'll add print later. The metadata that we load needs to fit a lot of purposes and to fit a diversity of workflows.

We want the catalogue to be for and by the community and we want the community to share the experience of relevant metadata information, so that it can be as full, correct and comprehensible as possible. The Advisory Group is composed of the core group, including representatives from different parts of the world and the aim is to listen and to understand the global needs of the Community Zone for the Community Catalogue. The full group is much wider. Its focus areas are:
- The Community Zone data model - matching/merging records from different sources, issues relating to various metadata schemas, goverance and policies relating to local annotation
- Contribution to the Community Catalogue by individual Alma libraries
- Workflows - what are the most streamlined workflows for working with Community zone records
- Licensing implications - as we are assuming a CC0 license, are there implicaitons for any of the above?

A shared record becomes part of the library inventory. Inventory information is handled locally at the institution level, including linking information, provider data, public notes, access notes, PDA programmes, lincense information etc.  This will be published through the discovery service (Primo) and the record can be enhanced with book reviews, book covers, reading lists etc.

Every change that is done in the Community Zone will be beneficial for others because it is transfered to the local zone as both are linked. The Community Catalogue is part of a collaborative network. There are differennt member institutions that create the Community Catalogue. This is the Community Zone. Each member can use their own catalogue or the Community Catalogue. It is very flexible and the Community Zone can be used in different ways.

In the next few months we are planning to add more records in the Community Catalogue. We'll be working with the Advisory Group and implement their recommendations. By mid 2014, we plan to implement the shared record model, then support the community contribution by the end of the year. In 2015, after establishing a robust catalogue for electronic resources, we'll start adding support for print records.

Libraries will gain through the maximised sharing for mainting and managing the records. The central authority control will help for the management of those recors. There is a funcitonality making the importing process more efficient.

Question and answers

Q: One record fits all but what if we take a record home and make some changes etc, how does this impact on shared catalogue?
A: There will be a functionality in place to help you to contribute to the records managed via the Central Catalogue. today this is not yet the case because if you want to add information you do  need a local copy of the record and it defeats the purpose of the Community Catalogue, but this is going to change. So the inventory is where it will be possible to manage this

Q: It is good that records will be CC0, are there any plans for people not part of the Community to access this data?
A: Those records can be used by all. This is not however all in our hands, we have to negotiate with data publishers etc. We are not allowed yet to completely open up the Community Zone but we want to maintain some value to our customers so whenever possible we will open up the data but it may not be possible everywhere

Q: How to integrate other catalogues?
A: We are looking into this

Q: Catalogues that have a high update frequency, how will that work?
A: E.g. authority records are updated on a regular basis automatically. If a specific catalogue managed in the Community Zone is often updated, we would have to look at this on a case to case basis.

Q: In different countries the authority files are connected to different standards, how will you handle these different connections in one catalogue entry?
A: That will be one question that the Advisory Group will have to look into

Alma Product Update and Roadmap

Bar Veinstein, Corporate Vice President, Resource Management Solutions
[live-blog IGeLU Conference 2013, Berlin]

This session is about Alma.If you'd asked me 3 years ago I wouldn't have thought we would have reached 33 live institutions today, which is quite amazing. There are an additional 52 in implementation phase.

Success factors: Customer trust (reliability, security), east of implementation, openness & interoperability, depth of functionality & efficiency, innovation/cutting edge. People talk a lot about ineroperability but it is not until I joined this industry that I grasped what this really means.  Banks compete with each other, they don't share api's etc.

Multi-tenancy, why should you care? A lot of customers don't care or prefer the hosted environment because they think it means a dedicated environment for them. The multi-tenancy is valuable for the vendor, but actually it is for the customer as well. This is bercause:
- One version: no customer is left behind when the software is updated. This means there is one single version for everyone and it doesn't matter anymore which version we're on. It also means that new features are made available much more quickly because we don't need to maintain older versions
- Painless upgrades: vendor-managed updates, automatic
- Disaster recovery: a matter of economies of scale, with quicker response and resolution
- Vendor is responsible: customisations & integrations mus work in the new version

One of the main concerns of customers is security. We need a multi-dimensioinal approach. We take this very seriously, we have an ISO certification, we have an external company that monitors 24/7 our firewalls, we've implemented a lot of capabilities on the business continuity etc. This is valid for all products but it is mostly because of more and more working in the Cloud that we've put in place the 24/7 monitoring. With regards to Alma, it is not possible to connect to it without https.  For cases where the protocol of an institution is not encrypted, we've developed a solution that uses ssl as a wraper, to ensure that the communication is secure and encypted.

Ex Libris can monitor user experience per transaction, such as end user time, health of the transaciton, errors, calls per minute, etc. so we look at the performance of all the instances across the world. This is something new specifically developed with Alma. Alma deplys in months, not in years, this includes ERM and link resolver and in some cases, discovery. This is what customers have asked for. We know it's not perfect yet, we know some people are complaining and we need to work on better training. But we've created a methodology around this. We are also working on migration from a diversity of other products.

We have a strong vision for developer collaboration. We understand that moving to the Could with sql access is not feasible. We have to take the responsibility of developiong the api's so we have a developer platform, where we are committing to deliver these api's and services and extensions to apply them. Until now, we've developed them but haven't yet done all the delivery. But we are going to embedd the API Management Infrastructure, to allow us to have much stronger capabilities for the delivery of api's. The API Console will be available as part of the portal so people can learn and test api's before implementing. This will be released in the second half of 2014.

Dvir Hoffman, director, Marketing and Product Management
Roadmap planning - some highlights

Collaboration - maximise cooperation, integration and sharing between institutions while supporting each institution's particular worklows and standards
Efficiencies - reduce costs by streamlining processes and redirecting staff to other more important taks
Digital - Continue to enhance unified resource management and provide increased consolidation options

We arealso working hard on analytics solutions. The community zone includes more authority records, we've created a working group to bring a community catalogue in the comunity zone. We are also working on the unified digital resource management.

Next year we will focus on emerging electroinc licensing and purchasing models. The community catalogue will be licence free, i.e. CC0 as we are committed to oppeness. We want our Knowledge Base to expand. Alma should be a multi-format solution and was built from the ground with this multi-format model in view.  We want more predictive analytics.

We have a master plan but we maintain the flexibility to adapt and allow for changes in the roadmap. Initial plans for the next release is customer values:
- Overlap analysis = ability to overlap analysis of packages or between vendors: we plan to push the analytics into the workflows, firstly by helping to save costs for acquisitions
- Next year we want to provide you with a benchmark analytics by developing KPIs.- you can define your own and they can be compared with the Alma community
- We continue our investment in digital resources, by providing enhanced collection and metadata management for existing digital collections - no need to migrate files or ingest processes, thereby simplifying the process for staff. Provide ongoing OAI-PMH based harvesting of resources into Alma
- Resource sharing in ILL, i.e. Resource Sharing Driven Acquisitions
- Copyright control, embedd licensing information, monitoring digitisation requests in order to make sure thee are not copyright violations
- Patron Requests for Purchase - better service for patrons

Customers can choose their networks, so you can be part of a consortium but there is flexibility, such as shared catalogue, acquisitions, resource sharing (ILL) etc. Other things we are working on are:
- New purchasing options with vendors: this should come in the next few months, we are working with vendors to streamline the processes, reduce duplication etc.
- We've also engaged actively with some vendors in terms of security, some don't use secure protocols and are not Cloud ready, so we are working on that
- New acquisitions models: vendors themselves need to make decisions, especially for electronic resources
- Bibframe: we are taking an active part

Questions & answers

Q: What about e-book acquisitions workflows and platforms, e.g. ebrary etc.
A: We started working with a few vendors to streamline acquisition processes, including e-books, so using Alma acquisitions tools. The customers using Alma today have e-book purchasing. It is not yet very efficient. So we are working with the vendors to integrate different selection processes and loading the metadata. It's not easy, some vendors are very protective because they make money on their metadata.

Q: New Developer Platform and API console: how are  you going to communicate with us, the developers, what is your stragegy?
A: We haven't engaged yet because we had to finalise the vendor we will use first. Once the contract is signed (end 2013) we will start engaging with the user group at the latest at the beginning of 2014.

Q: Electronic resource management: are you talking to local initiatives, e.g. in the UK we have KB+ and are you talking to JISC etc.
A: We are aware of KB+ in the context of licensing, so we've started active engagement, we are looking into ways to integrate or link into external products. This is not just with Alma, but also concerns SFX. We believe working at the national level, in Germany there are other similar programmes and we work together. So it's not because we work closely with vendors that we forget national initiatives.

Q: Alma puts a lot of emphasis on opening up and using CC0, will you do the same with your own Knowledge Base?
A: Good question! At the moment there are no such plans but we should discuss this. We are not a non-profit organisation, so we make money on Alma, and because we are profitable we were able to develop a good product. We should raise this quesiton in front of the management team.

Q: In which way are you thinking of public libraries? Especially interested in circulaiton and e-books readers
A: Most customers of ExLibris are academic but there are lots of consortiums that include more and more other types of libraries and we know we have to focus on this. In terms of e-books readers we are talking to some vendors. This is on our roadmap.

History of the World Library 2040-2090

Key note speech to the IGeLU Conference in Berlin
Michael von Cotta-Schonberg, Deputy Director General, the Royal Library, Univertisy of Copenhagen
[Note: this is live-blogged and is an imperfect representation of the excellent talk - hopefully it will give some sense of what has been presented]

The major governing idea of this presentation is that modern technology will eventually make the traditional library obsolete. Literature will totally migrate to the e-format and a new library culture will develop to handle this situation.

This is a 50 year Jubilee address of the President of the World Library (WL) to the members of the board, whom you represent. You will have to vote on a crucial issue.

My part is to give a brief overview of the Jubilee in 2090. Before the establishement of the WL in 2040, there were institutions called libraries fom latin or bibliothek from Greek. Let's have a brief look at it.

Prehistory of the WL. We leave that to ther Holyness Popess Benedicta the 18th, it was recognised words also have to be written and read not only spoken. They had to invent something to write with and to write on. The oldest texts in the world were written on clay. The first known library was Ashurbanipals palace in Ninive. For thousend of years they used the skin of calves, the invented printing which is a process to reproduce text and images on a support. That was to be short lived, lasting about 500 years. During that period literature was a scarce commodity, due to difficulty of access to books. the solution to the problem is almost as ancient as books themselves, and that is the library. Libraries as a solution to the scarcity problem was so efficient that they proliferated. There were libraries everywhere in the Middle Ages. When books became more of a mass product, libraries were everywhere including in Universities and even nations had their own libraries. But as book produciton became cheaper, more and more people bought books and had their own library. Mostly, books would never be read or never read again - not very efficient...

So within one generation the whole world literature was digitised. E-books had their own distribution systems. E-book publishers kept their books on their own servers because they wanted to protect their own income and services. But Google made them cheaply available on the internet. A period of transition at the beginning of the 21st Century but people weren't aware of it. A great part of non-fiction literature was produced by public money, sold to private companies who would then sell them even more expensive. The cost of this double system was enormous and wouldn't allow to develop the new e-based system. Academics tried to invent new systems with the support of national governments. But academics didn't have the strength to break the monopoly of the great repository institutions and that meant that sometimes double-costs were paid for content that universities had themselves produced.

Google was the solution! In 2035 they made an offer to the UN to turn over all its assests to the Library under one condition that the new library would be called the World Library. This idea was much older it had been thought by HG Wells he called it the World Brain but it's more than a microfilm. Of course Google's offer raised opposition but by that time Europeana and other counterparts grew so important that they had enough say. Opposition was overcome and the WL was approved by the UN in 2040.

Mrs Pippa was the fisrt president and the first to receive a rejuvenation procedure as those became available in the 30's. The WL shouldn't be a centre of texts, but a networkded structure given free and easy online access to texts of all kinds stored on servers belonging to the owners, would provide free access to the content of institutions, libraries and organisations that decided to join the WL. Those were given contributing partnernsihps and voting rights. At the outset the WL didn't own the collection of literature except the collection belonging to Google but was giving access. It was one global network where each collection partner was giving access to the whole world. It was a dounting task. Legal and formalist problems had to be solved. Financial issue was ok because Google gave a huge grant for 25 years.

By 2050 the technological and organisational network was in place. But then crises struck. A small village in Paraguay lived in splendid isolation from the world at large, but the community grew and proclaimed the overthrough of a civilisation that was corrupt and this included the WL. The group was lead by Raoul. They appeared to be healthy and happy so the government left them alone but this was a mistake. On Jan 1 2051 they launched an attack on the system of the WL, a virus that crashed the whole system and it was closed for 2 months causing absolute havoc. One of the most contributing University's digitised content was lost as the person who owned the key to access died. It was therefore necessary to take backups of all the content of the WL. In the end it was dcicded that copies should be deposited by all members. It was costing a fortune but was more efficient than using unsafe procedures. In 2060, the WL had ceased to be a collecitng network but was a central library not for use but for security.

The hacker problem was solved effectively. Every year the WL organised a competition invinting hackers to break into the library. The best one would get a very well paid job for the WL. Only twice the library was hacked but due to the safety copying it wasn't disatrous. Gradually the WL regained its credibility. It would also comprise commercialy produced literature. Print on demand copies could be ordered and this gave a special niche industry for the publishers. Legal Deposit of printed literature had been replaced by Legal Deposit of e-literature. However this was only allowed in the member libraries own reading rooms. Eventually it dawned on the publishers industry that this was obsolete so a petition was made that national libraries had to make this colleciton of e-books available through a price determined by publishers. The authors would get some contribution. The market would be managed by the publishers. The whole procedure was fairly simple and by 2068 the WL was a seller of world literature.

Profits to publishers raised significantly. But the movie and music industry wanted to be included. By 2075, the WL had gained such momentum that those were integrated. The LOE was created = the library of everything. Merging the global web with the WL was an idea that was raised. The discovery department of the WL today is the largest and employs lots of people cleaning the metadata and allowing full seraching texts. Expected to come out with something workable in the next decade. The global publishing system is complicated. For fiction, it is easy for all to enter into the market. Independent publishers flourished as never before. Self publishers overloaded the market but the public got tired of badly written novels. Small scale publishing has by now become a viable industry only if you're not in it for big profits. About 100 years ago many thought this century would see the death of the book. Untrue. Faboulus take off together with the age of the computer.There are different publishing models. the golden open access system made the temptation to publish more and more articles paied by the authors themselves desireable. This functioned reasonably well but was unsustainable because of the large amount of publications. A new system was developed where the pre-publication peer review only comprising the first part, the assessment of methodological soundness. The assessment of scholarly importance was left to new forms of post-publication peer review. This made the publishing run more smoothly but the post-publication peer-reviewing didn't work so well. This meant that there was an increasing number of scholarly article methodologically sound but otherwise insignificant.

Something had to happen. In 2081 the board of the WL received a request to develop an administrative system for all new scholarly publications govering the field of the Nobel Price for A. eminent scientific contribution, B. Valuable scientific contribution or C.accebtable scientific contribution. Would only be a small fee for scholarly publication. The WL created a list of quality publications.

In a very short time the first thing you looked at was the profile of quality certification obtained for a publication rather than the work of the author itself. Quality insurance was taking over by scholarly associations that they had initially introduced years ago. This was the 4th crises of the WL, the most controversial. The proposal now is that world citizens should have direct connection with the WL for retrieveal of data directly into the brain. Using technologies and interface developed by the Kulu organisation. The technology has been proven. 3-dimensional spider in the mind, allowing to look in all directions at once and having text pictures and sound all together. The programm can inject new knowledge in the brain. Voting options are:

1. The world needs to know about the WL's efforts to make available to all knowledge without profit. If yes, it's giving access to knowledge in its fundamental sense. The emprires of the futures and the empires of the mind. So giving our brains to the sum of human knowledge is the best way. By accepting direct unmediated knowledge you will support a new knowledge order and thus a new world based on the right to knowledge. Individuals will achieve a higher level of intelligence. You can steer in any direction you choose. Please vote yes.

2. The risks outweighs the benefits for two reasons. Risk of 2-way communication between the library and human brains. We cannot garantee that we can avoid partial publication between brain and library. the consequences of a massive intelligence computer system are unclear. The second question concerns security. Library computers can be breached this would give hackers the ability to upload data to human brains. This could be to force you to buy new products or programming autonomous funcitons in the brain and this could lead to the complete colapse of human civilisation.

Now it is for the members of the board of the WL to vote.  The motion has been defeated! God save us all. Maybe librarians were wondering if the Libraries had a future. It had a great one. It will lead up to the fulfilment of the great library but the defining moment was the transition to the ubiquitous digital library. National libraries have survived to preserve and make national/historical collections available to the community. After the WL came to being most public libraries were integrated into cultural services under many different names. The professions of librarians is glorious. We are excellent researchers.

The past is an everchanging theatre of interpretation; the future is a stormy sea of potentialities; only the present stands firm but it just lasts a second... Fortunately!

Opening session

[This is live-blogged from IGeLU 2013 Conference in Berlin, please forgive typos etc.]

Welcome by Jiri Kende, chair

Updates on the Steering Committee's situation, with a special invitation for members to vote. Results will be announced at the closing session on Tuesday afternoon. Next year, the IGeLU conference will be held in Oxford, hosted by the Bodleian University.

Welcome by Matti Shem tov, President and CEO, Ex Libris Group

Unfortunately I can't join you in person as I broke my leg 3 weeks ago in a very unglorified way... [nice pic of Matti with cruches!]. What we've done since last year, we have 280 new institutioins and are now over 5,500, we employ 530 people, run 3 data centers (Chicago, Amsterdam, Singapour), revenue of 95m$ and have made Alma live. We have a new owner: golden Gate Capital, which is a San Francisco-based private equity firm with $12 billion in capital under management. ExLibris remains an independ business, with the same executive team, roadmap and operations.

We are working on the transition to SaaS, there are a few tedious issues but more and more applications are developed in SaaS. We develop our technology, as well as our business model and operational structure. It is a priority for us. We want our company to be in the Cloud although we will continue to support the local installations for the years to come. In 2008, we had only 11% customers in the Cloud and today, this has risen to 75%.

This past year has been very good for Primo. It now serves 1,916 institutions worldwide. We focus on particular aspects, especially personalised ranking and tighter integraiton with Aleph, Voyager and Alma. We moved to agile releases which present new functionality bi-monthly. We have introduced Big Data technologies (Cassandra and Hadoop) and use Oracle. We continue to enhance the product and have introduced large scale multi-tenancy.

It has also been a very good year for Alma, it is live at 33 institutions worldwide and there are an additional 52 institutions in implementation phase.

Our other products such as Aleph and Voyager still concerns more than 4000 customers and we are continuing developments. Rosetta is a very unique product for massive digital content. We have more and more customers using Rosetta around the world.

Singapore data center is relatively new and currently supports 4 customers, this is going to increase. We've achieved ISO 27001 as a security standard. We take a lot of community initiatives, and support various initiatives, such as open access. We have close collaborations with our customers via the community zone advisory group, voting process for enhancements or OPAC funcitonality in Primo. We collaborate with the product working groups, have teesting teams for new CRM system and work woth joint focus groups.

Koby Rosenthal, Corporate VP, General Manager Europe, Ex Libris

The partnership within the community is very rare because it is usually very competitive. We should be proud of this because it is really unique and I find it impressive. Updates for Europe since the last IGeLU, we have more and more institutions using Primo and adding more experienced people to our organisation. We try to bring more people from the headquarters to Europe. The UK early adopters programme is very successful. In many places Alma is now live, which doesn't mean there aren't still some issues to resolve but we can learn from what is done in various places. We have a very experienced team who speak the language of the countries with which they work.

We are moving forward with Alma and are working in close collaboration with institutions that have made it live as well as those who are showing an interest. We are transferring a lot of knowledge to our staff in Europe as well as to institutions. We are building a new team that will look at the local specificities and needs in various parts of Europe for specific tayloring for the different languages. We want to support the customers and future customers in Alma and are adding knowledge and experience to our organisation.

We have appointed an European strategy director and are planning enhanced cooperation by incresing the number of solution days. and establish regional directors meetings. There is an Alma early adopter program in German speaking countries, including the University of Mannheim, as a large German library, and this will kick off in October. We are in the process of implementing similar programs in other countries such as France, Italy etc.

Wednesday 10 April 2013

Cloud Computing MmIT Conference 2013

Just back from attending the MmIT CloudBusting conference in Sheffield, I took extensive notes which were meant to be "live-blogged" but for various reasons (mostly technical and being ever so slightly disorganised), I wasn't able to post them on the day. However, notes are available so here goes (with usual caveats, i.e. potential errors etc. so comments/corrections welcome):

Content of this blog post:
  • Welcome by Leo Appleton
  • Keynote presentation by Karen Blakeman
  • Lise Robinson OCLC
  • Rapid fire sessions
  • Bethan Ruddock MIMAS
  • Panel questions and answers
The programme of the conference can be found here.

Welcome by Leo Appleton, Uni of Sheffield, IT director 

Theme of this conference: Cloud busting. Given the number of people we have here today, we think we’ve selected a relevant theme. The extent of what cloud means to libraries is what we intend to get out of this conference. There is a green slip in the welcome pack to write down questions to submit to the panel for the closing Q&A sessions. 

Welcome to Sheffield under the sun (we had snow one week ago). I want to tell you why we did the move to the Cloud. The Cloud is everywhere, we all have various devices, do shopping online etc., this brings a lot of expectations. User interfaces are easy to use, we expect things to work all the time. If it does break we expect support 24hrs a day. This expectation is also on mobile devices. Most of the services we deliver is actually in the hands of service providers and we have very little say. These are some of the challenges that we face. We also have to do things overnight, over the weekend etc. It’s so much part of our lives, we are so used to it.

We moved first to the Cloud for emails. We were one of the 1st unis to go completely Google for staff and students. We’ve been running like that for about 2 years. I think we provide a much better service. When Google goes down for one hour it’s on the news whereas, we once lost emails for 3 days, that would be unacceptable now.
It's not a cost cutting exercise, we have saved money but that wasn’t the main reason. It’s about providing better services, more available and user friendly.


Keynote: Karen Blakeman, RBA Information Services
Searching in the Cloud, bright new dawn or storm weather (view slides to the presentation)

Delighted to be in sunny Sheffield compared to cloudy Reading! I have my own company of information services. I don’t have access to large database most of my research has to be done in the Cloud and  I have to be creative to gain access to some papers, mostly it’s contacting publishers an asking them.The issues of the Cloud, and benefits are very much the same as a large organisation. We want to see where we’re coming from and in a way it’s almost like going back.

The Cloud definitions: Stuff that is out there, it can be all sorts, anything in electronic form available on the web. Another definition is data & services that are serviced and accessible via the web so not just publicly available sources but your own papers and documentation that you need to run your organisation is hosted outside in a single interface (?).

If you remember when we had tape etc. you had to host them, pick the one data source key to your business. The real progress then was having only one device like the telephone, it gave you access to everything that was available, you were no longer restricted to what you had in-house. There was much more data that you could access. It’s the same now. The key is how you get to it. The internet was a possible way of accessing these databases. Initially we were still using dial-up methods etc. We had a pack of telecom software to access the info. In 1994 netscape came, until then stuff sat on floppy disc, it was telnetting, so point and click was key then. Now fast forwarding to my beauty, my android. From this I can access everything, forget the laptop, with my android I can quickly tweet people etc.

So Cloud computing is not really new, the main difference is publicly available data. E.g. students working at a distant location, you can access information outside of a physical location. Federated search engines, a lot of you use them. There are different hosts, they go out identifying databases, resources that are relevant and develop easy to use interfaces. Not only text, but visualisation, images, clouds of data etc. There’s lots going on in the presentation of data. Wonderful links to resources, in and outside of an institution. Large amount of resources outside that you can link to your own info.E.g. repository searches, open url, open access, etc., I won’t go into details but the problem is that you may be able to find it but not access it. There’s a large amount of different types of info.

My wonderful android is linked to my Google account so everything is synchronised. It also means I’ve got backups. I have access to all my various accounts, my apps, calendar, Googlemail, Evernote account, Mendeley etc. and it doesn’t matter what device I’m using.  I can access my events etc. and don’t even need to print my tickets because I can just show the screen. Mobile is the way forward, it doesn’t matter what platform, it is it’s all out there.

My question is: where’s my stuff? This wasn’t a problem with my desktop, I had a file manager, or file locater, that’s when you really don’t remember where stuff is. But for the Cloud it doesn’t work. Google tools etc. are not designed for real time stuff, but designed for indexing stuff. There are other services designed for that, but there’re limited, e.g. apps limit the amount of data you can pull back. E.g. topsy is a good one, or there’s the good old fashioned rss but it’s not even really real time.

So how do I find my own stuff. CloudMagic does that but the downside is you allow it access to your account. I did this for Facebook, twitter etc. then I panicked, it’s a free service, what are they going to do with my data? I pulled the plug and changed my password. So what’s the reputation, where is the data stored, what security etc. are all important questions. If like me you travel a lot, you might get wifi or not, battery is a problem etc. so Cloud is great but when you’re mobile, you will have these kinds of problems. Stuff will be unavailable. You don’t know what’s going behind the scenes. Privacy glitches happen. Now security is tighter but we need to be aware they can occur with any system. There will always be leaks, it’s a case of risk assessment.

With broadband we made the assumption everyone has a fast connection, but it’s not always the case. What’s the policy regarding backups? What if everything goes wrong? Do we need to keep local backups? Don’t assume that because your stuff is in the Cloud it will be well taken care of. I always have my own backups just in case.

Obfuscation, fragmentation, isolation are important words. Obfuscation if you read contracts (do you read them?), we don’t because it’s a nightmare to try and read them but it’s concerning that there is a blur of boundaries between public and private. Google is trying to combine your own personal info with institutional info with search etc. If you search it will give priority to content from your contacts in Google+. Or there are situations where information only comes from the circles you belong to, this touches to problems of confidentiality. You have to be very careful even when doing a simple search, especially if you’re logged in.

Personal information is another tricky aspect. Again if searching within your own circles all the information may not be public, but it’s not always obvious. So we need to think how much we will incorporate in our interfaces for our users. Another example is a facebook local group, there’s a mixture of information and people are not always aware of the implications and differences. Some parts of the pages are publicly available, others are only for members.

There’s also an assumption that we are all well connected on the internet. Having been online doesn’t mean that we are well connected. A big chunk of users don’t have a good connection, in some parts there is no internet at all. This applies to school children, students etc. So we should keep things simple, not everyone has got megabus broadband with powerful downloading capabilities.


Lise Robinson, product manager OCLC
“Something good is going to happen: an overview of developing a cloud based solution”

I am the product manager in UK and Ireland for OCLC WorldShare, my approach for this workshop is quite personal, about my experiences of working at OCLC, concerns about cloud based data management. So not a sales pitch but a personal approach. I will have questions for you at the end because I want to get feedback from users.

I am a librarian and have been working for LMS for a number of years. I needed to integrate Dr Who in my talk, I’m very excited after Saturday’s episode. Everyone is interested in the Cloud and the episode was about that, in a scary way. It’s very sexy and everyone wants to know about this. But at OCLC we’ve been doing it for 40 years. When OCLC was formed, the centre of what they’ve done is WorldCat. It’s a universal library catalogue and has always been hosted. 

Why have OCLC gone into the idea of a hosted platform? Today’s library environment looks like a model of interlinked services, data etc. We all use similar systems with acquisitions, finances etc. and the systems have become messy and libraries isolated. The idea of webscale is that by bringing the libraries together, it’s giving us better services. I have a positive look about what’s going on in the Cloud. The model we follow is one that you see being successful with names you know such as e-bay, amazon, facebook etc. It’s the idea that libraries come together to form a community because they want to do something together at webscale, bringing together the data and applications, the community gives you the power and the sharing abilities. It’s made possible by Cloud computing. It’s helping the sharing and re-use. 

When WorldShare Management started from scratch we looked at these big names. We had to face questions such as: What you are going to store, are you going to store my stuff on your server? It’s about saving costs. Interesting what Leo said this morning, that it’s not about money. You have to be careful talking about costs, you might be talking to IT people, library systems managers etc. and we’re telling them we’ll take charge of the management of your server. But we are taking away those mondaine tasks with the belief that you can change from using 70% of your time to 30 % for the managing services to developing innovation. It’s about freeing up people from management of hardware and instead focusing on services to users. 

We did research about library transactions (e.g. back office transactions, OPAC transactions etc.) and look at how our software could provide the right sort of infrastructure to provide efficiencies to libraries. Another argument is that the Cloud is greener. It uses less servers and has a lesser carbon footprint each library having their own servers. It’s maybe not about internal efficiencies but it’s an interesting aside. 

So we are pushing the idea of something new and maybe better. It’s a struggle when you are talking about the Cloud because a lot of people ask questions about traditional way of working based on a  deployed LMS. We hear: What would happen if? And most if the time the answer is well this won’t happen anymore. For example we are asked about offline and backup, what if the server goes down but the answer is it won’t go down. This is a positive thing of the Cloud, you shouldn’t have to worry about that. We are not reliant on the one server. We have racks of servers, we can invest in powerful systems. It won’t be the library server that goes down, it will be the organisational network.

What about downtime and upgrades? Currently we are obsessed with minimising downtime. But in the Cloud you probably won’t even notice when an upgrade is made. We do it regularly, it’s a tricky moment we do about 4 times a year, and we need to be careful but libraries won’t notice. 

It’s about efficiencies, so not cost savings but efficiencies in the management of services especially where each library does the same thing. It’s the idea of shared data and shared tasks. We are expert in this field in the library world. There are other options for shared data to create efficiencies. E.g. we may use the same vendors, check the serials in the same way. The development process is community driven. The other one is ERM. If one institution creates a licence data (or uses one that we’ve created), the next library can simply re-use what’s already there.There is one version of WorldShare and you have an instance of it. It’s not just about creating data, it’s also that analysis becomes an interesting idea. Others might have specialities. You can analyse holdings on a specific area and compare with other libraries. When working in silos, you don’t have these options. 

The big scary elephant is about security. What about my data? Can I get it back? Is it safe? I don’t want to give you a law lesson. I’m told (I did a bit of research) that a lot of people worry about the US Patriot Act (this is my understanding), which is part of a number of security measures set up after 9/11 and is about the ability to access data (terrorism etc.). In the US we are horrified about the idea of keeping transactions data because we have the idea that this Act means that data within the US is available, apparently a lot of this is scare stories, what we do at OCLC is that we abide by local data protection acts. We have to take consideration of customers’ concerns. We have been thinking of these things. The Patriots’ Act gets all the publicity but in the UK there is the Regulation of Investigatory Powers Act (RIPA). So we take the security seriously. It’s about scripting the data. There is more information on our website about this. 

My questions to participants are (what I’m interested to know about you):
Why do you want to move to the Cloud?
How will your organisation benefit?
What would you move to the cloud? What’s stopping you?

Discussion:
Q: You were talking about offline backups. We have mobile libraries going round. We are thinking of a cloud-based system but not sure how that would work.
A: The default answer is that you can access the systems from anywhere and that’s our default attitude. But in some cases we have to acknowledge that an offline system is more appropriate and then data/transactions can be uploaded to the Cloud. When we are so immersed in this world, we tend to forget situations where it may not work so well. Most of the users we work with are more static.
Q: Is there supply and demand for online video?
A: We provide a platform, one part of which is WorldShare Management services so for video it would have to be streamed, it’s not a discreet service we offer.
Q: Can we keep local copies? 
A: I once heard of someone who couldn’t imagine that there were libraries that had data they didn’t want to share. Our approach is that we need to develop the ability to have records hidden from WorldCat. We have the flexibility to catalogue all sorts of types of material, e.g. law material, private papers etc. This is the positive advantage of silos! But it’s on our roadmap. It will come, it isn’t there yet but it’s recognised as a high priority. WorldShare offers the ability to catalogue in a shared environment. There are more records in WorldCat that are not traditional books, there are other types such as e-resources etc.
Q: We are going with Alma in July in the Cloud. I’m an IT guy in the library. OCLC is out there already. Is there a way that both systems are going to “speak” to each other?
A: Yes, you’ve got WorldCat and we’ve built around it an alternative to Alma, and WorldCat still exists in its own right. So the access of Alma to WorldCat should be possible. With WorldShare you don’t need to download the records because it’s already there but with other systems we’ll need to make sure it’s possible.
Q: What we are hoping from the Cloud is not only library services but all the other stuff, api’s, finance systems, student record systems etc.
A: That’s one of the ideas of WorldShare, it’s the community data services and apps, the platform is built using open architecture. Even what we describe as the LMS has services on the platform and users can build other services on the platform. E.g. if you use EBSCO as discovery service, you can build in on the platform because it’s an open system. Apps that are written for WorldShare, we will quality check them and make sure the rights are respected but it’s our idea of a community. That’s the real beauty of the collaboration.

Rapid fire sessions

Penny of Bailey Solutions: Tracking software
Outlook is not enough to monitor enquiries (there is an article available about this). Why would you want to track enquiries? You want to improve your service, it’s also about not missing deadlines, emails don’t give you statistics so you have to enter them somewhere else. Our software provides tools and it’s also about the quality of the information (statistics). Our software is also especially useful if you respond to FOI requests. It’s about sharing knowledge so that the enquiries staff can see how it’s been done before. It allows to store information also for complex enquiries. It provides a 24/7 service, as it keeps on recording your enquiries even out of hours.

Andy Tattersall, Uni of Sheffield
We’re in the process of moving from Talis to Alma. I want to talk of what we think will happen in the future. My point of view is from a systems manager so didn’t look at the legal point of view. Mine is about corporate services. We’re on a good footing already because our IT people are wanting to move, providing safeguards are in place. We need good relationships with the technical people. Don’t forget to do that. Cloud stuff is good for user interface, but you will have local systems needing to talk to your Cloud system, e.g. firewalls so you need secure protocols, this is really important. Data security is a biggy. We’ve all got worries about this, are we compliant legally (e.g. EU laws)? What IT people asked me: what about backup data? What if it’s stored in the US? The less sensitive data the better. It’s a good thing to sprinkle you data down anyway and have disaster plans. I’m not talking of power cuts etc., we all expect companies to respond to that, but what happens if someone pulls the plug, if no money comes in, what plans are in place? What about technical details: we have specialist people to manipulate the data, so how much technical skills will I need in the future?

Jeff Newman, Kaltura
Kaltura is an online video platform for solving the problem of who hosts when, for example what if youtube is not the solution? This is for universities. We have a system to stream videos on all devices, it’s a SaaS (Software as a Service). It works for all types of content. We created an app called mediaspace, it’s a portal system allowing to create galleries of video content using a logging so it’s not for the general public. There are different levels of permissions/controls to secure videos on the internet. It’s an OS platform. Its’ a matter of choosing the appropriate application. We are an open and flexible platform so if you have your own system it could be integrated. We have already a lot of customers, including in Denmark.

Dave Parkes, Uni of Staffordshire
I’m not selling anything but want to say how we create the Horizon report, it comes out every 5 years and is a window on the education sector. We usually get it about right, we said that tablets would be happening etc. there’s 40 of us worldwide working in the Cloud, never see each other but we argue and talk about stuff, we start to narrow things down, what are people using in education, what are the challenges etc., we have a watchlist that we look at. Anything that’s relevant to universities. It takes about 3 months for things to take shape. It’s challenging when 40 people have a case to argue but eventually we produce a little booklet. It’s then given away (my son was very impressed with that). Our predictions for the next 2 to 3 years: increasingly using games in technology, using analytics, and for the next 4 to 5 years, freely printing and wearable technology.

Nicola Philpott, Ancoris
We are a partner company working with Google. Organisations can buy directly from Google but Google doesn’t offer services around the project itself, they provide tools and they have partners in the UK, we are one of them. We work with organisations that for instance have the project to get rid of emails, or companies looking at diff ways of storing docs so using goggle drive. We look at the project from start to finish, help migrate the data etc. We’ve just finished a project with Bristol University. We help with the change of management, users changing platform and getting them learning to doing that, so we provide training, face to face or webinars to help them use your new service. We have an online e-learning platform, users can ask questions, find answers etc. So helping you move your project on, including your customisations. The Google Apps suite includes gmail, calendar, G+, vault, docs& drive, sites (intranet?), talk (online conferences etc.). It allows collaborative team working.

Robert Bley, Ex Libris
Random thoughts and musings about the Cloud. What is not the Cloud? It’s not about just hosting legacy apps, it’s born Cloud, it’s not Cloud server, it’s proper shared services, it’s supporting the application beyond hosting. This means big change for libraries, for vendors, we need to re-engineer our business around Cloud applications. E.g. it’s multi-language. How to choose a Cloud provider? Do they have global facilities, the manpower to support you and all other customers, full redundancy infrastructure, data centre certified to relevant standards, etc.? Do they have EU hosting? If it has to be in the States, are they part of the same agreements? Who else uses a service and are they happy with it? Is there physical security? How open is the system, e.g. doing mashups, taking the data elsewhere? Does the vendor support just the application or the infrastructure as well, and the Code? We’ve realised very soon that it’s very entwined. More evidence is needed, does it cost more? There is a agreement that it’s Green, so energy savings. Implications for libraries in terms of staff, not so much numbers but skills, and if tech skills, is it the traditional ones? Or can we start doing mashups etc. Budgeting is also an important question. We’ve done calculations about cost savings and we are happy to refine those. It encourages people to be more creative.

Bethan Ruddock, MIMAS
LAMP: your data uses lots of fabulous info so how to put it in a Cloud service to allow you to benchmark with other libraries? This project is a partnership between JISC, Huddersfield and Mimas. We had a meeting the other day, and we talked about: do you think you might want this so we’ve created a dashboard with our discussions. The idea is that you’d click on a single thing and then it takes you down to further details. It’s early days. Potential use cases of things we think people might want to do with this data, we’d like to hear from the community.

Liz Robins, OCLC
Why web scale makes sense for libraries? OCLC has a good background in delivering library services. WorldCat is essentially web-based and we have been doing this for 40 years. Sharable data is not just for bib data, but vendors, licences, knowledge base etc. It’s innovative, WorldShare is the name of the platform, started from scratch, built on open platform to allow for other technologies as well. The situation now is there are lots of silos not speaking to each other, so what if we brought all these services together. We are all over the world. 100 sites , the first in the UK was on 4th June.

Bethan Ruddock from MIMAS
Opening up - bibliographic data-sharing and interoperability (view slides of the presentation)

Why would you want to share data? And how would you do it? Your data could be enhanced from being shared. We have developed a lot of services that are about shared data (e.g COPAC, Archives Hub etc.). COPAC combines your record with other peoples’ records. With Archives it’s different because they are unique but hey are stored in the same database. With LAMP we’ve only just started so we don’t really know yet but we would also like to combine your data.

So you need to think of the format. If it’s just a library service, it’s probably only up to you, it’s whatever works for you. But with the general caveats of trying to do something that’s stable, so if you can store things in fairly compact and sustainable ways, it will help because combination services will want data in certain formats. They may not always do the transformations for you.

Have a look at what formats you can get data in and how you can get it out? If the data is in a locked down system it will be hard to get it out. Which one is most appropriate for what you want to do? Which version should you choose? Xml is one of the best formats, it’s text based so it’s sustainable, it’s compact and transformable. If you don’t have the expertise in-house, there are services that can help you. We’ve developed a programme to transform data in EAD for Archives.

Consistently bad data is better than inconsistently good data, that’s “my rule”. It’s easy to fix something in the wrong place everywhere, whereas it’s hard to correct something that’s inconsistently wrong in 5% of your records.

Barriers and risks are about licensing and data ownership. For example there are legal barriers to sharing. Think also of the risk of not sharing your data not only the risks of sharing it. What will happen to your data if you don’t share it?

Things to think when choosing a Cloud solution, think of why you want to share your data? Does the service meet your needs? Is it something your users have asked for e.g. they want your data as part of a collaborative service? Is it for resource management? Once you’ve decided where and why you want to send you data, think of the right format for the purpose. And will you be able to get it out of the Cloud in an appropriate format if you wanted to send it elsewhere? Will it be kept in a standardised format?

 A note on Linked Data: it’s the idea that you describe something with the use of triples and they are reversible. So they are built to describe a relationship. This allows you to build a picture and this can then be extended: example: Ruth works at MIMAS, and reverse this info and then link it; so for instance who else works at MIMAS, what else does this person do etc. everything has a unique identifier so things can’t conflict even if the name is the same, as an example. The idea is that it can go in the internet because of uri’s and url’s. We use standardised vocabulary. There are definitions of RDF types and you can look up what it means.

Panel questions and answers

Should we teach Cloud literacy (especially for students)
We (Staffordshire) provide literacy about the Cloud and it’s all about staying safe online, so it’s about raising awareness, including of the “digital health”. Understanding the consequences of an online self, so as not to be limited in the future. It’s about letting go of a level of control. It’s also having the resources there and educating about the risks. It’s extraordinary to have tools allowing to collaborate. There is often an idea that teachers don’t understand technology. But we have to explain the different layers of media. Recommended reading Marc must die

140 characters to convince us not to go to the Cloud
They’re all watching you. Where’s my data. They are going to sell it back to you. Data security issues. Buy in isolation, be off grid. Some people in Microsoft world send their kids to schools where they are no computers. Don’t be part of the technological singularity, where software becomes self-aware. Don’t put all of the stock. Be careful to give away too much and not being able to get it back. How much will you outsource? If you exit the Cloud and you have no more infrastructure to fall back onto.

Other presentation slides:
And tweets from the Conference on Eventifier