Sheet Music Consortium issues
Monday I met with Andrew Rouner, the Wash U Digital Library Systems director, and Cassandra Stokes, also of the Digital Library Systems unit, to talk about the options for getting the sheet music record data from where it is and in the format it's currently in to a format in which it will be able to be harvested by the Sheet Music Consortium's OAI metadata harvester.
After learning that we didn't have the add-on for III that would allow export of MARC records in XML and/or Dublin Core, I was thinking I'd need to end up doing a lot of hand-tagging and that it would be extremely time-consuming. My meeting with Andrew and Cassandra reassured me that a lot more of this process will be automated that I had thought, which is great news. My first order of business is to export the MARC records for the Balmer & Weber records as plain text and then we'll go from there. I'm still not sure of the best course of action for exporting the records - I'll ask Mark what he knows about exporting tomorrow when we meet. I have a basic understanding of how exporting works: I believe I'll run a list (can I run lists in the supplementary catalog?) of the Balmer & Weber records and export the MARC data as plain text to the hard drive. I guess it will just be one big text file at this point and then it will get broken up into individual records later?
Most of our meeting focused on the possibilities of exporting and then the processes it would take in order to get the MARC data into a format that would be able to be harvested by the SMC's harvester. Andrew did some follow-up research after our meeting and sent me an email with additional information about this. Apparently there are utilities for converting MARC records directly into BibClass (the bibliographic record module of the digital library system Wash U will be using, DLXS), and there is a program to expose OAI-compliant metadata to harvester (something called broker20, which is Univ. of Michigan's OAI-compatible data provider for shareable metadata. I don't understand where the transformation into unqualified Dublin Core comes in (it's my understanding that the Sheet Music Consortium only harvests data in this specific schema at this point in time). Is that what broker20 does, or is there some other step in there that I'm missing? I feel so clueless about this stuff, but I'm learning, slowly but surely...
The other aspect of this that needs to be figured out is if there needs to be some sort of public (or perhaps it wouldn't necessarily have to be public) interface through DLXS where the metadata and images would reside. According to Andrew, his impression is that BibClass doesn't exist solely to expose OAI metadata (I had asked if the metadata, including links to the images, could just exist on the server somewhere and be harvested, or if it had to be incorporated into some kind of interface), so we will probably need to put it all together (images, metadata, etc.) into DLXS. I'm still a little fuzzy on this, obviously...
I feel very fortunate that Andrew and Cassandra are willing and able to help me out with this - I would be at a complete loss without their help.
I spent several hours tonight digesting what we had talked about yesterday and looking at DLXS's website and examples of BibClass records to see if I could figure out what's going on. There are still a few holes in my understanding - I need to email Andrew to ask a few more questions. I also sent an email off to Stephen Davison of the SMC to let him know of my project and to see if he had any guidance or suggestions for me at this point. I also need to contact Jenn Riley at Indiana (wish I had been able to meet with her while I was in Bloomington last week!), since they are apparently using DLXS to expose their sheet music collection for the SMC. I have been dragging my feet about contacting Stephen since I really wanted to have somewhat of a clue of what I was talking about so I didn't sound like a complete idiot. Not sure I'm at that point yet, but I decided I couldn't wait any longer to contact him!
After meeting with Mark tomorrow afternoon, I hope to start working on exporting the records.
Hours today (and yesterday): 4
Total hours completed: 29
After learning that we didn't have the add-on for III that would allow export of MARC records in XML and/or Dublin Core, I was thinking I'd need to end up doing a lot of hand-tagging and that it would be extremely time-consuming. My meeting with Andrew and Cassandra reassured me that a lot more of this process will be automated that I had thought, which is great news. My first order of business is to export the MARC records for the Balmer & Weber records as plain text and then we'll go from there. I'm still not sure of the best course of action for exporting the records - I'll ask Mark what he knows about exporting tomorrow when we meet. I have a basic understanding of how exporting works: I believe I'll run a list (can I run lists in the supplementary catalog?) of the Balmer & Weber records and export the MARC data as plain text to the hard drive. I guess it will just be one big text file at this point and then it will get broken up into individual records later?
Most of our meeting focused on the possibilities of exporting and then the processes it would take in order to get the MARC data into a format that would be able to be harvested by the SMC's harvester. Andrew did some follow-up research after our meeting and sent me an email with additional information about this. Apparently there are utilities for converting MARC records directly into BibClass (the bibliographic record module of the digital library system Wash U will be using, DLXS), and there is a program to expose OAI-compliant metadata to harvester (something called broker20, which is Univ. of Michigan's OAI-compatible data provider for shareable metadata. I don't understand where the transformation into unqualified Dublin Core comes in (it's my understanding that the Sheet Music Consortium only harvests data in this specific schema at this point in time). Is that what broker20 does, or is there some other step in there that I'm missing? I feel so clueless about this stuff, but I'm learning, slowly but surely...
The other aspect of this that needs to be figured out is if there needs to be some sort of public (or perhaps it wouldn't necessarily have to be public) interface through DLXS where the metadata and images would reside. According to Andrew, his impression is that BibClass doesn't exist solely to expose OAI metadata (I had asked if the metadata, including links to the images, could just exist on the server somewhere and be harvested, or if it had to be incorporated into some kind of interface), so we will probably need to put it all together (images, metadata, etc.) into DLXS. I'm still a little fuzzy on this, obviously...
I feel very fortunate that Andrew and Cassandra are willing and able to help me out with this - I would be at a complete loss without their help.
I spent several hours tonight digesting what we had talked about yesterday and looking at DLXS's website and examples of BibClass records to see if I could figure out what's going on. There are still a few holes in my understanding - I need to email Andrew to ask a few more questions. I also sent an email off to Stephen Davison of the SMC to let him know of my project and to see if he had any guidance or suggestions for me at this point. I also need to contact Jenn Riley at Indiana (wish I had been able to meet with her while I was in Bloomington last week!), since they are apparently using DLXS to expose their sheet music collection for the SMC. I have been dragging my feet about contacting Stephen since I really wanted to have somewhat of a clue of what I was talking about so I didn't sound like a complete idiot. Not sure I'm at that point yet, but I decided I couldn't wait any longer to contact him!
After meeting with Mark tomorrow afternoon, I hope to start working on exporting the records.
Hours today (and yesterday): 4
Total hours completed: 29
2 Comments:
Do you know that Jenn Riley has a blog?
http://inquiringlibrarian.blogspot.com/
I met her at the Blogger Bash at ALA in Chicago last summer--she's very easy to talk to!
Thanks, Joy! I added the feed to my Bloglines account.
Post a Comment
<< Home