FYR 2010

Posted on 30 July 2010 | No responses

Right then, it’s been a funny few weeks but here we are again I’ll be trying to do at least a post a week for August (that’ll be August 2010) just in-case this takes another hiatus.

Anyway this week has been fun we’ve got an odd financial year hear which end at the end of July as a sys admin it falls to me to run the process that changes the acquisition funds to the new financial year. Great, we’ve got a nice little check-list to run through the last few years everything has gone nice and smoothly so I wasn’t too worried about it.

So I get towards the end of the preliminaries on Thursday lunchtime and suddenly the last script (which last years notes say took 20 minutes) takes 12+ hours. Perfect.

Well in on Friday, nice and early, looking at what might be causing it but no joy as yet, open call with the suppliers but they’ve not had much success either so should be a nice weekend.

Back to it then…

Posted on 26 March 2010 | No responses

Right after another substantial break I’m going to take another swing at this (is this the 5th or 6th now), still need to do that Marc 21 report write up but that might have to wait a couple of weeks.

Few important things coming up at work:

  • We are looking at renewing the hardware our ILS system runs on.
  • Re-looking at how/when we will be deploying our new OPAC (hopefully before the start of the next session)
  • Working on getting the integration of our system into our corporate Business Objects platform.
  • Trying to deploy a couple of library applications on facebook
  • I’m working on getting approval for another project but don’t want to say too much just yet.
  • And most importantly I’m off to New York in a few weeks (Woo & Hoo), so expect a couple of countdown-esque posts.

    Splitting Marc Files

    Posted on 14 August 2009 | No responses

    Not doing so well with keeping up the posts but it is time for another one.

    It turns out I needed to split a series of Marc21 files down into more manageable chunks so our importer wouldn’t have to go and have a lie down after 36 hours.

    The input source is four files containing 100,000+ records (which will take 3-ish days to import in total),  I could split these down by hand in a text editor but I thought it’d be easier to quickly script it – and then next time it’ll be even quicker.

    #Start of marcSplit.py
    #Supply a List of files placed in same directory as script
    files = ["1.mrc", "2.mrc", "3.mrc", "4.mrc"]
    
    for file in files:
        #Set up the integers for iteration
        nextRec = 0
        n = 0
        part = 1
    
        #open file & read to a string
        allRecords = open(file).read()
    
        # split file at every 5000th "\x1d" -1 for no nextRec
        while nextRec > -1:
            nextRec = allRecords.find("\x1d", nextRec+1)
            n = n + 1
            if n % 5000 == 0:
                #we have the 5000th record
                n = 0
                newFile = open(file + "_" + str(part) + ".mrc", 'wt')
                newFile.write(allRecords[0:nextRec+1])
                allRecords = allRecords[nextRec+1:]
                part = part + 1
                nextRec = 0
        #Write the last segment of the file (also works if less than 5000 records)
        newFile = open(file + "_" + str(part) + ".mrc", 'wt')
        newFile.write(allRecords[0:])
    #end
    

    There we have it, bit crap but it was quick and it works – output is a series of new files in the same directory all suffixed with a part number.  I’m sure it could be a bit smarter in places, especially how it handles the last chunk of the file.

    Next time I’ll go through the Marc21 Cataloguers reports we can now generate with JTDS & Marc4J.

    More Info:

    Inaugural Post 2

    Posted on 13 July 2009 | No responses

    Well one slightly botched upgrade has led to a re-install of WordPress, so I’ve lost all of my posts (when I say that, I mean that I lost the last inaugural post as that was the only one I’d bothered to write in 4 months).

    So here we are again back to the very first post, still not got a lot of ideas about what to write but I shall try to enforce some discipline on myself this time around.

    Recent Posts

    Tag Cloud

    LibraryWork

    Meta

    m2m is proudly powered by WordPress and the SubtleFlux theme.

    Copyright © m2m