summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Dobrica Pavlinusic [Wed, 25 Apr 2012 11:46:07 +0000 (13:46 +0200)]
remove unused module use
Dobrica Pavlinusic [Tue, 24 Apr 2012 17:59:28 +0000 (19:59 +0200)]
correctly fetch next page of results from vuFind
Dobrica Pavlinusic [Tue, 24 Apr 2012 17:36:16 +0000 (19:36 +0200)]
decode utf-8 marc correctly before saving
Dobrica Pavlinusic [Mon, 23 Apr 2012 12:12:08 +0000 (14:12 +0200)]
vuFind scraper which fetch marc records directly
Dobrica Pavlinusic [Mon, 23 Apr 2012 12:11:38 +0000 (14:11 +0200)]
mech wrapper function
Dobrica Pavlinusic [Mon, 23 Apr 2012 11:49:24 +0000 (13:49 +0200)]
cleanup var scoping
Dobrica Pavlinusic [Wed, 18 Apr 2012 13:00:10 +0000 (15:00 +0200)]
all fileds optional, cleanup industryIdentifiers
Dobrica Pavlinusic [Wed, 18 Apr 2012 12:59:47 +0000 (14:59 +0200)]
make saerch query param to test script
Dobrica Pavlinusic [Tue, 17 Apr 2012 22:25:53 +0000 (00:25 +0200)]
map all Google Books JSON to MARC21
Dobrica Pavlinusic [Tue, 17 Apr 2012 21:31:09 +0000 (23:31 +0200)]
added basic GoogleBooks JSON API
Dobrica Pavlinusic [Tue, 17 Apr 2012 14:37:56 +0000 (16:37 +0200)]
removed COBISS which started serving images instead of HTML table (sigh)
Dobrica Pavlinusic [Fri, 17 Feb 2012 16:42:07 +0000 (17:42 +0100)]
increase dump_nr to files
Dobrica Pavlinusic [Fri, 17 Feb 2012 16:41:47 +0000 (17:41 +0100)]
die on unsupported format
Dobrica Pavlinusic [Fri, 17 Feb 2012 15:04:48 +0000 (16:04 +0100)]
create records in utf-8 encoding
Dobrica Pavlinusic [Mon, 18 Jul 2011 13:48:02 +0000 (15:48 +0200)]
changed URL to NSK Aleph
Dobrica Pavlinusic [Thu, 16 Dec 2010 14:13:31 +0000 (15:13 +0100)]
use cobiss.ba non-ajaxy interface
Dobrica Pavlinusic [Thu, 16 Dec 2010 14:10:14 +0000 (15:10 +0100)]
render queries with asterisk (*) at end
Dobrica Pavlinusic [Thu, 16 Dec 2010 14:02:00 +0000 (15:02 +0100)]
fix ISBN, ISSN and authors search
Dobrica Pavlinusic [Thu, 16 Dec 2010 13:44:48 +0000 (14:44 +0100)]
fix Zoom query encoding
we try to decode query back into utf-8 to generate correct search
string to insert into HTML form
Dobrica Pavlinusic [Mon, 8 Nov 2010 16:01:49 +0000 (17:01 +0100)]
added support for single results
This shows result page directly, so it needed another regex to recognize it.
Both regexes are depend on Croatian language in interface, but they
are marked in code with FIXME for easy modification
Dobrica Pavlinusic [Mon, 8 Nov 2010 15:57:31 +0000 (16:57 +0100)]
added ISBN and ISSN mapping
Dobrica Pavlinusic [Mon, 25 Oct 2010 09:55:11 +0000 (11:55 +0200)]
Aleph changed port
Dobrica Pavlinusic [Mon, 25 Oct 2010 09:54:51 +0000 (11:54 +0200)]
use database name for BASENAME
Dobrica Pavlinusic [Sat, 23 Oct 2010 21:58:19 +0000 (23:58 +0200)]
correctly pass usemap to all render calls
Dobrica Pavlinusic [Sat, 23 Oct 2010 21:55:04 +0000 (23:55 +0200)]
rename all $this to $self to be more perl-like
instead of JavaScript I guress?!
Dobrica Pavlinusic [Sat, 23 Oct 2010 21:54:00 +0000 (23:54 +0200)]
some more debug
Dobrica Pavlinusic [Sat, 23 Oct 2010 21:20:57 +0000 (23:20 +0200)]
better mapping to Aleph search syntax
Dobrica Pavlinusic [Sat, 23 Oct 2010 20:49:29 +0000 (22:49 +0200)]
and cleanup code to load correct format
Dobrica Pavlinusic [Sat, 23 Oct 2010 20:19:16 +0000 (22:19 +0200)]
collect results to support offset in fetch record
Dobrica Pavlinusic [Sat, 23 Oct 2010 20:11:48 +0000 (22:11 +0200)]
strip spaces from end of value
Dobrica Pavlinusic [Sat, 23 Oct 2010 19:30:08 +0000 (21:30 +0200)]
fix and cleanup database selection from link
Dobrica Pavlinusic [Sat, 23 Oct 2010 16:57:49 +0000 (18:57 +0200)]
stop at last record
Dobrica Pavlinusic [Sat, 23 Oct 2010 16:48:48 +0000 (18:48 +0200)]
try to select our database from link
Aleph returns database unavailable if we switch to advanced form and
default database isn't there. Now we try to follow link with database
in which we are interested.
Dobrica Pavlinusic [Sat, 23 Oct 2010 16:46:55 +0000 (18:46 +0200)]
allocate session just once
this will prevent our denial of service against Aleph if we get
too many requests (and since it's random number, I might as well used
42 for it)
Dobrica Pavlinusic [Sat, 23 Oct 2010 16:27:26 +0000 (18:27 +0200)]
added save_content for debugging
Dobrica Pavlinusic [Sat, 23 Oct 2010 14:17:32 +0000 (16:17 +0200)]
cleanup unimarc/marc parsing
Dobrica Pavlinusic [Sat, 23 Oct 2010 13:52:59 +0000 (15:52 +0200)]
fix number of tests
Dobrica Pavlinusic [Sat, 23 Oct 2010 13:52:03 +0000 (15:52 +0200)]
fix warnings
Dobrica Pavlinusic [Sat, 23 Oct 2010 13:44:31 +0000 (15:44 +0200)]
store hits and don't try to download more records
Dobrica Pavlinusic [Sat, 23 Oct 2010 13:24:02 +0000 (15:24 +0200)]
blurb
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:53:02 +0000 (14:53 +0200)]
select database
this allows us to search in different Aleph databases in single
web interface
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:51:04 +0000 (14:51 +0200)]
use module name for database if missing
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:46:30 +0000 (14:46 +0200)]
session should be integer
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:45:59 +0000 (14:45 +0200)]
save in correct database named directory
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:29:02 +0000 (14:29 +0200)]
support fields without subfields
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:11:12 +0000 (14:11 +0200)]
test both providers
Dobrica Pavlinusic [Sat, 23 Oct 2010 12:08:43 +0000 (14:08 +0200)]
join multi-line fields
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:58:46 +0000 (13:58 +0200)]
test with new API
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:58:37 +0000 (13:58 +0200)]
use save_marc
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:56:34 +0000 (13:56 +0200)]
cleanup output
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:47:35 +0000 (13:47 +0200)]
move save_marc to Scraper
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:31:36 +0000 (13:31 +0200)]
cleanup field extraction
Dobrica Pavlinusic [Sat, 23 Oct 2010 11:21:59 +0000 (13:21 +0200)]
fix number of fix extraction
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:33:59 +0000 (01:33 +0200)]
remove debug
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:23:34 +0000 (01:23 +0200)]
load Aleph
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:23:20 +0000 (01:23 +0200)]
and test it
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:23:08 +0000 (01:23 +0200)]
fix Aleph scraper
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:18:08 +0000 (01:18 +0200)]
report invalid databases
Dobrica Pavlinusic [Fri, 22 Oct 2010 23:09:55 +0000 (01:09 +0200)]
extract common code into Scraper package
Dobrica Pavlinusic [Fri, 22 Oct 2010 22:57:58 +0000 (00:57 +0200)]
rewrite COBISS into perl object
and modify server to use it
Dobrica Pavlinusic [Fri, 22 Oct 2010 22:10:51 +0000 (00:10 +0200)]
rename to Biblio Z39.50
Dobrica Pavlinusic [Fri, 22 Oct 2010 22:09:02 +0000 (00:09 +0200)]
give render module to pull usemap from
Dobrica Pavlinusic [Fri, 22 Oct 2010 22:05:21 +0000 (00:05 +0200)]
and make eval for usemap work
Dobrica Pavlinusic [Fri, 22 Oct 2010 22:03:49 +0000 (00:03 +0200)]
better error reporting
Dobrica Pavlinusic [Fri, 22 Oct 2010 21:52:07 +0000 (23:52 +0200)]
command to test with yaz-client
dpavlin [Fri, 22 Oct 2010 21:31:08 +0000 (21:31 +0000)]
make usemap configurable
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@14
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 22 Oct 2010 21:12:46 +0000 (21:12 +0000)]
more debug output and some cleanup
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@13
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 22 Oct 2010 20:49:16 +0000 (20:49 +0000)]
generate marc record
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@12
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 22 Oct 2010 20:25:51 +0000 (20:25 +0000)]
basic parser for Aleph html at NSK
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@11
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 26 Mar 2010 16:26:59 +0000 (16:26 +0000)]
fix mapping for 210 -> 260
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@10
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sun, 21 Jun 2009 08:16:41 +0000 (08:16 +0000)]
produce unimarc (without conversion) of usmarc (with conversion)
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@9
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sat, 20 Jun 2009 22:09:33 +0000 (22:09 +0000)]
support multiple results
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@8
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sat, 20 Jun 2009 20:19:49 +0000 (20:19 +0000)]
cleanup output
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@7
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sat, 20 Jun 2009 19:42:32 +0000 (19:42 +0000)]
simple server based on Net::Z3950::SimpleServer
which serve out UNIMARC records
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@6
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sat, 20 Jun 2009 19:42:00 +0000 (19:42 +0000)]
remove debugging content output
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@5
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Sat, 20 Jun 2009 19:28:04 +0000 (19:28 +0000)]
split COBISS into search (from advanced form) and fetch_rec,
create rewriter from Z39.50 to COBISS syntax,
added diag for output
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@4
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 19 Jun 2009 21:51:56 +0000 (21:51 +0000)]
Module::Build builder
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@3
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 19 Jun 2009 21:51:28 +0000 (21:51 +0000)]
create MARC record
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@2
ae73d1a6-5fa4-44a9-8f13-
f281fb455051
dpavlin [Fri, 19 Jun 2009 17:50:34 +0000 (17:50 +0000)]
scrape COBISS
git-svn-id: svn+ssh://llin.lib/home/dpavlin/private/svn/Z3950-HTML-Scraper@1
ae73d1a6-5fa4-44a9-8f13-
f281fb455051