rdmpage / microcitation-lite

SQLIte database of publications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Microcitation-lite

SQLite database of publications, based on earlier MySQL version.

SQLite

Experimenting with SQLite to keep things simple (e.g., no need to install database server).

Tables

Have decided to have two tables, one for data derived from various sources, and one for data obtained by resolving DOIs. Might help in cases where there are multiple sources of journal data and none are complete (e.g., web pages may have more information than DOI, such as PDFs, other languages, etc.).

Update timestamp

Need a trigger to have time stamps update when record added/modified:

-- DROP TRIGGER publications_updated;
CREATE TRIGGER publications_updated AFTER UPDATE ON publications FOR EACH ROW
BEGIN

UPDATE publications
SET
    updated = CURRENT_TIMESTAMP
WHERE guid = old.guid;

END;

Views

CREATE VIEW IF NOT EXISTS `Korean_journal_of_applied_entomology` AS
SELECT 
    guid, 
    title, 
    journal,
    issn,
    volume,
    issue,
    spage,
    epage,
    year,
    doi
FROM `publications` 
WHERE `publications`.journal="Korean journal of applied entomology";

Journals

Korean Journal of Systematic Zoology

Journal has two titles, DOIs all have title set to Animal Systematics, Evolution And Diversity, metadata harvested from https://koreascience.kr/ and by resolving DOIs. Will need to split DOIs into two sets, add additional multilingual data, archive PDFs, and also handle articles pre-DOI. Wikidata has two versions of the Korean Journal of Systematic Zoology, likely based on ISSN issues.

Mycosystema

Complicated journal with Chinese and English versions, sometimes in parallel. Data from CNKI, magtech, and we also have DOIs. The original Chinese journal is Acta Microbiologica Sinica ISSN 0001-6209, which started in 1953 and is still running. Lots of Wikidata articles with PMID and nothing else, these will also have CNKI DOIs. See also 《菌物学报》三十年回眸与展望 / A glance back at the thirty years’ Mycosystema and prospect. Need to also check PDFs.

Journal ISSN Years Wikidata Notes
Acta Mycologia Sinica 0256-1883 1982 - 1996 Q52380146 vols 1-
Mycosystema: Annual Report of Systematic Mycology & Lichenology Laboratory 1988 - 1997 Q104214469 This journal is in DOI Mycosystema as volumes 7-14, issues Z1 and have URLs and English titles
Mycosystema 1007-3515 1997–2003 Q52380050
Mycosystema 1672-6472 2004- Q15760108

Science

There has been an epic fuck up with DOIs for this journal, massive duplication, 2-3 DOIs for some articles 9typically those on the same page). Some filtering done, everything with flag=1 in the database is likely a bad DOI. See https://community.crossref.org/t/more-multiple-doi-madness/4090

There is a view with a clean(ish) list.

Get PDFs

SELECT """" || pdf || ",""" FROM publications WHERE issn='1123-6787' AND pdf IS NOT NULL;

Wikidata

Add DOIs

SELECT wikidata, "P356", """" || UPPER(doi) || """" , "P2378" , "Q5188229" FROM publications_doi WHERE issn='0459-8113' and updated > '2022-12-29 14:00:00' AND wikidata IS NOT NULL AND doi IS NOT NULL;

Add CNKI

SELECT wikidata, "P6769", """" || cnki || """"  FROM publications_doi WHERE issn='2095-0357' and updated > '2022-12-29 14:00:00' AND wikidata IS NOT NULL AND cnki IS NOT NULL;

Adding PDF with wayback archive URL

Add qualifiers that say it’s a PDF and it’s backed up.

SELECT wikidata, "P953", """" || pdf || """",  "P2701", "Q42332", "P1065", """" || "https://web.archive.org" || waybackmachine || """" FROM publications WHERE issn='1225-0104' AND wikidata IS NOT NULL AND pdf IS NOT NULL AND waybackmachine IS NOT NULL;

Internet Archive

SELECT wikidata, "P724", """" || internetarchive || """" FROM publications WHERE doi LIKE "10.5635/ASED%" AND wikidata IS NOT NULL AND internetarchive IS NOT NULL;

Add volume and issue

SELECT wikidata, "P478", """" || volume || """" FROM publications WHERE issn="2346-9641" AND wikidata IS NOT NULL AND volume IS NOT NULL;
SELECT wikidata, "P433", """" || issue || """" FROM publications WHERE issn="2346-9641" AND wikidata IS NOT NULL AND issue IS NOT NULL;

Add page numbers

SELECT CONCAT(wikidata, "P304", """" || spage "-" epage, """") FROM publications_doi WHERE issn='1988-3196' AND  wikidata IS NOT NULL AND spage IS NOT NULL and epage IS NOT NULL and flag=1;

Add multilingual titles and labels

Titles

SELECT wikidata, "P1476", language || ":" || """" || value || """", "S248", "Q4698727", "S854", """" || guid || """"
FROM publications 
INNER JOIN multilingual USING(guid) 
WHERE issn='1021-5506'
AND year < 2013
AND multilingual.`language` = 'zh'
AND multilingual.`key` = 'title'
AND wikidata IS NOT NULL;

Labels

SELECT wikidata, "L" || language, """" || value || """"
FROM publications 
INNER JOIN multilingual USING(guid) 
WHERE issn='1021-5506'
AND year < 2013
AND multilingual.`language` = 'zh'
AND multilingual.`key` = 'title'
AND wikidata IS NOT NULL;

Translations

SELECT wikidata_en, "P629", wikidata_ru FROM trans WHERE wikidata_en IS NOT NULL and wikidata_ru IS NOT NULL and en IS NULL;

About

SQLIte database of publications

License:MIT License


Languages

Language:PHP 100.0%