scielo metodologia en

30
SciELO Methodology

Upload: srdiego

Post on 03-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 1/30

SciELO Methodology

Page 2: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 2/30

Methodology

• SciELO PC Programs (Windows/Visual Basic/VBAWord) – Server programs: Title Manager, Code Manager,

Converter/Parser, XML SciELO

 – Workstation program: Markup/Parser

 – Located on: c:\scielo\bin\• SciELO Processing (.bat, .sh, java)

 – GeraPadrao

 – Programs to export data

 – Bibliometrics

 – Etc – Located on: c:\home\scielo\www\proc or c:\scielo\web\proc

• SciELO Web (Apache, PHP, WWWISIS) – Located on: c:\home\scielo\www or c:\scielo\web

Page 3: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 3/30

Computers

Local server

(Windows):• Title Manager

• Code Manager

• Converter

• XML SciELO

• Markup/Parser

• Local web site• files storage (img,

pdf, html, etc)1 or more workstations

(Windows)

Markup/Parser

Microsoft Word

Linux server

• processing• homologation web site

• production web site

Obs.: Each one of

these functions can be

in one or more linux

server

Page 4: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 4/30

The programs are in: c:\scielo\bin\ 

The data are in: c:\scielo\serial

The data of each journal are in:c:\scielo\serial\<journal_acronym>

The data of each journal issue are in:

c:\scielo\serial\acronym\v*n* 

Before using the programs, it is necessary to check if all files are in thecorrect structure.

Under the volume and number folder, the following directories must becreated:

Files structure in the local server

(Windows)

Page 5: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 5/30

c:\SciELO\serial\sajs\v105n7-8\source\editorial.pm6

Body

Contains all articles of

an issue, each article

in its own file, named

in the correct way

Img

Contains all images,

figures, graphics,etc. named in the

correct way

c:\SciELO\serial\sajs\v105n7-8\body\a01v29n1.html 

c:\SciELO\serial\sajs\v105n7-8\img\a01fig01.gif  

Markup

Contains the articles to

be marked. The filesfrom the folder body

should be copied and

pasted in this folder. 

c:\SciELO\serial\sajs\

v105n7-8\markup\a01v29n1.html

PDF

Contains the PDF

files that must benamed in the same

way as the HTML

files

c:\SciELO\serial\sajs\

v105n7-8\pdf\a01v29n1.pdf

Source

Contains the original files

(final version), without

any sort of last-minute

modifications or

adjustments

Files structure in the local server

(Windows)

Page 6: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 6/30

Structure of the journal’s folders

directory

Files structure in the local server

(Windows)

Page 7: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 7/30

• SciELO PC programs are accessedby the Program Files menu

Components 

SciELO PC Programs

Page 8: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 8/30

SciELO PC Programs: Local

server• Title Manager :

 – program in Visual Basic

 – to manage the database title (journal), section (table of contents of the journals),issue (issues).

 – located in the local server: c:\scielo\bin\config

• Code Manager :

 – program in Visual Basic – to manage the tables of codes (language, country, etc) .

 – located in the local server: c:\scielo\bin\codes

• Converter :

 – program in Visual Basic

 – to convert the markup documents into database ISIS

 – located in the local server: c:\scielo\bin\converter

• XML SciELO: – Program in BATCH

 – To generate the XML to export to ISI and PubMed, it can be modified to generateto other databases

 – Located in the local server: c:\scielo\xml_scielo

Page 9: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 9/30

SciELO PC Programs:

workstation• Markup:

 – program in VBA Word

 – to guide the identification of the elements ofthe article

 – located in the local server  andWORKSTATION: c:\scielo\bin\markup

• SGML Parser :

 – program in VB and C

 – to parse the markup of the documents

 – located in the local server  andWORKSTATION: c:\scielo\bin\sgmlpars

Page 10: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 10/30

Page 11: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 11/30

Files structure of web site

• Web site files – www/bases

• pdf

• translation

• artigo

• issue

• title

• etc

 – www/htdocs• img

• revistas

 – www/cgi-bin• Wxis

• IsisScript

• Processings files• www/proc

• www/bases-work

Page 12: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 12/30

Files

reception

Workflow in the local serverFiles

preparation ->doc/html

.html

Title

Section

Issue

Title

Manager

Code

Manager

Codenewcode

Code

Manager

corrections

Markup

Parser

MS-Word

Issue.mds

Marked files

Converter

issue of a journal

v<VOL>n<NUM>,

Located on serial

Corrections

Local GeraPadrao

scilist

artigo etc

Local web site

Page 13: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 13/30

Workflow in the local server• PEOPLE receive the files of the issues (.html, images, pdf, etc)

• PEOPLE prepare and archive them in serial/<journal_acronym>/v<VOL>n<NUM>/ in folders:markup, body, img, pdf. One article for one file .html. At this point, markup and body folders havethe same content.

• Parallelly the issue’s data are registered using Title Manager/Create new issue 

• When an unregistered journal title comes, it must be registered, using Title Manager/Create newtitle

• After registering the issues’ data, Title Manager generates input files for markup and converterprograms in their own folder in the computer where Title Manager is running (bin\markup and

bin\convert). So, it is necessary to copy the files from bin\markup to the other computers whereMarkup runs

• Markup program is used to identify the bibliographic elements of the articles/text located in themarkup folder (serial/<journal_acronym>/v<VOL>n<NUM>/markup)

• Parser program is used to validate the files processed by Markup program

• Converter program reads the files located in markup and body of na issue(serial/<journal_acronym>/v<VOL>n<NUM>) and then generates its databas

•  All the databases generated by Converter program are used by GeraPadrao to create thedatabase of the web site. The images, pdf, etc, have to be copied to the corresponding folder to beaccessed by the web site.

• Code Manager is rarely used. It manages the tables of codes used by SciELO.

• Whenever mistakes are found, it is possible go back and correct the data and redo the process

• Finally, using a script EnviaBasesScieloPadrao.bat, the databases are sent to a server to beprocessed

Page 14: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 14/30

Workflow: transfering data from local to processing area

issue of a journal

v<VOL>n<NUM>,

Located on serial

Code

newcode

Located on

bases, resulting

of local

GeraPadrao:

Title

newissue

Local GeraPadrao

scilist

EnviaBasesSciELOPadrao.bat

(local): proc/temp/transf2linux

Processing server

FTP

Page 15: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 15/30

local server

Processing server

1) Configure the files:C:\scielo\web\proc\transfer\ or

C:\home\scielo\www\proc\transfer

EnviaBasesLogOn.txt

EnviaImgPdfLogOn.txtEnviaTranslationLogOn.txt

2) Execute in C:\scielo\web\proc\ or c:\home\scielo\www\proc\:

EnviaBasesSciELOPadrao.bat – it sends the databases from Windows to Linux

EnviaImgPdfSciELOPadrao.bat – it sends the img, pdf from Windows to Linux

EnviaTranslationSciELOPadrao.bat – it sends the translations from Windows to Linux

3) Execute

GeraPadrao.bat

Open <server_name>

<user_name>

<password>

cd <path_www>

Configuration of the processing to send data and

files to the processing area

Page 16: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 16/30

Workflow in the Linux server

issue of a journal

v<VOL>n<NUM>,

Located on serial

Code

newcode

Located on bases,

resulting of local

GeraPadrao:

Title

newissue

GeraPadrao.bat

scilistdatabases artigo etcFor the web site

Homologation

Web site

copy

Production/Public

Web site

Page 17: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 17/30

Workflow in the Linux server

•  After receiving the databases and files, theGeraPadrao.bat script must be executed to

generate the databases for the web site. It

is necessary because the databases of theWindows and Linux have uncompatible

format

Page 18: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 18/30

Processings in Windows

• Xml_scielo: is part of PC Programs/server – PubMed: generate XML to PubMed

 – ISI: generate XML to Web Of Sciences

• EnviaBasesScieloPadrao.bat sends databases to

processing server• EnviaImgPdfScieloPadrao.bat sends img and pdf to

homologation server

• EnviaTranslationScieloPadrao.bat sends translations to

homologation server

Note: processing, homologation and public server can be

the same

Page 19: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 19/30

Only for SciELO Brasil

• Health indicators (Brazilian database)

• Curriculum ScienTI / Lattes (Brazilian

Database, but it is possible an adaptation)

• Semantic highlights: a trial of Knewco,

interrupted.

Page 20: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 20/30

Centralized processings

• SciELO.ORG:• Bibliometrics

• Links Medline, LILACS, etc

• Co-authors – not finished

• Centralized in Brazil

 – doaj: not ready; necessary and agreementwith DOAJ

 – Accesses

Page 21: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 21/30

Processings in the instance

• scieloUpdate: to update the SciELO web site in Linux

• For data exchange: – By sending data

• Envia2Medline.bat: feeds scielo.org. Use for: bibliometrics,

etc• Crossref

 – By letting available to harvest• Google Scholar

• Webservices

 – By querying• Scimago. Query to http://www.scimagojr.com/journalrank.php

• databases: related and cited from SciELO.org

Page 22: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 22/30

In the instance: no processing /

exchanging data

• External services:

 – Google Analytics

 – OAI

Installation and configuration of

Page 23: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 23/30

Installation and configuration of

SciELO web site and processings

in a Linux server• http://reddes.bvsalud.org/projects/scielo-

metodologia/browser/tags/v5.0-

pr/docs/SciELO-Web-

5.0_installation_guide_en.pdf

Page 24: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 24/30

Installation and configuration of the

local SciELO web site

• http://webdevcodex.com/tutorial-

installing-apache2-php5-mysql5-

phpmyadmin3-windows-7-vista/ • Local SciELO web site

 – Version 3: (php4.3.x)

http://reddes.bvsalud.org/projects/scielo-

metodologia/browser/branches/scielo-web_3.3 

 – Version 4: (php4.3.x - php5.2.x)

http://reddes.bvsalud.org/projects/scielo-

metodologia/browser/tags/

Page 25: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 25/30

Using VirtualBox to host a Linux

server in a Windows• Configuring the network:

 – Bridge

Move or rename the file: /etc/udev/rules.d/70-persistent-net.rules

To get an IP

• Maping the database folder in Linux to a folder in Windows, to use the free space in Windows.The space in VM is limited to 28 Gb. 

• Linux: /home/scielo/www/bases => Windows: c:\scielo\bases\linux\public

• Linux: /home/scielo_homolog/www/bases => Windows: c:\scielo\bases\linux\homolog\

 – VirtualBox:

• Settings: shared folders

• add share:

 – C:\scielo\bases\linux\public

 – bases

• add share:

 – C:\scielo\bases\linux\homolog

 – bases_homolog – /etc/fstab:

 – bases -> /home/scielo/www/bases

 – bases_homolog -> /home/scielo_homolog/www/bases

Page 26: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 26/30

Edit /etc/hosts, including the IP and server

name:

• Example:

 – 127.0.0.1 vm.scielo.br

NOTE: 127.0.0.1 DOES NOT CHANGE

Configuration in the server (Linux)

127.0.0.1 servername

Page 27: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 27/30

Configuration of the Virtual Host

Edit:Ubuntu: /etc/apache2/sites-available/scielo

Windows: <Apache_path>/conf/extra/httpd-vhost.conf

Blue: [email protected] = the e-mail of the web site administrator

Red: /home/scielo = path of the web site

Green: localscielo = url of the web site

Page 28: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 28/30

New Virtual Host

• Create the file for the virtual host in:

 – /etc/apache2/sites-available/<name_of_the_virtual_host_file>

 – Example: /etc/apache2/sites-available/scielo_homolog

 – Copy the scielo virtual host file and edit it tochange the configuration as shown in the

previous slide:• Path

• Server name

Page 29: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 29/30

•  After changing the configuration of apache

/ virtual hosts, you MUST execute:

sudo /etc/init.d/apache2 reload

Configuration in the computers

Page 30: Scielo Metodologia En

8/11/2019 Scielo Metodologia En

http://slidepdf.com/reader/full/scielo-metodologia-en 30/30

Configuration in the computers

which access the SciELO websites

- Editing the file: HostsEdit C:\windows\system32\drivers\etc\hosts, adding the

line:

127.0.0.1 localscielo

<IP_homolog> homologscielo

Where:

<IP_homolog> = IP of homologation SciELO web site

homologscielo = server name of homologation

SciELO web site

Do it for all the computers which have to access

localscielo and homologscielo