a novel machine learning model that guides …a novel machine learning model that guides graduate...

1
A novel machine learning model that guides graduate students to write more organized and structured texts Javier Vera 1 , Hector Allende-Cid 2 , René Venegas 3 , Sebastián Rodríguez 2 , Wenceslao Palma 2 , Sofía Zamora 3 , Fernando Lillo 3 , Humberto González 2 , Ashley Van Cott 1,4 , Eduardo N. Fuentes 1,4* Academic writing is one of the most valuable skills a scientist can develop. A primary challenge for graduate students is to coherently and concisely organize and present ideas within a manuscript. Writing a quality research manuscript requires transmitting the most relevant information through precise sentences that fulfill diverse communicational roles, ultimately resulting in a coherent, understandable text connected by cohesive mechanisms (e.g. lexical relationships between pairs of terms). Despite technological advances, the execution and teaching of the writing process have not similarly advanced. Therefore, a top priority for graduate programs is to implement new methodologies and technologies that aid students in communicating research advances. Through our investigation, we developed a novel, unsupervised machine-learning model applied to cell biology and biomedical texts that guides students in writing better organized and more structured texts. In conclusion, our research proposes an unsupervised machine-learning model applicable in revealing the hierarchy of information within cell biology and biomedical texts, providing automatic cohesion feedback that aids graduate students in writing more coherent, structured Abstracts. Our findings show how computational tools can contribute and significantly help young scientist to improve communicational skills. Future technologies and tools that provide deeper and more detailed advice for constructing and writing academic texts (e.g. scientific papers, theses, grants) remain to be developed. 1 WriteWise Research Group, Artificial Intelligence Unit, Santiago, Chile; 2 Pontificia Universidad Católica de Chile, Escuela de Ingeniería Informática; 3 Instituto de Literatura y Ciencias del Lenguaje, Chile; 4 BioPub, Scientific Writing Unit, Santiago, Chile. *Corresponding author: [email protected] TEXT NATURAL LANGUAGE PROCESSING GRAPH CONSTRUCTION Key concepts Text organization and structure Word connectivity and hierarchy Text reorganization and feedback SOFTWARE ANALYSIS USER INTERACTION ORIGINAL ABSTRACT PRE-TEST PILOT ACTIVITIES SOFTWARE DEMO (30 MIN) SOFTWARE USE (1 HOUR) REVISED ABSTRACT POST-TEST INTERVENTION UNSUPERVISED MACHINE LEARNING 1.Rubric development (experts in academic discourse) 2.Rubric validation (external reviewer) 3.Rubric improvement 4.Training and induction of rubric raters 5.Minimum agreement between raters 6.Randomized-blinded revision and rating (2 raters/abstract) 7.Average between raters 8.Statistical validation and results ABSTRACT WRITING QUALITY ASSESSMENT FIG. 1. MACHINE LEARNING MODEL, EXPERIMENTAL DESIGN, AND WRITING QUALITY ASSESSMENT INTRODUCTION Topic adequacy Audience adequacy Communicational process Semantic relationships Sentence length Holistic appreciation Average Conclusion presentation CONCLUSION All - trans - retinoic acid ( AtRA ) is the most active metabolite derived from vitamin A metabolism and has been used for treatments of some erythropoietic diseases. Recent studies using human erythrocytes (RBC) have suggested that the interaction mechanism induces structural changes in lipid composition. However, the detail of these changes is unclear. In the present study, the molecular interaction between AtRA and RBC as well as molecular models of membrane structural changes were investigated. The latter consisted of dimyristoylphosphatidylcholine (DMPC) and dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes located in the outer and inner monolayers of the RBC respectively. X-ray diffraction and differential scanning calorimetry (DSC) showed that AtRA induced structural and thermotropic perturbations in multilayers and vesicles of both DMPC and DMPE, particularly at the hydrophobic region of the membranes. Scanning electron microscopy (SEM) observations revealed that AtRA induced morphological alterations in RBC from their normal discoid form to stomatocytes. These outcomes suggested that AtRA molecules were located preferentially in the inner monolayer of the RBC membrane. The results obtained from this study suggest that the location of AtRA molecules into the RBC membrane and the modulation of the membrane properties thus providing deeper insight into the structural biology of these type of cells. FIG. 3. MACHINE LEARNING MODEL HELPS TO COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE (REVISED ABSTRACT) 1. AtRA 2.RBC 3. Structural Ranking of the 3 most important key concepts 1% most important and connected concepts. Text represented as an interactive graph. FIG. 4. UNSUPERVISED MACHINE LEARNING MODEL HELPS TO COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE FIG. 2. UNSTRUCTURED AND UNORGANIZED ABSTRACT (ORIGINAL) PRE- AND POST-TEST COMPARISON PRE PRE PRE PRE PRE PRE PRE PRE POST POST POST POST POST POST POST POST All-trans-retinoic acid (AtRA) is a metabolite derived from vitamin A metabolism has been used for the treatment of inflammatory skin diseases such as acne or psoriasis and as a potential chemotherapeutic agent in some types of cancer. In the present study, the molecular interaction with human erythrocytes as well as molecular models of its membrane were investigated. The latter consisted of dimyristoylphosphatidylcholine (DMPC) and dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes located in the outer and inner monolayers of the human erythrocyte membrane, respectively. X-ray diffraction and differential scanning calorimetry (DSC) showed that the molecule induced structural and thermotropic perturbations in multilayers and vesicles of both DMPC and DMPE, particularly at the hydrophobic region of the membranes. Scanning electron microscopy (SEM) observations revealed that the retinoid induced morphological alterations from their normal discoid form to stomatocytes. These outcomes suggested that the retinoic were located preferentially in the inner monolayer and suggest that the location of AtRA molecules into the RBC membrane and the modulation of the membrane properties could be an important issue to help clarify the potent biological effects shown by this retinoid.

Upload: others

Post on 10-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A novel machine learning model that guides …A novel machine learning model that guides graduate students to write more organized and structured texts Javier Vera1, Hector Allende-Cid2,

A novel machine learning model that guides graduate students to write

more organized and structured texts

Javier Vera1, Hector Allende-Cid2, René Venegas3, Sebastián Rodríguez2, Wenceslao Palma2, Sofía Zamora3, Fernando Lillo3,

Humberto González2, Ashley Van Cott1,4, Eduardo N. Fuentes1,4*

Academic writing is one of the most valuable skills a scientist can develop. A primary challenge for

graduate students is to coherently and concisely organize and present ideas within a manuscript.

Writing a quality research manuscript requires transmitting the most relevant information through

precise sentences that fulfill diverse communicational roles, ultimately resulting in a coherent,

understandable text connected by cohesive mechanisms (e.g. lexical relationships between pairs of

terms). Despite technological advances, the execution and teaching of the writing process have not

similarly advanced. Therefore, a top priority for graduate programs is to implement new

methodologies and technologies that aid students in communicating research advances. Through

our investigation, we developed a novel, unsupervised machine-learning model applied to cell

biology and biomedical texts that guides students in writing better organized and more structured

texts.

In conclusion, our research proposes an unsupervised machine-learning model applicable in

revealing the hierarchy of information within cell biology and biomedical texts, providing

automatic cohesion feedback that aids graduate students in writing more coherent,

structured Abstracts. Our findings show how computational tools can contribute and

significantly help young scientist to improve communicational skills. Future technologies and

tools that provide deeper and more detailed advice for constructing and writing academic

texts (e.g. scientific papers, theses, grants) remain to be developed.

1WriteWise Research Group, Artificial Intelligence Unit, Santiago, Chile; 2Pontificia Universidad Católica de Chile, Escuela de Ingeniería Informática; 3Instituto de

Literatura y Ciencias del Lenguaje, Chile; 4BioPub, Scientific Writing Unit, Santiago, Chile. *Corresponding author: [email protected]

TEXT

NATURAL

LANGUAGE

PROCESSING GRAPH

CONSTRUCTION

• Key concepts

• Text organization

and structure

• Word connectivity

and hierarchy

•Text reorganization

and feedback

SOFTWARE

ANALYSIS

USER

INTERACTION

ORIGINAL ABSTRACT

PRE-TEST

PILOT ACTIVITIES

1° 2°

SOFTWARE DEMO

(30 MIN) SOFTWARE USE

(1 HOUR)

REVISED ABSTRACT

POST-TEST

INTERVENTION

UNSUPERVISED

MACHINE LEARNING

1.Rubric development (experts in academic discourse)

2.Rubric validation (external reviewer)

3.Rubric improvement

4.Training and induction of rubric raters

5.Minimum agreement between raters

6.Randomized-blinded revision and rating (2 raters/abstract)

7.Average between raters

8.Statistical validation and results

ABSTRACT WRITING QUALITY ASSESSMENT

FIG. 1. MACHINE LEARNING MODEL, EXPERIMENTAL DESIGN,

AND WRITING QUALITY ASSESSMENT

INTRODUCTION

Top

ic

adeq

uac

y

Au

die

nce

adeq

uac

y

Co

mm

un

icat

ion

al

pro

cess

Sem

anti

c

rela

tio

nsh

ips

Sen

ten

ce

len

gth

Ho

listi

c

app

reci

atio

n Ave

rage

Co

ncl

usi

on

pre

sen

tati

on

CONCLUSION

All-trans-retinoic acid (AtRA) is the most active metabolite derived from vitamin A metabolism

and has been used for treatments of some erythropoietic diseases. Recent studies using human

erythrocytes (RBC) have suggested that the interaction mechanism induces structural changes in

lipid composition. However, the detail of these changes is unclear. In the present study, the

molecular interaction between AtRA and RBC as well as molecular models of membrane

structural changes were investigated. The latter consisted of dimyristoylphosphatidylcholine

(DMPC) and dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes

located in the outer and inner monolayers of the RBC respectively. X-ray diffraction and

differential scanning calorimetry (DSC) showed that AtRA induced structural and thermotropic

perturbations in multilayers and vesicles of both DMPC and DMPE, particularly at the hydrophobic

region of the membranes. Scanning electron microscopy (SEM) observations revealed that AtRA

induced morphological alterations in RBC from their normal discoid form to stomatocytes. These

outcomes suggested that AtRA molecules were located preferentially in the inner monolayer of

the RBC membrane. The results obtained from this study suggest that the location of AtRA

molecules into the RBC membrane and the modulation of the membrane properties thus

providing deeper insight into the structural biology of these type of cells.

FIG. 3. MACHINE LEARNING MODEL HELPS TO COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE (REVISED ABSTRACT)

1.AtRA

2.RBC

3.Structural

Ranking of the 3 most

important key

concepts

1% most important and

connected concepts.

Text represented as an

interactive graph.

FIG. 4. UNSUPERVISED MACHINE LEARNING MODEL HELPS TO

COMMUNICATE KEY CONCEPTS AND ORGANIZE TEXT STRUCTURE

FIG. 2. UNSTRUCTURED AND UNORGANIZED

ABSTRACT (ORIGINAL)

PR

E- A

ND

PO

ST-T

EST

CO

MPA

RIS

ON

PR

E

PR

E

PR

E

PR

E

PR

E

PR

E

PR

E

PR

E

PO

ST

PO

ST

PO

ST

PO

ST

PO

ST

PO

ST

PO

ST

PO

ST

All-trans-retinoic acid (AtRA) is a metabolite derived from vitamin A metabolism has been

used for the treatment of inflammatory skin diseases such as acne or psoriasis and as a

potential chemotherapeutic agent in some types of cancer. In the present study, the

molecular interaction with human erythrocytes as well as molecular models of its membrane

were investigated. The latter consisted of dimyristoylphosphatidylcholine (DMPC) and

dimyristoylphosphatidylethanolamine (DMPE), representative of phospholipid classes

located in the outer and inner monolayers of the human erythrocyte membrane,

respectively. X-ray diffraction and differential scanning calorimetry (DSC) showed that the

molecule induced structural and thermotropic perturbations in multilayers and vesicles of

both DMPC and DMPE, particularly at the hydrophobic region of the membranes. Scanning

electron microscopy (SEM) observations revealed that the retinoid induced morphological

alterations from their normal discoid form to stomatocytes. These outcomes suggested that

the retinoic were located preferentially in the inner monolayer and suggest that the location

of AtRA molecules into the RBC membrane and the modulation of the membrane properties

could be an important issue to help clarify the potent biological effects shown by this

retinoid.