the corpus of business discourse
TRANSCRIPT
The Corpus of Business Discourse
A Comparison of Accounting and HR Learners Dr. Alfred Miller
@ACBSPAccredited #ACBSP2016
ACBSP Region 8 The International Council of Business Schools and Programs
Reflecting on HR Industry Project: Word Cloud (Davies, 2010).
OBJECTIVE•Explore Creating New Knowledge in the Classroom
•Build Awareness of Software-mediated Content Analysis
• Improve Teaching and Learning@ACBSPAccredited #ACBSP2016
Overview of Method•Mixed-method grounded theory study•Data mining and content analysis•KH Coder, prepped with Stanford POS Tagger•Data is student reflections from Taxation and HR•Qualitative interpretation of quantitative data•Five quantitative methods + intuition and induction•Is KH Coder content analysis effective?
KH Coder Five Quantitative Methods
•Word frequency analysis•Hierarchical cluster analysis•Co-occurrence network•Multi-dimensional scaling•Self-organizing map
Grounded Theory•Course reflections = Business discourse data•HR section included an overseas study tour to Greece•Taxation Accounting students learn U.S. Tax Code •Data is coded and tagged and machine interaction enables identification of emergent properties to construct a new theory from the data•Was new knowledge created?
Individual Reflection is submitted with the Final Project-Outline below
•Introduction •What did you learn about individual and business taxation in BUS 4163 Taxation or International HRM in BUS 4936?•Challenges you faced? •Your experience with the group project? •Conclusion
Three Groups•2015 Year 4 International HR: 25 Students•2015 Year 4 Taxation (Accounting): 23 Students•2016 Year 4 Taxation (Accounting): 28 Students
Data Preparation•Assemble reflections into a single master document for each section•Scrub learner names, headings, bullet points, fonts and special treatments •Separate each learner’s contribution using unique html code: <h1>Learner1</h1>•Data sample saved as a .txt file i.e. plain text.
Research Questions
•Q1. Is text-mining methodology, an effective way to reorganize, visualize and analyze text from business student reflections?
•Q2. Is text analysis from student reflections a meaningful way to assess creation of new knowledge in the classroom?
•Q3. Was new knowledge created as a result of the learners experience in the classroom?
Graduate Outcomes HR•Global Awareness
• Propose the application of functional knowledge and managerial insight through research on complex challenges facing human resource management.
• Evaluate the relevance of various theories related to challenges in human resource management using criteria derived from industry-based knowledge and skills.
• Analyze challenges facing human resource management using quantitative and qualitative analytical tools (Higher Colleges of Technology, 2015).
Course Learning Outcomes Taxation
•Differentiate the principles and practices in various tax systems•Critique the effectiveness of tax system compliance•Calculate the taxable income and tax liability of individuals•Calculate the taxable income of business entities•Calculate the taxable capital gains of assets•Recommend a business model predicated on a tax system
Definition of Terms
•Corpus Linguistics. Corpus linguistics is the study of a language through large database of native texts, written or spoken. It includes using frequency and concordancing techniques (Tang, 2008).
Definition of Terms•KH Coder. Open source, text mining and quantitative content analysis software. Continuously improved since intro in 2001. Originally developed for use with Japanese text, now expanded for use with several languages (Koichi, H. 2015; Text Analysis, 2015), and used in over 532 studies (Pelet, 2014).
Definition of Terms•Utilizes R programming language which is becoming an increasingly popular lingua franca for data analytics in both academia and the corporate world. Open sourced R is considered on par with proprietary packages like SPSS, SAS and Stata (Northeastern University, 2016)
Definition of Terms•Preprocessing text. To use KH Coder, text must be preprocessed first; usually using a computer program, to remove characters that will not transfer effectively and be read by the coding software (Pelet, Khan, Papadopoulou, & Bernardin, 2014).
Definition of Terms•Stanford POS Tagger. Efficient, basic part-of-speech tagger—software that reads text originally in English but expanded to other languages, and assigns parts of speech to each word and token, such as noun, verb, adjective, etc. The tagger also performs lemmatization and identifies and groups similar words according to their root form, use and meaning (Toutanova, Klein, Manning, & Singer, 2003).
Definition of Terms•Fruchterman-Reingold algorithm. Force directional algorithm to determine co-occurrence networks in KH Coder. Stabilization of force vectors determines node placement based on spring-like attraction and electric particle-like repulsion (Fruchterman, & Reingold 1991).
• Jaccard similarity coefficient. A Word frequency algorithm, dividing (frequencies of co-occurrence or intersection of a and b) by (frequency of appearance of word a + frequency of appearance of word b or union, – frequencies of co-occurrence of a and b). For example, in the case where the frequency of word a is 4, and the frequency of word b is 3, then the frequency of a and b is 2. As such, the Jaccard coefficient is 2/ (4+3-2) =0.4 (Mori, Matsuo, & Ishizuka 2004, p. 2).
Definition of Terms
Word Frequency•Word frequency: First and most basic way to identify themes. Key assumption is that frequently occurring words are important clues to the major themes of the text than words that occur less frequently (Ryan, 2003).
•Proper Noun. 68 for HR Class reflections•Tier 1: Two were high frequency Greece (44) and HR (25)•Tier 2: Beiersdorf, manufacturer of Nivea, and Demo Pharmaceuticals. Firms students toured in Greece and interviewed the management
Word Frequency•Noun. 373 total words, most used word was course (60)• Tier 1; course, research, company, project• Tier 2; student, employee, information, knowledge, experience, problem, organization, time, country, skill, class, interview, way, work, people • Tier 3; business, industry, opportunity, thing, addition, issue, team, career, example, culture, topic, lot, semester, department, life, manager, responsibility, study, challenge, environment, idea, term, trip, chance, goal, strategy, teacher
Word Frequency• Adjective. 153 total words • Tier 1: different (43)• Tier 2; future, new, important, good, right, useful, able, best,
international, integrative, confident, great, better, clear, economic, human, interesting, professional, real
• Adverb. 33 total words, most used adverb was abroad (7)• Tier 1; abroad, especially, finally, furthermore, really• Tier 2; actually, effectively, exactly, just, likewise, totally• Tier 3; additionally, basically, briefly, efficiently, externally, firstly,
greatly, hard, instead, internally, late, maybe, nowadays, precisely, prior, proficiently, properly, randomly, secondly, short, successfully
Word Frequency•Verb. 189 total words, and most used verb and most used word overall was learn (86).
•Tier 1; learn, help, make•Tier 2; know, write, face, gain, improve, work, think, create, study, relate, deal, let, provide, apply, develop, teach, understand, analyze, collect, meet•Tier 3; choose, follow, interview, notice, organize, recruit, use, visit, benefit, build, expand, happen, manage, need, overcome, solve, support, travel, communicate, cooperate, discover, discuss, encourage, feel, like, live, search, try
WORD FREQUENCY ANALYSIS for TAXATION vs HR
Part of Speech
HR 2015
Tax 2015
Tax 2016
Nouns 373 603 617Proper nouns
60 133 95
Adjectives 153 264 251Adverbs 33 59 69Verbs 189 296 288n= 25 23 28
Word Frequency
2015(Anzai and Matsuzawa, 2013)
tax 754income 299taxation 197
deduction 191C or S
corporation 173individual 146business 130
government 94learn 94pay 90rate 79
different 79project 78
use 73type 71
course 70form 69group 68
country 63expense 60
work 60help 57
taxpayer 56challenge 54
know 53people 48face 47
entity 47thing 46
understand 45liability 44
calculate 43profit 41
wealth 41case 40credit 40
corporate 40deduct 40apply 39time 38
include 37make 37work 37gross 37
example 36person 36
partnership 35service 35filing 33
important 33exemption 32personal 32standard 32married 31money 31
company 29information 29
revenue 29year 29like 29
Word Frequency
2016(Anzai and Matsuzawa, 2013)
tax 573taxation 201
learn 147income 136
business 131project 119C or S
Corporation 103course 99type 97
help (ful) 96know 87
deduction 86understand 80information 78
entity 74calculate 73
group 66work 65face 63thing 63
government 63different 62
make 60time 50
country 49knowledge 45
work 45mean 43
example 42challenge 41
people 41wealth 41include 41
rate 40new 40
individual 39experience 37
good 37use 36
member 36
lot 35difficult 34class 33
subject 32student 31difficulty 30
need 30study 30
problem 29calculation 29
future 29provide 28liability 28
way 28service 27
payment 26point 26value 26idea 26
consumption 26
Analysis of TAX Word Frequency • 6 out of 10 top words match up between 2015 and 2016• 70% Similarity between top 60 words• Unique to 2015: form, expense, report, taxpayer, liability, case, profit, credit, apply, gross, filing, important, exemption, personal, standard, married, year, like• Differences were tax related, i.e. AMT, IRS Publication 17 Individual Tax Guide, Standard Deduction vs Itemized, Foreign Tax Credit• Unique to 2016: mean, new, experience, good, member, lot, class, subject, student, need, study, problem, future, provide, liability, way, point, idea, consumption• Differences were teaching and learning related, i.e. Active, IELTS, Extended Outside the Classroom, Experiential
HIERARCHICAL CLUSTER ANALYSIS• Produces a treed dendrogram• Two objects are merged at every step, two which are least dissimilar.• Agglomerative bottoms-up approach i.e. Ward’s (1963) minimum variance method is used. Average and complete are other methods. • Tree branches can be cut to isolate construct categories •Often followed with Multi-dimensional scaling
@ACBSPAccredited #ACBSP2016
HR Custer Analysis: Workplace Trends
•Future Workforce = trip, career, future, issue, organization, addition, make•Business Management = manager, team, work, responsibility, business•Work/life Balance = idea, deal, life, study, create, country, lot, work, environment, think
Workplace Trends
Future Workforce
Business Management
Work/Life Balance
HR Custer Analysis: Global Mobility
•Talent and Ability = relate, term, skill, topic• Innovation = know, new, way, improve•Career = opportunity, write, industry, thing, student, semester• International Student Experience = course, learn, Greece, different, time, project, company, help, research, knowledge, people, gain, information, class, experience, interview, example, culture•HR Challenges= lot, important, HR, provide, problem, employee, face, department, challenge, good, right
GlobalMobility
Talent & Ability
Innovation
Career
International Student
Experience
HR Challenges
HIERARCHICAL CLUSTER ANALYSIS of TAXATION 2015 vs 2016
Cooperative
Learning
Business Taxation
Individual Taxation
Cooperative
Learning
Business Taxation
Understand-ing US Tax Code
2015 Cluster AnalysisMarried filing separately vs jointly Ways to lower taxes
Itemized Deductions
Tax Revenue and distribution of benefits
Tax Deductible Business Expenses
Sources of Income
Educating and understanding by non-US students
Student learning through group work
Business Entities
C1C2
C3
2016 Cluster AnalysisDealing with hard math problems
Teamwork skills: Being an effective team member
Business entity taxation
USA Study for International Students
Study skills and Learning
Taxation facts
Taxation and Justice
Taxable Income
Tax deductible business expenses
C1
C2 C3
Multi-dimensional Scaling
•Multi-dimensional scaling (MDS) for interval or ratio scaled data or with correspondence analysis for nominal data to obtain mapped observations in space. •Graphical way of finding groupings in the data. • Preferred in some cases because MDS has relaxed assumptions of normality, scale data, equal variances and covariances, and sample size.• For analysis mainly looking for clusters and dimensions
Multi-dimensional Scaling
• Purple = How to create an effective cross cultural training program = trip, problem, culture, HR, class write interview department topic.• Orange = Career Aspirations = relate, term, career, team, future,
industry.• Red = Participative Management = organization study idea issue
work important business manager. • Green = Management of the Classroom Environment =
responsibility, deal, semester, study, life, let, environment, create, think.• Blue = Career Skills that Travel Will Improve = good, improve,
opportunity, skill, thing, country.• Beige = How to Face Challenges = provide, right, lot, face, know,
way, people, gain, new, challenge, information, example.• Aqua = Knowledge Worker = work, addition, make, employee,
student learn, course, project Greece, different, research, company, help, time, knowledge, experience.
Multi-Dimensional
ScalingInternational HR
2015
Emergent Trends in Educating Today’s Knowledge Worker
Figure 6. HR Industry Project: Interpretation of Multi-dimensional Scaling.
. QRZOHGJH: RUNHU
3DUWLFLSDWLYH0 DQDJHP HQW &DUHHU$ VSLUDWLRQV
0 DQDJHP HQWRIWKH&ODVVURRP
( QYLURQP HQW + RZWR&UHDWHDQ
( IIHFWLYH&URVV&XOWXUDO7UDLQLQJ3URJUDP
&DUHHU6NLOOVWKDW7UDYHO: LOO, P SURYH + RZ WR) DFH
&KDOOHQJHV
Multi-Dimensional
ScalingTaxation 2015Two tax related clusters are delineated and two sets are agglomerated by pairs, all are delineated from the cooperative learning cluster
Multi-Dimensional
ScalingTaxation 2016Three distinct tax related clusters are delineated and a pair are agglomerated all are delineated from the two cooperative learning clusters
Co-occurrence Network• Pink color = company node = single most central concept • Darker Blue = Central concept following Pink. In the project, students researched and proposed solutions to specific HR problems uncovered at particular firms. • Learn = Largest node and also centrally located. • Lines = Network relationships, Thicker line = Stronger connection. • Strongest bond = career + future, 2nd strongest = project + industry. • Greece is center of its own cluster.• Responsibility and work related to each other only
Co-occurrence Network• The central cluster of the following words; learn, company, knowledge, experience, make, information, research and project appear to be significantly related, as core concepts. •Others which are darker blue but located more peripherally are topics of tangential importance. • Some words which logically seem to go together are connected such as face, department and challenge; or important, organization and employee being connected to company. Interestingly these two streams intersect first and then connect at the problem node
CentralityCo-occurrence
NetworkInternational
HR 2015
Clear and well networked central theme
CentralityCo-occurrence
NetworkTaxation 2015
Several strong network ties noted. One well networked central theme with three well networked sub themes
CentralityCo-occurrence
NetworkTaxation 2016Connections not as strong as for 2015. One well networked central theme with four sub-themes of which two are well networked
CommunitiesCo-occurrence
NetworkTaxation 2015Distinct communities of characteristics centered around a central theme
CommunitiesCo-occurrence
NetworkTaxation 2016Has four well networked communities and four that are less so. The central theme is relatively weak
Self Organizing Map• Construct groups that are colors closer to pink like
orange or red, asserts there is a large difference in vectors of neighboring nodes; they are distant. • Pink line if present, denotes a vast gulf dividing clusters. • Shades closer to blue, like purple and green are
proximally related to neighbors, • Shades closer to white, such as gray are more neutral
(Higuchi, 2010). • Constructs are listed below by color from proximal to
more distant and correlated with search hits shown in italics, for the construct. These relate to the map on the following page
Self Organizing Map• Blue = Challenges of Growing a Business and Employee Retention =
experience, challenge, good, employee, important, organization, lot, information, know, problem, company, face • Purple = General Manager = business, responsibility, country, manager,
work, term, study• Green = The Advanced Way = example, way, issue.• Gray = Travel Strategy project, industry, new, trip, people, different,
improve, create.• Brown = Study Abroad = life, addition, make, idea, student, study, let,
think, semester. • Red = Human Resource Management = course, career, future, team, thing,
HR, deal, Greece, culture, environment, learn, interview, department. • Orange = How to Write = write, relate, provide, right.• Pink = Research Skills = research, topic, time, skill, gain, help, work, class,
opportunity, knowledge.
Self Organizin
g MapInternatio
nalHR
2015
Study Abroad Travel StrategyResearch Skills
General Manager
The Advanced Way
How to Write
Human Resource Management Employee Retention
Challenges of Growing a Business
Theoretical model derived using tax constructs and search engine
Self Organizin
g MapTaxation
2015Distal
Proximal
NeutralThree
proximally
located
constructs, three that are distal and two that are neutral
Self Organizin
g MapTaxation
2016
Distal
DistalNeutral
Proximal
Three
proximally
located
constructs, three that are distal and two that are neutral
Q1. Is text-mining methodology, an effective way to reorganize, visualize and analyze text from business student
reflections?
•Yes, the five text mining methods from KH Coder contributed new ways to interpret the data. •The method has been well deployed as it is approaching 600 published studies. •The software is open source, works with many languages, and is easy to use.
Q2. Is text analysis from student reflections a meaningful way to assess creation of new knowledge in the classroom?
• Word frequency analysis = understanding mastery and use of language • Co-occurrence network allowed the identification of
differentiated concepts — central and tangential• Hierarchical cluster analysis, multi-dimensional
scaling and self-organizing map contributed to a theoretical model building. • Ordaining grounded theory, these theoretical models
appear as emergent properties. They are meaningful ways to assess creation of new knowledge in the classroom.
Q3. Was new knowledge created as a result of the learners experience
in the global classroom?• Models using hierarchical cluster analysis, multi-
dimensional scaling, and the self-organizing map along with the supporting information gained from word frequency analysis and the co-occurrence network assessed the creation of new knowledge.
• Models demonstrate alignment with learning outcomes from Higher Colleges of Technology courses.
• New knowledge was created in the classroom via exploratory theoretical models that explain what the student’s gained from completing the course or program
Author Bio• Dr. Alfred Miller, Mizzou alum, entrepreneur & company founder, • US Army veteran, holds a Ph.D. in E-Commerce from NCU, four MA’s from Webster, and a graduate of Wharton’s Global Faculty Development Program. • BAC/GRAD Commissioner, Editor ACBSP Region 8 Journal • Faculty and ACBSP Champion for Higher Colleges of Technology • Chair-Elect for ACBSP Region 8, representative to the Scholar Practitioner Publication Committee/Editorial Review Board •Member of the American Center for Mongolian Studies, the Gulf Comparative Education Society, and Higher Education Teaching and Learning’s Liaison for UAE and Ecommerce Discipline Officer
Thank You!• Dr. Alfred Miller (Ph.D.)• Commissioner, ACBSP BAC/GRAD Board• Business Faculty/ACBSP Champion/System
Representative• Chairman Elect of the Board of Directors, ACBSP Region
8• Editor Region 8 Journal• Higher Colleges of Technology-Fujairah Women’s College• Direct: 971 9 201 1325• Fax: 971 9 228 1313• Mobile: 971 50 324 1094
@ACBSPAccredited #ACBSP2015
The Corpus of Business Discourse
A Comparison of Accounting and HR Learners Dr. Alfred Miller
@ACBSPAccredited #ACBSP2016
ACBSP Region 8 The International Council of Business Schools and Programs