data collection and analysis in sociolinguistics
DESCRIPTION
Enrico Giai BA in Translating and Interpreting MA Student in Translation Studies Turin University Email: [email protected]. Data Collection and Analysis in Sociolinguistics. Practical elements for research methods in sociolinguistics. Turin , 07-08 April 2014. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/1.jpg)
Data Collection and Analysis in SociolinguisticsPractical elements for research methods in sociolinguistics
Enrico GiaiBA in Translating and InterpretingMA Student in Translation Studies Turin University
Email: [email protected]
Turin, 07-08 April 2014
![Page 2: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/2.jpg)
2 Tuesday, April 8th
Main topics Inferential statistics
Variables Hypothesis Null Hypothesis Likelihood Chi square test ANOVA
Rbrul for inferential and multivariate statistics
![Page 3: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/3.jpg)
3 Inferential Statistics –Variables Two types of variables
Dependent Independent
The independent variable(s) affect the dependent variable in some predictable way
Another classification (for questions): Category type variables (usually dependent variables) Ordinal type variables Continuous type variables (usually independent variables)
![Page 4: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/4.jpg)
4 Inferential Statistics –Experimental and Null Hypothesis Experimental hypothesis
The hypothesis according to which a certain variable is affected in a predictable & systematic way by some other variable
Must be tested
Null hypothesis: the exact opposite of the experimental hypothesis
![Page 5: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/5.jpg)
5 Inferential Statistics –Likelihood and Statistical Significance Likelihood, or statistical significance
The probability for the null hypothesis to be true Expressed by a percentage As a convention in the humanities and social sciences, we take 5%
sure that the null hypothesis is true (p = 0.05) as a cut-off point. Greater than 5% sure (p > 0.05), we cannot reject the null hypothesis; less than or equal to 5% sure (p ≤ 0.05), we reject the null hypothesis
(Levon 2010:71)
![Page 6: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/6.jpg)
6 Inferential Statistics – Chi Square Test (1) Related to 2 category type questions The test compares the observed frequencies with the expected
ones, in order to establish whether the null hypothesis is true or false
How to Calculate the observed frequencies Calculate the expected frequencies Calculate the chi squared values Sum the chi squared values up Calculate the degree of freedom If the critical value of significance is higher than the one related to p=0.05,
the null hypotesis will be true
You can use RBRUL or TEST.CHI.QUAD Excel formula
![Page 7: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/7.jpg)
7 Inferential Statistics – Chi Square Test (2) You can use TEST.CHI.QUAD Excel formula Example: occurrences of code-switching in relation to age brackets in
Filipino language survey N.B.: Age as a category type question because we consider age brackets!
![Page 8: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/8.jpg)
8 Inferential Statistics – Chi Square Test (3) Observed frequencies:
=(E3*B5)/E5 in H3 =(E3*C5)/E5 in I3 =(E3*D5)/E5 in J3
![Page 9: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/9.jpg)
9 Inferential Statistics – Chi Square Test (4) Chi Square Test in J5:
=TEST.CHI.QUAD(B3:D3;H3:J3)
The value is > 0.05, therefore the results were achieved by chance (NO STATISTICAL SIGNIFICANCE)
![Page 10: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/10.jpg)
10 Inferential Statistics – Scatterplot
Related to 2 continuous type questions
Compares the correlation between two variables Positive correlation Negative correlation
You can use RBRUL – see slide #56
![Page 11: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/11.jpg)
11 Inferential Statistics – ANOVA
ANalysis Of VAriance Bi/Multivariate Regression Analysis
Related to more than one category type question and more than one continuous type question
You can use RBRUL – see e-book
![Page 12: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/12.jpg)
12 Inferential and multivariate statistics Inferential statistics
Formulating and testing hypothesis Key concepts: likelihood, dependent and independent variables, hypothesis
and null hypothesis
Multivariate statistics, or statistical modelling How a dependant variable changes in relation to two or more independent
variables Key concept: the three lines of evidence (See Tagliamonte 2012)
Statistical significance (p<0.05) Factor weight (FW→1) Strength of factor group
![Page 13: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/13.jpg)
13 Rbrul and multivariate statistics Rbrul
Based on R Tool for multivariate statistics Input: Excel worksheet Output: numbers
What for? Formulating hypothesis after descriptive analysis of a
questionnaire/corpus Testing hypothesis with inferential multivariate analysis
What do we need? Excel worksheet in .csv format R
![Page 14: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/14.jpg)
14Converting .xlsx format to .csv (1)Let’s consider the Filipino language survey (.xls format)1. Go to http://www.docspal.com (or another online converter)
![Page 15: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/15.jpg)
15Converting .xlsx format to .csv (2)2. Upload .xls or .xlsx Excel file and select .csv in “convert to”
![Page 16: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/16.jpg)
16Converting .xlsx format to .csv (3)3. Click on “Convert”
![Page 17: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/17.jpg)
17Converting .xlsx format to .csv (4)4. Click on output file
![Page 18: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/18.jpg)
18Converting .xlsx format to .csv (5)5. Click on “Salva pagina con nome”
![Page 19: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/19.jpg)
19Converting .xlsx format to .csv (6)6. .csv output file
![Page 20: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/20.jpg)
20Rbrul: Installation step-by-step (1)1. Download R: http://cran.r-project.org/bin/windows/base/
![Page 21: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/21.jpg)
21Rbrul: Installation step-by-step (2)2. Press “Avanti” until the installation process finishes.
![Page 22: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/22.jpg)
22Rbrul: Installation step-by-step (3)3. Open R. If you have troubles, right-click “Esegui come amministratore”.
![Page 23: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/23.jpg)
23Rbrul: Installation step-by-step (4)4. Open R.
![Page 24: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/24.jpg)
24Rbrul: Installation step-by-step (5)5. Write: source(“http://www.danielezrajohnson.com/Rbrul.R”)
![Page 25: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/25.jpg)
25Rbrul: Installation step-by-step (6)6. Hit the Enter key
![Page 26: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/26.jpg)
26Rbrul: Installation step-by-step (7)7. Write rbrul()
![Page 27: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/27.jpg)
27Rbrul: Installation step-by-step (8)
8. Hit the Enter key. Now you are in Rbrul.
![Page 28: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/28.jpg)
28 Rbrul: Loading data (1)1. Write 1 and press the Enter key
![Page 29: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/29.jpg)
29 Rbrul: Loading data (2)2. Write c and press the Enter key
![Page 30: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/30.jpg)
30 Rbrul: Loading data (3)3. Open the questionnaire in .csv
![Page 31: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/31.jpg)
31Rbrul: Loading data (4)4. Now you are ready
![Page 32: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/32.jpg)
32 Example: Linguistic survey and RBRUL (1) Considered variables:
Code-switching (category type variable/question)
Who speaks what language(s) at work, with friends, & with family in IT & PH (continuous type question)
Who uses what language(s) when watching TV, reading, dreaming, & thinking (category type question)
Number of known languages (continuous type question)
Age (continuous type question)
![Page 33: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/33.jpg)
33 Example: Linguistic survey and RBRUL (2) Hypothesis:
Code-switching & who speaks what language(s) at work, with friends, & with family in IT & PH (cat+con: bivariate analysis)
Code-switching & Who uses what language(s) when watching TV, reading, dreaming, & thinking (cat+cat: cross tabulation)
Number of known languages & age (con+con: scatterplot)
![Page 34: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/34.jpg)
34
Hypothesis: Code-switching and language use (1)Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis.Is there a relation between the number of languages used to talk with friends in PH and in IT & the occurrences of code-switching?Average PH: 1.43Average IT: 1.31
Category+continuous: bivariate analysis
1. Press 5 for bivariate analysis and hit Enter key.
![Page 35: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/35.jpg)
35Hypothesis: Code-switching and language use (2) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis2. Choose variables (1)
![Page 36: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/36.jpg)
36Hypothesis: Code-switching and language use (3) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis3. Dependant variable (50)
![Page 37: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/37.jpg)
37Hypothesis: Code-switching and language use (4) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis4. Type of response (Enter)
![Page 38: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/38.jpg)
38Hypothesis: Code-switching and language use (5) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis5. Choose application (2 + Enter x3)
![Page 39: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/39.jpg)
39Hypothesis: Code-switching and language use (6) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis6. Choose independent variable (# lang used with Friends in IT/PH) (42 Enter 46 Enter x2)
![Page 40: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/40.jpg)
40Hypothesis: Code-switching and language use (7) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis7. Choose continuous variable (42 Enter 46 Enter x2)
![Page 41: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/41.jpg)
41Hypothesis: Code-switching and language use (8) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis8. Modelling (5 Enter)
![Page 42: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/42.jpg)
42Hypothesis: Code-switching and language use (9) Formulate hypothesis on code-switching and language use with friends
in IT/PH using bivariate analysis8. Modelling (5 Enter)
![Page 43: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/43.jpg)
43Hypothesis: Code-switching and language use (10)
Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysisLogodd: 0.571 vs 0.292 (If positive, high likelihood)Deviance: 142.818 vs 144.821 (The larger the deviance, the less accurate the result given)P value: 0.0644 vs 0.234 (>0.05)
Therefore: Correlation code-switching/language use with friends is NOT SIGNIFICANT
![Page 44: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/44.jpg)
44 Hypothesis: Code-switching and language use (11)
The same procedure can be adopted in analysing the relation between code-switching & language used with family & at work in Italy and in the Philippines
![Page 45: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/45.jpg)
45Hypothesis: Code-switching and language use – TV (1)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.Is there a relation between the languages used to watch TV & the occurrences of code-switching?
Category+category: cross tabulation1. Press 4 for cross tabulation and hit Enter key.
![Page 46: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/46.jpg)
46Hypothesis: Code-switching and language use – TV (2)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.2. Choose factors for columns (50 Enter)
![Page 47: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/47.jpg)
47Hypothesis: Code-switching and language use – TV (3)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.3. Choose factors for rows (51 Enter x3)
![Page 48: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/48.jpg)
48Hypothesis: Code-switching and language use – TV (4)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.4. Cross tabulation
Do those who watch TV in Italian code-switch more?
![Page 49: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/49.jpg)
49Hypothesis: Code-switching and language use – TV (5)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.5. Chi Square Test in Excel Effective frequency of Italian/Code-switching: 45 Expected frequency of Italian/Code-switching: 32.09
Multiply the total amount of observed frequencies related to the first independent variable (=45) and the total amount of observed frequencies related to its dependent variable (=87). The amount is then divided by the total amount of the frequencies (=122).
𝐸𝑥𝑝 . 𝑓𝑟𝑒𝑞 .=87∗45122 =𝟑𝟐 .𝟎𝟗
![Page 50: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/50.jpg)
50Hypothesis: Code-switching and language use – TV (6)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.5. Chi Square Test in Excel
![Page 51: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/51.jpg)
51Hypothesis: Code-switching and language use – TV (7)
Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test.5. Degree of freedom: (3-1)*(8-1)=2*7=14
0.1863 is the result of chi.sq.tst. The value referred to p=0,05 is 1.761. Our result is much lower. (p>0.40!)Therefore: there can’t be a relationship c.s/watching tv in other languages.
![Page 52: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/52.jpg)
52Hypothesis: Age and number of known languages (1)
Formulate hypothesis on age and number of known languages using scatterplot.Is there a relation between the number of known languages & age?
Continuous+continuous: scatterplot1. Press 6 for scatterplot and hit Enter key.
![Page 53: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/53.jpg)
53Hypothesis: Age and number of known languages (2)
Formulate hypothesis on age and number of known languages using scatterplot.2. Press 1 to enter scatterplot menu and select y-axis variable (2 Enter)
![Page 54: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/54.jpg)
54Hypothesis: Age and number of known languages (3)
Formulate hypothesis on age and number of known languages using scatterplot.3. Press 12 to enter scatterplot menu and select x-axis variable (12 Enter)
![Page 55: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/55.jpg)
55Hypothesis: Age and number of known languages (4)
Formulate hypothesis on age and number of known languages using scatterplot.4. Select standard layout and features (Enter x8)
![Page 56: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/56.jpg)
56Hypothesis: Age and number of known languages (5)
Formulate hypothesis on age and number of known languages using scatterplot.5. Scatterplot appears
![Page 57: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/57.jpg)
57
Hypothesis: Age and number of known languages (6) Formulate hypothesis on age and number of known languages using scatterplot.
The line is higher at the beginning and lower at the end: Negative relation
The elder the people, the higer the amount of known languages?YES, because of negative correlation
![Page 58: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/58.jpg)
58 To sum up1. Descriptive analysis – Hypothesis → Null Hypothesis
1 continuous + 1 category type question: comparison of means 2 category type questions: crosstabs 2 continuous type questions: scatterplot
2. Inferential analysis – Null hypothesis test → Significance 1 continuous + 1 category type question: bi/multivariate analysis (ANOVA) 2 category type questions: cross tabulation + chi squared test 2 continuous type questions: correlation
![Page 59: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/59.jpg)
59References
BLOOMER A. & WRAY A., 2006, Projects in Linguistics 2nd Edition, Hodder Arnold, London and New York. HON, K., 2013, “An Introduction to Statistics”, retrievable from the World Wide Web: http://www.artofproblemsolving.com/LaTeX/Examples/statistics_firstfive.pdfJOHNSON, D. E., 2009, “Getting off the GoldVarb Standard: Introducing Rbrul for Mixed-Effects Variable Rule Analysis”, in Language and Linguistics Compass 3/1 (2009): 359–383TAGLIAMONTE, S. A., 2012, “Quantitative Analysis”, in TAGLIAMONTE S. A., 2012, Change, Observation, Interpretation, Wiley Blackwell, Chichester TAMMINGA, M., 2011, “Getting started with Rbrul”, retrievable from the World Wide Web: http://www.danielezrajohnson.com/Getting_started_with_Rbrul.pdf SUNDERLAND J., 2010, "Research Questions in Linguistics", in Litosseliti L. (ed.), Research Methods in Linguistics, Continuum, London and New York: 9-28.
See e-book in my blog
![Page 60: Data Collection and Analysis in Sociolinguistics](https://reader035.vdocument.in/reader035/viewer/2022062410/56815c67550346895dca771c/html5/thumbnails/60.jpg)
60
Thank you