donors choose project (1)
TRANSCRIPT
![Page 1: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/1.jpg)
Funding Education through Donors ChooseGeneral Assembly 2016Fernando Hidalgo
![Page 2: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/2.jpg)
Problem Description
Task: Predict Whether a Donor’s Choose Project will get FundedExperience: Donor’s Choose Data from Sept 2002 - CurrentlyPerformance: Classification Accuracy, the Number of correct prediction out of all predictions made.
![Page 3: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/3.jpg)
The Data
![Page 4: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/4.jpg)
LabelsCompleted: 592,757
&
Expired:261,536Class Skewness:
Use F1 Score as a way to use recall and precision in check.
Baseline: .69
![Page 5: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/5.jpg)
Features Abbreviations Descriptions
total_price_excluding_optional_support Total Price of the Project (integer)(dollars)
students_reached # of students that are project reaches(integer)
school_type Types of School:Charter, magnet, year_round, nlns, kipp, Charter_ready_promise(categorical)
date_posted Day that the project was posted(categorical)
resource_type Type of Resources the project asks(categorical)
grade_level The Grade Level of the Project(categorical
poverty_level Poverty Level (categorial)
school_state From what state the project is posted(categorical)
Eligible_double_your_impact_matchWhether it was eligible to be matched(categorical
teacher_prefix The Prefix of the Teacher Posting(categorical)
primary_focus_area The Project’s Primary Area of Focus(categorical)
primary_focus_subject The Project’s Primary Subject of Focus(categorical)
Original Feature
s
![Page 6: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/6.jpg)
Feature Engineering
New Features Description
price_per_student total_price/students_reached
project_length Date_expiration - date_posted
month_posted Extracted from date_posted
day_posted Extracted from date_posted
![Page 7: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/7.jpg)
Visualizations
![Page 8: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/8.jpg)
![Page 9: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/9.jpg)
Rate of Projects Funded to Total Projects per Resource
![Page 10: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/10.jpg)
Rate of Projects Funded to Total Projects per Month
![Page 11: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/11.jpg)
Rate of Projects Funded to Total Projects per Grades
![Page 12: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/12.jpg)
Rate of Projects Funded to Total Projects per Primary Focus Area
![Page 13: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/13.jpg)
Rate of Projects Funded to Total Projects per Teacher Prefix
![Page 14: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/14.jpg)
Rate of Projects Funded to Total Projects per Poverty Level
![Page 15: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/15.jpg)
Relationship Between Project Length and Funding
![Page 16: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/16.jpg)
Relationship Between Project Price and Funding
![Page 17: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/17.jpg)
Relationship Between Price per Student and Funding
![Page 18: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/18.jpg)
Predictive Model
![Page 19: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/19.jpg)
The 3 Models:
1.AdaBoost
2.Random Forest
3.Logistic Regression
![Page 20: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/20.jpg)
GridSearch Accuracy Scores
using F1 Score Metric
Model Accuracy Best Parameter
Random Forest 0.759 Criterion: Entropy
AdaBoost .7676 N_estimators: 60
Logistic Regression 0.811 Penalty: L2
Simplest Model with Best Score:Logistic Regression
![Page 21: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/21.jpg)
Checking Feature Significance:
Using Random Forest Classifier
The top 5 Features Seem to Have Most of the Predictive Power
![Page 22: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/22.jpg)
Using Only the 5 Most Significant Features
1. Total_price_excluding_optional_su
pport
2. Eligible_double_your_impact_match
3. Resource_Type_Books
4. Resource_Type_Technology
5. price_per_student
New Score withLogistic Regression:
.8171
![Page 23: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/23.jpg)
Overview● Model Improvement of .1271 over the baseline using
Logistic Regression with F1 Score.
● Most of Predictive Power Lies in 5 Features
● Ethical Implications:○ The features with the most predictive power are not
ones that can be changed without fabrication
![Page 24: Donors Choose Project (1)](https://reader035.vdocument.in/reader035/viewer/2022070519/58f292b31a28ab564b8b457d/html5/thumbnails/24.jpg)
Model Improvements Add Prescriptive Data:
Project Essays Project Materials
Use Data Based on Location:Census
Skewed Data:Find Reasons
Methods