considering the techniques for wrangling the transportation data...
TRANSCRIPT
Considering the Techniques for
Wrangling the Transportation Data
with Developing the Data Wrangling
Tool Based on Web Mashup Concept
A dissertation submitted to The University of Manchester for
the degree of Master of Science in the Faculty of Science and
Engineering
2016
Ganibek Zhakimbayev
School of Computer Science
Abstract
Nowadays the field of transportation attracts high attention while optimization of
transportation routes fosters the development of economics within the local regions.
At the level of cities, the optimization of transportation requires high-quality data. The
number of sources for this type of data had grown considerably together with
increasing the possibilities of information technologies. In these circumstances people
who are engaged in the transportation field faces difficulties in selecting and
preparation suitable datasets which frequently have considerable size and the format
which is not convenient for the analysis. Hence, one of the essential activity is Data
Wrangling which is the preparation the data in suitable form for the subsequent use.
The project focuses on the investigating the Data Wrangling techniques that could be
suitable for transportation data. The primary activity for performing this task is the
implementation of Traffic Data Wrangling Tool prototype. This prototype is
implemented using the Web Mashup concept that implies using the diversity of data
sources and development tools to mash them and obtain the integrated result that
has advantages compared to using different data and tools separately. After
consideration of project background and suitable sources of data, the prototype was
designed and implemented as a web application that is built using open source tools
such as Django Web Framework, Python Pandas library for scientific data analysis,
and D3 Java-Script library for a visual representation of data.
The objectives of the implementation were achieved. The experience has
demonstrated the effectiveness of using the mash of traffic data and open data from
online providers. Mashing the power of different tools which focus on various aspects
provides the possibility to use a domain-specific set of Data Wrangling functions in
one place including cleaning, transformation, and visualization of the datasets.
9