arxiv:2001.00818v1 [cs.lg] 1 jan 2020 · as one can imagine, democratizing ai is a multi-faceted...

12
A FRAMEWORK FOR DEMOCRATIZING AI APREPRINT Shakeel Ahmed, Ravi S. Mula, Soma S. Dhavala * {shakkeel, ravi, soma}@mlsquare.org January 6, 2020 ABSTRACT Machine Learning and Artificial Intelligence are considered an integral part of the Fourth Industrial Revolution. Their impact, and far-reaching consequences, while acknowledged, are yet to be comprehended. These technologies are very specialized, and few organizations and select highly trained professionals have the wherewithal, in terms of money, manpower, and might, to chart the future. However, concentration of power can lead to marginalization, causing severe inequalities. Regulatory agencies and governments across the globe are creating national policies, and laws around these technologies to protect the rights of the digital citizens, as well as to empower them. Even private, not-for-profit organizations are also contributing to democratizing the technologies by making them accessible and affordable. However, accessibility and affordability are all but a few of the facets of democratizing the field. Others include, but not limited to, portability, explainability, credibility, fairness, among others. As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare, we are developing scientific tools in this space. Specifically, we introduce an opinionated, extensible, Python framework that provides a single point of interface to a variety of solutions in each of the categories mentioned above. We present the design details, APIs of the framework, reference implementations, road map for development, and guidelines for contributions. 1 Introduction In the 21st century, we take it for granted how to harness the potential of electricity. It may seem incomprehensible that specialist engineers had to be hired to operate electric generators in situ at homes just to run light bulbs in the early days. It required a coming together of great minds, ripe business opportunities, technological breakthroughs, and abundance of patience, to turn electricity into what we now consider a mere utility. While Machine Learning(ML) and Artificial Intelligence(AI) have been around for a long time, with roots in various disciplines such as a Statistics, Computer Science, Neuroscience, Physics, Philosophy, among others, the ecosystem seems to be ripe for deriving utilitarian value out of them. We are already witnessing their proven utility in Information Retrieval, and e-commerce, to name a few. But it is just the tip of the ice-berg. They are a part of a much larger, sweeping 4th Industrial Revolution[1], whose ramifications are too significant to ignore, and too complex to fathom. As it stands today, AI is a specialized field, and a select few have the expertise required to shape them for the benefit of the society at large. A plausible solution is to democratize the very creation, distribution, and consumption of the technology. A concerted cooperation between public and private entities in creating both scientific and technological tools is necessary. Governments are reacting by creating national policies and task forces [2], and new laws are being formed to protect the rights of data citizens[3, 4]. Many institutions, both academic and private[5, 6, 7, 8, 9] are addressing this problem by heavily investing in advancing the science and technology. Below, we take a cursory look at some desirable attributes of AI that make it responsible, and responsive. Portable: Separate the creation of AI from its consumption JAVA transformed the enterprise software by making it portable across platforms – code once written, can be run everywhere. Likewise, intelligence (encoded in the form of models) once created, shall run anywhere. * corresponding author arXiv:2001.00818v1 [cs.LG] 1 Jan 2020

Upload: others

Post on 20-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A FRAMEWORK FOR DEMOCRATIZING AI

A PREPRINT

Shakeel Ahmed, Ravi S. Mula, Soma S. Dhavala∗{shakkeel, ravi, soma}@mlsquare.org

January 6, 2020

ABSTRACT

Machine Learning and Artificial Intelligence are considered an integral part of the Fourth IndustrialRevolution. Their impact, and far-reaching consequences, while acknowledged, are yet to becomprehended. These technologies are very specialized, and few organizations and select highlytrained professionals have the wherewithal, in terms of money, manpower, and might, to chart thefuture. However, concentration of power can lead to marginalization, causing severe inequalities.Regulatory agencies and governments across the globe are creating national policies, and laws aroundthese technologies to protect the rights of the digital citizens, as well as to empower them. Evenprivate, not-for-profit organizations are also contributing to democratizing the technologies by makingthem accessible and affordable. However, accessibility and affordability are all but a few of the facetsof democratizing the field. Others include, but not limited to, portability, explainability, credibility,fairness, among others. As one can imagine, democratizing AI is a multi-faceted problem, and itrequires advancements in science, technology and policy. At mlsquare, we are developing scientifictools in this space. Specifically, we introduce an opinionated, extensible, Python framework thatprovides a single point of interface to a variety of solutions in each of the categories mentionedabove. We present the design details, APIs of the framework, reference implementations, road mapfor development, and guidelines for contributions.

1 Introduction

In the 21st century, we take it for granted how to harness the potential of electricity. It may seem incomprehensible thatspecialist engineers had to be hired to operate electric generators in situ at homes just to run light bulbs in the early days.It required a coming together of great minds, ripe business opportunities, technological breakthroughs, and abundanceof patience, to turn electricity into what we now consider a mere utility. While Machine Learning(ML) and ArtificialIntelligence(AI) have been around for a long time, with roots in various disciplines such as a Statistics, ComputerScience, Neuroscience, Physics, Philosophy, among others, the ecosystem seems to be ripe for deriving utilitarian valueout of them. We are already witnessing their proven utility in Information Retrieval, and e-commerce, to name a few.But it is just the tip of the ice-berg. They are a part of a much larger, sweeping 4th Industrial Revolution[1], whoseramifications are too significant to ignore, and too complex to fathom. As it stands today, AI† is a specialized field, anda select few have the expertise required to shape them for the benefit of the society at large. A plausible solution isto democratize the very creation, distribution, and consumption of the technology. A concerted cooperation betweenpublic and private entities in creating both scientific and technological tools is necessary. Governments are reacting bycreating national policies and task forces [2], and new laws are being formed to protect the rights of data citizens[3, 4].Many institutions, both academic and private[5, 6, 7, 8, 9] are addressing this problem by heavily investing in advancingthe science and technology. Below, we take a cursory look at some desirable attributes of AI that make it responsible,and responsive.

• Portable: Separate the creation of AI from its consumptionJAVA transformed the enterprise software by making it portable across platforms – code once written, can berun everywhere. Likewise, intelligence (encoded in the form of models) once created, shall run anywhere.

∗corresponding author

arX

iv:2

001.

0081

8v1

[cs

.LG

] 1

Jan

202

0

Page 2: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

Some existing approaches are via porting models in PMML[10] format, or compile them to ONNX[11] format,implemented in winML[12]. Another approach is to embrace write-once, scale-anyhow paradigms likeSpark[13] and Julia[14]. Our approach is discussed in Section 2.1.

• Explainable†: Provide plausible explanations, along with predictions When decisions are made by AI agentsin high stakes environments, such as those encountered in FinTech, EdTech, and Healthcare, it is imperativethat a human being would be able to interpret those decisions in the context of the situation, and explain howthe agent has come to that decision[15, 16]. Tools such as LIME[17], CORELS[18], DoWhy[19], shapely[20]are some practical approaches to providing explanations. For an overview see here[21]. Our approach isoutlined in Section 2.2.

• Credible: Provide coverage intervals, and other uncertainty quantification metricsStatistical Software such as SAS R©,STATA R©, and many libraries in R[22], provide uncertainty estimates, alongwith predictions, in terms of standard errors and p-values. Bayesian counterparts produce credible intervals.Statistics community lays enormous importance on producing such inference summaries. Unfortunately, MLtools rarely provide such reports, as they are primarily concerned with prediction tasks alone. This problemis exasperated in the Deep Learning space, where they reportedly can make confident mistakes[23]. Howcan we make AI credible by not only making predictions, but also by reporting coverage intervals and otheruncertainty estimates around the predictions? For a curated list of papers, and tools, see [24].

• Fair: Make AI-bias free and equitableIt is said, what we eat is what we are and what we think is what we become. AI is no different. What ispredicted is based on what data the AI systems are fed. As a result, AI systems can make biased, unethicaldecisions, putting some sections of the community at risk. How can AI be made fair and equitable? Recent, ongoing work by IBM, provides some alternatives to detecting bias, and acting on it [25].

• Decentralized and Distributable - Let models go where data is, instead of the other way roundAll internet-first commercial players rely on user generated data to provide hyper personalized products.Although such products provide tangible benefits to the end users, they come at the cost of risking privatedata and security threats. A decentralized system wherein the data remains within the users’ premises canpotentially avoid such privacy issues. This pivoting has lead to the popularization of techniques such asSecure Multiparty Computation(SMPC) and Federated Learning. SMPC restricts exposing model weights,while Federated Learning takes care of securely distributing the training process to multiple worker nodes.Frameworks like PySyft[26] makes SMPC and Federated Learning intuitive and accessible to machine learningdevelopers. Decentralizing the distribution channel of data, and intelligence, can mitigate some risks ofcentralized control.

• Declarative: Say what the model needs and what it should do. Do not worry about how it should go aboutdoing itAn important aspect of ML model development is getting the right data, and model management. Manypractitioners estimate that more than 80% of Developer time goes in the ETL (Extract, Transform, Load) jobs.However, Developers should only be concerned with what they need and not how they should get the data. Thisproblem is largely addressed by the databases world with standards like SQ. Many databases, including NoSQLdatabases, support SQL for querying and retrieving the data. Project Morpheous[27] certainly looks veryinteresting as it makes graph and tabular data interoperable seamlessly. It is a practical alternative to associativearrays that attempt to unify RDBS, Graphs, and NoSQL databases[28]. That means that Data Scientist canleverage SQL/OpenCypher for asking what they need and offload the “getting” part to the database’s computeengines. While such open standards are not available for model specification, the workhorse ’lm’ packagein R allows models to be specified in the formula syntax. And SAS’s language certainly looks like the SQLanalogue for model specification. Can we make SQL/OpenCypher the de facto declarative language for ETLand develop a standard to declaratively specify the entire pipeline, including model specification and data flowspecification? For an outline of the proposal, see here[29].

• Reproducible: Reproduce any result, on-demandModel building and doing Data Science is experimental in nature. Unless carefully managed, reproducingthe results is next to impossible, particularly when working in distributed environments. At the very least, allresults shall be reproducible. It is possibly by ensuring that 1) Data 2) Models and 3) Run Time Environmentsare versioned. An open source ML-As-A-Service platform, daggit[30], promises to handle this responsibilityby using DVC[31], git[32] and docker[33], for versioning Data, Models, and Run Times, respectively.

The above list is neither comprehensive nor exhaustive. There are many other challenges to over come such as makingAI systems scalable, auditable, actionable, governable, discoverable among others. But, as it appears, solutions are†AI & ML, Explainability & Interpretability, Models & Algorithms are used interchangeably for the sake of simplicity

2

Page 3: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

fragmented, and a holistic viewpoint is missing. At mlsquare, we are developing a single point of interface to bringseveral solutions together. In the next Sections, we focus on the specific solutions we are building.

2 Democratizing AI at mlsquare

We provide an extensible Python framework that incorporates the above mentioned tools, where possible, and innovatewhere necessary. The following are the design goals:

• Bring Your Own Spec First. Use existing, well-documented APIs, such as those in sklearn for ML. Onlyprovide implementations or extend the functionality where lacking.

• Bring Your Own Experience First: Minimal changes shall be introduced in the development workflow. Forexample, with just one additional line of code to standard model fitting in sklearn, the resulting model canrun on a GPU.

• Consistent: All the quality attributes of AI described earlier shall become first class methods of a model objectwith a consistent interface. It should not be necessary to stitch different, possibly incompatible, functionalities.

• Compositional: Deep Learning, when looked from a technology perspective, is a lego block computationalframework for composing models. They can support inputs and outputs of varying shape, size, and nature.It allows many models to be expressed compositionally, without having to implement every piece of it. Thelego block architecture gives a lot of expressive power to the model designer. With backing from all hardwarevendors, and framework developers, hardware acceleration is an added benefit.

• Modular: Thanks to inherent modularity of Deep Learning technology, exploit the inherent object-orientednessof many algorithms. For example, when developing a Decision Tree equivalent in Deep Neural Network(DNN)[34], write the Decision Tree as a DNN layer. Then, one could immediately use it either in a classifi-cation or regression task. Even one can develop a Kernel Decision Tree. This modularization amplifies theexpressiveness.

• Extensible: All the provided implementations shall be extensible or default behaviours can be overwritten ifone chooses to.

In the following subsections, we describe our approach to providing portability and explainability.

2.1 Portability

As ML is practiced today, Data Scientist or an ML Scientist would write the models in a particular language (egPython), and a Production Engineer would rewrite them in a different language or reimplement them in a different techstack – scalability being the primary concern. This is called as the two language problem. One way to deal with thetwo language problem is to unify the tech stack. Languages like Julia claim that same piece of code can scale froma laptop to a cluster. Frameworks like Apache Spark also support multiple languages, and scale from a single CPUsystem to a cluster. But a developer is tied to a single ecosystem. Another way to achieve portability is by means byhaving an intermediate representation of the models such as PMML and ONNX. Many Deep Learning frameworks suchas a TensorFlow[35],PyTorch[36] and MXNet[37] support ONNX. Of interest is WinML which allows saving sklearn,xgboost[38] models in ONNX. It does so by reimplementing the models in a target language like TensforFlow. Whilesuch exact reimplementations might provide high fidelity portings, extending and supporting the models can be tediousand time consuming, if not impossible. Rather than providing such one-to-one operator level mappings of models, wesuggest a more generic semantic mapping of models as an efficient alternative. We utilize three different approachesthat enable this transpilation process, each with its own advantages and disadvantages. We focus on supervised tasks.

1. Exact Semantic MapWhen an exact equivalence between a user supplied model(referred to as primal model), and it’s neural networkcounterpart(referred to as proxy model) exists, we can initialize the weights of the neural network with theparameters of the fitted model, and just save the model. This can be achieved for many classical models thatfall under the Generalized Linear Models umbrella. An advantage of this approach is that the proxy model isfaithful to the primal model.

2. Approximate Semantic MapIn this approach, the proxy model is trained on same data as the primal model, but its target labels are thepredictions from the primal model. Further, the proxy model’s architecture is chosen so as to closely fulfillproxy model’s intent. Together, they ensure a better semantic approximation. In someways, the capacity of theproxy model is constrained by the primal model’s capacity. In Section cite results section, we compare the the

3

Page 4: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

performance of such transpiled models. On the theoretical side, a Probably Approximately Faithful (PAF)framework has to be developed.

3. Universal MapIn the third approach, both the intent and implementation can be delegated to the proxy model. In someways,neural networks are used as generic black box functional approximators. This may be useful when the primalmodels can not be fit on large data sets, or there is no theoretical backing for a semantic map. Say, a user isinterested in training a sklearn Logistic Regression model on a 1TB dataset. This would be extremelyhard with existing sklearn implementations. On the other hand, a proxy model can be easily scaled withthe underneath hardware. But the downside is that, fidelity between primal and proxy models may not bepreserved.

2.2 Explainability

Explainability of black-box machine learning models is extremely critical while developing machine learning appli-cations. Typically, explainability is focused on explaining what a model did or how a model was making predictions.While they are certainly useful for a Data Scientist, they are not really the explanations that an end-user needs. Instead,the explanations shall be about the predictions, and what an end user can do about them. We approach explainabilityfrom this perspective. In particular, we define, at an abstract level, explanations as predicates in a First Order Logicsystem, represented in Conjunctive Normal Form (CNF). A proxy model shall produce such predicates as its output. Inthat light, we see producing explanations as a synthetic language generation problem.

The explain method provides explanations of a model’s output. explain is model agnostic and provides interpretationsas a decision tree path. We use recurrent neural networks to train on the outputs of a localized decision tree for thegiven training dataset. When fed with a new unseen data point, this RNN would output the corresponding decision treepath traversed. This path provides us with the decision taken by the model at each feature and acts as an interpretationfor the same. explain is an ongoing research and is only available as a prototype.

3 Specifics of the Framework

The mlsquare framework utilizes multiple extensible components as shown in Fig.1. Portability in the framework isachieved with a single line of additional code. The user is expected to pass their standard machine learning model(primalmodel) to the dope function. dope then handles the responsibility of detecting and triggering the corresponding neuralnetwork architecture with the help of adapters and optimizers. Internally, a neural network architecture search(NAS) occurs and dope returns the optimal neural network model. The Python object returned is wrapped in such away that it’s interface remains identical to that of primal model sent to dope. Below, we outline the components that actas the foundation for dope’ing.

3.1 Dope

dope accepts the primal model as an argument(see Fig. 2). Additional model configuration can be passed as keywordarguments. This function handles the responsibility of mapping the primal model to its neural network equivalent.Once the right architectural parameters are obtained, dope delegates the responsibility of wrapping the neural networkmodel with the characteristics of the primal model to adapters. The adapter returns a Python Object that behaveslike the primal model but under the hood utilizes neural networks. As shown in Fig. 1, dope is assisted by two othercomponents - registry and adapters.

3.2 Adapter

Adapters act as a connector between the primal model and the proxy model (i.e. the neural network equivalent). Wecurrently support Keras[39] as the backend for our proxy models and sklearn for primal models. Adapters providemethods(ex - fit, score, predict etc.) that extend on the proxy model’s APIs. When the fit method is invoked, ittriggers the Optimizer. The Optimizer receives model configurations and returns a trained model. Adapter thenreturns the trained model to dope. The save method exports the trained model as a Keras and ONNX models. TheAdapter module can hold multiple Adapter classes for each type of "primal-proxy" mappings. For example, wecurrently have two classes in Adapters - SklearnKerasClassifier and SklearnKerasRegressor. A sklearnclassifier model, that needs to be mapped to a Keras model, can utilize the SklearnKerasClassifier adapter. Newadapters can be written for interacting with other scientific libraries.

4

Page 5: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

Figure 1: Components in the Framework

Figure 2: Converting a Standard ML Model using dope

5

Page 6: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

3.3 Optimizer

The neural network architecture of the proxy model can depend on the dataset it is being trained. Hence, we pass theproxy model through a neural architecture search (NAS) process. We use Tune[40] based on Ray[41] for our models’NAS process. The optimizer component searches for best model and returns it to the adapter. Configuration of tunecan be edited or reconfigured by passing the parameters via the fit method of the proxy model.

3.4 Registry

The registry object, like the name suggests, maintains a registry of models supported by ML Square. registry is apython dictionary object which is initialized while importing the mlsquare library. It expects a tuple of length twoas key - name of the algorithm and the package name. registry then returns the corresponding proxy model andthe adapter. registry can be used to register a model with the help of a decorator – just add @registry.registerdecorator in the models’ class definition. The model should contain adapter, model name and primal models’ modulename. Then, upon calling dope, a corresponding NAS can be triggered.

3.5 Architectures

The architectures module in mlsquare library provides the semantic mappings of the primal models. Itis a Python class with various attributes to define the mappings. These attributes include the correspondingadapter of the model, module name of the primal model, name of the algorithm and model parameters requiredto initialize the neural network. To keep track of multiple versions of a given proxy model, an additionalattribute called version is supported. There are three levels of class declarations involved in creating a modelmapping. BaseClass is the topmost abstract base class (ABC). This class ensures that the basic methods ex-pected for the functioning of a proxy model are not left undeclared. The next level is a generic parent classfrom which multiple algorithms can inherit common attributes. For example, the GenralizedLinearModel isused as a parent class for LinearRegression, LogisticRegression, RidgeRegression etc. The final level isdeclaring the proxy model itself as a class. This level includes declaring the key attributes required to create the model.

Taking a closer look at Fig. 3 reveals how the modular aspects of neural networks are being utilized inmlsquare to accomplish semantic mappings. Tweaking the activation and loss function set in the model parameters ofthe proxy model that inherits GeneralizedLinearModel can provide a range of different models. It provides a lotof flexibility for the developers in deciding how a model mapping should be declared.

4 Workflow

Figure 4 shows a typical user workflow while utilizing mlsquare’s dope functionality. Such a workflow involves threetouchpoints with dope while transpiling the model.

1. import the dope function

2. pass the primal model to the dope function

3. use the transpiled model to train, score, predict or save it for future use

The following subsections provide details on the different methods available on the transpiled proxy model.

4.1 Developer Experience

The code snippet below demonstrates a typical workflow of training and scoring a Sklearn Linear Regression model.

1 from mlsquare import dope2 from sklearn.linear_model import LinearRegression3 from sklearn.preprocessing import StandardScaler4 from sklearn.model_selection import train_test_split5 import pandas as pd6 from sklearn.datasets import load_diabetes7

8 model = LinearRegression ()9 diabetes = load_diabetes ()

10

6

Page 7: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

Figure 3: UML Representation of architecture

Figure 4: Workflow representation

7

Page 8: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

11 X = diabetes.data12 sc = StandardScaler ()13 X = sc.fit_transform(X)14 Y = diabetes.target15

16 x_train , x_test , y_train , y_test = train_test_split(X, Y, test_size =0.60,random_state =0)

17

18 m = dope(model)19

20 m.fit(x_train , y_train)21 m.score(x_test , y_test)

Listing 1: Usage of dope

As mentioned earlier, adding one additional line of code(line no. 18) converts the sklearn model into a neural network.The object returned by dope() can then be used like any other sklearn model. The API interface remains consistent,with the only difference being that the training and scoring are performed on a neural network. The capabilities of thisextended model are explained in detail in the following sections.

4.1.1 fit() method

Below shown is a code snippet for calling the fit() method.

1 # Train with default mappings2 m.fit(x_train , y_train)3

4 # Train with additional parameters5 # tune by ray package is the NAS backend used by mlsquare6 from ray import tune7 # Define the hyperparameters you wish to customize8 params = {’optimizer ’:{’grid_search ’:[’adam’, ’nadam’]}}9 m.fit(X_train , y_train , params=params)

Listing 2: Customizing Neural Architecture Search

When the fit method is called a neural architecture search is performed. If no additional parameters are given (lineno.2), the default mappings are used to create and train the neural network. dope provides the option to customizethe hyperparameters used for neural architecture search (line no. 8 and 9). In the above example, we customize the"optimizer" of the neural network. This, in turn, does a grid search on adam and nadam optimizers and returns the bestperforming model.

4.1.2 score() and predict() methods

The dope’d model contains both the score and predict methods like a sklearn model.

1 # Scoring your models2 m.score(x_test , y_test)3 # Sample output4 # 90/90 [==============================] - 0s 244us/step5 # [0.5179819915029737 , 0.911111111442248]6

7 # Predict outputs8 y_pred = m.predict(x_test)

Listing 3: Score and Predict methods

In the above example, the score method returns a list with two values - loss and accuracy. The predict method returns anarray of predicted values.

4.1.3 Saving dope’d model

The save method available on the dope’d model allows exporting the trained model in two formats - ONNX and HDF5.The method expects a file name while saving the model.

8

Page 9: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

1 # Save method expects file name as an argument2 m.save(filename="my_model")

Listing 4: Saving a model

The ONNX file can then be loaded, converted or used directly in different runtimes. We could score a sklearn modelin chrome browser.

5 Results

The framework currently contains a collection of seven sklearn models mapped to their neural network equivalents.Table 1 shows the performance comparison of each of these models. The comparisons are done by training andscoring each algorithm on a relevant dataset. The training and scoring are done on the standard sklearn model and itstranspiled equivalent dope’d model.

Performance comparisonAlgorithm Dataset Name Dope (acc/r2) sklearn (acc/r2)Logistic Regression Iris 0.84 0.82Linear Regression UCI Diabetes 0.42 0.40Ridge Regression UCI Diabetes 0.44 0.42Lasso Regression UCI Diabetes 0.43 0.43ElasticNet Regres-sion

UCI Diabetes 0.44 0.44

Linear SVC Iris 0.84 0.91Decision Tree Classi-fier

Iris 0.96 0.93

Table 1: Comparison of dope’d models vs Standard sklearn models

The datasets chosen were UCI Iris for regression algorithms and UCI Diabetes datasets[42] for classification algorithms.The performance metric used was accuracy and r2 score for classification and regression respectively. Table 1results clearly shows that the performance of models transpiled by dope are on par with their sklearn counterparts’performance.

A more exhaustive collection of trial runs and their corresponding results can be found in this link. This spreadsheetcontains performance and configuration details of each new algorithm added to mlsquare. Once a new algorithm isadded, a Python script trains and scores the algorithm with relevant datasets. By making it openly available to view theresults, it helps us maintain accountability and quality of new algorithms added.

6 Roadmap

Table 2 depicts the available features[X], ongoing work[�] and future research[�] on the mlsquare framework. One of thegoals of building this framework is to propose and demonstrate the feasibility of a machine learning framework that can ease theprocess of democratizing AI, utilizing existing tools with very minimal changes. The mlsquare framework currently supports theportability feature(.save() method). It can convert above mentioned sklearn methods to a neural network model. We used Kerasas our primary framework to create the transpiled neural network. So far our focus has been in developing the right architecture thatcan support a wide range of existing frameworks and in prototyping different custom features like save(), explain(), nas() andso forth.

The following list provides a gist of our ongoing research:

1. Extending save() to XgBoost, IRT[43], Surprise[44] and DeepCTR[45]: Supporting these widely used frameworkswill help us make deep learning techniques more accessible to a broader audience.

2. Adding support for Deep Random Trails(DeRT) in explain(): DeRT is an in-house explainability technique. We arecurrently working on on-boarding this tool to the explain method.

3. Adding support for PyTorch: As mentioned earlier, we currently use Keras to define the neural networks. We will soonbe extending support to PyTorch as well.

4. Extending neural architecture search capabilities with AutoKeras[46].

5. Bringing uncertainty quantification as a first class method on a model object.

9

Page 10: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

Framework progressML Square Supported module Algorithm Status

.save()sklearn

Linear Regression XRidge Regression XLasso XElasticNet XSupport Vector Ma-chines

X

Decision Tree Classi-fier

X

Linear DiscriminantAnalysis

Decision Tree Regres-sor

Multi-task Lasso �Multitask ElasticNet �Passive AggressiveClassifier

Robust Regression �IRT �DeepCTR �XgBoost �Surprise �

.explain()DeRT �Interpret �Shapley �

.nas()Tune XAutoKeras �

.ci() �

Table 2: Progress of mlsquare features

Contributing

mlsquare is an open-source framework built by committed and enthusiastic volunteers. We strongly believe in the power ofcommunity-driven solutions. If you find mlsquare’s mission exciting, you can contribute to the framework in the following ways -

1. Refer to this link to add new algorithms to dope that you might find be useful to other developers.

2. Please reach out to us at [email protected], if you are interested in any of the ongoing research or yet-to-begin projects,marked as [�] and [�], respectively in Table 2.

3. This spreadsheet mentioned in Section 5 is an internal tool used by mlsquare to keep a track of the performance of everyalgorithm available on the framework. We would like to add more datasets and create an openly accessible API to executetraining and inference sessions for anyone interested in contributing. If you know of an interesting dataset or is interestedin extending this tool, please do let us know by dropping an email at [email protected].

4. We are open to contributions at all levels(from documentation to architectural design inputs). To suggest changes to theframework, you can raise an issue on our GitHub repository.

7 Conclusion

In this paper, the need for democratizing AI, and several of its facets are introduced. An extensible Python framework is proposedtowards that end. Details of different components of the framework and their responsibilities are discussed. The frameworkcurrently provides support for porting a subset of sklearn models to their approximately faithful Deep Neural Network counterparts,represented in ONNX format. An important take-away is that, Deep Learning architectures can be constrained to reflect well-understoodmodeling paradigms. It debunks a common misconception that Deep Learning requires big data. In order to democratize the verycreation of this framework, and bring transparency into the process, we have created a leader board to access the quality metricsof the semantic maps. Several enhancements and extensions are in pipeline. A process is to put in place to allow community tocontribute.

10

Page 11: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

References

[1] Klaus Schwab. The Fourth Industrial Revolution. Crown Publishing Group, New York, NY, USA, 2017.

[2] NITI Ayog. National Strategy For Artificial Intelligence. http://bit.ly/36haUIy, 2018.

[3] GDPR. https://en.wikipedia.org/wiki/General_Data_Protection_Regulation.

[4] US Congress. FUTURE of Artificial Intelligence Act of 2017. https://www.congress.gov/bill/115th-congress/house-bill/4625/text.

[5] MIT Technology Review. MIT has just announced a $1 Billion plan to build a new college for AI. http://bit.ly/2QeODFN,2019.

[6] OpenAI. https://openai.com/.

[7] Allen Institute for Artificial Intelligence. https://allenai.org/.

[8] Google. AI for Everyone. https://www.h2o.ai/democratizing-ai/, 2019.

[9] h2o.ai. Machine Learning for all: the democratizing of a technology. https://www.h2o.ai/democratizing-ai/, 2019.

[10] R. Grossman, S. Bailey, A. Ramu, B. Malhi, P. Hallstrom, I. Pulleyn, and Xiao Qin. The management and mining of multiplepredictive models using the predictive modeling markup language. Information and Software Technology, 41(9):589 – 595,1999.

[11] Junjie Bai, Fang Lu, Ke Zhang, et al. Onnx: Open neural network exchange. https://github.com/onnx/onnx, 2019.

[12] Microsoft. Windows ml. https://github.com/microsoft/Windows-Machine-Learning.

[13] Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen,Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. Apache Spark: Aunified engine for big data processing. Communications of the ACM, 59(11):56–65, 2016.

[14] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B Shah. Julia: A fresh approach to numerical computing. SIAMreview, 59(1):65–98, 2017.

[15] Soma S. Dhavala. Ai for education. The Blue Dot Magazine, 9:47–49, 2018.

[16] Cynthia Rudin. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable ModelsInstead. arXiv e-prints, 11 2018.

[17] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why Should I Trust You?”: Explaining the Predictions of AnyClassifier. arXiv e-prints, 02 2016.

[18] Elaine Angelino, Nicholas Larus-Stone, Daniel Alabi, Margo Seltzer, and Cynthia Rudin. Learning Certifiably Optimal RuleLists for Categorical Data. arXiv e-prints, 04 2017.

[19] DoWhy: A Python package for causal inference. https://github.com/microsoft/dowhy, 2019.

[20] Scott M Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb,Nisha Bansal, and Su-In Lee. Explainable ai for trees: From local explanations to global understanding. arXiv preprintarXiv:1905.04610, 2019.

[21] Molnar C. Interpretable Machine Learning: A Guide For Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/.

[22] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna,Austria, 2013.

[23] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions forunrecognizable images. Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, 2015.

[24] Uncertainty Quantification in Deep Learning. Literature survey, Paper reviews, Experimental Setups. https://github.com/ahmedmalaa/deep-learning-uncertainty, 2019.

[25] Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia,Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John Richards,Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. AI Fairness 360: An extensibletoolkit for detecting, understanding, and mitigating unwanted algorithmic bias. https://arxiv.org/abs/1810.01943,October 2018.

[26] Theo Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach.A generic framework for privacy preserving deep learning. arXiv e-prints, 11 2018.

[27] OpenCypher. Morpheus: Bringing Cypher to Apache Spark. https://github.com/opencypher/morpheus.

[28] Jeremy Kepner, William Arcand, William Bergeron, Nadya Bliss, Robert Bond, Chansup Byun, Gary Condon, KennethGregson, Matthew Hubbell, Jonathan Kurz, Andrew McCabe, Peter Michaleas, Andrew Prout, Albert Reuther, Antonio Rosa,and Charles Yee. Dynamic distributed dimensional data model (d4m) database and computation system. In ICASSP, IEEEInternational Conference on Acoustics, Speech and Signal Processing - Proceedings, pages 5349–5352, 2012.

11

Page 12: arXiv:2001.00818v1 [cs.LG] 1 Jan 2020 · As one can imagine, democratizing AI is a multi-faceted problem, and it requires advancements in science, technology and policy. At mlsquare,

A PREPRINT - JANUARY 6, 2020

[29] Soma S Dhavala. Dagger: A DSL for Declarative DataScience. http://bit.ly/2ZFk13v.

[30] Sunbird. ML Workbench. https://github.com/project-sunbird/sunbird-ml-workbench.

[31] DVC. Open Source Version control for Machine Learning Projects. https://dvc.org/.

[32] git. . https://git-scm.com/.

[33] Dirk Merkel. Docker: Lightweight linux containers for consistent development and deployment. Linux J., 2014(239), March2014.

[34] Yongxin Yang, Irene Garcia Morillo, and Timothy M. Hospedales. Deep Neural Decision Trees. arXiv e-prints, 06 2018.

[35] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis,Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, YangqingJia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore,Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, VincentVanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, andXiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/,2015. Software available from tensorflow.org.

[36] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin,Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, AlykhanTejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style,high-performance deep learning library. Advances in Neural Information Processing Systems 32, pages 8024–8035, 2019.

[37] Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and ZhengZhang. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arXiv e-prints,Dec 2015.

[38] Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining, pages 785–794, 2016.

[39] François Chollet et al. Keras. https://keras.io, 2015.

[40] Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. Tune: A Research Platformfor Distributed Model Selection and Training. arXiv e-prints, 07 2018.

[41] Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang,William Paul, Michael I. Jordan, and Ion Stoica. Ray: A Distributed Framework for Emerging AI Applications. arXiv e-prints,12 2017.

[42] Dheeru Dua and Casey Graff. UCI machine learning repository. http://archive.ics.uci.edu/ml, 2017.

[43] Item Response Theory. https://en.wikipedia.org/wiki/Item_response_theory.

[44] Nicolas Hug. Surprise, a Python library for recommender systems. http://surpriselib.com, 2017.

[45] Shenweichen. Easy-to-use,Modular and Extendible package of deep-learning based CTR models. https://github.com/shenweichen/DeepCTR.

[46] Haifeng Jin, Qingquan Song, and Xia Hu. Auto-Keras: An Efficient Neural Architecture Search System. arXiv e-prints, 062018.

12