This time we made great improvements to the platform features and functionality such as Big Data support, Binning, MSSQL Connector etc.
Datrics has changed its architecture from Pandas to Dask. Currently computations are available on any datasets with an infinite size.
All datasets are automatically converted to parquet format that ensures faster computations.
Now you can download the file that describes how your model was trained, which steps were performed to convert the initial data to the train dataset. It also describes all the requirements (python libraries) that were used to train the model.
We've updated the model performance dashboard with additional metrics: ROC, AUC, GINI.
Model performance dashboard was extended with the classification dashboard: confusion matrix, detailed classification report, precision/recall and ROC/AUC curves, and a lot more were added.
We've added the possibility to convert continuous variables to categorical depending on the distribution of the values. Binning functionality supports three auto-binning algorithms - quantile-based, interval-based, and optimal with respect to the Information Value categorization, which is extended with manual adjustment of the bins.
We've added the possibility to take the trained logistic model and calculate the effect of each binary variable to the resulting predicted score. Scorecard functionality supports two types of scores' scaling and allows to perform the adjustment of the scores based on the novel information.
Model "What-If API" now supports any size of models, predictions are performed in the dask cluster and support big data sets.
We've added the possibility to extract data from and store data to MS SQL databases.
We've added the possibility to select independent variables that are characterized by sufficient predictive power with respect to the binary target variable.
We've added the possibility to assess the predictive power of independent variables with respect to the binary dependent variable.
For some bricks, the distribution of the training data should be "memorized" for future runs on the unseen data. Freezing capabilities, we've added, allow to store the configuration of the brick and re-use the values for new data.
We've added additional binary classification model. This model has additional explainability (e.g. variable coefficients and their statistical significance, etc.) compared to other classification models.
We extended the analytics functionality with Exploratory Data Analysis brick, which provides more insights into analyzed data, like central tendency, dispersion, distribution and outliers.