We are happy to bring updates to Datrics analytics platform. In this release we have enhanced the datasets and data sources management, and added monotone constraints to the LightGBM and XGBoost algorithms. The updates are intended to improve user control, flexibility, and cooperation on the platform, as well as to enhance interpretability, improve generalization, and facilitate faster convergence of LightGBM and XGBoost models.
This update brings several features and improvements that enhance the control and flexibility users have over their datasets. Let's explore the details of each user story included in this release.
In this update, we have implemented a robust rights management system for datasets. Users now have the ability to define roles for viewers and editors, allowing for more controlled access to datasets. Edit access is only available to the dataset creator, while any user can view the settings and data loaded.
We have introduced a new feature that allows the management of data sources at the group level. Data sources is created, edited at the group level, which means you do not need to connect to your data warehouse or object storage for each project. Data sources created previously in the project are migrated automatically to the group.
Datrics allows users to reuse same dataset in many pipelines, it’s useful when one has multiple data flows with the same data. You can now duplicate your dataset in the same of different project. You can create a copy of a dataset from the editor view by creating a new version, or via the context menu action.
The copying functionality has been enhanced to handle different scenarios:
Monotone constraints play a crucial role in XGBoost (Extreme Gradient Boosting) and LightGBM (Light Gradient Boosting Machine) algorithms. These constraints ensure that the model's predictions maintain the same directionality as the corresponding feature values, thereby capturing meaningful relationships between the features and the target variable.
Let's explore their importance in more detail for both XGBoost and LightGBM.
Here are key benefits of using monotone constraints for gradient boosting frameworks:
In the latest release, we have added monotone constraints setting to the LGBM (Binary, Multiclass, Regression) and XGBoost (Classification, Regression). Monotonicity means that as a feature value increases, the target variable should either increase or decrease consistently.
To train the model, add the relevant model brick to the scene and connect to the training dataset. All the models available in Machine Learning section or via search. Monotone constraints are part of the advanced mode settings. Select features that have monotone relations to the target variables of non-increasing or non-decreasing type.
In summary, monotone constraints are essential in both XGBoost and LightGBM algorithms as they enhance interpretability, improve generalization, facilitate faster convergence, enable accurate modeling, enhance performance, and reduce overfitting. By leveraging the monotonic relationships between features and the target variable, these constraints enable these gradient boosting frameworks to make more reliable predictions and provide valuable insights for decision-making tasks.
All features in the release.