The year is coming to the end, but we are not slowing down! New data previewer for more convenient and fast data exploration, optimized datasets usage in the pipeline and deployments’ notifications to Slack in Datrics latest release.
We have upgraded the previewer of the bricks’ data inputs and outputs. New data previewer allows analysts to explore the data within the pipeline more efficiently. Let’s dig deeper on what is new.
1. View the sample or full data
On each brick one may explore all the inputs and outputs, by going to “View data preview”. By default, the sample of the first data output with 1’000 rows will be displayed. You may select to display side by side up to 3 datasets, in case you need to analyze and compare visually data.It’s possible to customize the sample by defining the number of rows to be displayed, as well as the sampling strategy. Currently, data previewer support 3 options: from top, from bottom, from edges.You also may always switch to the full data view, so that the entire dataset will be loaded.
2. Filtering and sorting
We have added more options to work with the data in the table. Filter and sort the data by one or multiple column, define the columns sorting order.
3. Long string preview
Long string may be opened in a separate window, so that one is able to review the contents in more details. JSON will be automatically “beautified”. To open the string full view double click on the cell or press the view icon in the cell.
The last, but not the least, uninterrupted and focused work is crucial for the best results, that’s why we constantly improving the stability and performance of the service. Updated data previewer is optimized to work with big datasets.
For faster experimentation with the pipeline, Datrics caches the data retrieved from that data bases, therefore there might my some substantial time saved while creating and testing the pipeline.
After a dataset from a database is created, it will be automatically retrieving the live data on each run. You may turn on caching on the brick. On the next pipeline run, the dataset cache will be created and used on the consecutive runs, until you would like to recache the data or switch to the live connection.
Monitoring production with many deployed pipelines become simpler with the notifications to Slack. Subscribe to receive the notification about the all or only failed runs to Slack. Coordinate with the team on actions and tasks in the messenger you already use.
We never stop improving the analytical and data wrangling Datrics toolbox. In this release with have upgrades our commonly used bricks: Math formula and Missing value treatment.
We have added a bunch of new functions to the Math formula brick:
Logical:
Typization: ToString, ToNumeric, ToBoolean, ToDatetime
DateTime:
String functions: upper case, lower case
More about the brick in the Datrics documentation.
We have extended the missing values treatment option with ‘Choose from another column’.
We have added the meta information about the container bricks (Pipeline brick, For loop) to the metafile of the pipeline. You may analyze the efficiency of the algorithm within the compound bricks using this information. Thus, optimize where needed.