🔶Dataset details section
Dataset details reveal data quality for the entire dataset using a donut chart. The number is derived from the collective quality of the individual columns.
You will see this section for the first time when a dataset loads onto the data preparation screen, and whenever none of the columns are selected.
Dataset details display the following information.
- Sample rows
- Sample strategy (includes Random, Erroneous, Column based, and Initial data samples)
- Total rows
- Number of columns
- Number of data types in the dataset
- Overall dataset data quality as a donut chart.
The donut chart splits data into a percentage of valid data, invalid data, and missing values. Click on the sections of the donut chart to selectively view valid, invalid, and missing values in your dataset.
🔶Sample strategy
Generating a sample is essential in speeding up the transformations performed. It is performed by taking a sample from the entire data using various strategies. The Initial sample strategy is used when the dataset is imported for the first time. You can change the strategy at any point during the data preparation process. Click on the edit icon in the dataset details panel to change the sample strategy.
The different sample strategies available are: - Initial sample: Generated from the first 5 MB data of the imported file.
- Random sample: Randomly selected rows from the imported file.
- Erroneous sample: Rows containing invalid or missing entries.
- Column based sample: Generated based on the distinct values from the selected column.
🔶Intelligent suggestions
DataPrep suggests transforms based on the imported data and makes for effective data preparation. Suggestions are shown when one or multiple columns are selected, it is also shown when a filter is applied. - When you click one of the suggested transform, you will be taken to the Studio panel with a live preview of the transformation to be applied to your data.
- You may choose to edit the options and conditions in the operation bar before applying the suggested operation.