Profiling Data in Power BI
The process of learning more about data is known as data profiling. Any information that would assist in understanding the data would be helpful. For example, you likely want to know how many unique values you have in the column, what the minimum and maximum values are, the average and standard deviation, and so on. This is what Data Profiling can offer you.
The data profiling capabilities in Power Query Editor offer new and simple ways to clean, transform, and understand data. They are as follows:
- Column Quality
- Column distribution
- Column Profile
After you have loaded your data into the power query editor, you can profile your data easily. Go to the View tab. Here you can select multiple options in the “Data Preview” group.
After enabling the options you will be able to see the changes in the Power Query Editor.
The column quality feature categorizes values in rows into five groups:
- Valid: as indicated by the color green.
- Error: is highlighted in red.
- Empty: depicted in dark gray.
- Unknown: depicted in dashed green. When a column contains errors, the quality of the remaining data is unclear.
- Unexpected Error: highlighted in red.
You can see these here in the picture below:
Hovering over any of the columns displays the numerical distribution of the quality of values throughout the column.
This feature displays a collection of visualizations beneath the column titles that indicate the frequency and distribution of the values in each column. The data in these visualizations is sorted in descending order starting with the most frequent value.
Hovering over the distribution data in any of the columns displays information about the column’s overall data (with distinct count and unique values).
This feature gives you a more detailed look at the data in a column. It also includes a column statistics chart in addition to the column distribution chart. As illustrated in the image below, this information is displayed beneath the data preview section.
By hovering over the areas of the chart, you may interact with the value distribution chart on the right side and select any of the bars from the chart.
You can copy the column statistics by clicking on the right three dots button (…) and selecting “Copy”.
When you click the three dots button (…) in the upper-right corner of the value distribution chart, you can choose Group by in addition to Copy. This tool categorizes the values in your chart according to a set of accessible options.