Profiler Object:Column-Based Statistics: Difference between revisions
No edit summary |
No edit summary |
||
Line 27: | Line 27: | ||
|C=ProfilerDataType = mdProfilerGetColumnInferredDataType(profiler, columnNameStr); | |C=ProfilerDataType = mdProfilerGetColumnInferredDataType(profiler, columnNameStr); | ||
|.Net=ProfilerDataType = profiler.GetColumnInferredDataType(columnNameStr); | |.Net=ProfilerDataType = profiler.GetColumnInferredDataType(columnNameStr); | ||
}} | |||
==GetColumnInferredColumnType== | |||
Along with the data type analysis, the Profiler will also analyses the column type and return. | |||
ProfilerColumnType enumerator that is the closest to what the column type is. This could be different than the column type specified by the user. Possible values that this function could return are: ColumnTypeInt1, ColumnTypeReal8, ColumnTypeBoolean, ColumnTypeDate, etc. | |||
This function takes one parameter. | |||
===Parameters=== | |||
{{ Object Parameters | |||
|Name=FieldName | |||
|Data Type=String | |||
|Description=A string value representing the Field Name. | |||
}} | |||
{{ Object Syntax | |||
|Syntax=ProfilerColumnType = profiler->GetColumnInferredColumnType(FieldName); | |||
|C=ProfilerColumnType = mdProfilerGetColumnInferredColumnType(profiler, FieldName); | |||
|.Net=ProfilerColumnType = profiler.GetColumnInferredColumnType(FieldName); | |||
}} | }} | ||
Revision as of 23:51, 14 May 2015
The column-based statistics should only be retrieved after ProfileData is called. These functions return column-specific details.
GetColumnInferredDataType
This function returns a column’s inferred data type in ProfilerDataType form. See ProfilerDataType Enumerations for details. The inferred data type is used to determine if a prevalent data type is seen for the majority of values in this column. For a deviant value to be returned (i.e., a value that differs from the user-specified data type), the count of that detected data type must exceed all other detected data type counts by at least 20%.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the column’s inferred data type. |
Syntax | profiler->GetColumnInferredDataType(columnNameStr); |
---|---|
C | ProfilerDataType = mdProfilerGetColumnInferredDataType(profiler, columnNameStr); |
.Net | ProfilerDataType = profiler.GetColumnInferredDataType(columnNameStr); |
GetColumnInferredColumnType
Along with the data type analysis, the Profiler will also analyses the column type and return.
ProfilerColumnType enumerator that is the closest to what the column type is. This could be different than the column type specified by the user. Possible values that this function could return are: ColumnTypeInt1, ColumnTypeReal8, ColumnTypeBoolean, ColumnTypeDate, etc.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
FieldName | String | A string value representing the Field Name. |
Syntax | ProfilerColumnType = profiler->GetColumnInferredColumnType(FieldName); |
---|---|
C | ProfilerColumnType = mdProfilerGetColumnInferredColumnType(profiler, FieldName); |
.Net | ProfilerColumnType = profiler.GetColumnInferredColumnType(FieldName); |
GetColumnSortation
This function returns a column's natural sortation. This is the sortation order seen in the values as they were input. In order for a column to be considered near-sorted, no more than 10% of the input values must be out of order.
This function accepts one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the column’s sortation information. |
Return Value
This function returns one of the following enumerations.
Enum Value | Sortation Type | Description |
---|---|---|
0 | SortUnknown | No sortation detected. |
1 | SortStringAscending | Values are sorted ascending, using a string comparison. |
2 | SortStringDescending | Values are sorted descending, using a string comparison. |
3 | SortNumericAscending | Values are sorted ascending, using a numeric comparison. |
4 | SortNumericDescending | Values are sorted descending, using a numeric comparison. |
5 | SortDateAscending | Values are sorted ascending, using date/time comparison. |
6 | SortDateDescending | Values are sorted descending, using date/time comparison. |
Syntax | profiler->GetColumnSortation(columnNameStr); |
---|---|
C | Sortation = mdProfilerGetColumnSortation(profiler, columnNameStr); |
.Net | Sortation = profiler.GetColumnSortation(columnNameStr); |
GetColumnSortationPercent
This function returns a percentage indicating how well a column is sorted. This is only reported for columns where GetColumnSortation returned a value other than SortUnknown. The sortation percentage is determined by counting the number of re-ordering values that would be required to put the list of values into a sorted state, and then dividing this value by the worst-case value (i.e., re-ordering required for a reverse-sorted list.)
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the sortation percentage |
Syntax | profiler->GetColumnSortationPercent(columnNameStr); |
---|---|
C | double = mdProfilerGetColumnSortationPercent(profiler, columnNameStr); |
.Net | double = profiler.GetColumnSortationPercent(columnNameStr); |
GetColumnMostPopularCount
This function returns the number of records that contain the most popular value.
This function takes one parameter
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the sortation percentage |
Syntax | profiler->GetColumnMostPopularCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnMostPopularCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnSortationMostPopulatCount(columnNameStr); |
GetColumnDistinctCount
This returns the number of distinct values in a column. Distinct values may have duplicates. A group of duplicate values is counted as 1 distinct value.
This function takes one parameter
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the Distinct count. |
Syntax | profiler->GetColumnDistinctCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnDistinctCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnDistinctCount(columnNameStr); |
GetColumnUniqueCount
This function returns the number of unique values in the specified column. Unique values do not have duplicates.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the unique count. |
Syntax | profiler->GetColumnUniqueCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnUniqueCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnUniqueCount(columnNameStr); |
GetColumnDefaultValueCount
This function returns the number of records that contained the default value set with the SetColumnDefaultValue function.
This function accepts one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the default value count |
Syntax | profiler->GetColumnDefaultValueCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnDefaultValueCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnDefaultValueCount(columnNameStr); |
GetColumnBelowRangeCount
This function returns the number of records with values that were below the lower bound set with the SetColumnValueRange function.
This function accepts one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the below range count |
Syntax | profiler->GetColumnBelowRangeCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnBelowRangeCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnBelowRangeCount(columnNameStr); |
GetColumnAboveRangeCount
This function returns the number of records with values that were above the upper bound set with the SetColumnValueRange function.
This funtion accepts one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the above range count |
Syntax | profiler->GetColumnAboveRangeCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAboveRangeCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAboveRangeCount(ColumnNameStr); |
GetColumnAboveSizeCount
This function returns the number of records with values that were longer than the length set with the SetColumnSize function.
This function takes one parameter
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the above size count. |
Syntax | profiler->GetColumnAboveSizeCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAboveSizeCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAboveSizeCount(columnNameStr); |
GetColumnAbovePrecisionCount
This function returns the number of records with numeric values that have a precision greater than the precision set with the SetColumnPrecision function.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the above precision count. |
Syntax | profiler->GetColumnAbovePrecisionCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAbovePrecisionCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAbovePrecisionCount(columnNameStr); |
GetColumnAboveScaleCount
This function returns the number of records with numeric values that have a scale larger than the scale set with the SetColumnScale function.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the above scale count |
Syntax | profiler->GetColumnAboveScaleCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAboveScaleCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAboveScaleCount(columnNameStr); |
GetColumnInvalidRegExCount
This function returns the number of records with values that did not match any of the regular expressions set with the SetColumnCustomPattern function.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the count of records that have not matched with the regular expression set with the SetColumnCustomPattern function. |
Syntax | profiler->GetColumnInvalidRegExCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnInvalidRegExCountCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnInvalidRegExCount(columnNameStr); |
GetColumnEmptyCount
This function returns the number of records with empty values. An empty value is not Null, can contain spaces, and has no string or value.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the empty count. |
Syntax | profiler->GetColumnEmptyCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnEmptyCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnEmptyCount(ColumnNameStr); |
GetColumnNullCount
This function returns the number of records with NULL values.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the column null count. |
Syntax | profiler->GetColumnNullCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnNullCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnNullCount(columnNameStr); |
GetColumnInvalidDataCount
This function returns the number of records where the value is inconsistent with the column type set with the AddColumn function. (e.g., If you set a column's column type to ColumnTypeInt1 and the input value is "John Smith", that’s considered Invalid Data and therefore the counter for this function will be incremented.)
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the column invalid data count. |
Syntax | profiler->GetColumnInvalidDataCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnInvalidDataCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnInvalidDataCount(columnNameStr); |
GetColumnInvalidUTF8Count
This function returns the number of records containing an invalid UTF-8 sequence.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get the invalid UTF8 count. |
Syntax | profiler->GetColumnInvalidUTF8Count(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnInvaidUTF8Count(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnInvalidUTF8Count(columnNameStr); |
GetColumnNonPrintingCharCount
This function returns the number of records containing non-printable characters. Printable characters are letters, numbers, punctuation, etc.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get non-printing character count. |
Syntax | profiler->GetColumnNonPrintingCharCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnNonPrintingCharCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnNonPrintingCharCount(columnNameStr); |
GetColumnDiacriticCharCount
This function returns the number of records containing diacritic characters. Diacritic characters are symbols added to letters of the alphabet to indicate different pronunciation than the letters are usually given.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get diacritic character count. |
Syntax | profiler->GetColumnDiacriticCharCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnDiacriticCharCount(profiler, ColumnNameStr); |
.Net | integer = profiler.GetColumnDiacriticCharCount(ColumnNameStr); |
GetColumnForeignCharCount
This function returns the number of records containing foreign characters. All diacritic characters are foreign characters, but not all foreign characters are diacritics.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get foreign character count |
Syntax | profiler->GetColumnForeignCharCount(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnForeignCharCount(profiler, ColumnNameStr); |
.Net | integer = profiler.GetColumnForeignCharCount(ColumnNameStr); |
GetColumnAlphaOnlyCount
This function returns the number of records that contain only alphabetic characters. This include spaces and punctuation but not numbers or symbols.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get Alphabetic character count. |
Syntax | profiler->GetColumnAlphaOnlyCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAlphaOnlyCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAlphaOnlyCount(columnNameStr); |
GetColumnNumericOnlyCount
This function returns the number of records that contain only numeric characters. This includes spaces and punctuation but not alphabetic characters or symbols.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get Numeric character count. |
Syntax | profiler->GetColumnNumericOnlyCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnNumericOnlyCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnNumericOnlyCount(columnNameStr); |
GetColumnAlphaNumericCount
This function returns the number of records that contain both alphabetic and numeric characters. This includes spaces and punctuation.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get alphanumeric character count. |
Syntax | profiler->GetColumnAlphaNumericCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnAlphaNumericCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnAlphaNumericCount(columnNameStr); |
GetColumnUpperCaseOnlyCount
This function returns the number of records that only contain upper-case alphabetic characters. This includes spaces, punctuation, and numbers.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get upper-case only count. |
Syntax | profiler->GetColumnUpperCaseOnlyCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnUpperCaseOnlyCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnUpperCaseOnlyCount(columnNameStr); |
GetColumnLowerCaseOnlyCount
This function returns the number of records that only contain lower-case alphabetic characters. This includes spaces, punctuation, and numbers.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get lower-case only count. |
Syntax | profiler->GetColumnLowerCaseOnlyCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnLowerCaseOnlyCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnLowerCaseOnlyCount(columnNameStr); |
GetColumnMixedCaseCount
This function returns the number of records that contain mixed-case (both upper and lower-case characters) alphabetic characters. This includes spaces, punctuation, and numbers.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get mixed-case chatacters count. |
Syntax | profiler->GetColumnMixedCaseCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnMixedCaseCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnMixedCaseCount(columnNameStr); |
GetColumnSingleSpaceCount
This function returns the number of records that contain multiple words separated only by a single space.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get single space count. |
Syntax | profiler->GetColumnSinlgeSpaceCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnSingleSpaceCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnSingleSpaceCount(columnNameStr); |
GetColumnMultiSpaceCount
This function returns the number of records that contain multiple words separated by more than one space, at-least once.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get multi-space count. |
Syntax | profiler->GetColumnMultiSpaceCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnMultiSpaceCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnMultiSpaceCount(columnNameStr); |
GetColumnLeadingSpaceCount
This function returns the number of records that contain one or more leading space.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get leading space count. |
Syntax | profiler->GetColumnLeadingSpaceCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnLeadingSpaceCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnLeadingSpaceCount(columnNameStr); |
GetColumnTrailingSpaceCount
This function returns the number of records that contain one or more trailing spaces.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get trailing space count. |
Syntax | profiler->GetColumnTrailingSpaceCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnTrailingSpaceCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnTrailingSpaceCount(columnNameStr); |
GetColumnMaxSpaces
This function returns the maximum number of spaces that occurred between words in the column values.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get maximum space count. |
Syntax | profiler->GetColumnMaxSpaces(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnMaxSpaces(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnMaxSpaces(columnNameStr); |
GetColumnMinSpaces
This function returns the minimum number of spaces that occurred between words in the column values.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get minimum space count. |
Syntax | profiler->GetColumnMinSpaces(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnMinSpaces(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnMinSpaces(columnNameStr); |
GetColumnTotalSpaces
This function returns the total number of spaces that occurred between words in the column values. This doesn’t include leading and trailing spaces.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get total spaces. |
Syntax | profiler->GetColumnTotalSpaces(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnTotalSpaces(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnTotalSpaces(columnNameStr); |
GetColumnTotalWordBreaks
This function returns the total number of word breaks found in the column values.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get total word breaks. |
Syntax | profiler->GetColumnTotalWordBreaks(ColumnNameStr); |
---|---|
C | integer = mdProfilerGetColumnTotalWordBreaks(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnTotalWordBreaks(columnNameStr); |
GetColumnAvgSpaces
This function returns the average number of spaces found between words in the column values.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get average spaces. |
Syntax | profiler->GetColumnAvgSpaces(ColumnNameStr); |
---|---|
C | double = mdProfilerGetColumnAvgSpaces(profiler, columnNameStr); |
.Net | double = profiler.GetColumnAvgSpaces(columnNameStr); |
GetColumnDecorationCharCount
This function returns the number of records with the values containing decorative characters. Decorative characters appear at the beginning or end of the value, and are tab, comma, pipe, and double-quote. This count is useful because it often indicates that field delimiters may have somehow found their way into the data stream.
Decorative character analysis is meant to detect bad data imports by flagging (returning result code QS07) and counting any delimiters that made their way to the values of your table.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to decorative character count. |
Syntax | profiler->GetColumnDecorationCharCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnDecorationCharCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnDecorationCharCount(columnNameStr); |
GetColumnProfanityCount
This function returns the number of records with values containing profanity.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get profanity count. |
Syntax | profiler->GetColumnProfanityCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnProfanityCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnProfanityCount(columnNameStr); |
GetColumnInconsistentDataCount
This function returns the number of records with values that are inconsistent with the Data Type you set with the AddColumn function. Record inconsistency is evaluated by analyzing a record's column values and determining what type of data each value represents. This determination is not an absolute, as a value can often be mistaken for another type.
This function takes one parameter.
Parameters
Name | Data Type | Description |
---|---|---|
ColumnName | String | Column Name to get Inconsistent data count. |
Syntax | profiler->GetColumnInconsistentDataCount(columnNameStr); |
---|---|
C | integer = mdProfilerGetColumnInconsistentDataCount(profiler, columnNameStr); |
.Net | integer = profiler.GetColumnInconsistentDataCount(columnNameStr); |