Profiler Object:Initialization: Difference between revisions

From Melissa Data Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 100: Line 100:
|Description=A string value representing the software license key.
|Description=A string value representing the software license key.
}}
}}
===Return Value===
===Return Value===
The '''SetLicenseString''' function returns value of 0 or 1.
The '''SetLicenseString''' function returns value of 0 or 1.
Line 113: Line 112:


==SetPathToProfilerDataFiles==
==SetPathToProfilerDataFiles==
The '''SetPathToProfilerDataFiles''' function sets the the path to the data files required for profiling.
This function sets the the path to the data files required for profiling.


The required data files are
The required data files are
Line 135: Line 134:


==SetFileName (Optional)==
==SetFileName (Optional)==
This function sets the path/file name of profiling output file. The processing results will be stored in the output file. This function accepts one parameter
Optional. This function sets the path/file name of profiling output file. The processing results will be stored in the output file.
 
This function accepts one parameter


===Parameters===
===Parameters===
Line 151: Line 152:


==SetAppendMode (Optional)==
==SetAppendMode (Optional)==
This function sets the mode for how the output file needs to be handled. The default is overwrite mode if this function is not called.
Optional. This function sets the mode for how the output file needs to be handled. The default is overwrite mode if this function is not called.


Note that if a Profiler file is opened in append mode, certain statistics will not be accurate, as some results cannot be carried over from run to run. Specifically: TableExactMatch, TableContactMatch, TableHouseholdMatch,TableAddressMatch, ColumnInferredDataType, ColumnSortation, and ColumnSortationPercent.
If a Profiler file is opened in append mode certain statistics will not be accurate, as some results cannot be carried over from run to run. Specifically: TableExactMatch, TableContactMatch, TableHouseholdMatch,TableAddressMatch, ColumnInferredDataType, ColumnSortation, and ColumnSortationPercent.


If you intend to open the file in Append mode, you will want call '''[[#SetSortAnalysis|SetSortAnalysis()]]''', '''[[#SetMatchUpAnalysis|SetMatchUpAnalysis()]]''' and '''[[#SetRightFielderAnalysis|SetRightFielderAnalysis()]]''' appropriately.
If you intend to open the file in Append mode, you will want to call '''[[#SetSortAnalysis (Optional)|SetSortAnalysis]]''', '''[[#SetMatchUpAnalysis (Optional)|SetMatchUpAnalysis]]''' and '''[[#SetRightFielderAnalysis (Optional)|SetRightFielderAnalysis]]''' appropriately.


The '''SetAppendMode''' function takes a value of the enumerated type '''AppendMode'''.
The '''SetAppendMode''' function takes a value of the enumerated type '''AppendMode'''.
Line 170: Line 171:
|1
|1
|Report
|Report
|Open the profile table so that reporting may be done. No new records can be appended to the profile run.  
|Open the profile table so that reporting may be done. No new records can be appended to the profile run.
|-
|-
|2
|2
Line 180: Line 181:
|Output profile table must not exist.
|Output profile table must not exist.
|}
|}


===Parameters===
===Parameters===
Line 195: Line 197:


==SetUserName (Optional)==
==SetUserName (Optional)==
This function sets the user name for a particular run. This is optional. This function accepts one parameter.
Optional. This function sets the user name for a particular run.
 
This function accepts one parameter.


===Parameters===
===Parameters===
Line 210: Line 214:


==SetTableName==
==SetTableName==
This function sets the name of the table. This function takes one parameter
This function sets the name of the table.
 
This function accepts one parameter


===Parameters===
===Parameters===
Line 226: Line 232:


==SetJobName (Optional)==
==SetJobName (Optional)==
This function sets the job name for a particular run. This is optional. This fuction accepts one parameter
Optional. This function sets the job name for a particular run.
 
This fuction accepts one parameter


===Parameters===
===Parameters===
Line 242: Line 250:


==SetJobDescription (Optional)==
==SetJobDescription (Optional)==
This function sets the job description for a particular run. This is optional. This function accepts one parameter.
Optional. This function sets the job description for a particular run.
 
This function accepts one parameter.


===Parameters===
===Parameters===
Line 258: Line 268:


==SetSortAnalysis (Optional)==
==SetSortAnalysis (Optional)==
This function omits the sortation analysis. Sortation analysis can consume a good amount of the '''AddRecord()''' profiling time. This time penalty grows geometrically as more records are added, so it can be quite costly. As many users are not interested in this statistic, it is a good candidate for omission.
Optional. This function omits the sortation analysis which can increase '''AddRecord''' profiling time. This time penalty grows geometrically as more records are added. If you are not interested in this statistic, disable it to decrease your profiling time.


The default value is "true." This function was made available to allow bypassing a time-consuming analysis that may not be required by the user. This property can be turned off by passing "false" for the value. This property should be set before '''InitializeDataFiles()''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.


===Parameters===
===Parameters===
Line 266: Line 276:
|Name=SortAnalysisValue
|Name=SortAnalysisValue
|Data Type=Int
|Data Type=Int
|Description=0-False, 1-True. This Sets the analysis on or Off
|Description=0-False, 1-True. This Sets the analysis on or off.
}}
}}


Line 276: Line 286:


==SetMatchUpAnalysis (Optional)==
==SetMatchUpAnalysis (Optional)==
This function Omits duplicate record detection. Duplicate analysis consumes some of the '''AddRecord()''' and '''ProfileData()''' profiling time. However, the penalty is not high (under 5% during '''AddRecord()''' and 30% during '''ProfileData()'''), and often outweighs the usefulness of having these statistics.  
Optional. This function omits duplicate record detection. Duplicate analysis increases the '''AddRecord''' profiling time by under 5% and '''ProfileData''' profiling time by about 30%. These time increases often outweigh the usefulness of having these statistics.
The default value is "true." This function was made available to allow bypassing a time-consuming analysis that may not be required by the user. This property can be turned off by passing "false" for the value. This property should be set before '''InitializeDataFiles()''' is called.
 
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.


===Parameters===
===Parameters===
Line 283: Line 294:
|Name=MatchUpAnalysisValue
|Name=MatchUpAnalysisValue
|Data Type=Int
|Data Type=Int
|Description=0-False, 1-True. This Sets the analysis on or Off
|Description=0-False, 1-True. This Sets the analysis on or off.
}}
}}


Line 293: Line 304:


==SetRightFielderAnalysis (Optional)==
==SetRightFielderAnalysis (Optional)==
This function omits inferred data type analysis. This analysis is responsible for the Inconsistant Data and Inferred Data Type statistics. The speed penalty is incurred during the '''AddRecord()''' profiling step, and is under 10%. Because the penalty is not very high and these statistics are useful, it is not usually a good candidate for omission.
Optional. This function omits inferred data type analysis. This analysis is responsible for the '''Inconsistent Data''' and '''Inferred Data Type''' statistics. This analysis increases the '''AddRecord''' profiling time by under 10%. Because these time increases are not very high and these statistics are useful, it is usually good to keep these analysis.


The default value is "true." This function was made available to allow bypassing a time-consuming analysis that may not be required by the user. This property can be turned off by passing "false" for the value. This property should be set before '''InitializeDataFiles()''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.


===Parameters===
===Parameters===

Revision as of 23:10, 31 December 2014

← Profiler Object Reference

Profiler Object Interface Navigation
Initialization
Object Information
Enumeration Listing and Parsing
Column Specification
Initiate Profiling
Data Input
Profiling
Table-Based Statistics
Column-Based Statistics
Column-Based String Statistics
Column-Based Numeric Statistics
Column-Based Date/Time Statistics
Column-Based Name Statistics
Column-Based State/Province Statistics
Column-Based Zip/Postal Code Statistics
Column-Based Country Statistics
Column-Based Email Statistics
Column-Based Phone Statistics
Iterators
Column-Based Value Frequency Table Iteration
Column-Based Value Length Frequency Table Iteration
Column-Based Value Pattern Table Iteration
Column-Based Value Date/Time Table Iteration
Column-Based Value SoundEx Table Iteration
Column-Based Word Table Iteration
Column-Based Word Length Table Iteration
Result Codes
Returned Result Codes
Result Codes


The following functions set the paths to the necessary data files, the license, and initialize the Profiler Interface:

SetLicenseString

This function sets the license string required to enable Profiler Object’s complete functionality.

The License String is a software key that unlocks the full functionality of the component. Without the License String, the object will only function in DEMO mode

The license string is normally set using an environment variable, either MD_LICENSE or MD_LICENSE_DEMO. Calling SetLicenseString is an alternative method for setting the license string, but applications developed for a production environment should only use the environment variable. When using an environment variable, it is not necessary to call the SetLicenseString function. For more information on setting the environment variable, see Licensing.

This function accepts one parameter.

Parameters

Name Data Type Description
LicenseString String A string value representing the software license key.

Return Value

The SetLicenseString function returns value of 0 or 1.

The SetLicenseString function returns 0 if the provided License String is incorrect.

Syntax profiler->SetLicenseString(StringValue);
C integer = mdProfilerSetLicenseString(profiler, LicenseString);
.Net integer = profiler.SetLicenseString(LicenseString);


SetPathToProfilerDataFiles

This function sets the the path to the data files required for profiling.

The required data files are

  • mdProfiler.dat
  • mdProfiler.mc

This function accepts one Parameter.

Parameters

Name Data Type Description
LocationString String A string value representing the location of the Profiler data files.


Syntax profiler->SetPathToProfilerDataFiles(StringValue);
C mdProfilerSetPathToProfilerDataFiles (profiler, LocationString);
.Net profiler.SetPathToProfilerDataFiles(LocationString);


SetFileName (Optional)

Optional. This function sets the path/file name of profiling output file. The processing results will be stored in the output file.

This function accepts one parameter

Parameters

Name Data Type Description
FileNameString String A string value representing the name of the file.


Syntax profiler->SetFileName(StringValue);
C mdProfilerSetFileName(profiler, FileNameString);
.Net profiler.SetFileName(FileNameString);


SetAppendMode (Optional)

Optional. This function sets the mode for how the output file needs to be handled. The default is overwrite mode if this function is not called.

If a Profiler file is opened in append mode certain statistics will not be accurate, as some results cannot be carried over from run to run. Specifically: TableExactMatch, TableContactMatch, TableHouseholdMatch,TableAddressMatch, ColumnInferredDataType, ColumnSortation, and ColumnSortationPercent.

If you intend to open the file in Append mode, you will want to call SetSortAnalysis, SetMatchUpAnalysis and SetRightFielderAnalysis appropriately.

The SetAppendMode function takes a value of the enumerated type AppendMode.

Value Name Description
0 Append Add new profiling information to any existing information.
1 Report Open the profile table so that reporting may be done. No new records can be appended to the profile run.
2 Overwrite Overwrite existing output profile table.
3 MustNotExist Output profile table must not exist.


Parameters

Name Data Type Description
appendMode AppedMode This sets the append mode for the output file.


Syntax profiler->SetAppendMode(AppendModeValue);
C mdProfilerSetAppendMode (profiler, appendMode);
.Net profiler.SetAppendMode(appendMode);


SetUserName (Optional)

Optional. This function sets the user name for a particular run.

This function accepts one parameter.

Parameters

Name Data Type Description
UserNameString String A string value representing the user name.


Syntax profiler->SetUserName(userNameString);
C mdProfilerSetUserName (profiler, UserNameString);


SetTableName

This function sets the name of the table.

This function accepts one parameter

Parameters

Name Data Type Description
TableNameString String A string value representing the table name.


Syntax profiler->SetTableName(tableNameString);
C mdProfilerSetTableName(profiler, TableNameString);
.Net profiler.SetTableName(TableNameString);


SetJobName (Optional)

Optional. This function sets the job name for a particular run.

This fuction accepts one parameter

Parameters

Name Data Type Description
JobNameString String A string value representing the Job name.


Syntax profiler->SetJobName(jobNameString);
C mdProfilerSetJobName (profiler, JobNameString);
.Net profiler.SetJobName(JobNameString);


SetJobDescription (Optional)

Optional. This function sets the job description for a particular run.

This function accepts one parameter.

Parameters

Name Data Type Description
JobDesciptionString String A string value representing the job description.


Syntax profiler->SetJobDescription(jobDescriptionString);
C mdProfilerSetJobDescription (profiler, JobDesciptionString);
.Net profiler.SetJobDescription(JobDesciptionString);


SetSortAnalysis (Optional)

Optional. This function omits the sortation analysis which can increase AddRecord profiling time. This time penalty grows geometrically as more records are added. If you are not interested in this statistic, disable it to decrease your profiling time.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
SortAnalysisValue Int 0-False, 1-True. This Sets the analysis on or off.


Syntax profiler->SetSortAnalysis(int);
C mdProfilerSetSortAnalysis(profiler, SortAnalysisValue);
.Net profiler.SetJobSortAnalysis(SortAnalysisValue);


SetMatchUpAnalysis (Optional)

Optional. This function omits duplicate record detection. Duplicate analysis increases the AddRecord profiling time by under 5% and ProfileData profiling time by about 30%. These time increases often outweigh the usefulness of having these statistics.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
MatchUpAnalysisValue Int 0-False, 1-True. This Sets the analysis on or off.


Syntax profiler->SetMatchUpAnalysis(int);
C mdProfilerSetMatchUpAnalysis(profiler, MatchUpAnalysisValue);
.Net profiler.SetMatchUpAnalysis(MatchUpAnalysisValue);


SetRightFielderAnalysis (Optional)

Optional. This function omits inferred data type analysis. This analysis is responsible for the Inconsistent Data and Inferred Data Type statistics. This analysis increases the AddRecord profiling time by under 10%. Because these time increases are not very high and these statistics are useful, it is usually good to keep these analysis.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
RightFielderAnalysisValue Int 0-False, 1-True. This Sets the analysis on or Off


Syntax profiler->SetRightFielderAnalysis(int);
C mdProfilerSetRightFielderAnalysis(profiler, RightFielderAnalysisValue);
.Net profiler.SetRightFielderAnalysis(RightFielderAnalysisValue);


SetDataAggregation (Optional)

This function omits all forms of aggregation. This analysis is responsible for all data aggregation steps. Any statistic that cannot be determined incrementally (for example, median, standard deviation, etc) is determined via aggregation. Although the time savings can be rather high (nearly 40%), the aggregation statistics are usually too useful to be sacrificed.

The default value is "true." This function bypasses a time-consuming analysis that may not be required by the user. This property can be turned off by passing "false" for the value. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
DataAggregationVale Int 0-False, 1-True. This Sets the aggregation on or off.


Syntax profiler->SetDataAggregation(int);
C mdProfilerSetDataAggregation(profiler, dataAggregationValue);
.Net profiler.SetDataAggregation(dataAggregationValue);


SetCountGeneration (Optional)

This function omits all forms of value gathering. This analysis is responsible for all value tables (Frequency, Pattern, SoundEx, etc). All iterators will return no results. Also, since data aggregation uses these tables, this also suppresses data aggregation statistics. The time savings is very high (over 90%), but the cost is great.

The default value is "true." This function is available to allow bypassing a time-consuming analysis that you may not require. This property can be turned off by passing "false" for the value. This should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
CountGenerationValue Int 0-False, 1-True. This Sets the count generation on or off.


Syntax profiler->SetCountGeneration(int);
C mdProfilerSetCountGeneration(profiler, CountGenerationValue);
.Net profiler.SetCountGeneration(CountGenerationValue);


InitializeDataFiles

This function opens the required data files and prepares the profiler for use. You must call the SetLicenseString function (if you don’t have the MD_LICENSE environmental variable set) prior calling InitializeDataFiles.

If the InitializeDataFiles function returns a code other than zero, you can call the GetInitializeErrorString function to display a string describing the error.

Required Functions

You must successfully call SetPathToProfilerDataFiles and SetFileName functions.

The InitializeDataFiles function returns a value of the enumerated type ProgramStatus.


Value Name Description
0 ErrorNone No error - initialization was successful.
1 ErrorConfigFile Error in Configuration file
2 ErrorDatabaseExpired Database file expired
3 ErrorLicenseExpired License expired.
4 ErrorProfileFile Error in Profile file
5 ErrorUnknown Unknown error.


Syntax profiler->InitializeDataFiles();
C ProgramStatus = mdProfilerInitializeDataFiles(profiler);
.Net ProgramStatus = profiler.InitializeDataFiles();


GetInitializeErrorString

This function returns a descriptive string to describe an error from the InitializeDataFiles function.

The GetInitializeErrorString function returns a string describing the error that occurred when the InitializeDataFiles function failed. This is useful for outputting a quick message to the user.

Syntax profiler->InitializeErrorString();
C string = mdProfilerGetInitializeErrorString(profiler);
.Net string = profiler.InitializeDataFiles();