Profiler Object:Initialization: Difference between revisions

From Melissa Data Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
[[Profiler Object:Reference|← Profiler Object Reference]]
[[Profiler Object:Reference|← Profiler Object Reference]]


Line 268: Line 269:


==SetSortAnalysis (Optional)==
==SetSortAnalysis (Optional)==
Optional. This function omits the sortation analysis which can increase '''AddRecord''' profiling time. This time penalty grows geometrically as more records are added. If you are not interested in this statistic, disable it to decrease your profiling time.
Optional. This function handles the sortation analysis, which can increase '''AddRecord''' profiling time. This time penalty grows geometrically as more records are added. If you are not interested in this statistic, disable it to decrease your profiling time.


The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
Line 286: Line 287:


==SetMatchUpAnalysis (Optional)==
==SetMatchUpAnalysis (Optional)==
Optional. This function omits duplicate record detection. Duplicate analysis increases the '''AddRecord''' profiling time by under 5% and '''ProfileData''' profiling time by about 30%. These time increases often outweigh the usefulness of having these statistics.
Optional. This function handles duplicate record detection. Duplicate analysis increases the '''AddRecord''' profiling time by under 5% and '''ProfileData''' profiling time by about 30%. These time increases often outweigh the usefulness of having these statistics.


The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
Line 304: Line 305:


==SetRightFielderAnalysis (Optional)==
==SetRightFielderAnalysis (Optional)==
Optional. This function omits inferred data type analysis. This analysis is responsible for the '''Inconsistent Data''' and '''Inferred Data Type''' statistics. This analysis increases the '''AddRecord''' profiling time by under 10%. Because these time increases are not very high and these statistics are useful, it is usually good to keep these analysis.
Optional. This function handles inferred data type analysis. This analysis is responsible for the '''Inconsistent Data''' and '''Inferred Data Type''' statistics. This increases the '''AddRecord''' profiling time by under 10%. It is usually good to keep these useful analysis, even with the increased profiling time.


The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.
Line 322: Line 323:


==SetDataAggregation (Optional)==
==SetDataAggregation (Optional)==
This function omits all forms of aggregation. This analysis is responsible for all data aggregation steps. Any statistic that cannot be determined incrementally (for example, median, standard deviation, etc) is determined via aggregation. Although the time savings can be rather high (nearly 40%), the aggregation statistics are usually too useful to be sacrificed.
Optional. This function handles all forms of aggregation. Any statistic that cannot be determined incrementally (for example, median, standard deviation, etc.) is determined via aggregation. This increases profiling time by nearly 40%. These statistics are usually useful enough to justify the increased profiling time.


The default value is "true." This function bypasses a time-consuming analysis that may not be required by the user. This property can be turned off by passing "false" for the value. This property should be set before '''InitializeDataFiles''' is called.
The default value is "true." A "false" value will disable this function. This property should be set before '''InitializeDataFiles''' is called.


===Parameters===
===Parameters===
Line 340: Line 341:


==SetCountGeneration (Optional)==
==SetCountGeneration (Optional)==
This function omits all forms of value gathering. This analysis is responsible for all value tables (Frequency, Pattern, SoundEx, etc). All iterators will return no results. Also, since data aggregation uses these tables, this also suppresses data aggregation statistics. The time savings is very high (over 90%), but the cost is great.
Optional. This function handles all forms of value gathering. This analysis is responsible for all value tables (Frequency, Pattern, SoundEx, etc.) All iterators and data aggregation statistics are dependent on these analysis. These analysis increase profiling time by over 90%.


The default value is "true." This function is available to allow bypassing a time-consuming analysis that you may not require. This property can be turned off by passing "false" for the value. This should be set before '''InitializeDataFiles''' is called.
The default value is "true." A "false" value will disable this function and by dependency disable all iterators and data aggregation statistics. This should be set before '''InitializeDataFiles''' is called.


===Parameters===
===Parameters===
Line 358: Line 359:


==InitializeDataFiles==
==InitializeDataFiles==
This function opens the required data files and prepares the profiler for use. You ''must'' call the '''SetLicenseString''' function (if you don’t have the MD_LICENSE environmental variable set) prior calling '''InitializeDataFiles'''.
This function opens the required data files and prepares the profiler for use. You ''must'' call the '''[[#SetLicenseString|SetLicenseString]]''' function if you don’t have the MD_LICENSE environmental variable set prior to calling '''InitializeDataFiles'''.


If the '''InitializeDataFiles''' function returns a code other than zero, you can call the '''GetInitializeErrorString''' function to display a string describing the error.
If the '''InitializeDataFiles''' function returns a code other than zero, you can call the '''[[#GetInitializeErrorString|GetInitializeErrorString]]''' function to display a string describing the error.


===Required Functions===
===Required Functions===
You must successfully call '''SetPathToProfilerDataFiles''' and '''SetFileName functions'''.
You must successfully call '''[[#SetPathToProfilerDataFiles|SetPathToProfilerDataFiles]]''' and '''[[#SetFileName functions|SetFileName functions]]'''.


The '''InitializeDataFiles''' function returns a value of the enumerated type '''ProgramStatus'''.
The '''InitializeDataFiles''' function returns a value of the enumerated type '''ProgramStatus'''.
Line 406: Line 407:


==GetInitializeErrorString==
==GetInitializeErrorString==
This function returns a descriptive string to describe an error from the InitializeDataFiles function.
This function returns a descriptive string to describe an error from the '''[[#InitializeDataFiles|InitializeDataFiles]]''' function. This can be used to output a message to the user.
 
The '''GetInitializeErrorString''' function returns a string describing the error that occurred when the '''InitializeDataFiles''' function failed. This is useful for outputting a quick message to the user.


{{ Object Syntax
{{ Object Syntax

Revision as of 18:41, 6 January 2015

← Profiler Object Reference

Profiler Object Interface Navigation
Initialization
Object Information
Enumeration Listing and Parsing
Column Specification
Initiate Profiling
Data Input
Profiling
Table-Based Statistics
Column-Based Statistics
Column-Based String Statistics
Column-Based Numeric Statistics
Column-Based Date/Time Statistics
Column-Based Name Statistics
Column-Based State/Province Statistics
Column-Based Zip/Postal Code Statistics
Column-Based Country Statistics
Column-Based Email Statistics
Column-Based Phone Statistics
Iterators
Column-Based Value Frequency Table Iteration
Column-Based Value Length Frequency Table Iteration
Column-Based Value Pattern Table Iteration
Column-Based Value Date/Time Table Iteration
Column-Based Value SoundEx Table Iteration
Column-Based Word Table Iteration
Column-Based Word Length Table Iteration
Result Codes
Returned Result Codes
Result Codes


The following functions set the paths to the necessary data files, the license, and initialize the Profiler Interface:

SetLicenseString

This function sets the license string required to enable Profiler Object’s complete functionality.

The License String is a software key that unlocks the full functionality of the component. Without the License String, the object will only function in DEMO mode

The license string is normally set using an environment variable, either MD_LICENSE or MD_LICENSE_DEMO. Calling SetLicenseString is an alternative method for setting the license string, but applications developed for a production environment should only use the environment variable. When using an environment variable, it is not necessary to call the SetLicenseString function. For more information on setting the environment variable, see Licensing.

This function accepts one parameter.

Parameters

Name Data Type Description
LicenseString String A string value representing the software license key.

Return Value

The SetLicenseString function returns value of 0 or 1.

The SetLicenseString function returns 0 if the provided License String is incorrect.

Syntax profiler->SetLicenseString(StringValue);
C integer = mdProfilerSetLicenseString(profiler, LicenseString);
.Net integer = profiler.SetLicenseString(LicenseString);


SetPathToProfilerDataFiles

This function sets the the path to the data files required for profiling.

The required data files are

  • mdProfiler.dat
  • mdProfiler.mc

This function accepts one Parameter.

Parameters

Name Data Type Description
LocationString String A string value representing the location of the Profiler data files.


Syntax profiler->SetPathToProfilerDataFiles(StringValue);
C mdProfilerSetPathToProfilerDataFiles (profiler, LocationString);
.Net profiler.SetPathToProfilerDataFiles(LocationString);


SetFileName (Optional)

Optional. This function sets the path/file name of profiling output file. The processing results will be stored in the output file.

This function accepts one parameter

Parameters

Name Data Type Description
FileNameString String A string value representing the name of the file.


Syntax profiler->SetFileName(StringValue);
C mdProfilerSetFileName(profiler, FileNameString);
.Net profiler.SetFileName(FileNameString);


SetAppendMode (Optional)

Optional. This function sets the mode for how the output file needs to be handled. The default is overwrite mode if this function is not called.

If a Profiler file is opened in append mode certain statistics will not be accurate, as some results cannot be carried over from run to run. Specifically: TableExactMatch, TableContactMatch, TableHouseholdMatch,TableAddressMatch, ColumnInferredDataType, ColumnSortation, and ColumnSortationPercent.

If you intend to open the file in Append mode, you will want to call SetSortAnalysis, SetMatchUpAnalysis and SetRightFielderAnalysis appropriately.

The SetAppendMode function takes a value of the enumerated type AppendMode.

Value Name Description
0 Append Add new profiling information to any existing information.
1 Report Open the profile table so that reporting may be done. No new records can be appended to the profile run.
2 Overwrite Overwrite existing output profile table.
3 MustNotExist Output profile table must not exist.


Parameters

Name Data Type Description
appendMode AppedMode This sets the append mode for the output file.


Syntax profiler->SetAppendMode(AppendModeValue);
C mdProfilerSetAppendMode (profiler, appendMode);
.Net profiler.SetAppendMode(appendMode);


SetUserName (Optional)

Optional. This function sets the user name for a particular run.

This function accepts one parameter.

Parameters

Name Data Type Description
UserNameString String A string value representing the user name.


Syntax profiler->SetUserName(userNameString);
C mdProfilerSetUserName (profiler, UserNameString);


SetTableName

This function sets the name of the table.

This function accepts one parameter

Parameters

Name Data Type Description
TableNameString String A string value representing the table name.


Syntax profiler->SetTableName(tableNameString);
C mdProfilerSetTableName(profiler, TableNameString);
.Net profiler.SetTableName(TableNameString);


SetJobName (Optional)

Optional. This function sets the job name for a particular run.

This fuction accepts one parameter

Parameters

Name Data Type Description
JobNameString String A string value representing the Job name.


Syntax profiler->SetJobName(jobNameString);
C mdProfilerSetJobName (profiler, JobNameString);
.Net profiler.SetJobName(JobNameString);


SetJobDescription (Optional)

Optional. This function sets the job description for a particular run.

This function accepts one parameter.

Parameters

Name Data Type Description
JobDesciptionString String A string value representing the job description.


Syntax profiler->SetJobDescription(jobDescriptionString);
C mdProfilerSetJobDescription (profiler, JobDesciptionString);
.Net profiler.SetJobDescription(JobDesciptionString);


SetSortAnalysis (Optional)

Optional. This function handles the sortation analysis, which can increase AddRecord profiling time. This time penalty grows geometrically as more records are added. If you are not interested in this statistic, disable it to decrease your profiling time.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
SortAnalysisValue Int 0-False, 1-True. This Sets the analysis on or off.


Syntax profiler->SetSortAnalysis(int);
C mdProfilerSetSortAnalysis(profiler, SortAnalysisValue);
.Net profiler.SetJobSortAnalysis(SortAnalysisValue);


SetMatchUpAnalysis (Optional)

Optional. This function handles duplicate record detection. Duplicate analysis increases the AddRecord profiling time by under 5% and ProfileData profiling time by about 30%. These time increases often outweigh the usefulness of having these statistics.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
MatchUpAnalysisValue Int 0-False, 1-True. This Sets the analysis on or off.


Syntax profiler->SetMatchUpAnalysis(int);
C mdProfilerSetMatchUpAnalysis(profiler, MatchUpAnalysisValue);
.Net profiler.SetMatchUpAnalysis(MatchUpAnalysisValue);


SetRightFielderAnalysis (Optional)

Optional. This function handles inferred data type analysis. This analysis is responsible for the Inconsistent Data and Inferred Data Type statistics. This increases the AddRecord profiling time by under 10%. It is usually good to keep these useful analysis, even with the increased profiling time.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
RightFielderAnalysisValue Int 0-False, 1-True. This Sets the analysis on or Off


Syntax profiler->SetRightFielderAnalysis(int);
C mdProfilerSetRightFielderAnalysis(profiler, RightFielderAnalysisValue);
.Net profiler.SetRightFielderAnalysis(RightFielderAnalysisValue);


SetDataAggregation (Optional)

Optional. This function handles all forms of aggregation. Any statistic that cannot be determined incrementally (for example, median, standard deviation, etc.) is determined via aggregation. This increases profiling time by nearly 40%. These statistics are usually useful enough to justify the increased profiling time.

The default value is "true." A "false" value will disable this function. This property should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
DataAggregationVale Int 0-False, 1-True. This Sets the aggregation on or off.


Syntax profiler->SetDataAggregation(int);
C mdProfilerSetDataAggregation(profiler, dataAggregationValue);
.Net profiler.SetDataAggregation(dataAggregationValue);


SetCountGeneration (Optional)

Optional. This function handles all forms of value gathering. This analysis is responsible for all value tables (Frequency, Pattern, SoundEx, etc.) All iterators and data aggregation statistics are dependent on these analysis. These analysis increase profiling time by over 90%.

The default value is "true." A "false" value will disable this function and by dependency disable all iterators and data aggregation statistics. This should be set before InitializeDataFiles is called.

Parameters

Name Data Type Description
CountGenerationValue Int 0-False, 1-True. This Sets the count generation on or off.


Syntax profiler->SetCountGeneration(int);
C mdProfilerSetCountGeneration(profiler, CountGenerationValue);
.Net profiler.SetCountGeneration(CountGenerationValue);


InitializeDataFiles

This function opens the required data files and prepares the profiler for use. You must call the SetLicenseString function if you don’t have the MD_LICENSE environmental variable set prior to calling InitializeDataFiles.

If the InitializeDataFiles function returns a code other than zero, you can call the GetInitializeErrorString function to display a string describing the error.

Required Functions

You must successfully call SetPathToProfilerDataFiles and SetFileName functions.

The InitializeDataFiles function returns a value of the enumerated type ProgramStatus.


Value Name Description
0 ErrorNone No error - initialization was successful.
1 ErrorConfigFile Error in Configuration file
2 ErrorDatabaseExpired Database file expired
3 ErrorLicenseExpired License expired.
4 ErrorProfileFile Error in Profile file
5 ErrorUnknown Unknown error.


Syntax profiler->InitializeDataFiles();
C ProgramStatus = mdProfilerInitializeDataFiles(profiler);
.Net ProgramStatus = profiler.InitializeDataFiles();


GetInitializeErrorString

This function returns a descriptive string to describe an error from the InitializeDataFiles function. This can be used to output a message to the user.

Syntax profiler->InitializeErrorString();
C string = mdProfilerGetInitializeErrorString(profiler);
.Net string = profiler.InitializeDataFiles();