Similarity Search (Spatial Statistics)

License Level:BasicStandardAdvanced

Summary

Identifies which candidate features are most similar or most dissimilar to one or more input features based on feature attributes.

Learn more about how Similarity Search works

Illustration

Similarity Search

Usage

Syntax

SimilaritySearch_stats (Input_Features_To_Match, Candidate_Features, Output_Features, Collapse_Output_To_Points, Most_Or_Least_Similar, Match_Method, Number_Of_Results, Attributes_Of_Interest, {Fields_To_Append_To_Output})
ParameterExplanationData Type
Input_Features_To_Match

The layer (or a selection on a layer) containing the features you want to match; you are searching for other features that look like these features. When more than one feature is provided, matching is based on attribute averages.

Tip: When your Input Features To Match and Candidate Features come from a single dataset,

  • Right-click the layer and choose Selection followed by Create Layer From Selected Features. Use the new layer created for this parameter.
  • Next, right-click the layer again and choose Selection followed by Switch Selection to get the layer you will use for your Candidate Features.

Feature Layer
Candidate_Features

The layer (or a selection on a layer) containing candidate matching features. The tool will look for features most like (or most dislike) the Input Features To Match among these candidates.

Tip: When your Input Features To Match and Candidate Features come from a single dataset,

  • Right-click the layer and choose Selection followed by Create Layer From Selected Features. Use the new layer created for this parameter.
  • Next, right-click the layer again and choose Selection followed by Switch Selection to get the layer you will use for this parameter.

Feature Layer
Output_Features

The output feature class contains a record for each of the Input Features To Match and for all of the solution matching features found.

Feature Class
Collapse_Output_To_Points

Specify whether you want the geometry for the Output_Features to be points or to match the geometry (lines or polygons) of the input features. This option is only available when the Input_Features_To_Match and the Candidate_Features are both lines or both polygons. Choosing COLLAPSE for large line or polygon datasets will improve tool performance.

  • NO_COLLAPSEThe output geometry will match the line or polygon geometry of the input features. This is the default.
  • COLLAPSEThe line and polygon features will be represented as feature centroids (points).
Boolean
Most_Or_Least_Similar

Choose whether you are interested in features that are most alike or most different to the Input Features To Match.

  • MOST_SIMILARFind the features that are most alike.
  • LEAST_SIMILARFind the features that are most different.
  • BOTHFind both the features that are most alike and the features that are most different.
String
Match_Method

Choose whether matching should be based on values, ranks, or cosine relationships.

  • ATTRIBUTE_VALUESSimilarity or dissimilarity will be based on the sum of squared standardized attribute value differences for all of the Attributes Of Interest.
  • RANKED_ATTRIBUTE_VALUESSimilarity or dissimilarity will be based on the sum of squared rank differences for all of the Attributes Of Interest.
  • ATTRIBUTE_PROFILESSimilarity or dissimilarity will be computed as a function of cosine similarity for all of the Attributes Of Interest.
String
Number_Of_Results

The number of solution matches to find. Entering zero or a number larger than the total number of Candidate Features will return rankings for all of the candidate features.

Long
Attributes_Of_Interest
[field,...]

A list of numeric attributes representing the matching criteria.

Field
Fields_To_Append_To_Output
[field,...]
(Optional)

An optional list of attributes to include with the Output Features. You might want to include a name identifier, categorical field, or date field, for example. These fields are not used to determine similarity; they are only included in the Output Features for your reference.

Field

Code Sample

SimilaritySearch example 1 (Python window)

The following Python window script demonstrates how to use the SimilaritySearch tool.

import arcpy
import arcpy.stats as SS
arcpy.env.workspace = r"C:\Analysis"
SS.SimilaritySearch ("Crime_selection", "AllCrime", "c:\\Analysis\\CrimeMatches", 
                     "NO_COLLAPSE", "MOST_SIMILAR", "ATTRIBUTE_VALUES", 4, 
                     "HEIGHT;WEIGHT;SEVERITY;DST2CHPSHP", "Name;WEAPON")
SimilaritySearch example 2 (stand-alone Python script)

The following stand-alone Python script demonstrates how to use the SimilaritySearch tool.

# Similarity Search of crime data in a metropolitan area

# Import system modules
import arcpy, os
import arcpy.stats as SS

# Set geoprocessor object property to overwrite existing output
arcpy.gp.overwriteOutput = True

try:
    # Set the current workspace (to avoid having to specify the full path to
    # the feature classes each time)
    arcpy.env.workspace = r"C:\Analysis"

    # Make a layer from the crime feature class
    arcpy.MakeFeatureLayer_management("AllCrime", "Crime_selection") 

    # Select the target crime to match
    # Process: Select By Attribute
    arcpy.SelectLayerByAttribute_management("Crime_selection","NEW_SELECTION",
                                            '"OBJECTID" = 1230043')

    # Use Similarity Search to find  to create groups based on different variables 
    # or analysis fields
    # Process: Group Similar Features  
    SS.SimilaritySearch("Crime_selection","AllCrime","CJMatches","NO_COLLAPSE",
                        "MOST_SIMILAR","ATTRIBUTE_VALUES",4,
                        "HEIGHT;WEIGHT;SEVERITY;DST2CHPSHP","Name;WEAPON")
    
except:
    # If an error occurred when running the tool, print out the error message.
    print arcpy.GetMessages()

Environments

Related Topics

Licensing Information

ArcGIS for Desktop Basic: Yes
ArcGIS for Desktop Standard: Yes
ArcGIS for Desktop Advanced: Yes
8/26/2014