Crawled Properties
Crawled properties are metadata that is extracted from content sources to make the data available for searching. Crawled properties are typically reported by the Content SSA or other FAST Search Server 2010 for SharePoint connectors, but can also be created during item processing by an IFilter or a property extractor.
A crawled property is uniquely defined by the parameters of Name, Propset, and VariantType.
Two specific managed properties are populated with the crawled property names and values discovered for the given item, as follows:
crawledpropertynames Holds discovered crawled properties that have a value for a specified item.
crawledpropertiescontent Holds the value of every crawled property in crawledpropertynames.
Some discovered crawled properties are not mapped into these managed properties. The disadvantage of automatically indexing the content of discovered crawled properties is that not all content is relevant for searching. There may be several reasons for this; for example, the crawled properties might provide sensitive information or contain data that can adversely affect relevance or recall. A crawled property will map to crawledpropertiescontent if the following is true:
The crawled property has variant types that map to a string or list of strings.
Crawled properties that are known to provide unwanted content in the search index are excluded by setting their IsMappedToContents property to False.
Because every crawled property belongs to a category (determined by its Propset), the category has a Boolean property (MapToContents) that sets the default value of the IsMappedToContents property of new crawled properties
So, if the crawled property is a string and its IsMappedToContents property is True, the content of the crawled property should be searchable in crawledpropertiescontent.
Each crawled property belongs to a crawled property category, which is a high-level grouping of crawled properties based on the IFilter and content source that is used to extract the metadata from the content.
The following are examples of categories:
Business Data Metadata that is associated with content retrieved by using the Business Data Connectivity (BDC) service.
Mail Metadata that is associated with Microsoft Exchange Server.
Office Metadata that is contained in Microsoft Office documents such as Microsoft Word, Microsoft Excel, and Microsoft PowerPoint.
People Metadata that is associated with the people profiles in SharePoint Server 2010. The majority of these are also mapped to various managed properties from Active Directory and SharePoint information.
Web HTML metadata that is associated with web pages.
A crawled property category may contain multiple property sets. Table 1 describes the interfaces that are related to crawled properties.
A crawled property is uniquely defined by the parameters of Name, Propset, and VariantType.
Two specific managed properties are populated with the crawled property names and values discovered for the given item, as follows:
crawledpropertynames Holds discovered crawled properties that have a value for a specified item.
crawledpropertiescontent Holds the value of every crawled property in crawledpropertynames.
Some discovered crawled properties are not mapped into these managed properties. The disadvantage of automatically indexing the content of discovered crawled properties is that not all content is relevant for searching. There may be several reasons for this; for example, the crawled properties might provide sensitive information or contain data that can adversely affect relevance or recall. A crawled property will map to crawledpropertiescontent if the following is true:
The crawled property has variant types that map to a string or list of strings.
Crawled properties that are known to provide unwanted content in the search index are excluded by setting their IsMappedToContents property to False.
Because every crawled property belongs to a category (determined by its Propset), the category has a Boolean property (MapToContents) that sets the default value of the IsMappedToContents property of new crawled properties
So, if the crawled property is a string and its IsMappedToContents property is True, the content of the crawled property should be searchable in crawledpropertiescontent.
Each crawled property belongs to a crawled property category, which is a high-level grouping of crawled properties based on the IFilter and content source that is used to extract the metadata from the content.
The following are examples of categories:
Business Data Metadata that is associated with content retrieved by using the Business Data Connectivity (BDC) service.
Mail Metadata that is associated with Microsoft Exchange Server.
Office Metadata that is contained in Microsoft Office documents such as Microsoft Word, Microsoft Excel, and Microsoft PowerPoint.
People Metadata that is associated with the people profiles in SharePoint Server 2010. The majority of these are also mapped to various managed properties from Active Directory and SharePoint information.
Web HTML metadata that is associated with web pages.
A crawled property category may contain multiple property sets. Table 1 describes the interfaces that are related to crawled properties.
Interface | Description |
---|---|
CrawledProperty | Specifies a crawled property. |
Category | You can use the Category interface to specify default mapping behavior that is common to all crawled properties within the category. You can use the AllCategories property of the Schema interface to retrieve a collection of property categories. You can retrieve a collection of CrawledProperty objects for a given category by using the Category.GetAllCrawledProperties method. You can create a crawled property by using the Category.CreateCrawledProperty method. |
ManagedProperty | Managed properties are metadata that can be searched or retrieved in query
results. You can retrieve a collection of CrawledProperty objects that represent the crawled properties mapped to a specific managed property by using the ManagedProperty.GetMappedCrawledProperties method. You can configure crawled property mappings by using the ManagedProperty.SetCrawledPropertyMappings method. |
Comments
Post a Comment