我创建了一个类来保存每个对象的id和文本(在我的例子中,我称它为sku(项目编号)和描述)。这将创建一个较小的对象,该对象使用的内存较少,因为它仅用于搜索。在找到匹配项后,我仍然会从数据库中获取完整的对象。
public class SmallItem
{
private int _sku;
public int Sku
{
get { return _sku; }
set { _sku = value; }
}
// Size of max description size + 1 for null terminator.
private char[] _description = new char[36];
public char[] Description
{
get { return _description; }
set { _description = value; }
}
public SmallItem()
{
}
}
以下是自定义包含函数的示例:
public static int[] Contains(string[] descriptionTerms, int maxResults, List<SmallItem> itemList)
{
// Don't allow more than the maximum allowable results constant.
int[] matchingSkus = new int[maxResults];
// Indexes and counters.
int matchNumber = 0;
int currentWord = 0;
int totalWords = descriptionTerms.Count() - 1; // - 1 because it will be used with 0 based array indexes
bool matchedWord;
try
{
/* Character array of character arrays. Each array is a word we want to match.
* We need the + 1 because totalWords had - 1 (We are setting a size/length here,
* so it is not 0 based... we used - 1 on totalWords because it is used for 0
* based index referencing.)
* */
char[][] allWordsToMatch = new char[totalWords + 1][];
// Character array to hold the current word to match.
char[] wordToMatch = new char[36]; // Max allowable word size + null terminator... I just picked 36 to be consistent with max description size.
// Loop through the original string array or words to match and create the character arrays.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
char[] desc = new char[descriptionTerms[currentWord].Length + 1];
Array.Copy(descriptionTerms[currentWord].ToUpper().ToCharArray(), desc, descriptionTerms[currentWord].Length);
allWordsToMatch[currentWord] = desc;
}
// Offsets for description and filter(word to match) pointers.
int descriptionOffset = 0, filterOffset = 0;
// Loop through the list of items trying to find matching words.
foreach (SmallItem i in itemList)
{
// If we have reached our maximum allowable matches, we should stop searching and just return the results.
if (matchNumber == maxResults)
break;
// Loop through the "words to match" filter list.
for (currentWord = 0; currentWord <= totalWords; currentWord++)
{
// Reset our match flag and current word to match.
matchedWord = false;
wordToMatch = allWordsToMatch[currentWord];
// Delving into unmanaged code for SCREAMING performance ;)
unsafe
{
// Pointer to the description of the current item on the list (starting at first char).
fixed (char* pdesc = &i.Description[0])
{
// Pointer to the current word we are trying to match (starting at first char).
fixed (char* pfilter = &wordToMatch[0])
{
// Reset the description offset.
descriptionOffset = 0;
// Continue our search on the current word until we hit a null terminator for the char array.
while (*(pdesc + descriptionOffset) != '\0')
{
// We've matched the first character of the word we're trying to match.
if (*(pdesc + descriptionOffset) == *pfilter)
{
// Reset the filter offset.
filterOffset = 0;
/* Keep moving the offsets together while we have consecutive character matches. Once we hit a non-match
* or a null terminator, we need to jump out of this loop.
* */
while (*(pfilter + filterOffset) != '\0' && *(pfilter + filterOffset) == *(pdesc + descriptionOffset))
{
// Increase the offsets together to the next character.
++filterOffset;
++descriptionOffset;
}
// We hit matches all the way to the null terminator. The entire word was a match.
if (*(pfilter + filterOffset) == '\0')
{
// If our current word matched is the last word on the match list, we have matched all words.
if (currentWord == totalWords)
{
// Add the sku as a match.
matchingSkus[matchNumber] = i.Sku.ToString();
matchNumber++;
/* Break out of this item description. We have matched all needed words and can move to
* the next item.
* */
break;
}
/* We've matched a word, but still have more words left in our list of words to match.
* Set our match flag to true, which will mean we continue continue to search for the
* next word on the list.
* */
matchedWord = true;
}
}
// No match on the current character. Move to next one.
descriptionOffset++;
}
/* The current word had no match, so no sense in looking for the rest of the words. Break to the
* next item description.
* */
if (!matchedWord)
break;
}
}
}
}
};
// We have our list of matching skus. We'll resize the array and pass it back.
Array.Resize(ref matchingSkus, matchNumber);
return matchingSkus;
}
catch (Exception ex)
{
// Handle the exception
}
}
一旦获得匹配SKU的列表,就可以遍历数组并构建一个仅返回匹配SKU的查询命令。
-
搜索约171000个项目
-
创建所有匹配项的列表
-
查询数据库,仅返回匹配项
-
构建完整的项目(类似于SmallItem类,但有更多字段)
-
用完整的blow项对象填充datagrid。
我也尝试过在没有非托管代码的情况下使用String.IndexOf(并尝试了String.Contains…与IndexOf的性能相同)。那条路慢多了。。。大约25秒。
我还尝试使用StreamReader和包含[Sku编号]|[Description]行的文件。该代码类似于非托管代码示例。整个扫描过程用了大约15秒。速度不算太差,但也不算太好。与我向您展示的方式相比,file and StreamReader方法有一个优势。文件可以提前创建。我向您展示的方式需要内存和应用程序启动时加载列表的初始时间。对于我们的171000件商品,这大约需要2分钟。如果你可以在每次应用启动时等待初始加载(当然可以在单独的线程上完成),那么用这种方法搜索是最快的方法(至少我已经找到了)。
希望有帮助。
PS-感谢Dolch对一些非托管代码的帮助。