【问题标题】:Count occurrence of whole word in a string计算字符串中整个单词的出现次数
【发布时间】:2015-09-14 04:55:01
【问题描述】:

我想找到一个字符串中特定单词的出现次数。

我在网上搜索了很多类似的答案

但他们都没有给我准确的结果。

我想要的是:

输入:

I have asked the question in StackOverflow. Therefore i can expect answer here.

“The”关键字的输出:

The keyword count: 2

注意:不要在句子中考虑“Theore”中的“The”。

基本上我想匹配整个单词并获得计数。

【问题讨论】:

    标签: c# asp.net


    【解决方案1】:

    像这样尝试(方式 1)

    string SpecificWord = " the ";
                string sentence = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
                int count = 0;
                foreach (Match match in Regex.Matches(sentence, SpecificWord, RegexOptions.IgnoreCase))
                {
                    count++;
                }
                Console.WriteLine("{0}" + " Found " + "{1}" + " Times", SpecificWord, count);
    

    或像这样(方式 2)

    string SpecificWord = " the ";
    string sentence = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
    int WordPlace = sentence.IndexOf(SpecificWord);
    Console.WriteLine(sentence);
    int TimesRep;
    for (TimesRep = 0; WordPlace > -1; TimesRep++)
    {
        sentence = (sentence.Substring(0, WordPlace) +sentence.Substring(WordPlace +SpecificWord.Length)).Replace("  ", " ");
        WordPlace = sentence.IndexOf(SpecificWord);
    }
    
    Console.WriteLine("this word Found " + TimesRep + " time");
    

    【讨论】:

      【解决方案2】:

      试试这个也适用于结构化数据。

          var splitStr = inputStr.Split(' ');
          int result_count = splitStr.Count(str => str.Contains("userName"));
      

      【讨论】:

        【解决方案3】:

        怎么样(似乎比其他解决方案更有效):

        public static int CountOccurences(string haystack, string needle)
        {
            return (haystack.Length - haystack.Replace(needle, string.Empty).Length) / needle.Length;
        }
        

        【讨论】:

          【解决方案4】:

          Count 在一个字符串中出现整个单词的可能性有很多。
          例如

          第一:

          string name = "pappu kumar sdffnsd sdfnsdkfbsdf sdfjnsd fsdjkn fsdfsd sdfsd pappu kumar";
          var res= name.Contains("pappu kumar");
          var splitval = name.Split("pappu kumar").Length-1;
          

          第二:

          var r = Regex.Matches(name, "pappu kumar").Count;
          

          【讨论】:

            【解决方案5】:

            这样试试

            string Text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
                            Text = Text.ToLower();
                            Dictionary<string, int> frequencies = null;
                            frequencies = new Dictionary<string, int>();
                            string[] words = Regex.Split(Text, "\\W+");
                            foreach (string word in words)
                            {
                                if (frequencies.ContainsKey(word))
                                {
                                    frequencies[word] += 1;
                                }
                                else
                                {
                                    frequencies[word] = 1;
                                }
                            }
            
            
                            foreach (KeyValuePair<string, int> entry in frequencies)
                            {
                                string word = entry.Key;
                                int frequency = entry.Value;
                                Response.Write(word.ToString() + "," + frequency.ToString()+"</br>");
                            }
            

            要搜索特定的单词,然后尝试这样。

            string Text = "I have asked the question in StackOverflow. Therefore the i can expect answer here.";
                    Text = Text.ToLower();
                    string searchtext = "the";
                    searchtext = searchtext.ToLower();
                    string[] words = Regex.Split(Text, "\\W+");
                    foreach (string word in words)
                    {
                        if (searchtext.Equals(word))
                        {
                            count = count + 1;
                        }
                        else
                        {
                        }
                    }
                    Response.Write(count);
            

            【讨论】:

              【解决方案6】:

              这样试试

              var searchText=" the ";
              var input="I have asked the question in StackOverflow. Therefore i can expect answer here.";
              var arr=input.Split(new char[]{' ','.'});
              var count=Array.FindAll(arr, s => s.Equals(searchText.Trim())).Length;
              Console.WriteLine(count);
              

              DOTNETFIDDLE

              编辑

              为您的搜索句子

              var sentence ="I have asked the question in StackOverflow. Therefore i can expect answer here.";
              var searchText="have asked";
              char [] split=new char[]{',',' ','.'};
              var splitSentence=sentence.ToLower().Split(split);
              var splitText=searchText.ToLower().Split(split);
              Console.WriteLine("Search Sentence {0}",splitSentence.Length);
              Console.WriteLine("Search Text {0}",splitText.Length);
              var count=0;
              for(var i=0;i<splitSentence.Length;i++){
                  if(splitSentence[i]==splitText[0]){
                    var index=i;
                      var found=true;
                      var j=0;
                      for( j=0;j<splitText.Length;j++){
                        if(splitSentence[index++]!=splitText[j])
                        {
                            found=false;
                            break;
                        }
                      }
                      if(found){
                          Console.WriteLine("Index J {0} ",j);
                          count++;
                          i= index >i ? index-1 : i;
                      }
                  }
              
              }
              Console.WriteLine("Total found {0} substring",count);
              

              DOTNETFIDDLE

              【讨论】:

              • 感谢您发布答案。但它给了我 2 个计数。
              • 如果我的关键字中有空格,这将不起作用
              • 给我一些你想搜索的关键字
              • 考虑同样的例子。 The 应该有输出 1,I can 也应该有输出 1
              【解决方案7】:
              string input = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
              string pattern = @"\bthe\b";
              var matches = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
              Console.WriteLine(matches.Count);
              

              见正则表达式Anchors - "\b"。

              【讨论】:

                【解决方案8】:

                这个解决方案应该适用于字符串所在的任何地方:

                var str = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
                var numMatches = Regex.Matches(str.ToUpper(), "THE")
                    .Cast<Match>()
                    .Count(match => 
                        (match.Index == 0 || str[match.Index - 1] == ' ') && 
                        (match.Index + match.Length == str.Length || 
                            !Regex.IsMatch(
                                str[match.Index + match.Length].ToString(),
                                "[a-zA-Z]")));
                

                .NET Fiddle

                【讨论】:

                  【解决方案9】:

                  问题并不像你想象的那么简单;应该注意许多问题,例如标点符号、字母大小写以及如何识别单词边界等问题。 但是使用N_Gram 概念我提供了以下解决方案:

                  1- 确定键中有多少个单词。将其命名为 N

                  2-提取文本中所有N个连续的单词序列(N_Grams)。

                  3- 统计 N_Grams 中 key 的出现次数

                      string text = "I have asked the question in StackOverflow. Therefore i can expect answer here.";
                      string key = "the question";
                      int gram = key.Split(' ').Count();
                      var parts = text.Split(' ');
                      List<string> n_grams = new List<string>();
                      for (int i = 0; i < parts.Count(); i++)
                      {
                          if (i <= parts.Count() - gram)
                          {
                              string sequence = "";
                              for (int j = 0; j < gram; j++)
                              {
                                  sequence += parts[i + j] + " ";
                              }
                              if (sequence.Length > 0)
                                  sequence = sequence.Remove(sequenc.Count() - 1, 1);
                              n_grams.Add(sequence);
                          }
                      }
                  
                      // The result
                      int count = n_grams.Count(p => p == key);
                  
                  }
                  

                  例如对于键 = the question 并将 single space 视为单词边界,提取以下二元组:

                  我有
                  问过

                  问题
                  问题在
                  在 StackOverflow 中。
                  堆栈溢出。因此
                  所以我
                  我可以
                  可以期待
                  期待答案
                  在这里回答。

                  the question在文中出现的次数不明显可见:1

                  【讨论】:

                    【解决方案10】:

                    您可以使用 while 循环搜索第一次出现的索引,然后从找到的索引 ++ 位置进行搜索,并在循环结束时设置一个计数器。 While 循环一直持续到 index == -1。

                    【讨论】:

                      【解决方案11】:

                      一个可能的解决方案是使用正则表达式:

                      var count = Regex.Matches(input.ToLower(), String.Format("\b{0}\b", "the")).Count;
                      

                      【讨论】:

                      • Regex 如果这个词是 R [测试它试图复制记事本加加号
                      猜你喜欢
                      • 1970-01-01
                      • 2011-02-07
                      • 1970-01-01
                      • 2014-04-29
                      • 2016-06-28
                      • 2021-01-09
                      • 1970-01-01
                      • 2013-02-01
                      相关资源
                      最近更新 更多